Parses a string as a Syslog message.
parse_syslog [octet_counting=bool, raw=bool, schema=string, selector=string, schema_only=bool, unflatten_separator=string]Description
Section titled “Description”Parses a string as a Syslog message.
Tenzir supports reading syslog messages in both the standardized “Syslog Protocol” format (RFC 5424), and the older “BSD syslog Protocol” format (RFC 3164).
input: string
Section titled “input: string”The string to parse as a syslog message.
octet_counting = bool (optional)
Section titled “octet_counting = bool (optional)”Controls handling of length-prefixed syslog messages as defined in RFC 6587. Some syslog implementations prepend the message length in bytes, followed by a space:
<length> <syslog-message>For example, where 65 is the byte count of the syslog message that follows:
65 <165>1 2023-10-11T22:14:15.003Z host app 1234 ID01 - Test messageThe parameter supports three modes:
- Not specified (default): Auto-detect. Treats the length prefix as a hint, not an authority. The parser first tries to parse the full message; if that fails and a valid prefix is present, it falls back to parsing the truncated message. This maximizes leniency for messages that happen to start with digits.
octet_counting=true: Require and trust the length prefix. Emits a warning and returns null if the input lacks a valid prefix. If the message is longer than the stated length, truncates to the stated length before parsing.octet_counting=false: Never strip a length prefix. Parse the input as-is, treating any leading digits as part of the message.
raw = bool (optional)
Section titled “raw = bool (optional)”Use only the raw types that are native to the parsed format. Fields that have a
type specified in the chosen schema will still be parsed according to the
schema.
schema = string (optional)
Section titled “schema = string (optional)”Provide the name of a schema to be used by the parser.
If a schema with a matching name is installed, the result will always have all fields from that schema.
- Fields that are specified in the schema, but did not appear in the input will be null.
- Fields that appear in the input, but not in the schema will also be kept. Use
schema_only=trueto reject fields that are not in the schema.
If the given schema does not exist, this option instead assigns the output schema name only.
The schema option is incompatible with the selector option.
selector = string (optional)
Section titled “selector = string (optional)”Designates a field value as schema name with an optional dot-separated prefix.
The string is parsed as <fieldname>[:<prefix>]. The prefix is optional and
will be prepended to the field value to generate the schema name.
For example, the Suricata EVE JSON format includes a field event_type that
contains the event type. Setting the selector to event_type:suricata causes an
event with the value flow for the field event_type to map onto the schema
suricata.flow.
The selector option is incompatible with the schema option.
schema_only = bool (optional)
Section titled “schema_only = bool (optional)”When working with an existing schema, this option will ensure that the output schema has only the fields from that schema.
If the schema name is obtained via a selector and it does not exist, this has
no effect.
This option requires either schema or selector to be set.
unflatten_separator = string (optional)
Section titled “unflatten_separator = string (optional)”A delimiter that, if present in keys, causes values to be treated as values of nested records.
A popular example of this is the Zeek JSON
format. It includes the fields id.orig_h, id.orig_p, id.resp_h, and
id.resp_p at the top-level. The data is best modeled as an id record with
four nested fields orig_h, orig_p, resp_h, and resp_p.
Without an unflatten separator, the data looks like this:
{ "id.orig_h": "1.1.1.1", "id.orig_p": 10, "id.resp_h": "1.1.1.2", "id.resp_p": 5}With the unflatten separator set to ., Tenzir reads the events like this:
{ "id": { "orig_h": "1.1.1.1", "orig_p": 10, "resp_h": "1.1.1.2", "resp_p": 5 }}Duplicate Keys
Section titled “Duplicate Keys”If the parser encounters a duplicate key in an event, it will transparently upgrade the field to be a list of values instead.
For a simple example, consider this JSON file:
{"key": 7}{"key": 0.0, "key": 1}{"key": 42}{key: 7}{key: [0.0, 1.0]}{key: 42}If the values are of different type, conversions to a common type will be attempted, such as to a common number type. Ultimately values will be stringified if they do not share a common type:
{"key": 0.0, "key": "1.1.1.1", "key": "example.com"}{key: ["0", "1.1.1.1", "example.com"]}Examples
Section titled “Examples”Parse a RFC5424 syslog string
Section titled “Parse a RFC5424 syslog string”from { input: r#"<165>8 2023-10-11T22:14:15.003Z mymachineexamplecom evntslog 1370 ID47 [exampleSDID@32473 eventSource="Application" eventID="1011"] Event log entry"#}output = input.parse_syslog(){ input: "<165>8 2023-10-11T22:14:15.003Z mymachineexamplecom evntslog 1370 ID47 [exampleSDID@32473 eventSource=\"Application\" eventID=\"1011\"] Event log entry", output: { facility: 20, severity: 5, version: 8, timestamp: 2023-10-11T22:14:15.003Z, hostname: "mymachineexamplecom", app_name: "evntslog", process_id: "1370", message_id: "ID47", structured_data: { "exampleSDID@32473": { eventSource: "Application", eventID: 1011, }, }, message: "Event log entry", },}Parse an octet-counted syslog message
Section titled “Parse an octet-counted syslog message”When dealing with syslog messages that use RFC 6587 octet counting, the function auto-detects and strips the length prefix:
from { input: "119 <165>1 2023-10-11T22:14:15.003Z mymachineexamplecom evntslog 1370 ID47 [exampleSDID@32473 iut=\"3\"] Event log entry"}output = input.parse_syslog(){ input: "119 <165>1 2023-10-11T22:14:15.003Z mymachineexamplecom evntslog 1370 ID47 [exampleSDID@32473 iut=\"3\"] Event log entry", output: { facility: 20, severity: 5, version: 1, timestamp: 2023-10-11T22:14:15.003Z, hostname: "mymachineexamplecom", app_name: "evntslog", process_id: "1370", message_id: "ID47", structured_data: { "exampleSDID@32473": { iut: 3, }, }, message: "Event log entry", },}Use octet_counting=true to require the prefix or octet_counting=false to
disable auto-detection.