Formats
A format is the bridge between raw bytes and structured data. A format provides a parser and/or printer:
- Parser: translates raw bytes into structured event data
- Printer: translates structured events into raw bytes
Parsers and printers interact with their corresponding dual from a connector:
Formats appear as an argument to the read
and write
operators:
read <format>
write <format>
from <connector> [read <format>]
to <connector> [write <format>]
When a printer contructs raw bytes, it sets a
MIME content type so that savers
can make assumptions about the otherwise opaque content. For example, the
http
saver uses this value to populate the
Content-Type
header when copying the raw bytes into the HTTP request body.
The builtin printers set the following MIME types:
Format | MIME Type |
---|---|
CSV | text/csv |
JSON | application/json |
NDJSON | application/x-ndjson |
Parquet | application/x-parquet |
PCAP | application/vnd.tcpdump.pcap |
SSV | text/plain |
TSV | text/tab-separated-values |
YAML | application/x-yaml |
Zeek TSV | application/x-zeek |
Tenzir ships with the following formats:
📄️ bitz
Reads and writes BITZ, Tenzir's internal wire format.
📄️ cef
Parses events in the Common Event Format (CEF).
📄️ csv
The csv format is a configuration of the xsv format:
📄️ feather
Reads and writes the Feather file format, a thin wrapper around
📄️ gelf
Reads Graylog Extended Log Format (GELF) events.
📄️ grok
Parses a string using a grok-pattern, backed by regular expressions.
📄️ json
Reads and writes JSON.
📄️ kv
Reads key-value pairs by splitting strings based on regular expressions.
📄️ leef
Parses events in the Log Event Extended Format (LEEF).
📄️ lines
Parses and prints events as lines.
📄️ parquet
Reads events from a Parquet file. Writes events to a Parquet file.
📄️ pcap
Reads and writes raw network packets in PCAP file format.
📄️ ssv
The ssv format is a configuration of the xsv format:
📄️ suricata
Reads Suricata's EVE JSON output. The parser is an alias
📄️ syslog
Reads syslog messages.
📄️ time
Parses a datetime/timestamp using a strptime-like format string.
📄️ tsv
The tsv format is a configuration of the xsv format:
📄️ xsv
Reads and writes lines with separated values.
📄️ yaml
Reads and writes YAML.
📄️ zeek-json
The zeek-json format is an alias for json with the arguments:
📄️ zeek-tsv
Reads and writes Zeek tab-separated values.