🚀 Features
Section titled “🚀 Features”Precise Parsing
Section titled “Precise Parsing”Sep 27, 2024 · @IyeOnline · #4527
The CEF, CSV, GELF, JSON, KV, LEEF, Suricata, Syslog, XSV, YAML and Zeek JSON parsers now properly adhere to the schema of the read data. Previously, parsers would merge heterogeneous input into a single, growing schema, inserting nulls for fields that did not exist in some events.
The fluent-bit source now properly adheres to the schema of the read data.
The CEF, CSV, GELF, JSON, KV, LEEF, Suricata, Syslog, XSV, YAML and Zeek JSON
parsers now all support the --schema, --selector flags to parse their data
according to some given schema, as well as various other flags to more
precisely control their output schema.
Implement the azure-blob-storage connector
Section titled “Implement the azure-blob-storage connector”Sep 25, 2024 · @IyeOnline · #4617
The new azure-blob-storage connector allows reading from and writing
to Azure Blob Storage via an URI.
Make the kv-parser consider quotes when looking for separators
Section titled “Make the kv-parser consider quotes when looking for separators”Sep 22, 2024 · @IyeOnline · #4591
The kv parser now allows for keys and values to be enclosed in double quotes:
Split matches within quotes will not be considered. Quotes will be trimmed of
keys and values. For example "key"="nested = value, fun" will now successfully
parse as { "key" : "nested = value, fun" }.
Add a —null option to the lines parser
Section titled “Add a —null option to the lines parser”The lines parser can now handle null delimited “lines” with the --null flag.
Dynamically grow simdjson buffer if necessary
Section titled “Dynamically grow simdjson buffer if necessary”Sep 16, 2024 · @IyeOnline · #4590
The JSON parser is now able to also handle extremely large events when not using the NDJSON or GELF mode.
Metrics for TCP connections
Section titled “Metrics for TCP connections”metrics tcp shows metrics for TCP connections, emitted once every second per
connection. The metrics contains the reads and writes on the socket and the
number of bytes transmitted.
Support bytes inputs in the buffer operator
Section titled “Support bytes inputs in the buffer operator”Sep 16, 2024 · @dominiklohmann · #4594
The buffer operator now works with bytes inputs in addition to the existing
support for events inputs.
🔧 Changes
Section titled “🔧 Changes”Switch the index to basic messaging
Section titled “Switch the index to basic messaging”We removed the unused --snapshot option from the lookup operator.
Prefer recent partitions for retro lookups
Section titled “Prefer recent partitions for retro lookups”Oct 2, 2024 · @dominiklohmann · #4636
The lookup operator now prefers recent data in searches for lookups against
historical data instead of using the order in which context updates arrive.
Stabilize the bitz format
Section titled “Stabilize the bitz format”Oct 1, 2024 · @dominiklohmann · #4633
Tenzir’s internal wire format bitz is now considered stable. Note that the
format underwent significant changes as part of its stabilization, and is
incompatible with bitz from Tenzir Node v4.20 and older.
Precise Parsing
Section titled “Precise Parsing”Sep 27, 2024 · @IyeOnline · #4527
The JSON parser’s --precise option is now deprecated, as the “precise” mode
is the new default. Use --merge to get the previous “imprecise” behavior.
The JSON parser’s --no-infer option has been renamed to --schema-only. The
old name is deprecated and will be removed in the future.
🐞 Bug Fixes
Section titled “🐞 Bug Fixes”Stabilize the bitz format
Section titled “Stabilize the bitz format”Oct 1, 2024 · @dominiklohmann · #4633
We fixed a very rare crash in the zero-copy parser implementation of read feather and read parquet that was caused by releasing shared memory too
early.
Precise Parsing
Section titled “Precise Parsing”Sep 27, 2024 · @IyeOnline · #4527
We fixed various edge cases in parsers where values would not be properly parsed as typed data and were stored as plain text instead. No input data was lost, but no valuable type information was gained either.
Keep from tcp pipelines running on connection failures
Section titled “Keep from tcp pipelines running on connection failures”Pipelines starting with from tcp no longer enter the failed state when an
error occurrs in one of the connections.
Make read json --arrays-of-objects faster
Section titled “Make read json --arrays-of-objects faster”Sep 16, 2024 · @dominiklohmann · #4601
We fixed an accidentally quadratic scaling with the number of top-level array
elements in read json --arrays-of-objects. As a result, using this option will
now be much faster.
Stop using connection timeout to get node components
Section titled “Stop using connection timeout to get node components”Sep 16, 2024 · @dominiklohmann · #4597
The import and partitions operators and the tenzir-ctl rebuild command no
longer occasionally fail with request timeouts when the node is under high load.