Tenzir Node v4.9.0

Download the release on GitHub.

Features

Implement `context load`, `context save`, and `context reset`

The context reset operator allows for clearing the state of a context.

The context save and context load operators allow serializing and deserializing the state of a context to/from bytes.

By @eliaskosunen in #3908.

Add new ‘—file’ option to the python operator

The python operator gained a new --file flag that allows loading python code from a file instead of providing it as part of the pipeline definition.

By @lava in #3901.

Add Bloom filter context

The new bloom-filter context represents large sets in a space-efficient manner.

By @mavam in #3834.

Improve the `export` operator

The export operator gained a --low-priority option, which causes it to interfere less with regular priority exports at the cost of potentially running slower.

By @dominiklohmann in #3909.

Handle nested fields and integers as selectors in JSON parser

The --selector option of the json parser now works with nested fields, and integer fields.

By @jachris in #3900.

Add `lines` printer

The lines printer enables simple line-delimited formatting of events.

By @eliaskosunen in #3847.

Add the `openapi` operator

The openapi source operator generates Tenzir’s OpenAPI specification. Use openapi | to ./openapi.yaml to generate a file with the canonical format.

By @dominiklohmann in #3898.

Add structured_data to syslog output

The structured_data field in RFC 5424-style syslog messages is now parsed and included in the output.

By @eliaskosunen in #3871.

Implement a context content dumping mechanism

The new context inspect <context-name> command dumps a specific context’s user-provided data, usually the context’s content.

By @Dakostu in #3893.

Support parsing numeric timestamps since epoch

When specifying a schema with a field typed as time #unit=<unit>, numeric values will be interpreted as offsets from the epoch.

By @jachris in #3927.

Add running and paused times to pipeline metrics

Operator metrics now separately track the time that an operator was paused or running in the time_paused and time_running values in addition to the wall-clock time in time_total. Throughput rates now exclude the paused time from their calculation.

By @dominiklohmann in #3940.

Rewrite `chart` and `set-attributes` operators

The chart operator adds metadata to the schema of the input events, enabling rendering events as bar, area, line, or pie charts on app.tenzir.com.

By @eliaskosunen in #3866.

Update the main repository to include timestamped pipelines

show pipelines and the /pipeline API endpoints now include created_at and last_modified fields that track the pipeline’s creation and last manual modification time, respectively. Pipelines created with older versions of Tenzir will use the start time of the node as their creation time.

By @Dakostu in #3869.

Implement more malleable lookup data for contexts

The context match events now contain a new field mode that states the lookup mode of this particular match.

The enrich operator gained a --filter option, which causes it to exclude enriched events that do not contain a context.

By @Dakostu in #3920.

Update the main repository to include the pipeline run ID

Managed pipelines now contain a new total_runs parameter that counts all started runs. The new run field is available in the events delivered by the metrics and diagnostics operators.

By @Dakostu in #3883.

Changes

Context versioning

The binary format used by contexts for saving on disk on node shutdown is now versioned. A node can support loading of multiple different versions, and automigrate between them.

By @eliaskosunen in #3945.

Remove reader, writer, and language plugin types

We removed the tenzir-ctl start subcommand. Users should switch to the tenzir-node command instead, which accepts the same arguments and presents the same command-line interface.

By @dominiklohmann in #3899.

Disable colors if `NO_COLOR` or not a terminal

Color escape codes are no longer emitted if NO_COLOR is set to a non-empty value, or when the output device is not a terminal.

By @jachris in #3952.

Allow plugins to bundle further builtins

Plugins may now depend on other plugins. Plugins with unmet dependencies are automatically disabled. For example, the lookup and enrich plugins now depend on the context plugin. Run show plugins to see all available plugins and their dependencies.

By @dominiklohmann in #3877.

Replace `tenzir.db-directory` with `tenzir.state-directory`

The option tenzir.db-directory is deprecated in favor of the tenzir.state-directory option and will be removed in the future.

By @dominiklohmann in #3889.

Bug Fixes

Add support for commas in seconds in the time data parser

Commas are now allowed as subsecond separators in timestamps in TQL. Previously, only dots were allowed, but ISO 8601 allows for both.

By @eliaskosunen in #3903.

Update the repository to include lookup lifetime fixes

Retroactive lookups will now properly terminate when they have finished.

By @Dakostu in #3910.

Make `/serve` more consistent

The /serve API sometimes returned an empty string for the next continuation token instead of null when there are no further results to fetch. It now consistently returns null.

By @dominiklohmann in #3885.

Prevent duplicate fields in schema

Invalid schema definitions, where a record contains the same key multiple times, are now detected and rejected.

By @jachris in #3929.

Gracefully handle misaligned header and values in `xsv` parser

The xsv parser (and by extension the csv, tsv, and ssv parsers) skipped lines that had a mismatch between the number of values contained and the number of fields defined in the header. Instead, it now fills in null values for missing values and, if the new --auto-expand option is set, also adds new header fields for excess values.

By @dominiklohmann in #3874.

Fix restart on failure

The option to automatically restart on failure did not correctly trigger for pipelines that failed an operator emitted an error diagnostic, a new mechanism for improved error messages introduced with Tenzir v4.8. Such pipelines now restart automatically as expected.

By @dominiklohmann in #3947.

Fix logger deadlock in python tests

We fixed a rare deadlock by changing the internal logger behavior from blocking until the oldest messages were consumed to overwriting them.

By @lava in #3911.

Improve the `export` operator

We fixed a bug that under rare circumstances led to an indefinite hang when using a high-volume source followed by a slow transformation and a fast sink.

By @dominiklohmann in #3909.

Tenzir Node v4.9.0

Features

Implement context load, context save, and context reset

Add new ‘—file’ option to the python operator

Add Bloom filter context

Improve the export operator

Handle nested fields and integers as selectors in JSON parser

Add lines printer

Add the openapi operator

Add structured_data to syslog output

Implement a context content dumping mechanism

Support parsing numeric timestamps since epoch

Add running and paused times to pipeline metrics

Rewrite chart and set-attributes operators

Update the main repository to include timestamped pipelines

Implement more malleable lookup data for contexts

Update the main repository to include the pipeline run ID

Changes

Context versioning

Remove reader, writer, and language plugin types

Disable colors if NO_COLOR or not a terminal

Allow plugins to bundle further builtins

Replace tenzir.db-directory with tenzir.state-directory

Bug Fixes

Add support for commas in seconds in the time data parser

Update the repository to include lookup lifetime fixes

Make /serve more consistent

Prevent duplicate fields in schema

Gracefully handle misaligned header and values in xsv parser

Fix restart on failure

Fix logger deadlock in python tests

Improve the export operator

Implement `context load`, `context save`, and `context reset`

Improve the `export` operator

Add `lines` printer

Add the `openapi` operator

Rewrite `chart` and `set-attributes` operators

Disable colors if `NO_COLOR` or not a terminal

Replace `tenzir.db-directory` with `tenzir.state-directory`

Make `/serve` more consistent

Gracefully handle misaligned header and values in `xsv` parser

Improve the `export` operator