This release introduces the ocsf::cast operator to streamline schema transformations for OCSF events and adds support for one-level recursion in OCSF objects, enabling recursive relations such as process.parent_process and analytic.related_analytics.
Download the release on GitHub.
Features
Section titled “Features”ocsf::cast operator
Section titled “ocsf::cast operator”The new ocsf::cast operator handles common schema transformations when working
with OCSF events, such as homogenizing events of the same OCSF type or
converting timestamps to integer counts to strictly adhere to the schema.
This also deprecates the less flexible ocsf::apply operator, which is now
equivalent to ocsf::cast null_fill=true.
Changes
Section titled “Changes”Reduced memory consumption during import
Section titled “Reduced memory consumption during import”The memory usage while importing events has been significantly optimized.
Previously, importing would leave a trail of memory usage that only decreased
slowly over a period corresponding to tenzir.active-partition-timeout. Now,
events are properly released immediately after being written to disk, preventing
unnecessary memory accumulation.
We also eliminated redundant copies throughout the import path, reducing memory usage by 2-4x depending on the dataset. Additionally, we optimized the memory usage of buffered synopses, which are used internally when building indexes during import. This optimization avoids unnecessary copies of strings and IP addresses, roughly halving the memory consumption of the underlying component.
By @jachris in #5532, #5533, #5535.
Dynamic clean up of expired keys in deduplicate
Section titled “Dynamic clean up of expired keys in deduplicate”The deduplicate operator now also considers the timeouts set when calculating
frequency of cleaning up expired state. This resuts in lower memory usage if
a timeout is under 15min.
Flip pipeline subprocesses option semantics
Section titled “Flip pipeline subprocesses option semantics”We renamed the configuration option to tenzir.pipeline-subprocesses and
kept the feature opt-in to avoid surprising users upgrading from earlier
releases. Set the option to true to enable subprocess execution:
tenzir: pipeline-subprocesses: trueExpose one-level recursion for OCSF objects
Section titled “Expose one-level recursion for OCSF objects”We now support recursive OCSF objects at depth one, as opposed to dropping
recursive objects entirely. For example, pipelines can safely follow
relationships such as process.parent_process or analytic.related_analytics:
from { metadata: {version: "1.5.0"}, class_uid: 1007, process: { pid: 1234, parent_process: { pid: 5678, }, },}ocsf::apply
// New!assert process.parent_process.pid == 5678assert not process.parent_process.has("parent_process")The first assertion now succeeds while deeper ancestry is trimmed automatically, preserving schema compatibility for downstream consumers.
Bug Fixes
Section titled “Bug Fixes”Lambda capture extraction
Section titled “Lambda capture extraction”Lambda captures now work correctly for field accesses where the left side is not
a constant field path. For example, .map(x => a[x].b) previously did not
capture a, even though that is required to correctly evaluate the body of the
lambda. This now works as expected.