This release introduces two new powerful OCSF operators that automate enum derivation and provide intelligent field trimming. The update also includes string padding functions, better HTTP requests, IP categorization and much more!
Download the release on GitHub.
Features
Section titled “Features”Improved node robustness
Section titled “Improved node robustness”We added an experimental feature to run node-independent operators of a pipeline
in dedicated subprocesses. This brings improved error resilience and resource
utilization. You can opt-in to this feature with the setting
tenzir.disable-pipeline-subprocesses: false in tenzir.yaml. We plan to
enable this feature by default in the future.
Operations on concatenated secrets
Section titled “Operations on concatenated secrets”You can now arbitrarily nest operations on secrets. This is useful for APIs that expect authentication is an encoded blob:
let $headers = { auth: f"{secret("user")}:{secret("password")}".encode_base64()}By @IyeOnline in #5324.
New string padding functions
Section titled “New string padding functions”Ever tried aligning threat actor names in your incident reports? Or formatting CVE IDs with consistent spacing for your vulnerability dashboard? We’ve all been there, fighting with inconsistent string lengths that make our security tools output look like alphabet soup. 🍲
Meet your new formatting friends: pad_start() and pad_end()!
Live Threat Feed Dashboard
Section titled “Live Threat Feed Dashboard”Create a real-time threat indicator board with perfectly aligned columns:
from {time: "14:32", actor: "APT29", target: "energy", severity: 9}, {time: "14:35", actor: "Lazarus", target: "finance", severity: 10}, {time: "14:41", actor: "APT1", target: "defense", severity: 8}select threat_line = time + " │ " + actor.pad_end(12) + " │ " + target.pad_end(10) + " │ " + severity.string().pad_start(2, "0")write_lines14:32 │ APT29 │ energy │ 0914:35 │ Lazarus │ finance │ 1014:41 │ APT1 │ defense │ 08CVE Priority Matrix
Section titled “CVE Priority Matrix”Format CVE IDs and CVSS scores for your vulnerability management system:
from {cve: "CVE-2024-1337", score: 9.8, vector: "network", status: "🔴"}, {cve: "CVE-2024-42", score: 7.2, vector: "local", status: "🟡"}, {cve: "CVE-2024-31415", score: 5.4, vector: "physical", status: "🟢"}select priority = status + " " + cve.pad_end(16) + " [" + score.string().pad_start(4) + "] " + vector.pad_start(10, "·")write_lines🔴 CVE-2024-1337 [ 9.8] ···network🟡 CVE-2024-42 [ 7.2] ·····local🟢 CVE-2024-31415 [ 5.4] ··physicalNetwork Flow Analysis
Section titled “Network Flow Analysis”Build clean firewall logs with aligned source/destination pairs:
from {src: "10.0.0.5", dst: "8.8.8.8", proto: "DNS", bytes: 234}, {src: "192.168.1.100", dst: "13.107.42.14", proto: "HTTPS", bytes: 8924}, {src: "172.16.0.50", dst: "185.199.108.153", proto: "SSH", bytes: 45812}select flow = src.pad_start(15) + " → " + dst.pad_start(15) + " [" + proto.pad_end(5) + "] " + bytes.string().pad_start(7) + " B"write_lines 10.0.0.5 → 8.8.8.8 [DNS ] 234 B 192.168.1.100 → 13.107.42.14 [HTTPS] 8924 B 172.16.0.50 → 185.199.108.153 [SSH ] 45812 BBoth padding functions accept three parameters:
- String to pad (required)
- Target length (required)
- Padding character (optional, defaults to space)
If your string is already longer than the target length, it returns unchanged. Multi-character padding? That’s a paddlin’ (returns an error).
Your SOC dashboards never looked so clean! 🎯
Sinks in HTTP Parsing Pipelines
Section titled “Sinks in HTTP Parsing Pipelines”Parsing pipeline in the from_http and http operators now support sinks. This
worked already in from_file parsing pipelines and now works, as expected, also
in the HTTP parsing pipelines. For example, you can now write:
from_http "https://cra.circl.lu/opendata/geo-open/mmdb-country-asn/latest.mmdb" { context::load "geo-open-country-asn"}HTTP request body encoding
Section titled “HTTP request body encoding”The from_http and http operators now support using record values for the
request body parameter. By default, the record is serialized as JSON. You can
also specify encode="form" to send the body as URL-encoded form data. When
using form encoding, nested fields are flattened using dot notation (e.g.,
foo: {bar: "baz"} => foo.bar=baz). This supersedes the payload parameter,
which therefore is now deprecated.
Examples
Section titled “Examples”By default, setting body to a record will JSON-encode it:
http "https://api.example.com/data", body={foo: "bar", count: 42}POST /data HTTP/1.1Host: api.example.comContent-Type: application/jsonContent-Length: 33
{ "foo": "bar", "count": 42}To change the encoding, you can use the encode option:
http "https://api.example.com/data", body={foo: {bar: "baz"}, count: 42}, encode="form"POST /data HTTP/1.1Host: api.example.comContent-Type: application/x-www-form-urlencodedContent-Length: 20
foo.bar=baz&count=42Arbitrary body contents can be sent by using a string or blob:
http "https://api.example.com/data", body="hello world!"POST /data HTTP/1.1Host: api.example.comContent-Length: 12
hello world!IP address categorization functions
Section titled “IP address categorization functions”Ever wondered if that suspicious traffic is coming from inside the corporate network? 🏢 We’ve got you covered with a new suite of IP address classification functions that make network analysis a breeze.
is_private() - Quickly spot internal RFC 1918 addresses in your logs.
Perfect for identifying lateral movement or distinguishing between internal and
external threats:
where src_ip.is_private() and dst_ip.is_global()// Catch data exfiltration attempts from your internal networkis_global() - Find publicly routable addresses. Essential for tracking
external attackers or monitoring outbound connections:
where src_ip.is_global() and failed_login_count > 5// Detect brute force attempts from the internetis_multicast() - Identify multicast traffic (224.0.0.0/4, ff00::/8).
Great for spotting mDNS, SSDP, and other broadcast protocols that shouldn’t
cross network boundaries:
where dst_ip.is_multicast() and src_ip.is_global()// Flag suspicious multicast from external sourcesis_link_local() - Detect link-local addresses (169.254.0.0/16,
fe80::/10). Useful for identifying misconfigurations or APIPA fallback:
where server_ip.is_link_local()// Find services accidentally binding to link-local addressesis_loopback() - Spot loopback addresses (127.0.0.0/8, ::1). Hunt for
suspicious local connections or tunneled traffic:
where src_ip != dst_ip and dst_ip.is_loopback()// Unusual loopback connections might indicate malwareip_category() - Get the complete classification in one shot. Returns:
“global”, “private”, “multicast”, “link_local”, “loopback”, “broadcast”, or
“unspecified”:
where src_ip.ip_category() == "private" and dst_ip.ip_category() == "multicast"// Analyze traffic patterns by IP categoryThese functions work seamlessly with both IPv4 and IPv6 addresses, making them future-proof for your dual-stack environments. Happy hunting! 🔍
ocsf::trim and ocsf::derive
Section titled “ocsf::trim and ocsf::derive”Tenzir now provides two new operators for processing OCSF events:
ocsf::derive automatically assigns enum strings from their integer
counterparts and vice versa. It performs bidirectional enum derivation for OCSF
events and validates consistency between existing enum values.
from { activity_id: 1, class_uid: 1001, metadata: {version: "1.5.0"},}ocsf::deriveThis transforms the event to include the derived activity_name: "Create" and
class_name: "File System Activity" fields.
ocsf::trim intelligently removes fields from OCSF events to reduce data
size while preserving essential information. You can also have explicit control
over optional and recommended field removal.
from { class_uid: 3002, class_name: "Authentication", user: { name: "alice", display_name: "Alice", }, status: "Success",}ocsf::trimThis removes non-essential fields like class_name and user.display_name
while keeping critical information intact.
Compression for write_bitz
Section titled “Compression for write_bitz”Tenzir’s internal wire format, which is accessible through the read_bitz and
write_bitz operators, now uses Zstd compression internally, resulting in a
significantly smaller output size. This change is backwards-compatible.
By @dominiklohmann in #5335.
Changes
Section titled “Changes”Better query optimization
Section titled “Better query optimization”Previously, queries that used export followed by a where that used fields
such as this["field name"] were not optimized. Now, the same optimizations
apply as with normal fields, improving the performance of such queries.
Improved join behavior
Section titled “Improved join behavior”The join function now also works with empty lists that are typed as
list<null>. Furthermore, it now emits more helpful warnings.
Respecting error responses from Azure Log Analytics
Section titled “Respecting error responses from Azure Log Analytics”The to_azure_log_analytics operator now emits an error when it receives any
response considering an internal error. Those normally indicate configuration
errors and the pipeline will now stop with an error instead of continuing to
send data that will not be received correctly.
Renamed to_asl
Section titled “Renamed to_asl”We renamed our Amazon Security Lake integration operator from to_asl to
to_amazon_security_lake. The old name is now deprecated and will be removed
in the future.
By @IyeOnline in #5340.
kv parser no longer produces empty fields
Section titled “kv parser no longer produces empty fields”Our Key-Value parsers (the read_kv operator and parse_kv function) previously
produced empty values if the value_split was not found.
With this change, a “field” missing a value_split is considered an extension
of the previous fields value instead:
from \ {input: "x=1 y=2 z=3 4 5 a=6"},this = { ...input.parse_kv() }Previous result:
{x:1, y:2, z:"3", "4":"", "5":"", a:6}New result:
{x:1, y:2, z:"3 4 5", a:6}By @IyeOnline in #5313.
Bug Fixes
Section titled “Bug Fixes”Non-default databases in to_clickhouse
Section titled “Non-default databases in to_clickhouse”The to_clickhouse operator erroneously rejected table arguments of the form
database_name.table_name. This is now fixed, allowing you to write to
non-default databases.
By @IyeOnline in #5355.
Remove file size limit from Amazon Security Lake Integration
Section titled “Remove file size limit from Amazon Security Lake Integration”We removed the 256MB file size limit from the Amazon Security Lake integration.
By @IyeOnline in #5340.
Newlines before else
Section titled “Newlines before else”Previously, the if … { … } else { … } construct required that there was no
newline before else. This restriction is now lifted, which allows placing
else at the beginning of the line:
if x { … }else if y { … }else { … }Fixed encrypt_cryptopan function
Section titled “Fixed encrypt_cryptopan function”We fixed a bug that sometimes caused the encrypt_cryptopan function to fail
with the error “got ip, expected ip”, which was caused by an incorrect type
check. The function now works as expected again.
By @dominiklohmann in #5345.
Fixed list_separator option name in print_csv
Section titled “Fixed list_separator option name in print_csv”The print_csv, print_ssv and print_tsv functions had an option incorrectly
named field_separator. Instead, these functions have an option list_separator
now, allowing you to change the list separator.
You cannot set a custom field_separator on these functions. If you want to
print with custom field_separators, use print_xsv instead.
By @IyeOnline in #5357.
Fix context::create_geoip without db_path
Section titled “Fix context::create_geoip without db_path”The context::create_geoip operator failed with a message_mismatch error when
no db_path option was provided. This was caused by an internal serialization
error, which we now fixed. This is the only known place where this error
occurred.
By @dominiklohmann in #5342.
Fix http operator pagination
Section titled “Fix http operator pagination”The http operator dropped all provided HTTP headers after the first request
when performing paginated requests. The operator now preserves the headers for
all requests.