sigma
Filter the input with Sigma rules and output matching events.
Synopsis
sigma <rule> [--refresh-interval <refresh-interval>]
sigma <directory> [--refresh-interval <refresh-interval>]
Description
The sigma
operator executes Sigma rules on
its input. If a rule matches, the operator emits a tenzir.sigma
event that
wraps the input record into a new record along with the matching rule. The
operator discards all events that do not match the provided rules.
For each rule, the operator transpiles the YAML into an
expression and instantiates a
where
operator, followed by put
to generate an output.
Here's how the transpilation works. The Sigma rule YAML format requires a
detection
attribute that includes a map of named sub-expression called search
identifiers. In addition, detection
must include a final condition
that
combines search identifiers using boolean algebra (AND, OR, and NOT) or
syntactic sugar to reference groups of search expressions, e.g., using the
1/all of *
or plain wildcard syntax. Consider the following detection
embedded in a rule:
We translate this rule piece by building a symbol table of all keys (foo
and
bar
). Each sub-expression is a valid expression in itself:
foo
:a == 42 && b == "evil"
bar
:c == 1.2.3.4
Finally, we combine the expression according to condition
:
We parse the YAML string values according to Tenzir's richer data model, e.g.,
the expression c: 1.2.3.4
becomes a field named c
and value 1.2.3.4
of
type ip
, rather than a string
. Sigma also comes with its own event
taxonomy
to standardize field names. The sigma
operator currently does not normalize
fields according to this taxonomy but rather takes the field names verbatim from
the search identifier.
Sigma uses value
modifiers
to select a concrete relational operator for given search predicate. Without a
modifier, Sigma uses equality comparison (==
) of field and value. For example,
the contains
modifier changes the relational operator to substring search, and
the re
modifier switches to a regular expression match. The table below shows
what modifiers the sigma
operator supports, where ✅ means implemented, 🚧 not
yet implemented but possible, and ❌ not yet supported:
Modifier | Use | sigmac | Tenzir |
---|---|---|---|
contains | perform a substring search with the value | ✅ | ✅ |
startswith | match the value as a prefix | ✅ | ✅ |
endswith | match the value as a suffix | ✅ | ✅ |
base64 | encode the value with Base64 | ✅ | ✅ |
base64offset | encode value as all three possible Base64 variants | ✅ | ✅ |
utf16le /wide | transform the value to UTF16 little endian | ✅ | 🚧 |
utf16be | transform the value to UTF16 big endian | ✅ | 🚧 |
utf16 | transform the value to UTF16 | ✅ | 🚧 |
re | interpret the value as regular expression | ✅ | ✅ |
cidr | interpret the value as a IP CIDR | ❌ | ✅ |
all | changes the expression logic from OR to AND | ✅ | ✅ |
lt | compare less than (< ) the value | ❌ | ✅ |
lte | compare less than or equal to (<= ) the value | ❌ | ✅ |
gt | compare greater than (> ) the value | ❌ | ✅ |
gte | compare greater than or equal to (>= ) the value | ❌ | ✅ |
expand | expand value to placeholder strings, e.g., %something% | ❌ | ❌ |
<rule.yaml>
The rule to match.
This invocation transpiles rule.yaml
at the time of pipeline creation.
<directory>
The directory to watch.
This invocation watches a directory and attempts to parse each contained file as
a Sigma rule. The sigma
operator matches if any of the contained rules
match, effectively creating a disjunction of all rules inside the directory.
--refresh-interval <refresh-interval>
How often the Sigma operator looks at the specified rule or directory of rules to update its internal state.
Defaults to 5 seconds.
Examples
Apply a Sigma rule to an EVTX file using
evtx_dump
:
Apply a Sigma rule over historical data in a node from the last day:
export | where :timestamp > 1 day ago | sigma rule.yaml
Watch a directory of Sigma rules and apply all of them on a continuous stream of Suricata events:
from file --follow eve.json read suricata | sigma /tmp/rules/
When you add a new file to /tmp/rules
, the sigma
operator transpiles it and
will match it on all subsequent inputs.