Skip to main content
Version: Next


Executes YARA rules on byte streams.


yara [-B|--blockwise] [-C|--compiled-rules] [-f|--fast-scan] <rule> [<rule>..]


The yara operator applies YARA rules to an input of bytes, emitting rule context upon a match.

We modeled the operator after the official yara command-line utility to enable a familiar experience for the command users. Similar to the official yara command, the operator compiles the rules by default, unless you provide the option -C,--compiled-rules. To quote from the above link:

This is a security measure to prevent users from inadvertently using compiled rules coming from a third-party. Using compiled rules from untrusted sources can lead to the execution of malicious code in your computer.

The operator uses a YARA scanner under the hood that buffers blocks of bytes incrementally. Even though the input arrives in non-contiguous blocks of memories, the YARA scanner engine support matching across block boundaries. For continuously running pipelines, use the --blockwise option that considers each block as a separate unit. Otherwise the scanner engine would simply accumulate blocks but never trigger a scan.


Match on every byte chunk instead of triggering a scan when the input exhausted.

This option makes sense for never-ending dataflows where each chunk of bytes constitutes a self-contained unit, such as a single file.


Interpret the rules as compiled.

When providing this flag, you must exactly provide one rule path as positional argument.


Enable fast matching mode.


The path to the YARA rule(s).

If the path is a directory, the operator attempts to recursively add all contained files as YARA rules.


The examples below show how you can scan a single file and how you can create a simple rule scanning service.

Perform one-shot scanning of files

Scan a file with a set of YARA rules:

load file --mmap evil.exe | yara rule.yara
Memory Mapping Optimization

The --mmap flag is merely an optimization that constructs a single chunk of bytes instead of a contiguous stream. Without --mmap, the file loader generates a stream of byte chunks and feeds them incrementally to the yara operator. This also works, but performance is better due to memory locality when using --mmap.

Let's unpack a concrete example:

rule test {
string = "string meta data"
integer = 42
boolean = true

$foo = "foo"
$bar = "bar"
$baz = "baz"

($foo and $bar) or $baz

You can produce test matches by feeding bytes into the yara operator:

echo 'foo bar' | tenzir 'load stdin | yara /tmp/test.yara'

You will get one yara.match per matching rule:

"rule": {
"identifier": "test",
"namespace": "default",
"tags": [],
"meta": {
"string": "string meta data",
"integer": 42,
"boolean": true
"strings": {
"$foo": "foo",
"$bar": "bar",
"$baz": "baz"
"matches": {
"$foo": [
"data": "Zm9v",
"base": 0,
"offset": 0,
"match_length": 3
"$bar": [
"data": "YmFy",
"base": 0,
"offset": 4,
"match_length": 3

Each match has a rule field describing the rule and a matches record indexed by string identifier to report a list of matches per rule string.

Build a YARA scanning service

Let's say you want to build a service that scans malware sample that you receive over a Kafka topic malware.

Launch the processing pipeline as follows:

load kafka --topic malware | yara --blockwise /path/to/rules

If you run this pipeline on the command line via tenzir <pipeline>, you see the matches arriving as JSON. You could also send the matches via the fluent-bit sink to Slack, Splunk, or any other Fluent Bit output. For example, via Slack:

load kafka --topic malware
| yara --blockwise /path/to/rules
| fluent-bit slack webhook=<url>

This pipeline requires that every Kafka message is a self-contained malware sample. Because the pipeline runs continuously, we supply the --blockwise option so that the yara triggers a scan for every Kafka message, as opposed to accumulating all messages indefinitely and only initiating a scan when the input exhausts.

You can now submit a malware sample by sending it to the malware Kafka topic:

load file --mmap evil.exe | save kafka --topic malware

This pipeline loads the file evil.exe as single blob and sends it to Kafka, at topic malware.