Deduplicate events
The deduplicate
provides is a powerful
mechanism to remove duplicate events in a pipeline.
There are numerous use cases for deduplication, such as reducing noise, optimizing costs and make threat detection and response more efficent. Read our blog post for high-level discussion.
Analyze unique host pairs
Let's say you're investigating an incident and would like get a better of picture of what entities are involved in the communication. To this end, you would like to extract all unique host pairs to identify who communicated with whom.
Here's how this looks like with Zeek data:
Providing id.orig_h
and id.resp_h
to the operator restricts the output to
all unique host pairs. Note that flipped connections occur twice here, i.e., A →
B as well as B → A are present.
Remove duplicate alerts
Are you're overloaded with alerts, like every analyst? Let's remove some noise from our alerts.
First, let's check what our alert dataset looks like:
Hundreds of thousands of alerts! Maybe I'm just interested in one per hour per host affected host pair? Here's the pipeline for this: