summarize
Groups events and applies aggregate functions to each group.
Description
The summarize
operator groups events according to certain fields and applies
aggregation functions to each group. The operator
consumes the entire input before producing any output, and may reorder the event
stream.
The order of the output fields follows the sequence of the provided arguments. Unspecified fields are dropped.
Take care when using this operator with large inputs.
group
To group by a certain field, use the syntax <field>
or <field>=<field>
. For
each unique combination of the group
fields, a single output event will be
returned.
aggregation
The aggregation functions applied to each group
are specified with f(…)
or <field>=f(…)
, where f
is the name of an
aggregation function (see below) and <field>
is an optional name for the
result. The aggregation function will produce a single result for each group.
If no name is specified, the aggregation function call will automatically
generate one. If processing continues after summarize
, we strongly recommend
to specify a custom name.
Examples
Compute the sum of a field over all events
Group over y
and compute the sum of x
for each group:
Gather unique values in a list
Group the input by src_ip
and aggregate all unique dest_port
values into a
list:
Same as above, but produce a count of the unique number of values instead of a list:
Compute min and max of a group
Compute minimum and maximum of the timestamp
field per src_ip
group:
Compute minimum and maximum of the timestamp
field over all events:
Check if any value of a group is true
Create a boolean flag originator
that is true
if any value in the src_ip
group is true
:
Create 1-hour time buckets
Create 1-hour groups and produce a summary of network traffic between host pairs: