VAST's query language is designed for effective subsetting of data. The syntax
borrows from the terseness of
awk and from network-centric focus of
tcpdump, but enhanced with a rich type system.
A query is a boolean expression that declaratively defines the subset of
interest. Sub-expressions can be connected as conjunctions (
||), and negations (
!). Expression operands are either sub-expressions
or predicates at the leaves.
The following example shows an abstract syntax tree (AST) along with the corresponding expression.
Let's take a look at the individual components in more depth.
A predicate has the form
LHS denotes the left-hand
side operand and
RHS the right-hand side operand. The relational operator
op is typed, i.e., only a subset of the cross product
of operand types is a valid syntax. An operand is either an
extractor or a data value. Operands are always typed,
but for extractors the type can sometimes only be inferred at query runtime.
In this case, the type check also takes place lazily.
The following operators separate two operands:
|not equal to|
|in (left to right)|
|not in (left to right)|
|in (right to left)|
|not in (right to left)|
The table below illustrates a partial function over the cross product of
available types. Green cells represent a valid combination of
for the given set of operator classes.
An extractor retrieves a certain aspect of an event. VAST has the following extractor types:
Field: extracts all fields whose name match a given record field name.
Type: extracts all event types that have a field of a given type.
Meta: matches on the type name or field name of a layout instead of the values contained in actual events.
Field extractors have the form
z match on
record field names. The access fields in nested records. Using a type name as
leftmost element before a
. is also possible.
A field extractor has suffix semantics. It is possible to just write
x.y.z. In fact, writing
z is equivalent to
*.z and creates a
disjunction of all fields ending in
ts > 1 day ago
zeek.conn.id.orig_h in 192.168.0.0/24
orig_bytes >= 10Ki
Type extractors have the form
T is the type of a field. Type
extractors work for all basic types and
A search for type
:T includes all aliased types. For example, given the alias
type port = count exists, then the
:count type extractor will also consider
instances of type
port. However, a
:port query does not inclucde
types because an alias is a strict refinement of an existing type.
:timestamp > 1 hour ago
:addr == 184.108.40.206
:count > 42M
"evil" in :string
Meta extractors have the forms
#field. They work on the layouts of
events instead of the value domain.
#type form matches on the event layout name, or event type, hence the
name. Anologously, the
#field form matches on the field names of events.
These forms are useful when you're interested in exporting all events of a certain type or those containing a particular field independent of the event's content.
#type == "zeek.conn"
"suricata" in #type
#field == "community_id"
Predicates with type extractors and equality operators can be written tersely
as value predicates. That is, if a predicate has the form
:T == X where
X is a value and
T the type of
X, it suffices to write
The predicate parser deduces the type of
X automatically in this case.
220.127.116.11 is a valid predicate and expands to
:addr == 18.104.22.168.
This allows for quick type-based point queries, such as
(22.214.171.124 || 80/tcp) && "evil".
Value predicates of type
subnet expand more broadly. Given a subnet
10.0.0.0/8, the parser expands this to:
This makes it easier to search for IP addresses belonging to a specific subnet.