export

Synopsis

parameters:
[-h | -? | --help] prints the help text
[-c | --continuous] marks a query as continuous
[-u | --unified] marks a query as unified
[--disable-taxonomies] don't substitute taxonomy identifiers
[-n | --max-events=] <uint64> maximum number of results
[-r | --read=] <string> path for reading the query
[-w | --write=] <string> path to write events to
[-d | --uds] treat -w as UNIX domain socket to connect to
subcommands:
zeek exports query results in Zeek format
csv exports query results in CSV format
ascii exports query results in ASCII format
json exports query results in JSON format
null exports query without printing them (debug option)
arrow exports query results in Arrow format
pcap exports query results in PCAP format

Documentation

The export command retrieves a subset of data according to a given query expression. The export format must be explicitly specified:

vast export [options] <format> [options] <expr>

This is easiest explained on an example:

vast export --max-events=100 --continuous json ':timestamp < 1 hour ago'

The above command outputs line-delimited JSON like this, showing one event per line:

{"ts": "2020-08-06T09:30:12.530972", "nodeid": "1E96ADC85ABCA4BF7EE5440CCD5EB324BEFB6B00#85879", "aid": 9, "actor_name": "pcap-reader", "key": "source.start", "value": "1596706212530"}

The above command signals the running server to export 100 events to the export command, and to do so continuously (i.e., not matching data that was previously imported). Only events that have a field of type timestamp will be exported, and only if the timestamp in that field is older than 1 hour ago from the current time at the node.

The default mode of operation for the export command is historical queries, which exports data that was already archived and indexed by the node. The --unified flag can be used to export both historical and continuous data.

For more information on the query expression, see the query language documentation.

Some export formats have format-specific options. For example, the pcap export format has a --flush-interval option that determines after how many packets the output is flushed to disk. A list of format-specific options can be retrieved using the vast export <format> help, and individual documentation is available using vast export <format> documentation.

export pcap

Synopsis

exports query results in PCAP format
parameters:
[-h | -? | --help] prints the help text
[-f | --flush-interval=] <uint64> flush to disk after this many packets

Documentation

The PCAP export format uses libpcap to write PCAP events as a trace.

This command only supports events of type pcap.packet. As a result, VAST transforms a provided query expression E into #type == "pcap.packet" && E.

export arrow

Synopsis

exports query results in Arrow format
parameters:
[-h | -? | --help] prints the help text

Documentation

The Arrow export format renders events in Apache Arrow, a development platform for in-memory data with bindings for many different programming languages.

Primitive VAST types are mapped to Arrow types as follows:

VASTArrow
noneNullType
boolBooleanType
integerInt64Type
countUInt64Type
realDoubleType
timeTimestampType
durationInt64Type
stringStringType
patternStringType
enumerationUint64Type
addressFixedSizeBinary(16)
subnetFixedSizeBinary(17)

The name of the event_type present in a record batch is encoded into the metadata field of the schema at the key "name".

For example, the below Python program reads Arrow-formatted data from stdin and prints the schema of each batch to stdout.

#! /usr/bin/env python
# Example usage:
# vast -N export arrow '#type ~ /suricata.*/' | ./scripts/print-arrow.py
import sys
import pyarrow
# Open stdin in binary mode.
istream = pyarrow.input_stream(sys.stdin.buffer)
batch_count = 0
row_count = 0
# An Arrow reader consumes a stream of batches with the same schema. When
# reading the result for a query that returns multiple schemas, VAST will use
# multiple writers. Hence, we try to open record batch readers until an
# exception occurs.
try:
while True:
print("open next reader")
reader = pyarrow.ipc.RecordBatchStreamReader(istream)
try:
while True:
batch = reader.read_next_batch()
batch_count += 1
row_count += batch.num_rows
print(str(batch.schema))
except StopIteration:
print("done with current reader, rows: " + str(row_count))
batch_count = 0
row_count = 0
except:
print("done with all readers")

export null

Synopsis

exports query without printing them (debug option)
parameters:
[-h | -? | --help] prints the help text

Documentation

The null export format does not render its results, and is used for debugging and benchmarking only.

export json

Synopsis

exports query results in JSON format
parameters:
[-h | -? | --help] prints the help text
[--flatten] flatten nested objects into the top-level

Documentation

The JSON export format renders events in newline-delimited JSON (aka. JSONL).

export ascii

Synopsis

exports query results in ASCII format
parameters:
[-h | -? | --help] prints the help text

Documentation

The ASCII export format renders events according to VAST's data grammar. It merely dumps the data, without type information, and is therefore useful when digging for specific values.

export csv

Synopsis

exports query results in CSV format
parameters:
[-h | -? | --help] prints the help text

Documentation

The export csv command renders comma-seperatated values in tabular form. The first line in a CSV file contains a header that describes the field names. The remaining lines contain concrete values. Except for the header, one line corresponds to one event.

export zeek

Synopsis

exports query results in Zeek format
parameters:
[-h | -? | --help] prints the help text
[--disable-timestamp-tags] whether the output should contain #open/#close tags

Documentation

The Zeek export format writes events in Zeek's tab-separated value (TSV) style.