import

Synopsis

parameters:
[-h | -? | --help] prints the help text
[-t | --table-slice-type=] <atom> table slice type
[-s | --table-slice-size=] <uint64> the suggested size for table slices
[-b | --blocking] block until the IMPORTER forwarded all data
[-n | --max-events=] <uint64> the maximum number of events to import
[--read-timeout=] <string> read timoeut after which data is forwarded to the importer
subcommands:
zeek imports Zeek logs from STDIN or file
csv imports CSV logs from STDIN or file
json imports JSON with schema
suricata imports suricata eve json
syslog imports syslog messages
test imports random data for testing or benchmarking
pcap imports PCAP logs from STDIN or file
netflow imports NetFlow records (default: listens on :9995/udp)
corelight-json imports corelight json

Documentation

The import command ingests data. An optional filter expression allows for restricing the input to matching events. The format of the imported data must be explicitly specified:

vast import [options] <format> [options] [expr]

The import command is the dual to the export command.

The --type / -t option filters known event types based on a prefix. E.g., vast import json -t zeek matches all event types that begin with zeek, and restricts the event types known to the import command accordingly.

VAST permanently tracks imported event types. They do not need to be specified again for consecutive imports.

import corelight-json

Synopsis

imports corelight json
parameters:
[-h | -? | --help] prints the help text
[-l | --listen=] <string> the endpoint to listen on ([host]:port/type)
[-r | --read=] <string> path to input where to read events from
[-s | --schema-file=] <string> path to alternate schema
[-S | --schema=] <string> alternate schema as string
[-t | --type=] <string> filter event type based on prefix matching
[-d | --uds] treat -r as listening UNIX domain socket

Documentation

import netflow

Synopsis

imports NetFlow records (default: listens on :9995/udp)
parameters:
[-h | -? | --help] prints the help text
[-l | --listen=] <string> the endpoint to listen on ([host]:port/type)
[-r | --read=] <string> path to input where to read events from
[-s | --schema-file=] <string> path to alternate schema
[-S | --schema=] <string> alternate schema as string
[-t | --type=] <string> filter event type based on prefix matching
[-d | --uds] treat -r as listening UNIX domain socket
[--disable-community-id] disable computation of community id for every record

Documentation

The netflow import listens for Flexible NetFlow messages (i.e., NetFlow v5, NetFlow v9, or IPFIX).

Please note that the NetFlow formats are only supported directly, not the thin nffile wrapper from the nfdump-family of tools. For compatibility with nffile, the nfreplay tool can be used:

# listen on :9995/udp
vast import netflow -l :9995/udp
# replay NetFlow v5 records to :9995/udp
nfreplay -v 5 < path/to/sample.nfcapd

VAST automatically calculates the Community ID for every NetFlow record for better pivoting support. The extra computation induces an overhead of approximately 10% of the ingestion rate. The option --disable-community-id can be used to disable the computation completely.

For example, to filter records by community ID, you can use the query expression:

netflow.v5.community_id == "1:RZGAbvndscGaLqAvFH3bOls3Jh4="

The community_id field exists in all NetFlow record types and facilitates pivoting to PCAP, Zeek, Suricata, and other data formats that come with community ID support.

import pcap

Synopsis

imports PCAP logs from STDIN or file
parameters:
[-h | -? | --help] prints the help text
[-l | --listen=] <string> the endpoint to listen on ([host]:port/type)
[-r | --read=] <string> path to input where to read events from
[-s | --schema-file=] <string> path to alternate schema
[-S | --schema=] <string> alternate schema as string
[-t | --type=] <string> filter event type based on prefix matching
[-d | --uds] treat -r as listening UNIX domain socket
[-i | --interface=] <string> network interface to read packets from
[-c | --cutoff=] <uint64> skip flow packets after this many bytes
[-m | --max-flows=] <uint64> number of concurrent flows to track
[-a | --max-flow-age=] <uint64> max flow lifetime before eviction
[-e | --flow-expiry=] <uint64> flow table expiration interval
[-p | --pseudo-realtime-factor=] <uint64> factor c delaying packets by 1/c
[--snaplen=] <uint64> snapshot length in bytes
[--drop-rate-threshold=] <real64> drop rate that must be exceeded for warnings to occur
[--disable-community-id] disable computation of community id for every packet

Documentation

The PCAP import format uses libpcap to read network packets from a trace or an interface.

VAST automatically calculates the Community ID for PCAPs for better pivoting support. The extra computation induces an overhead of approximately 15% of the ingestion rate. The option --disable-community-id can be used to disable the computation completely.

import test

Synopsis

imports random data for testing or benchmarking
parameters:
[-h | -? | --help] prints the help text

Documentation

The test format exists primarily for testing and benchmarking purposes. It generates random data for a given schema.

import syslog

Synopsis

imports syslog messages
parameters:
[-h | -? | --help] prints the help text
[-l | --listen=] <string> the endpoint to listen on ([host]:port/type)
[-r | --read=] <string> path to input where to read events from
[-s | --schema-file=] <string> path to alternate schema
[-S | --schema=] <string> alternate schema as string
[-t | --type=] <string> filter event type based on prefix matching
[-d | --uds] treat -r as listening UNIX domain socket

Documentation

Ingest Syslog messages into VAST. The following formats are supported:

  • RFC 5424
  • A fallback format that consists only of the Syslog message.
# Import from file.
vast import syslog -r path/to/sys.log
# Continuously import from a stream.
syslog | vast import syslog

import suricata

Synopsis

imports suricata eve json
parameters:
[-h | -? | --help] prints the help text
[-l | --listen=] <string> the endpoint to listen on ([host]:port/type)
[-r | --read=] <string> path to input where to read events from
[-s | --schema-file=] <string> path to alternate schema
[-S | --schema=] <string> alternate schema as string
[-t | --type=] <string> filter event type based on prefix matching
[-d | --uds] treat -r as listening UNIX domain socket

Documentation

The suricata import format consumes EVE JSON logs from Suricata. EVE is output is Suricata's unified format to log all types of activity as single stream of line-delimited JSON.

For each log entry, VAST parses the field event_type to determine the specific record type and then parses the data according to the known schema.

vast import suricata < eve.log

import json

Synopsis

imports JSON with schema
parameters:
[-h | -? | --help] prints the help text
[-l | --listen=] <string> the endpoint to listen on ([host]:port/type)
[-r | --read=] <string> path to input where to read events from
[-s | --schema-file=] <string> path to alternate schema
[-S | --schema=] <string> alternate schema as string
[-t | --type=] <string> filter event type based on prefix matching
[-d | --uds] treat -r as listening UNIX domain socket

Documentation

The json import format consumes line-delimited JSON objects according to a specified schema. That is, one line corresponds to one event. The object field names correspond to record field names.

JSON's can express only a subset VAST's data model. For example, VAST has first-class support IP addresses but JSON can only represent them as strings. To get the most out of your data, it is therefore important to define a schema to get a differentiated view of the data.

The infer command also supports schema inference for JSON data. For example, head data.json | vast infer will print a raw schema that can be supplied to --schema-file / -s as file or to --schema / -S as string. However, after infer dumps the schema, the generic type name should still be adjusted and this would be the time to annotate fields with additional attributes, such as #timestamp or #skip.

If no type prefix is specified with --type / -t, or multiple types match based on the prefix, VAST uses an exact match based on the field names to automatically deduce the event type for every line in the input.

import csv

Synopsis

imports CSV logs from STDIN or file
parameters:
[-h | -? | --help] prints the help text
[-l | --listen=] <string> the endpoint to listen on ([host]:port/type)
[-r | --read=] <string> path to input where to read events from
[-s | --schema-file=] <string> path to alternate schema
[-S | --schema=] <string> alternate schema as string
[-t | --type=] <string> filter event type based on prefix matching
[-d | --uds] treat -r as listening UNIX domain socket

Documentation

The CSV import format consumes comma-separated values in tabular form. The first line in a CSV file must contain a header that describes the field names. The remaining lines contain concrete values. Except for the header, one line corresponds to one event.

Because CSV has no notion of typing, it is necessary to select a layout via --type/-t whose field names correspond to the CSV header field names.

import zeek

Synopsis

imports Zeek logs from STDIN or file
parameters:
[-h | -? | --help] prints the help text
[-l | --listen=] <string> the endpoint to listen on ([host]:port/type)
[-r | --read=] <string> path to input where to read events from
[-s | --schema-file=] <string> path to alternate schema
[-S | --schema=] <string> alternate schema as string
[-t | --type=] <string> filter event type based on prefix matching
[-d | --uds] treat -r as listening UNIX domain socket

Documentation

The Zeek import format consumes Zeek logs in tab-separated value (TSV) style.

Here's an example of a typical Zeek conn.log:

#separator \x09
#set_separator ,
#empty_field (empty)
#unset_field -
#path conn
#open 2014-05-23-18-02-04
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p proto service duration orig_bytes resp_bytes conn_state local_orig missed_bytes history orig_pkts orig_ip_bytes resp_pkts resp_ip_bytes tunnel_parents
#types time string addr port addr port enum string interval count count string bool count string count count count count table[string]
1258531221.486539 Pii6cUUq1v4 192.168.1.102 68 192.168.1.1 67 udp - 0.163820 301 300 SF - 0 Dd 1 329 1 328 (empty)
1258531680.237254 nkCxlvNN8pi 192.168.1.103 137 192.168.1.255 137 udp dns 3.780125 350 0 S0 - 0 D 7 546 0 0 (empty)
1258531693.816224 9VdICMMnxQ7 192.168.1.102 137 192.168.1.255 137 udp dns 3.748647 350 0 S0 - 0 D 7 546 0 0 (empty)
1258531635.800933 bEgBnkI31Vf 192.168.1.103 138 192.168.1.255 138 udp - 46.725380 560 0 S0 - 0 D 3 644 0 0 (empty)
1258531693.825212 Ol4qkvXOksc 192.168.1.102 138 192.168.1.255 138 udp - 2.248589 348 0 S0 - 0 D 2 404 0 0 (empty)
1258531803.872834 kmnBNBtl96d 192.168.1.104 137 192.168.1.255 137 udp dns 3.748893 350 0 S0 - 0 D 7 546 0 0 (empty)
1258531747.077012 CFIX6YVTFp2 192.168.1.104 138 192.168.1.255 138 udp - 59.052898 549 0 S0 - 0 D 3 633 0 0 (empty)
1258531924.321413 KlF6tbPUSQ1 192.168.1.103 68 192.168.1.1 67 udp - 0.044779 303 300 SF - 0 Dd 1 331 1 328 (empty)
1258531939.613071 tP3DM6npTdj 192.168.1.102 138 192.168.1.255 138 udp - - - - S0 - 0 D 1 229 0 0 (empty)
1258532046.693816 Jb4jIDToo77 192.168.1.104 68 192.168.1.1 67 udp - 0.002103 311 300 SF - 0 Dd 1 339 1 328 (empty)
1258532143.457078 xvWLhxgUmj5 192.168.1.102 1170 192.168.1.1 53 udp dns 0.068511 36 215 SF - 0 Dd 1 64 1 243 (empty)
1258532203.657268 feNcvrZfDbf 192.168.1.104 1174 192.168.1.1 53 udp dns 0.170962 36 215 SF - 0 Dd 1 64 1 243 (empty)
1258532331.365294 aLsTcZJHAwa 192.168.1.1 5353 224.0.0.251 5353 udp dns 0.100381 273 0 S0 - 0 D 2 329 0 0 (empty)

When Zeek rotates logs, it produces compressed batches of *.tar.gz regularly. Ingesting a compressed batch involves unpacking and concatenating the input before sending it to VAST:

zcat *.gz | vast import zeek