Distribution

VAST offers a flexible way of distributing its key components over processes, thanks to the actor model. This flexibility accommodates a wide range of possible deployment scenarios, ranging from highly integrated single-process applications to distributed setups.

Ingestion

When ingesting data, there are two options to spin up sources: either run them as a separate client process via vast import or spawn them directly inside the server process via vast spawn. We discuss both options below.

Source as Client Process

By default, starting VAST only spawns the necesssary server components, such as archive and index. This works well for ad-hoc scenarios where users perform manual imports or for low-volume continous data sources. The VAST client process connects to the server and sends the parsed data over a TCP connection. The setup looks like this:

The import command implements this form of ingestion. For example, to import a bunch of zipped Zeek logs, you can invoke VAST like this:

gzcat *.logs.gz | vast import zeek

Source within Server Process

The second method of ingesting data involves spawning a source actor directly inside the server process. The advantage is performance: it cuts out any IPC. Sending parsed data from the source is equivalent to passing a pointer in memory. For high-volume formats, like PCAP or NetFlow, avoiding stress on the I/O path can make a major difference.

The spawn and kill commands add and remove components inside a remote VAST node. For example, to spawn a PCAP source that listens on the interface em0, you would do this:

vast spawn source pcap -i em0