This feature is only available in the pro version of VAST. Please contact us if you are interested in trying it out.
Live intelligence matching is a mechanism that allows VAST to check if the value of a specified field in imported data is contained in a set of known values. Each such value is called an threat indicator or IoC (indicator of compromise), and each match is called a sighting.
The matching is implemented using high-performance probabilistic data
structures to minimize the performance overhead of this feature and enable
usage even on large, high-volume data sets. As a trade-off, matching is
restricted to equality comparisons: For advanced queries like conjunctions,
disjunctions or combining data from several sources, the regular
command needs to be used.
The basic usage pattern is to pick a source of threat indicators and a set of
record fields, and to then spawn a
vast matcher to perform the live matching
between the two sources.
The easiest way to do this is to use the
vast matcher start subcommand, which
spawns a matcher and attaches to its result stream and outputs one line of JSON
data for each sighting. The
can be used to select which fields of the input stream should be matched against
the indicators of this matcher. By default, all fields with the
attribute are matched.
vast matcher start command will attach to the output stream of the started
matcher and output one line of json data for every sighting, and remove the
matcher when the command is interrupted.
When more flexibility is required, one can use the
vast spawn matcher
--name=<matcher-name> command, which takes the same arguments as
start, to spawn a matcher that will live independently of the command
invocation. The sightings of that matcher can then be exported by a query like
vast export --continuous json "intel.sighting.matcher == <matcher-name>", in
any format - in fact, that's how
vast matcher start is implemented internally.
To add new indicators to a matcher, simply use
vast import to import new
records of the appropriate type. If no type was explicitly specified, the
intel.indicator type is used:
The supported values for the "type" field are currently
For example, if you have a file
indicators.csv containing indicators according
to the above type description, you could import the indicators as follows:
It is currently not possible to configure a matcher to ignore new indicators that are imported to vast after the matcher was started.
To remove single iocs from a matcher, use the
vast matcher ioc-remove
subcommand, which requires a matcher name and the ioc string and type of the
indicator to be removed.
Note that removing iocs in this way slightly increases the matcher overhead. To
remove many iocs in bulk, it is recommended to start a new matcher with a
ioc-query yielding the desired remaining iocs instead.
When the same indicator is added multiple times and subsequently removed, all of the previously added indicators are removed from the matcher. It is not necessary to remove the indicators multiple times as well.
This command only removes the specified indicators from a matcher, not from the database itself.
Custom IoC types
You can use any type you like as ioc type for a matcher, as long as it has
fields named 'ioc' and 'type' with the same semantics as the
The field names are currently hard-coded, a more flexible way of specifying them is on our roadmap.
Let's assume we want to match IP's listed on the Feodo Tracker, a list of active C2 servers maintained by the popular anti-malware site abuse.ch. We first download the raw data:
To use the data in a matcher, we need to import it into VAST as an indicator
type. The generic indicator type
intel.indicator requires the "type" and
"reference" field to be set in addition to the raw ioc value from the block list
In addition, the downloaded block list contains commented-out lines starting
with the '#' character and windows-style "\r\n" line breaks, so we add some
pre-processing before piping the enriched json data to the
Next, let's assume we have a zeek source continously importing data from zeek, and we want to get alerted if any connection to one of the C2 servers is detected. So we start the following matcher:
intel.indicator is the default ioc type, so we would not have needed
to specify it in the example above.
Also, this matcher will use all indicators of type
intel.indicators to match
against. If we wanted to restrict that, we could pass a custom ioc query to
select the indicators to be loaded at startup.
It is possible to specify multiple fields at once for a given matcher, e.g.,
The matcher will print all confirmed sightings to its standard output. To test it, we can run zeek in a different terminal window to monitor connections and continuously import the results into VAST:
Now, every time you visit one of the compromised servers listed in the block list (be careful!), one line of output should be printed by the matcher command.
The main tuning parameter for a matcher is its state size: The more space it is allowed to allocate, the lower the false positive rate will be and the less the overall system will be affected. With more and more data points being imported, having a low false positive rate becomes more and more important because the absolute number of false positives will increase linearly with the number of match queries, on top of the base system already being higher due to the various running importers.
The following table show some example values for appropriate state sizes:
|Deployment size||Matcher data||Appropriate State Size|
|Small||500 Indicators, 10K data points/s input||8 MiB|
|Medium||200K Indicators, 500K data points/s input||128 MiB|
|Large||30M Indicators, 1M data points/s input||600 MiB|
Note that "input" in the table above above refers to the amount of matches that are performed, not the total amount of data that is imported to VAST. Additionally, even though the number of indicators and the matches per second are combined in the table above, each of them independently increases the appropriate state size for a matcher.
The matchers spawned by VAST are configured to use a state size of 128MiB each. If the number of false positives by a matcher exceeds the acceptable threshold, we recommend to try to divide the input space to multiple parallel matchers. If that is not possible, it will be necessary to rebuild VAST from source after adjusting the amount of memory used by each matcher.
A way to configure the state size for each individual matcher from the command line or configuration is in progress.