This is an experimental feature: the API is subject to change and robustness is not yet comparable to production-grade features.
Using transforms allows you to apply automatic data transformations as events enter or leave VAST.
VAST supports import and export transforms. The former apply to new data ingested into the system, the latter apply to the results of a VAST query. Both import and export transforms can run in the server or client process. For imports, the client is the source generating the data, for exports the client is the sink receiving the exported data.
The flexible combination of location and type of transform type enables multiple use cases:
|Location||Transform Type||Use case||Example|
|Client||Import||Enrichment||Add community ID to flow telemetry|
|Server||Import||Compliance||Anonymize PII data|
|Client||Export||Post-processing||Compute expensive function (e.g., string entropy)|
|Server||Export||Access control||Remove sensitive fields|
A transform configuration consists of two parts:
- The transform definition as list of named steps.
- The transform trigger determining the execution context.
To define a transform, add it to
vast.transforms in the configuration file:
In the above example,
example_transform consists of two sequential steps that
execute in order.
A trigger determines when and where a transform executes. The trigger configuration is a dictionary with three keys:
transform: the name one of a previously defined transform.
events: a list of event types for which the transform fires
The triggers reside under the configuration key
export specify the transform type.
For example, this trigger configuration configures the
run at on the server side during import:
Transforms consist of a sequence of steps, each of which define a function that maps a batch of events to a new batch, possibly with a different layout. Steps are plugins and users can write their own.
We strongly recommend building VAST with Arrow support when using the transform steps described below, since most of them can work far more efficiently on table slices encoded in native Arrow format.
VAST currently ships with following native steps:
Deletes a field from the input.
field: string: The name of the field to be deleted.
Replaces a field with a constant value.
field: string: The field name to be replaced.
value: any: The new field value
Computes a SHA256 hash digest of a given field.
field: string: the field name over which the hash is computed.
out: string: the field name in which to store the digest.
salt: string: a salt value for the hash. (optional)