Aging

Experimental Feature

This is an experimental feature: the API is subject to change and robustness is not yet comparable to production-grade features.

The aging feature enables periodic deletion of events based on user-specified queries. When would you use this?

  • The retention policy requires to store data only up to N months.
  • The amount of new data exceeds your storage budget in the foreseeable future.
  • Sensitive data may enter the system but should not be recorded.

There are two aging mechanisms supported by VAST: content-based aging and quota-based aging.

Disk-Quota Aging

The most straightforward way to constrain the disk space used by VAST is to configure a disk quota:

vast start --disk-quota-high=1TiB

Whenever VAST detects that its database directory has grown to exceed the configured quota, it will erase the oldest data in the database. It is possible to specify an additional --disk-quota-low option to define a corridor for the disk space usage. This can be used to avoid having VAST running permanently at the upper limit and to instad batch the deletion operations together.

The full set of available options looks like this:

system:
start:
# Triggers removal of old data when the DB dir exceeds the disk budget.
disk-budget-high: 0K
# When the DB dir exceeds the budget, VAST erases data until the directory size
# falls below this value.
disk-budget-low: 0K
# Seconds between successive disk space checks.
disk-budget-check-interval: 90
note

When using this method, we recommend placing the log file outside of the database directory. It counts towards the size calculations, but cannot be automatically deleted during a deletion cycle.

Content-Based Aging

An alternative way to keep the size of the database in check is to erase data based on content. To do this, an aging query can be specified when starting VAST:

vast --aging-frequency=<timespan> --aging-query=<query> start

The query is periodically executed and returns a set of events that VAST erases from the archive.

note

Content-based deletion of data currently does not delete the corresponding data in the index, since that might require re-indexing which is not yet implemented.

As a side effect, index statistics from the vast status command are no longer reliable after an aging cycle.

It is also possible to set up the aging query using a vast.yaml configuration file:

system:
# Run the aging algorithm twice every day.
aging-frequency: 12 hours
# Remove all events that have a timestamp associated with them that's older
# than 7 days.
aging-query: "#timestamp < 7 days ago"