The VAST project is roughly a decade old. But what happened over the last 10
years? This blog post looks back over time through the lens of the git merge
commits.
Why merge commits? Because they represent a unit of completed contribution.
Feature work takes place in dedicated branches, with the merge to the main
branch sealing the deal. Some feature branches have just one commit, whereas
others dozens. The distribution is not uniform. As of 6f9c84198 on Sep 2,
2022, there are a total of 13,066 commits, with 2,334 being merges (17.9%).
We’ll take a deeper look at the merge commits.
For the statistics, we’ll switch to R. In all subsequent figures, a single point
corresponds to a merge commit. The reduced opacity alleviates the effects of
overplotting.
Code
Prior to Tenzir taking ownership of the project and developing VAST, it was a
dissertation project evolving along during PhD work at the University of
California, Berkeley. We can see that the first pre-submission crunch started a
few months before the NSDI ’16
paper.
Tenzir was born in fall 2017. Real-world contributions arrived as of 2018 when
the small team set sails. Throughput increased as core contributors joined the
team. Fast-forward to 2020 when we started doing public releases. The figure
below shows how this process matured.
Code
As visible from the tag labels, we were at CalVer for a
while, but ultimately switched to SemVer. Because we had
already commercial users at the time, this helped us better communicate breaking
vs. non-breaking changes.
Let’s zoom in on all releases since v1.0. At this time, we had a solid
engineering and release process in place.
Code
The v2.0 release was a hard one for us, given the long distance to v1.1. We
merged too much and testing took forever. Burnt by the time sunk in testing and
fixups, we decided to switch to an LPU model (“least publishable unit”) to
reduce release cadence. We didn’t manage to implement this model until after
v2.1 though, where the release cadence finally gets smaller. A monthly release
feels about the right for our team size.
The key challenge is minimizing the feature freeze phase. The first release
candidate (RC) kicks this phase off, and the final release lifts the
restriction. In this period, features are not allowed to be merged.1 This is
a delicate time window: too long and the fixups in the RC phase cause the
postponed pull requests to diverge, too short and we compromise on testing
rigor, causing a release that doesn’t meet our Q&A requirements.
This is where we stand as of today. We’re happy how far along we came, but
many challenges still lay ahead of us. Increased automation and deeper testing
is the overarching theme, e.g., code coverage, fuzzing, GitOps. We’re constantly
striving to improve or processes. With a small team of passionate, senior
engineers, this is a lot of fun!
We enforced this with a blocked label. CI doesn’t allow
merging when this label is on a pull request.↩