The Semantic Versioning Anti-Pattern

The classic X.Y.Z style of software numbering, sometimes known as Semantic Versioning, may be hurting more than it helps. Semver.org has codified how these numbers should be used for some cases, describing what is meant by the major and minor version numbers, when to change them, what represents maturity of your software, what denotes breaking backwards compatibility, and generally the “right” way to signal all sorts of things with your version... but it doesn't work for all cases, and sometimes people take it too far.

While this codification of versioning is nice on the surface, at the end of the day, it feels a little bit overly pedantic, and reminds me of the labyrinthine process used in Renaissance Italy to elect a Doge; which, if you are unfamiliar with it, was a multi layered election process where bodies of people were voted in, so they could elect other people, to elect other people. This process even included going to St Mark's Square and choosing the first boy found to pick a ballot from an urn. Suffice it to say, it was a highly involved and complicated system.

The modern world of software services have an expectation of “always on,” and with Continuous Delivery (CD) we have a new paradigm in software development, where we leverage powerful testing and automation that allows us to make small incremental changes on a regular continuous basis. These are a good things! I expound more upon the values of CD in another post on Sisyphus and Software Delivery.

If you have ever tried implementing CD and an always-on service, you have probably wrestled with the challenge of build numbers and what to use with your software, which might have left you feeling like semantic versioning stands in your way, like a stubborn child insisting they get attention.

I believe the rigamarole of semantic versioning overloads the CD process, crippling the very intent of what it means. It may have a place in some areas of software releasing, such as libraries, but I contend it is an anti-pattern in the CD and always-on world.

With modern CD systems, we want to simplify. Changes should be small and frequent, which leads to stable and gradual improvement without much pain. But this frequency itself becomes a counter pattern to what semantic versioning is all about, which is the plodding software development process of old.

I think that as developers it is easy for us to put too much weight into a version. We are lulled by pleasant the pattern of encoding a message into a small set of characters. Perhaps the beauty of this secret code hearkens to a boyhood fascination with spy serials.

Whatever the reason, we should remember the primitive needs we are trying to communicate:

  • Something changed.
  • This is stable software… or inversely this is not stable!
  • When something breaks, what build was it from?
  • Compatibility - the API's may or may not work.

These are very different messages, and perhaps this is why it often becomes confusing and problematic.

Remember, with always on systems you do not have unstable public releases. What you release publicly, continuously, is always production worthy. Anything that may be unstable should be built and hidden behind a feature flag, and delivered in parallel. This does mean more work, because you have to think about how to support two versions of a data structure in parallel (for example). But that extra thought itself helps to create more stable migrations. You no longer have the luxury of taking a production system down for a weekend to make a migration. That production system must stay live, 24x7x365.

It really feels like semantic versioning is about the software development world before always-on continuous software delivery systems. If you are working in this earlier software development world, then perhaps semantic versioning is okay for you.

If instead you are using CD and building always-on systems, then I suggest simplifying and splitting these primitive needs apart. When you release, just update a number, and stop worrying about encoding multiple messages into your version. More and more projects are doing this (Google Chrome, Firefox, any number of always-on websites, Ubuntu, Microsoft). Some even use the date of release instead of incrementing a number.

My preference is just to use something that intentionally does not look like a semantic version—this is the important step—as it helps to signal that you are simplifying the version message. I have found a simple way to do this with CD looks like:

YYMM.xxxx

Where you have a two digit year, a two digit month, followed by a zero padded incremented build number, reset as the date stamp changes. Ideally, you would not do more than 10k builds in a given month. With this you can know what build you are on, and you can also easily support continuous delivery systems. A few examples of what this might look like:

1612.0009
1612.0013
1701.0253

And so forth.

But wait, you say, what about API's and compatibility? These should not be based on the release number! API's should do their own version management, independent of the build number. A full conversation on this is available in Ideal API Versioning.

This leaves releases not ready for general availability, where you can use feature flags, and the very delivery process should include internal stages where unstable code can be tested and vetted before being delivered. If you have a longer-term change being worked in a branch, you can prefix your build numbers with a branch label:

dev-YYMM.xxxx  - a development release
pr-YYMM.xxxx   - a Pull Request
YYMM.xxxx      - the public release

Whatever you do, I hope this has caused some thought and consideration. I may be wrong, but for at least the use cases I dig into, I have found semantic versioning to get in the way more than it helps.