OSTree: rigorous and reliable deployment

February 26, 2014

I sometimes describe OSTree as being even more rigorous than traditional dpkg/rpm type package systems. Now, there are some of you out there who probably can’t imagine how that’s possible. You found packaging so tedious and painful that you gave up, and you now write Go code (because Google wrote it, it must be good, right?) and you hack on your MacBook from a coffee shop, and when you’re ready scp your statically linked binary to staging and then to production. Maybe you don’t even have staging. It’s so simple! Look how fast it is!

If you are one of those people, just think about what happens when you forgot to “git push” for a while, or you had “origin” be a local mirror or something, and then you lose your MacBook, and now you have a big statically linked blob running in production to which you no longer have the source.

In contrast to this developer, packaging is pretty rigorous. Production build systems ensure that all the source to particular artifacts are tracked, have a distinct, clean, and (mostly) reproducible build environment. For example, that your build system isn’t downloading stuff from the Internet in the middle.

On the deployment side, with packaging you can always log into your server and see what is installed, with version numbers. There’s a lot of advantages to that over a developer deploying binaries with scp.

Knowing what’s running

So how is OSTree more rigorous than traditional packaging? It’s very simple – when you run ostree admin status – you are also getting a description of what is running, not merely installed. At the moment, the simple implementation of that is that you must reboot to have a change take effect. On the plus side, you have fully atomic upgrades. But – we definitely can do partial live upgrades, which is the subject of this post.

With dpkg/rpm and friends, there isn’t a reliable link between the package system and the init system (today, systemd or historically one of the sysvinit implementations). For example, whether or not a service gets restarted on package change is up to the packager of the daemon, and furthermore it’s just a shell script called out from a %post. There’s nothing in the system to audit whether or not the daemon has been successfully restarted, and how that relates to the package change.

Conversely, it’s a pretty sad state of affairs that systemd is totally unaware of packaging. Now most existing administrators understand this, and know the technological/organizational/political[1] reasons this is the case.

Whether or not a daemon got restarted is only one of the obvious ways in which installed and running become distinct. A much more common case is upgrading a shared library such as libc.so.6. If we understand that not all daemons or code may be instantly restarted, then we have a situation where the package system is recording merely what’s installed – an administrator later logging in and debugging a failure may have to reconstruct that this system was live upgraded via noting the (deleted) suffix on the shared library in /proc/pid/maps. If they know to look there of course…

So how do I plan to preserve the present property that OSTree has in that it describes what’s running and not merely installed? Let’s be honest, it’s a hard problem. But take a simple case – we are running a tree with checksum ac81df, and we live-apply a subset of the files from the new tree 59da1a as an overlay on top of the running tree. Then ostree admin status might say something like this:

* fedora-atomic ccc6ff1d1d6fdfcb7309700af8fec5de61511767b6ed43f77feb549f7afcaefb.0
    origin refspec: local:fedora-atomic/rawhide/x86_64/buildmaster/base/core
  Dynamic overlays:

Here we’d be seeing the case where our new tree pulled in an updated bash binary, and a new libc. Furthermore, we can backreference from the (device, inode) pair in any running processes /proc/pid/maps to the originating tree – because it won’t be physically deleted as long as it’s still referenced. Also, for any live-upgrade system via OSTree, I plan for it to be fully aware of systemd, and carefully audit the return values from service restarts, correlating it with the state of the filesystem.

This is still a relatively simple case. Think about the situation where you upgrade two or three times, and do partial live updates from each of them. OSTree would carefully maintain the precise manner in which you upgraded – it would be reproducible by others. You’d be able to backreference from any code in memory to the originating tree, which contains the manifest of binary versions, which finally link to source code.

Now with yum history, one could theoretically reconstruct a lot of this, but again yum (really rpm) suffers from being so generic that the core operation of interacting with things like systemd is just a callout to an un-audited shell script. Your current desktop and servers are probably a messy blend what I call “partial live updates”.

[1] And when I say political, let’s imagine what happens when someone posts the first patch to show the package name from systemctl status

One Response to “OSTree: rigorous and reliable deployment”

  1. “for any live-upgrade system via OSTree, I plan for it to be fully aware of systemd”

    Is this somewhat related to https://www.youtube.com/watch?v=8M4FYC2hOp4 ?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: