I’ve been working on Free Software for a long time now, on many different projects over the years. In approximate order: Emacs, a variety of things in Debian (PowerPC port, GNOME packaging, GPG verification for apt, CDBS, SELinux porting), then Rhythmbox, then many years of Fedora and Red Hat Enterprise Linux, DBus maintenance, a side stint in web services, and of course my favorite project, GNOME. Which in turn includes working on a lot of infrastructure like systemd, accountsservice, gdm, polkit. A lot of these have been very fun to work on, and mostly useful contributions.
The documentation provides an overview in more depth, but briefly here: it’s a tool for parallel installation and atomic upgrades
of general-purpose Linux-kernel based operating systems, and designed to integrate well with a systemd/GNU userspace. You can use it to safely upgrade client machines over plain HTTP, and longer term, underneath package systems (but above filesystem and block storage layer; use whatever you want there).
I have been obsessed with upgrades for a long time, on and off. (Side note: interesting that KSplice does now exist). How OS upgrades work affects everything in the system architecture, from the development process to end system reliablity. It’s a problem domain that spans client machines, cloud deployments, traditional servers, and embedded devices.
Atomic and safe upgrades
Over the past few years, my interest in the domain rekindled for several reasons. One is that I happened to stumble across NixOS. While obviously there are a lot of software deployment mechanisms and such are out there, if you filter by “has atomic and safe upgrades”, the list becomes quite small and most of what’s there is quite specialized. So I studied Nix carefully; the executive summary is that while they have cool ideas, rebuilding or redownloading the entire system for a glibc security update is deeply impractical. But their approach of having a symlink farm which is the target of a bootloader entry stuck in my mind.
Also in the last few years, Chromium OS appeared with its autoupdater. The Chromium OS updater is an extremely efficient design…for their use case. But it’s hard to generalize the design to the wider world; doubling the disk space usage in every cloud image is a rather large penalty. Furthermore, the Chromium OS model doesn’t have much of a story for locally generated systems. If you want to customize the OS, you are completely unable to reuse their updates server, as it is all about deltas between fixed disk images. Again, this all makes sense if your model is that the only apps are web apps, but that’s a very fixed use case.
In short, OSTree is more efficient than Nix in a number of ways, and most importantly only handles filesystem trees; it’s not a package system. I posted plans for approaching the efficiency of the Chromium OS updater on the wire.
But, if OSTree is so cool, why isn’t it powering your package system? The simple answer is because it’s really quite deeply invasive for existing package systems like dpkg/rpm and all the others that are basically just clones of the same idea with different names. This quote from a LWN commenter sums it up:
What OSTree is proposing is somewhat unclear, but appears to require rebooting on *every single package upgrade* so as to switch into a new chroot containing that package. That means several times a day for me. Not a bloody chance am I letting something like *that* near any of my systems outside a VM: it is transparently ridiculous and optimizing for a very few programs that might need to take extra measures to avoid being broken by updates happening underneath them…
Right. On the plus side, you get atomic upgrades, and this is a tradeoff that a substantial number of people would likely take. Ultimately of course, as I replied to the commenter, it’s certainly possible to imagine carefully engineering the OS so that a certain subset of changes can be “live applied”, while still preserving atomic upgrades. Furthermore, while OSTree does not come with or force any particular independent application installation mechanism, it is designed to provide a fundamental layer for existing and new ones.
Parallel installation, the OS development process, and system quality
The core of OSTree is so simple – it’s just booting into hardlinked chroots – that it was relatively easy to enable something else besides atomic upgrades, which is easy parallel installation of operating systems. Not only does it make it easy to dual boot say a stable OS and the bleeding edge, if you have the disk space, you can thousand-boot, or more.
Why is this so critical? It’s because while package systems have a lot of flexibility, there’s one extremely important gap: The ability to try new code, and go back if it doesn’t work. This was covered in my original GUADEC presentation. Typically package-based distributions manage this by creating several different layers. Debian has stable, testing, unstable, and experimental. But if you upgrade from stable to unstable to see if suspend works for example, the package system will fight you trying to downgrade; the concept of “newer is better” is baked deeply into dpkg/rpm and everything built on top.
Being a software engineer working on an extremely complex general-purpose system like GNOME without massive development resources, let me tell you – it’s easy to break things unintentionally. Having a subset of users (but not everyone) that run the bleeding edge, like Firefox has in Nightly would be a real benefit, while also giving them a mechanism to fall back to the previous working build. And in fact, I have a separate project gnome-ostree that’s intended to be exactly that. Although it has an uninspired name, it’s better than “nightly” – it’s fully continuous, updated easily 70 times a day as git commits are made. But while it serves as an important testing base for validating the core OSTree designs in a relatively constrained scenario, it’s a separate project, and not the topic of this blog post.
OSTree underneath package systems
There are a large number of systems which fit into the model of efficiently replicating pre-constructed OS trees from a build server; many basic “client” workloads as well as cloud deployments are best delivered this way. That said, the “package” model where filesystem trees are computed dynamically on individual machines is very flexible, and some of that flexibility is entirely valid. Particularly for organizations which have invested heavily in it, it doesn’t make sense to toss out that investment; I want to support it.
While I’ve been relatively quiet about OSTree so far, I think it’s finally reached a point in implementation quality and design where I’d like to see more package system maintainers and distributions attempt to experiment with it; that’s the goal of this blog post. A quick weekend hack a while ago resulted in fedora-ostree. Since then, I worked on it a bit more this weekend, and updated it.
This is a long term effort; as the LWN commenter above said, OSTree has wildly different tradeoffs from existing package system semantics. There is a new section of the OSTree manual describing changes that many existing general-purpose distributions will have to make to adapt.
And clearly, hashing out a design where some changes can be applied live (after they are atomically set up for the next boot) would be really nice. If you’re logged into a system and want to zypper/yum/apt-get install strace, there’s no reason since that’s just a new file in /usr/bin that we can’t just make it appear right away. But as you go up from there in complexity, it gets more difficult to do without race conditions. But luckily, we have the complete source code to the operating system; and starting from a fundamental basis of reliability and safety, it is much easier to add features like speed and flexibility.
If you too share my passion for atomic upgrades, operating system upgrade engineering, continuous integration and such, then check out the git repository and join the mailing list; it’s a great time to join the project, as there are several new contributors, and it’s just fun to work on!