Peer review, FOSS, and packaging/containers etc

Lately whenever I give a presentation, I often at least briefly mention one of my primary motivations for doing what I do:  I really like working in global community of people on Free Software.

A concrete artifact of that work is the code landing in git repositories.  But I believe it’s not just about landing code – peer review is a fundamental ingredient.

Many projects of course start out as just one person scratching an itch or having fun.  And it’s completely fine for many to stay that way.  But once a project reaches a certain level of maturity and widespread usage, I think it’s generally best for the original author to “step down” and become a peer.  That’s what I’ve now done for the OSTree project.

In other words, landing code in git master for a mature project should require at least one other person to look at it.  This may sound obvious, but you’d be surprised…there are some very critical projects that don’t have much the way of peer review.

To call out probably the most egregious example, the bash shell.  I’m deliberately linking to their “git log” because it violates all modern standards for git commit messages.  Now,  I don’t want to overly fault Chet for the years and years he’s put into maintaining the Bash project on his own time.  His contribution to Free Software is great and deserves recognition and applause.  But I believe that getting code into bash should involve more than just him replying to a mail message and running git push.  Bash isn’t the only example of this in what I would call the “Linux distribution core”.

Another major area where there are gaps are the “language ecosystems like Node.js, Rust’s cargo, Python’s pip etc.  Many projects on there are “one person scratching an itch” that other people mostly just consume.

There’s no magical solution to this – but in e.g. the language ecosystem case, if you happen to maintain a library which depends on another one, maybe consider spending a bit of your time looking at open pull requests and jumping in with review?

A vast topic related to this is “who is qualified to review” and “how intensively do I review”, but I think some qualified people are too timid about this – basically it’s much better to have a lightweight but shallow process than none at all.

Now finally, I included “packaging” in the title of this blog, so how does that relate?  It’s pretty simple, I also claim that most people doing what is today known as “packaging” should sign up to participate in upstream peer review.  Things like build fixes should go upstream rather than being kept downstream.  And if upstream doesn’t have peer review, reconsider packaging it – or help ensure peer review happens upstream!



Github, accounts, and ease of contribution

At the moment we’re making plans to move OSTree to Github (from GNOME), and while there are a few reasons for this, one thing I want to talk about is the “account problem” and specifically how it relates to free and open source software.

The “account problem” is simply that requiring users to create them is a barrier to contribution.   It’s problematic to require people to have a Sourceforge account, a GNOME account, a Github account, an Apache Bugzilla account, a Fedora/CentOS account, etc.  People who are committed to making a larger contribution can obviously easily overcome this, but for smaller contributions it hurts.

Particularly for projects like GNOME that have distinct accounts for bugzilla and commit.  Having to create an account just to file a bug is bad.  Yes, there’s OpenID, but still.

I’ll note at this point that software freedom is quite important to me, and the fact that Github is proprietary software is a problem.  But – making it easy for people to contribute to Free Software is also a major benefit.

I wonder how things would have turned out if Sourceforge had been…well, let’s say “less crappy”.  Anyways, now we have Github.

And when we move OSTree, I’d like to avoid becoming too dependent on it.  Particularly for things that aren’t actually git, like the issue tracker. Hopefully if GNOME doesn’t disagree, we’ll maintain our mailing list and bugzilla there so that people who prefer that can use it.

But allowing people to create Github PRs easily is really critical in my mind.  (On this topic, we are also planning to use the Homu bot, which I really like)


Thoughts on unikernels/rump kernels

I spend most of my time working on Project Atomic to further Linux containers deriving from a traditional upstream Linux distribution model, but the space of software delivery/runtime mechanisms is vast, and in particular, I have thought Unikernels were an interesting development.   While I do like writing C, the thought of an OS/library in a high level language is an interesting one (particularly interesting to me for a long time is how garbage collection could be better if integrated with the OS).

That was before Docker, Inc. acquired a unikernel company – now, I’m certainly curious where they’re going to go with it.

My thoughts before this were that the Unikernel model might make sense in the scenario where you have a “large” application and your sole deployment target is required to be virtualized (e.g. AWS, GCE, etc.).

In this case, it’s not really possible to share anything between virtual machines directly (modulo KSM and similar ad-hoc techniques which cost CPU and aren’t always predictable) – and so because you can’t share anything between these apps, it could gain you efficiency to dump the parts of the OS and userspace that you aren’t using in that VM, which could be a lot.

But, if you have any smaller microservice applications, it seems to me that having a shared kernel and userspace (as we provide with the Project Atomic and OpenShift 3 models) is going to be a lot more efficient than doing a VM-per-microservice, even if your VMs are unikernels.

And even with the “large app only for virt” scenario, what about debugging?  Ah yes, I just found a blog from Bryan Cantrill on this topic, and I have to say I agree.

Still though, there’s lots of middle ground here.  We can do far better at helping application authors to produce smaller apps (and host images) than we are with Docker normally right now, for example.

New Atomic Host verb: rpm-ostree deploy

TL;DR: We’ve improved the host version management in Fedora Atomic Host, and you can now use atomic host deploy $version to atomically switch to a well-known version.

Longer version:

The awesome Cockpit project has been working on a UI for managing Atomic Host/OSTree updates. See this page for some background on their design.

If you download the most recent Fedora Atomic Host release, then atomic host upgrade, you’ll get a new rpm-ostree release which in turn has a new “deploy” verb. This was created to help implement the above Cockpit design; it’s a command line talking to code equivalent to what the Cockpit UI pull request will use.

This is noteworthy for several reasons. First, it really unlocks the “server side history” aspect of OSTree for the host tree. This is similar to tagged builds in a Docker repository for a container.

In order to explain this, one needs to understand that currently in Fedora, there is at most one content release per day. This is true of the traditional single “big repository of RPMs”, and also the OSTree commits derived from that used for Atomic Host.

OSTree has support for a metadata key per commit called ostree.version which is what you see when you type atomic host status. At present, we’re implementing a model where the version numbers are of the form “$major.$increment”, and at the time of this writing the version is 23.33, or 33 commits from release.

With that background out of the way, the interesting thing about the new rpm-ostree deploy (mapped via atomic host deploy) command is it allows you atomically switch one or more in a cluster of machines to a pre-determined version you have tested and validated.

For example, if you’re trying the current Fedora Atomic Host build, you can invoke:

# atomic host deploy 23.32


Transaction complete; bootconfig swap: no deployment count change: 0
Freed objects: 2.1 kB
krb5-libs 1.14-3.fc23 -> 1.14-2.fc23
lua 5.3.2-2.fc23 -> 5.3.0-4.fc23
Run "systemctl reboot" to start a reboot

If you contrast this with the traditional yum update or atomic host upgrade – these commands will both by default pick the latest versions of the components. If the OS vendor is providing updates while you’re in the middle of an upgrade, you could get hosts with a mix of updated or not, with changes you haven’t validated.

Now of course, there are several projects which help in implementing versioning on top of the OS vendor content. The Pulp project is an example which allows importing upstream RPM (or other) content, and managing well-known snapshots of it. Then you configure your client machines to pull from those immutable snapshots, rather than directly from upstream.

Doing this sort of downstream repository management makes a lot of sense for greater than small scales – among other things one often wants local mirroring as well. But even with a local versioned content mirror, it can be very convenient to have the intelligence to traverse the repository history built into the client. It also helps the repository management case as it can reuse the upstream versions, rather than trying to synthesize them downstream.

There’s a lot more work to do on top of this of course. I just posted a proposal for reworking the commit stream which I think would make this nicer. And the above linked Cockpit pull request will be very cool to see!

The bash vulnerability and Docker containers

In a previous post about Docker, I happened to randomly pick bash as a package shared between the host and containers. I had thought of it as a relatively innocent package, but the choice turned out to be prescient.  The bash vulnerability announced today shows just how important even those apparently innocent packages can be.

The truth is that whenever you run code, you need to have an understanding of who’s responsible for it over time. With the Project Atomic model for software delivery, we are also responsible for providing a base image from the upstream distribution, and that base image includes security updates. Are your application vendors on top of bash security updates? It will be interesting to see how rapidly public application containers are updated.

To me, a key goal of Atomic is making use of the flexibility and power of containers – while retaining the benefits of the maintenance model of trusted distributions, and this bash vulnerability shows why that’s important.

Project Atomic + Docker: A post-package world?

I recently was talking with a friend over lunch about Project Atomic and Docker, and he asked: are we entering a “post-package” world?

My short answer: No. The slightly longer answer is that we’re seeing an evolution of delivery coupled with a lot of innovation in management and orchestration.

Evolution of delivery

As part of Project Atomic, we’re evolving from the context of a “traditional” distribution, where distribution is a set of packages. The Project Atomic pattern is introducing two new higher order delivery vehicles: Docker, and rpm-ostree (also nicknamed via symlink atomic). The theme behind the name Atomic is that both of these technologies group software into indivisible units of management.

Let’s look at two artifacts from Project Atomic we’re working on in the Fedora distribution that are shipped in this way: the Atomic Cloud Image and the Docker Container Image. An essential fact to note is that both artifacts are composed of RPM packages.

For example, both the host system and container share a set of essentials such as the bash package. In fact, the idea is, at release they will have the same binary version. Both the Docker base image and the Atomic tree are reflecting the upstream RPM content. This is quite crucial for a distribution such as Fedora; both from the perspective of the maintainers as well as the downstream consumers. If you want to check whether an Atomic host or a container is affected by a security vulnerability, you can use the regular rpm -q, or any of the many higher order tools and frameworks built upon that core concept of an inventory of versioned component parts.

Runtime management power

So what’s the value, then, in boxing up the same old packages in new ways? For Docker base images (and derived images), there’s a massive increase in flexibility – it makes Linux containers very, very easy to use. A simple example is that the host system can now be decoupled from the applications; when the Fedora 22 release of Atomic comes out, your Fedora 21 base image containers can function effectively unchanged, except they will have a newer kernel. You can take advantage of newer hardware support in the kernel or other host features, and stage a migration of applications to newer base images.

It’s much easier now to take those same RPM packages for services and multi-instantiate. For example, you can have a Docker base image that contains a mariadb-server RPM, and instantiate multiple writable containers from that, each with their own copy of /etc/my.cnf.

Things get even more interesting with projects like geard, which make it easy to spin up and configure many containers across multiple host systems. This sort of orchestration is much more complex and expensive with virtual machines.

The continuing need for packages

Whenever a single organization starts to produce multiple products, there comes a very strong pull to define a common shared base. And the Project Atomic artifacts are not the only product of Fedora! It still needs to deliver traditional products, such as the Server and Workstation.

In the Server case, for example, there will obviously be a strong continuing demand for a virtualization host system, manifesting as projects such as oVirt. There’s also a case for a system capable of both virtualization and Linux containers.

And I think the Workstation case still makes sense. I use Linux and other Free Software on my desktop for real work – with Docker, it’s quite nice to be able to test server containers locally before pushing them. I can have the same Docker version on my workstation and servers, or decouple them. The distribution mechanism should continue to cover this. For that matter, the desktop system I’m writing from of course has virtual machines running Atomic, thus bringing four deliverables together.

One might ask: does it even make sense to do this many products within the context of a single organization? I think it does. There are a lot of powerful benefits to still receiving atomic host system updates and Docker base images from the same organization. A lot of userspace is shared, and it allows crossover for things like management tooling.

Now, one can of course find sub-groups within the (large) Docker community that are farther down the “post-package” spectrum, but I just don’t think it’s a realistic viewpoint. As an example, look at the upstream docker-registry While it tells you how to pull the binary Docker from the upstream registry, it also documents how to acquire the individual pieces and run it directly. And there’s real reasons for that, such as being able to build the registry from source code and improve it. To do that, you need the build dependencies, distinct from the runtime. Yes, the README uses pip instead of dpkg/rpm, but the concept is the same.

To be clear then, it’s not about Docker replacing packages: the realistic endpoint is blending the strengths of the two technologies. One example of that some people have been looking at is using Docker as a buildroot construction system for RPM packages.

Finally, on the OSTree side, things are quite a bit simpler. Conceptually, it’s just a way to compose packages on a server (instead of per client), providing each with atomic upgrades. Then rpm-ostree is a tool bridging the world of RPM and OSTree; it’s very much oriented around being a complement to RPM. The rpm-ostree tool also links to hawkey to allow it to inspect and operate on the RPM database inside the trees. More information about that here.

Getting involved

With Project Atomic, we’re not just introducing new software; we’re attempting to change how we deliver software, something deeply fundamental to a distribution project like Fedora. And furthermore, we’re changing how it’s deployed and managed, which impacts application authors and systems administrators. That said, I believe the benefits of Linux containers and Docker are very real.

Want to get involved? Jump in on the Project Atomic community, or see active SIGs and discussions in targeted distributions such as CentOS and the Fedora Cloud group which is hosting the Changes/Atomic Cloud Image. There’s plenty to do in infrastructure and release engineering. Check out fedora-dockerfiles for lots of example Dockerfiles, and try building your own apps. And don’t hesitate to ask questions!

GNOME West Coast Summit end

The West Coast Summit 2014 is over now, and I’m glad I was able to attend. There’s absolutely no substitute for getting a distributed group of people together for face to face conversations about their common interest in GNOME. Thanks to Endless Mobile for providing their office as a venue and sponsoring the event!

It was really great to see familiar faces like Germán, Giovanni, and Kristian (among many others!). Breakout sessions on topics like GNOME on Wayland and Gjs were very successful. It was cool to see GNOME on Wayland (well, it looked the same actually which was the goal 😉 ). Giovanni did an amazing amount of work on investigating the Spidermonkey GC. Christian wowed people with a demo of Builder. I worked on Continuous and OSTree. In particular, on the OSTree branch for static deltas, which should significantly speed up downloads.

See also posts from Sri and Matthias.

OSTree: rigorous and reliable deployment

I sometimes describe OSTree as being even more rigorous than traditional dpkg/rpm type package systems. Now, there are some of you out there who probably can’t imagine how that’s possible. You found packaging so tedious and painful that you gave up, and you now write Go code (because Google wrote it, it must be good, right?) and you hack on your MacBook from a coffee shop, and when you’re ready scp your statically linked binary to staging and then to production. Maybe you don’t even have staging. It’s so simple! Look how fast it is!

If you are one of those people, just think about what happens when you forgot to “git push” for a while, or you had “origin” be a local mirror or something, and then you lose your MacBook, and now you have a big statically linked blob running in production to which you no longer have the source.

In contrast to this developer, packaging is pretty rigorous. Production build systems ensure that all the source to particular artifacts are tracked, have a distinct, clean, and (mostly) reproducible build environment. For example, that your build system isn’t downloading stuff from the Internet in the middle.

On the deployment side, with packaging you can always log into your server and see what is installed, with version numbers. There’s a lot of advantages to that over a developer deploying binaries with scp.

Knowing what’s running

So how is OSTree more rigorous than traditional packaging? It’s very simple – when you run ostree admin status – you are also getting a description of what is running, not merely installed. At the moment, the simple implementation of that is that you must reboot to have a change take effect. On the plus side, you have fully atomic upgrades. But – we definitely can do partial live upgrades, which is the subject of this post.

With dpkg/rpm and friends, there isn’t a reliable link between the package system and the init system (today, systemd or historically one of the sysvinit implementations). For example, whether or not a service gets restarted on package change is up to the packager of the daemon, and furthermore it’s just a shell script called out from a %post. There’s nothing in the system to audit whether or not the daemon has been successfully restarted, and how that relates to the package change.

Conversely, it’s a pretty sad state of affairs that systemd is totally unaware of packaging. Now most existing administrators understand this, and know the technological/organizational/political[1] reasons this is the case.

Whether or not a daemon got restarted is only one of the obvious ways in which installed and running become distinct. A much more common case is upgrading a shared library such as If we understand that not all daemons or code may be instantly restarted, then we have a situation where the package system is recording merely what’s installed – an administrator later logging in and debugging a failure may have to reconstruct that this system was live upgraded via noting the (deleted) suffix on the shared library in /proc/pid/maps. If they know to look there of course…

So how do I plan to preserve the present property that OSTree has in that it describes what’s running and not merely installed? Let’s be honest, it’s a hard problem. But take a simple case – we are running a tree with checksum ac81df, and we live-apply a subset of the files from the new tree 59da1a as an overlay on top of the running tree. Then ostree admin status might say something like this:

* fedora-atomic ccc6ff1d1d6fdfcb7309700af8fec5de61511767b6ed43f77feb549f7afcaefb.0
    origin refspec: local:fedora-atomic/rawhide/x86_64/buildmaster/base/core
  Dynamic overlays:

Here we’d be seeing the case where our new tree pulled in an updated bash binary, and a new libc. Furthermore, we can backreference from the (device, inode) pair in any running processes /proc/pid/maps to the originating tree – because it won’t be physically deleted as long as it’s still referenced. Also, for any live-upgrade system via OSTree, I plan for it to be fully aware of systemd, and carefully audit the return values from service restarts, correlating it with the state of the filesystem.

This is still a relatively simple case. Think about the situation where you upgrade two or three times, and do partial live updates from each of them. OSTree would carefully maintain the precise manner in which you upgraded – it would be reproducible by others. You’d be able to backreference from any code in memory to the originating tree, which contains the manifest of binary versions, which finally link to source code.

Now with yum history, one could theoretically reconstruct a lot of this, but again yum (really rpm) suffers from being so generic that the core operation of interacting with things like systemd is just a callout to an un-audited shell script. Your current desktop and servers are probably a messy blend what I call “partial live updates”.

[1] And when I say political, let’s imagine what happens when someone posts the first patch to show the package name from systemctl status

Giving a *name* to your root filesystem

First, OSTree v2014.1 is out! Nothing earthshaking, but I’m happy with some of the fixes and features there.

One thing that’s absolutely fundamental about OSTree is that it forces one to name complete filesystem trees. While the system does not mandate any convention (they’re just strings), you have seen some examples in previous posts, like gnome-continuous/buildmaster/x86_64-devel-debug and fedostree/20/x86_64/base/minimal. Here the “OS” name starts first, and after that, you can choose whatever format you want. Now, traditional dpkg/rpm packages are names for partial filesystem trees (plus some metadata and scripts that run as root). When they’re assembled via a package manager onto the root partition of your drive, that collection is not normally named – what you have is an anonymous, and very often unique, custom set of packages.

There are of course efforts in various package systems and GNU/Linux distributions to attempt higher level management of software beyond “set of packages”. In Debian, metapackages are common. In Fedora, there is comps.

I could talk for quite a while about the management differences between the metapackage vs comps approaches, particularly after YumGroupsAsObjects. But suffice to say that I think both suffer badly from being glued on top of the “set of packages” model. In many cases they end up making things more complex, not less. Here is a blog entry that describes how Debian’s metapackages clash badly with another tool which tries to remove “unused” packages. From my observations in the Fedora context, comps groups are mainly used for initial system installation (in Anaconda) and early set up perhaps you do yum install @virtualization after installing a workstation.

How OSTree is less flexible, but more rigorous

With OSTree, you can say something like “I’m running fedostree/20/x86_64/base/minimal”. This is a name for a filesystem that was replicated from the rpm-ostree build server – and it is immutable. OSTree itself comes with no application mechanism, or even the ability to layer trees. So this is a far stronger and more rigorous description of the contents of your (visible) root filesystem.

For example, with the current rpm-ostree, if I remove a package from products.json, then it drops out of the filesystem tree composed on the server side, and thus will also disappear when clients upgrade. It’s really quite simple. The problem of removing old, unused packages is a messy subject in package systems like dpkg/rpm – it’s painful at the distribution level with things like Obsoletes, and if you are a downstream consumer of the distribution, if you installed a package at some point on your servers that you want no longer installed, your best bet is to use something like Puppet to assert that packages are removed.

Now, you still may be thinking “OSTree sounds cool, but I want to be able to install things!”.

Downstream tree construction and naming custom trees

I mentioned in the previous post that I plan to implement a feature like yum-ostree install strace, which would assemble a new filesystem tree from packages (just like rpm-ostree does the server side), and set it up for the next boot. But an interesting question arises – how should I name this filesystem tree? We could represent “install” by appending the string “+strace” to our current tree; so we might end up with a tree named “fedostree/20/x86_64/base/minimal+strace”. Now obviously this doesn’t scale really far – and perhaps leads us back towards wanting e.g. a “tracing-and-debugging-tools” metapackage (or comps group); if you care to install strace, why not also perf? With Fedora’s comps, it’s actually quite nice that we have a reserved symbol “@” and a distinct namespace from the normal package set. So we could synthesize a name like “fedostree/20/x86_64/base/minimal+@tracing”.

What I’m going for here really is that I’d like to cut down on the combinatoric complexity of packages by emphasizing layering over arbitrary additions and removals. This doesn’t mean that we need to completely restrict the system to layering – one could clearly implement yum-ostree remove X (for naming, append e.g. “-X” to the tree name). The lowest OSTree level lets you put whatever filesystem trees you want in it. But for many cases where people want to do this kind of thing, we can turn it into configuration instead of system manipulation. For example, using systemctl mask firewalld.service over yum remove firewalld. If something is supported via system configuration, we should prefer doing that rather than creating new filesystem trees; it’s more efficient and safe to replicate a pre-built tree that’s been tested and known to work, then add configuration.

Why OSTree requires “/usr/etc”

While I often lump “dpkg/rpm” together since they’re very similar architecturally, their implementation of configuration file management is a small but nontrivial difference. This is because dpkg allows interactivity, rpm does not.

With dpkg, in the case where a file is modified locally by the user (say /etc/httpd/conf/httpd.conf), dpkg will prompt the admin interactively. It only has at this time the modified config file, and the new file. Now, personally I think dpkg prompting in the middle of open heart surgery on your root file system is completely insane. You really must use screen or equivalent if doing remote administration. But being able to see what files changed is useful.

RPM is quite opposite. While it also does open heart surgery, you might see one or two messages related to config files go by in the output spam. At the end, it’s up to you to search for .rpmnew and such in /etc. Of course there does exist tooling on top to detect this after – I’m just talking about the out-of-the-box experience. Furthermore, allowing packagers to create a distinction between %config and %config(noreplace) makes the whole thing very inconsistent from a total system perspective. It’s hard to know what will happen on an upgrade unless you know beforehand what the packager chose.

The handling of /etc for OSTree took me a while of thought. The executive summary is that OSTree requires the existence of /usr/etc which is read-only defaults. Whenever you do an upgrade (more generally, switching trees), OSTree does a basic 3-way merge. It doesn’t attempt to understand the contents of files – if you have modified a config file in any way, that wins. Unlike the simplistic “split partitions” approaches out there, this does mean that you get new default config files, and also any config files that were removed in the new tree also vanish.

Unlike the rpm and dpkg model, OSTree is fully atomic. The /etc I’m talking about on upgrades is a separate copy from your running /etc. That allows you to do interactively run whatever scripts or programs you want to ensure the system is correctly set up for the next boot, with total safety.

Another cool feature enabled by the existence of /usr/etc is ostree admin config-diff. This is something I always wanted from Unix – it can give you, at any time, all of the files you modified in /etc. This makes it easier to replicate and reproduce system state after the fact. Your OS can have a “reset all configuration” button, and revert back to a pristine /etc at any time.

There are of course other systems out there you can layer on top of dpkg/rpm to improve the situation. etckeeper is a popular one. See also this proposal for /etc/defaults in Ubuntu. etckeeper suffers from being glued on after the fact. Furthermore, you don’t really need to keep a history of the upstream defaults in git – the main feature is having a 3 way merge. And /usr/etc is a simple way to implement it – and helps OSTree preserve the “feel” of the traditional Unix. You can just vi /etc/some/configfile.ini any time you want. The config state isn’t stored in a separate partition (e.g. CoreOS mounts theirs at /media/state). But OSTree still provides atomic upgrades as the many image-based upgrade systems out there do.