Mar 14, 2024Julian K. Arni

Call by hash

What happens if we make URLs immutable? A somewhat unusual idea that can substantially improve and simplify deployments.

It seems then, Hermogenes, that the giving of a name is not, as you imagine, an ordinary matter for any ordinary random persons.
Plato, Cratylus (tr. David Horan)

The best developer experience

For me, the thing that really gives programming it's sheen, that makes it sing with appeal, overflowing from the borders of work and work hours into evenings and weekends and hobby projects, is the feeling of omnipotence that it gives. Just write something down to make it true. Fiat sorted. In the small world of our program, programmers are gods.

But that small world comes crashing into the bigger one when system administration, deployments, reliability, distributed systems, real hardware, and state get involved. The words fail: the request made never returns, the file you opened can't be opened. Or the words never had the sufficient reach: though the files in the repo describe what should happen when that code is run without a history, it says nothing about how an upgrade process should happen.

At garnix we're developing a Platform-as-a-Service (PaaS — think Heroku) based on Nix and NixOS; the premise is to see how far we can go in this direction, insofar as infrastructure is concerned. Specifically, we are striving to make truths about the current codebase be the only truths programmers should care about. Getting the current code to production, avoiding downtime with new changes, worrying about interactions between co-existing versions of the same app, or about long-running processes being allowed to complete, or about what happens during a database migration, etc., shouldn't be part of the job. Once code is in main, it becomes truth, and the whole truth.

A tongue-in-cheek way of putting it is that the best developer experience is being a God. And so that's what we should provide.

A short history of failed apotheoses

In order to get to that developer experience, a lot of subgoals have to be in turn met. The ones I'll be talking about today are:

No unhandled requests (zero-downtime 1): there is no moment in time when a request could be dropped because there is no server up to handle it.
Process every request to completion (zero-downtime 2): servers that are handling requests do not get killed before being done. This should be true even if the request needs to be processed by multiple servers in turn. A lot of processing in modern servers happens asynchronously — such as via queues — and requests must be processed to completion throughout all of their lifetimes (which may be potentially hours long!).
Fast deployments: Deploying a set of services shouldn't take substantially longer than the longest time-till-readiness of any service.
Infrastructure-wide atomic upgrades: It should be possible to deploy new versions of two or more services together, such that none of the new versions ever talk to the old ones, and none of the olds ones ever talk to the new ones. Otherwise we're stuck maintaining backwards-compatibility of every API till the end of time, or else having downtime.
Simplicity: The infrastructure should not become substantially more complex (requiring extra components such as service meshes and control planes) or costly (requiring a lot of server duplication).

If we can achieve these, we're well on our way. But current practices tend to achieve only some of them, at the cost of others. Let's delve deeper into the specifics of deployment practices, and the trade-offs they make.

Simple (pit-stop) deployment. The rectangles represent services, and arrows request direction; the rightmost arrow represents the external-facing API. In pit-stop deployment, the old service is taken offline, updated, and restarted, causing downtime (red arrow).

The most basic form of deployment is to take down the service we want to upgrade, copy over in some way the new version, then start that. Let's call it pit-stop deployment, since the server is taken out of the race for the upgrade. This is certainly simple and fast. But it causes downtime in both senses, which is a major reason it's often replaced by other techniques. It's also impossible to have infrastructure-wide atomic upgrades.

Next we have rolling deployments. Here, each service is composed of multiple instances behind a load-balancer. When a new version of an application is available, a small number of new instances are started. When they reach a state of readiness, they're added to the pool, and an equal number of old instances removed. This continues until all instances are new.

Rolling deployment. Notice that different versions of the same service coexist, making it very hard to make incompatible changes to an API.

This method is quite popular nowadays, and it's the default in Kubernetes. It always has a server ready to handle requests, and usually handles all of them to completion. However, deployments aren't necessarily fast, and atomic upgrades are not possible. In fact, not only are deployments not atomic at an infrastructure-wide level (i.e., two versions of different services can't be deployed atomically together), but it's not even atomic at a single-service level. It does maintain relative simplicity: load-balancers are necessary everywhere, but they likely were needed anyhow for redundancy. (This is not to say Kubernetes itself is simple!)

Then we come to blue-green deployments. The idea here is to spin up a whole new, shadow infrastructure, with the new versions of services (or old ones where nothing changed). When all of the services are ready, traffic is switched to this new infrastructure.

This achieves infrastructure-wide atomic upgrades, which is very useful if you don't want to be forever stuck with your (internal!) API decisions. It also manages the first type of downtime well. The second, however, is tricky. The problem is that, even if a machine can accurately report whether it has finished on-going requests, a new request from a different service in that same version of the infrastructure may come in later. Even though no more requests are coming in from the outside world in the old version of the infrastructure, we can never be sure that there isn't a request bumping around its services (we can't ever be sure to ask services at the same time whether they're currently processing requests).

Moreover, some system must be put in place to make sure services in the new deployment group are talking to services in their own group (rather than the old one). They presumably address each other with the same name, but resolve to different things! This can be achieved with custom DNS servers, or appending to hosts files, or service meshes, but it adds extra complexity.

Not only that, but the deployment can take longer than necessary, since even servers that haven't changed must be redeployed. These may take longer to start up, and dependencies between services can make things even worse.

Blue-green deployment. An entire new infrastructure has to be deployed.

Knowing, then, the status quo, can we come up with a better deployment process? It turns out that yes, and by using a very nix-y idea: immutable URLs, that contain the hash of their behavior.

In this blog post, we're going to talk about the compute part of that idea. Persistence is hard; we'll get to it in this post, but will likely need another one after that to do it justice.

Nix, tendentiously

In order to get a sense for our approach, it will help to have a very basic understanding of Nix.

Consider the following script.

#!/bin/sh

SHOULD_I="Do not"

if [ "Do" == ${SHOULD_I/not/} ]; then
  find . -type f -executable -print | perl -ne 'print if /important/' | rm
else
  echo -n "How many -n do you see?"
fi

If you have written enough shell scripts, you are probably hearing alarm bells. This script is a museum of unportable behavior: its behavior will change depending on the version of /bin/sh, the of perl, test/[, and find exist on the system where they are run (if they exist at all). (echo will be a shell built-in.)

When /bin/sh interprets this script, first it looks up the PATH variable for directories to search, then searches those for binaries named perl, and find, then reads and executes those files. If the PATH variables changed, or if the contents of those files did, the behavior is different. The system is a name resolution mechanism, mapping short names such as perl to contents.?

Since lots of things can change how that resolution happens, lots of things can go wrong. You could not have perl installed. You could have a Python named perl in your PATH, for example. More often, though, one gets the “right” program, but with the “wrong” version. Sticking to POSIX might be good advice, but it just isn't enough.

The insight of Nix is that one can move the name resolution step to the build phase of software, rather than runtime. To do that, a phase could replace all occurrences of executable names with their contents. But that's awkward — embedding binaries everywhere, multiple times, isn't a very neat solution. The next best thing is to leave a filename there, but a filename that uniquely refers to that content — by, say, including the hash of that content. (Even better: instead of hashing the contents, we can hash all of the inputs that went into making that content what it is. That way, anyone can just from source — without building — know the relevant names, which has a lot of advantages, and one or two disadvantages.)

In practice, you end up writing something like:

#!${pkgs.bash}/bin/sh

SHOULD_I="Do not"

if [ "Do" == \${SHOULD_I/not/} ]; then
  ${pkgs.find}/bin/find . -type f -executable -print | ${pkgs.perl}/bin/perl -ne 'print if /important/' | rm
else
  ${pkgs.echo}/bin/echo -n "How many -n do you see?"
fi

And the references get resolved to specific versions at build time (the pkgs map or attribute set is a set of package versions), resulting in something like:

#!/nix/store/000ilzxq801jkh3bk9dv17lbhx6bhy7r-bash/bin/sh

if [ "Do" == ${SHOULD_I/not/} ]; then
  /nix/store/0brmhqqa6hanlkqjwizkx267rbksz04s-find/bin/find . ...
else
  /nix/store/021gp41ajq0x60cg345n2slzg7111za8-echo/bin/echo -n ...
fi

But these names are different than the ones we had before: what they resolve to never changes. /nix/store/000ilzxq801jkh3bk9dv17lbhx6bhy7r-bash/bin/sh will always be exactly the same version of bash?, built with the same flags and the same compiler.

This script will always “be” the same. Everything about it that we care about, down to the dependency versions, is embedded in the script (assuming the system takes care of populating /nix/store with the right versions). So we can also hash it, and call it by that name, with the guarantee that the name represents a fixed thing.

Note that during the replacement step we can store what variables were used. In that way, we can keep track of the dependencies of the script in a semi-automated way. (And, because it's so unlikely that someone will directly accidentally hard code these variables, we know that those are all the dependencies.) ?

Deterministic name server

To recapitulate, Nix unifies all of the name resolution, so that it only happens once rather than on every execution, and moves it towards the person who is most qualified to do it: its author. The resolution is, in other words, at build time rather than happening at runtime, getting prebaked into the thing deployed.

This terminology isn't how Nix is usually described. But it serves to bring out an analogy: DNS.

DNS associates a name to an IP address (or to another name). This association is mutable: a name can be updated to refer to a different IP address. But just as importantly, what code is deployed in that IP address (or even what server it refers to) itself can change.

But what if we changed that — what if we gave each server a URL that corresponded in a fixed way to the code running in that server? NixOS, which takes the core ideas of Nix-the-package-manager and makes a Linux distro out of it, can already give us a hash to use as the name. (Docker too, in quite a different way.) We allow variable interpolation for more easily referring to these URLs:

{
  nixosConfigurations = rec {
    backend = ...;
    frontend = ... ''
      ...
      runFrontend --backendUrl ${mkHashUrl backend};
      ..
    '';
  }
}

The above code is a nix map (or “attrset” in nix terminology) with a backend and a frontend key. This is a lot like the pkg variable in the script mentioned above: a package set, except that here the packages describe whole server configurations, that can be deployed. The details of what is in each don't matter for us; the only point to note is that the frontend uses the hash-based URL of the backend. The hash of both the backend and the frontend is calculated from their respective definitions, exactly like nix package hashes are calculated. Note that this in turn means that if the hash of the backend changes, the hash of the frontend also changes, but the converse is not true. Note also that a “dependency” in this context means that one service makes a request to another (the arrows in our little animations).

When garnix deploys this infrastructure, it creates DNS records for each server with a hash of their definition (this is what mkHashUrl also returns). Something like:

14sk2wdvisrsc[...].garnix.me → backend

09lb3m28bd5v4[...].garnix.me → frontend

To reiterate, 09lb3m28bd5v4.garnix.me only ever talks to 14sk2wdvisrsc.garnix.me; if there's a new version of the backend, a new version of the frontend will also have to be deployed that uses that (though this happens automatically).

The deployment is as in the video below. When a new service is to be deployed, all upstream services of it also need a redeployment, but nothing else. This has the advantages of blue-green deployments, but which a much smaller footprint.

Immutable deployments. A new version of a service (circled) creates service changes upstream only. This is analogous to how persistent datastructures work.

In fact, the footprint can quickly become smaller than even rolling or pit-stop deployments, because servers can be shared across environments. For example, just today we added a feature that enables PR deployments: ephemeral instances of the infastructure that are automatically created when a pull-request is opened, and destroyed when it's closed.?

The really cool thing is that if two PRs only differ in one service, they'll automatically share the servers that are the same! This can speed up setup, and substantially reduce costs. (If there's a PR-specific persistence service, such as a database, this forces a separation of all services that talk to the database. The details of persistence-related questions will be the topic of future blog posts.)

This setup gives us another big advantage, somewhat implicit in the above: we can statically tell that frontend depends on backend. Even more importantly: we can say that nothing else does (since nothing could “guess” the right URL without using this interpolation mechanism and hence being known to us).

Of course, all of this logic is automated: you don't have to keep tabs of what changed, and what therefore needs to be redeployed.

Count our sheep

We know all the consumers of an API, and can therefore more easily make backwards-incompatible changes confidently; especially together with —
Infrastructure-wide atomic changes.
More easily discover causes in misbehaving systems. Any change to a service, including an upgrade to its dependencies, will result in a name change, and so can be more easily noticed and attributed in the logs.
Every request is processed to completion, even if those requests pass through several different servers. Unlike blue-green deployments, we can establish a shutdown order that guarantees this: starting from the “topmost” services, which aren't depended on.?
Smaller footprint. We redeploy only part of our infrastructure on changes, rather than all of it. And we can share services across environments. This also leads to faster deployments.
Simplicity. We don't need a network overlay, or a deployment-attached DNS server, or a mesh, to get servers to communicate to the right versions of applications.

This list mostly subsumes our original desiderata, but also includes things not originally there. Nix, and nix-y ideas, are often like that: it gives you things you didn't even know you wanted.?

Conclusion

A possibility we didn't talk about, which is not implemented in garnix yet, but is very exciting, is combining the above system with socket-activation and scale-to-zero servers. What that gives us is something where every version of every service exists forever — you can talk to it if you know it's name. The implications for asynchronous processing are, we think, significant. This may moreover enable fearless breaking changes in public-facing APIs.

We also didn't talk about persistence, which is evidently a big part of the picture. The next two blog posts in this series will go into more detail there.

We also didn't talk about webhooks and circular dependencies, which complicate the picture.

We've started rolling out our hosting, based on these ideas, to alpha testers.