Stop trusting Nix caches
With most caches, you are giving a lot of people a lot of access
If you've been using Nix for any length of time, you've probably used projects that recommend adding their cache, and their signing key, to your Nix configuration. Or they do that sort-of-automatically, by setting the substituter in the flake.nix, so you are prompted to accept their cache when building anything.
We strongly recommend against that. External caches give people with access to the cache an easy path to replacing most of your executables with malicious ones, allowing potential remote code execution and privilege escalation. And because of how most of them are setup, there are a lot of people with that access!
(Adding more caches also happens to slow down most operations in Nix, since all the caches are queried for every path you want to build. But that's a small issue relative to security.)
How the attack works
In Nix, it's common to set up a cache (either via S3, or Cachix, or a self-hosted solution such as attic or nix-serve), and to populate that cache with the result of builds from CI. This in turn gets used by the next runs of CI itself to avoid duplicating work.
People who use or contribute to the project are then often encouraged to set up their dev machine configuration to use that cache too. That way, you don't have to build locally since CI does it for you (which, if done correctly, is wonderful). This can either be configured globally, or, with flakes, on a per-flake basis.
When configured globally, any time you build any package, those caches you configured will be queried for that package. Anyone with access to the cache can push malicious versions of that software, so that you end up installing (and using) it. A malicious actor just has to figure out a bunch of plausible candidates for packages (for example, upcoming versions of popular packages on nixpkgs) and upload those.
These candidate packages and executables can be ones that are frequently invoked with sudo (such as nix itself, due to nix-daemon and nixos-rebuild), in which case the attacker can gain root access.
Reading the source code of the project, or sandboxing the application you intend to run, is no protection at all, since the attack doesn't use the source, happens before you run the application, and affects completely unrelated executables.
The problem becomes worse given how the more popular CIs work and are used. In particular, in CIs like GitHub Actions, everyone with write access to the repo has read access to the secrets. For many projects that's a lot of people (the nix-community organization, for example, has over a hundred members). In order to automatically push artifacts to the cache, the secrets for uploading (e.g. signing key) must be uploaded to the CI, so all of those people have push access to the cache. Thus, if one of those people with write access is malicious or compromised, they can poison the cache.
For most CIs, secrets are only shared by running a malicious commit. That leaves a paper trail in the form of the build logs, allowing attribution. However, CIs also often have a retention period (e.g., 90 days for GitHub Actions); if the attack was only noticed after this period, or if the attacker waited this length of time after getting the cache secrets before mounting the attack, it's not clear that there is a way to figure out which contributor was malicious or compromised.
If the cache is set up per-flake, the situation is a little better. That cache only gets used for builds triggered in the project, rather than for any build at all. However, it still allows any of the project's developers to install malicious software, and potentially to escalate privileges (by e.g. having a dependency on an upcoming version of nix that they poisoned, which you then end up installing and, when you eventually choose to upgrade your running nix version, using). Not even your coworkers should have that kind of access to your computer, much less the dozen maintainers of some CLI tool you like.
How it should work
Obviously, the number of entities you trust to certify a cache artifact should be small. Every project or organization having their own cache (which every member also has access to, and can do anything with) that you trust is the opposite of that.
Instead, the entities doing the builds (e.g. GitHub Actions, nixbuild.net, garnix) should be doing the signing. Those are much fewer, and to various degrees those already have to be trusted anyhow.
That's how garnix works. We have a single cache that needs to be added, and a single key you trust, that only we have access to. Even though we build artifacts for thousands of customers, and cache all of those artifacts so anyone (with the right permissions) can download them, only we sign artifacts, and we only sign things we built ourselves.
Hydra also goes in the right direction, in separating the people who can build from those who can upload directly to the cache. For the former group, the only way to end up with artifacts on the cache is to actually build them, which means they can't be malicious substitutions. Privilege creep is a danger even then, and it can often be the case that all contributors to a repo have access to the Hydra instance.
GitHub/GitLab/Azure/etc. however, likely will not in the future have this type of Nix-specific provenance information. For now, you should be very careful about trusting the cache of a project that uses these CIs.
There are efforts like Trustix to help decentralize trust, which allow for requiring multiple caches (and build pipelines) to agree on what an artifact should be before being trusted. This is a good effort, but in the presence of CI pipelines like GitHub Actions, it only slightly ameliorates the problem. If, as is the case in these pipelines, many people have access to the cache, it's still easy to have access to two or more of them (and for that slight protection a lot of complexity, and more cache misses, are introduced). But Trustix opens the door for different pipelines, such as having dedicated verifiers, and that's very exciting.
What you should do now
Review all the substituers, extra-substituters, trusted-substituters, and trusted-public-keys in:
- /etc/nix/nix.conf
- ~/.config/nix/nix.conf
- Your configuration.nix
Also review ~/.local/share/nix/trusted-settings.json, which contains a record of which flake caches and keys you marked as permanently trusted.
We recommend removing caches that don't have a good story about limiting people with push access. In some limited cases, such as your work cache, you might decide after a cost-benefit analysis that removing the cache isn't worth it, and that's fair. Still, we recommend considering migrating to a system without as much risk.
If you are a maintainer of an open source project that has a cache, we suggest either moving to a system with more limited access, such as Hydra or garnix, or removing (or add warnings to) mentions of how to use the cache.
Though garnix and many Hydra installations don't give all (in the case of Hydra) or any (in the case of garnix) maintainers unrestricted push access to their caches, they too can be compromised, and the consequences are just as serious. Ultimately using even these caches involves a risk you should be aware of.
Conclusion
This blog post started because someone in the garnix Discord asked about adding another, external cache to speed up builds. Indeed, using these caches is pretty widespread practice. Hopefully this blog post makes clearer what the risks associated with this are.
Of course, having caches is a wonderful feature of Nix, and I wouldn't want to lose it altogether. In time I hope we migrate to safer ways of setting up these caches.
(Thanks to Arian van Putten for some pointers, and Alex David and Sönke Hahn for comments.)
Continue Reading
A response to Gerd Zellweger's "The Pain that is GitHub Actions"
How CI can be faster, more reliable, and more useful
We've added incremental compilation to garnix. In this blog, we discuss prior art on incremental compilation in Nix, and describe our own design.