Fix your FODs
A supply-chain attack on Nix, and our approach to solving it.
A couple of weeks ago, we wrote about an attack that proceeds via Nix caches. Today we're again going to talk about an attack, which again is somewhat well known already in the corridors of NixCon but not elsewhere (despite already having a CVE record), and which again works because of how Nix caches artifacts. That, however, is where similarities end.
The basic idea of the attack is to use hashes which people rarely pay attention to but which Nix often cares more about than anything else. This divergence adds up to an important vulnerability (I would rank this as one of the most important Nix vulnerabilities I'm aware of), letting things easily slip through code review.
In this blog post we'll talk about the vulnerability; how garnix, with the support of Mercury, implemented a CI check to validate against it; and what you can do to stay safe out there if you don't use garnix.
Fixed-output derivations
Nix is, for the most part, an input-addressed system: the store paths it gives to the things it built are based on hashes of the build recipe, not of the build output. One of the main exceptions are fixed-output derivations; these are derivations that get built in a less sandboxed environment (and so have e.g. access to the internet) but which must produce a store path with a particular hash. A popular example is pkgs.fetchurl, which can be used to fetch things from the internet if you already know that thing's hash:
pkgs.fetchurl { url = "https://ftp.gnu.org/gnu/gcc/gcc-15.2.0/gcc-15.2.0.tar.gz"; hash = "sha256-cpTWXMGgVYy4Fa8MqMd2PYb3oxGZeU7eP2MMDRsKVyM="; }
Which is used a lot to fetch the source code of projects.
A problem
A fact about FODs which might be surprising is that if Nix has already built a thing with hash X, it will reuse that instead of rebuilding for anything which has that same hash, even if other things, such as the url, are different.
You can try this yourself; in fact, let's do it with the Nix install URL. In a nix repl, try the following:
nix-repl> :l <nixpkgs> Added 24820 variables. nix-repl> pkgs.fetchurl { url = "https://evil.site/index.html"; hash = "sha256-VA0HcXrXnXqcnmNp4itxkO4FVSJkAmDAzTzfA+j9oso="; } «derivation /nix/store/qdaql7izydxbhqpq9nql4g1j5n0fq5c6-index.html.drv» nix-repl> pkgs.fetchurl { url = "https://nixos.org/index.html"; hash = "sha256-VA0HcXrXnXqcnmNp4itxkO4FVSJkAmDAzTzfA+j9oso="; } «derivation /nix/store/2a6pc1imh1ckmjv2bca4wx0anj2g2ypa-index.html.drv»
We got two different drv files. But let's try building them; in your terminal, try:
% nix build '/nix/store/qdaql7izydxbhqpq9nql4g1j5n0fq5c6-index.html.drv^*' --print-out-paths /nix/store/ym6icikfzsgpr7n5mfqaaxyj6s9lfi6i-index.html % nix build '/nix/store/2a6pc1imh1ckmjv2bca4wx0anj2g2ypa-index.html.drv^*' --print-out-paths /nix/store/ym6icikfzsgpr7n5mfqaaxyj6s9lfi6i-index.html
When built, they're the same! This makes some sense: their recipes are different, but their outputs are — if we believe their hashes — the same. And if the store path hashes of these packages were different, changing for example a mirror would cause potentially a lot of rebuilds.
But to most code reviewers, these look very different. The nixos.org URL is very well known, and trusted. The other, however, isn't.
This, incidentally, is the same issue that in a less malicious form bites a lot of Nix users. If you update a source URL or version in a fetcher, or more generally the contents of an FOD, but you don't update the hash, building the packages will succeed, but you'll be getting the wrong version. This can be very annoying to track down.
The essence of the attack. To a human the first box might look innocuous. They may even checkout the commit mentioned to be sure.
So now we have the pieces needed for the attack:
- First, we get our malicious code into Nixpkgs. This is supremely easy; many derivations are added to Nixpkgs automatically without review (for example, almost all the versions of all the Haskell packages). This is so far not a problem for Nixpkgs, because no one is going to use that package.
- But now we submit a PR updating a package that people do use; we keep the URL something unsuspicious, but use the hash of our malicious package. Everyone who uses that package will be executing the malicious code.
Whether the attack works, then, depends on whether reviewers actually manually verify the hashes. And we know they don't always, because often PRs make it in that had the wrong hash (at the time of writing, I only had to look about 12 hours back in the Nix commit log for evidence of that.)
A variant of the attack skips step 1 altogether, and instead uses an older hash of the target package itself. The older version is picked to have a known vulnerability that has been fixed in later versions.
It's not crucial that this be (partly or entirely) in Nixpkgs. What matters is only that the cache of the build of the honest-about-its-hash-but-still-malicious package be available when building the dishonest one. In almost all cases an attacker can achieve that; garnix, for instance, has a global cache, but beyond that almost everyone uses the NixOS cache, which as we saw does not vet everything it caches. Then the second step might be a pull-request to one of your projects or a dependency that is not in Nixpkgs (but with a hash that is). If your project is open source and you accept external contributions, its important to be aware of this.
Note that not using cache.nixos.org does not help. If you're using the Nixpkgs code, and running any form of caching (which includes the local /nix/store/!), you are susceptible to this attack.
How to safeguard against the attack
If you use garnix and have a paid plan, keeping yourself safe is relatively easy. Just enable our fodCheck feature by adding this line to your garnix.yaml (creating it if you don't have one yet):
fodChecks: true
garnix stores FOD input-content hash mappings in a global database; this makes the check much faster than running it on your own (see below). As it is run in parallel with your other checks, it likely will not affect overall CI times very often, except when first enabled.
Currently FOD-checking is still resource intensive (where the relevant resources are network traffic, local disk, and rate limit budgets with various upstreams), so we aren't making it available to free plans. As we verify more and more of Nixpkgs though, and cache the results, this will hopefully change.
If you don't use garnix, you can run fod-oracle to check whether any FODs have the wrong hash.
You should probably run this on CI. However, because fod-checker does not cache what FODs correctly specify their hash, this check can be extremely slow (a run for larger packages can take dozens of minutes). A potential compromise is to only run it (locally) when a) you receive external contributions; or b) you update or add any Nix dependency (e.g., update your version of Nixpkgs).
Another significant problem is that some FODs fail only intermittently — an upstream server is unreachable, for example. Without a database, it's hard to re-run only those failures.
What to do if an FOD check fails
The first thing to do is update your Nixpkgs (or whatever code it is where the FOD is defined) to fix the failure. Currently most projects will still have several failures, and it's annoying to fix them. But we hope that by having more people be aware of this issue, and equipped with the tools to fix them, this number will quickly go down.
If you want to figure out whether a particular FOD failure was actually malicious, and thus you were potentially exploited, things get a bit hairier.
Some categories of failures can be ruled out as innocuous. For example, if a website is intermittently down, or it's TLS certificate is expired, but the hash of the response is actually the same.
For most others, including hash mismatches or servers that don't respond at all or respond with errors, it is harder to tell, but we have landed on two different approaches:
- 
Trying to figure out what the original source (e.g., website, not source code) of the content was, and then deciding whether you trust that. In order to do that, you can look for the same hash (and its alternative formats) in all of the Nixpkgs history to find the first occurrence. That one should be the "real" source. Nixpkgs is quite a big project, though, so be prepared to let your computer work for a while. 
- 
Checking the actual contents of the cached FOD. Since this can be the entire repo of a project (and sometimes even binaries), this is often very hard. If you know the trusted origin, you can potentially find the specific version you're using, and compare it. 
For the first approach, we suggest using the git log -G<regex> option.
How Nix can make this better
What we built works well, but what we really want is a more Nix-integrated solution. After all, we still have some problems:
- Nothing will prevent you from running nix build locally (which will not do FOD checks).
- It only runs in garnix.
- There's no integration with other FOD checkers.
Jonas Chevalier (zimbatm) and others at Numtide have a very clear idea of how to "correctly" solve this issue. The essence of the idea is to have Nix store all the input-output hash pairs it has built, or received from a trusted source. Caches would keep that information in the narinfo file, so substitution can easily use it.
Jonas has a much more detailed design, which I think is fantastic. It will likely take a decent amount of full-time work. If you can sponsor his work, do!
Another option is to change FODs to hash to their source rather than declared output hash. This can be combined with content-addressed derivations so that early-cutoff still happens. (It's easy enough to try this out without any changes to Nix by writing a replacement to pkgs.fetchurl that sets the name to a hash of the entire URL instead of just the last component.) Because CA derivations are largely already a reality, this approach might be easier. But because CA derivations are not fully stable, this approach also has downsides.
Along a more incremental route, we would be happy to provide API access to the list of our verified FODs; other projects could too, and then CLI tools like fod-oracle could incorporate the sources you pick to avoid re-checking.
How this can make Nix better
Unreproducible FODs are a problem, and not just because of possible malicious attacks discussed in this post. They also often indicate incorrectly-updated versions, out-of-date mirrors, and missing security fixes. In particular, FOD checking can help point out commits where you intended to update a package, but forgot to update the hash so it effectively didn't happen.
There have been, and there continue to be, efforts to fix these FODs. But by making the fact and significance of FOD failures more obvious to downstream users, and by pinpointing the ones people actually depend on, there is a route towards collectively fixing all our FODs.
Conclusion
Nix, with its unified and comprehensive dependency management, and with its support for more reproducible builds, has the potential to make supply chain security much better than the current status quo. Lying FODs are a major roadblock, but with garnix's fodChecks we already have the tools to protect ourselves.
Thanks to Ólafur Bogason, Jonas Chevalier, Gabriella Gonzalez, Jade Lovelace and Alex Truslow for helpful discussions at various points in this work.
Continue Reading
Solving the issues with remote building in Nix
Nix makes CIs easy to compare; we benchmarked the main Nix CIs.
With most caches, you are giving a lot of people a lot of access