The Xz Backdoor Highlights the Vulnerability of Open Source Software—and Its Strengths

@[email protected] · 3 months ago

The Xz Backdoor Highlights the Vulnerability of Open Source Software—and Its Strengths

@[email protected] · 3 months ago

I wonder what sort of mitigations we can take to prevent such kind of attacks, wherein someone contributes to an open-source project to gain trust and to ultimately work towards making users of that software vulnerable. Besides analyzing with bigger scrutiny other people’s contributions (as the article mentioned), I don’t see what else one could do. There are many ways vulnerabilities can be introduced and a lot of them are hard to spot (especially in C with stuff like undefined behavior and lack of modern safety features) , so I don’t think “being more careful” is going to be enough.

I imagine such attacks will become more common now, and that these kind of attacks could become very appealing for governments.

@[email protected] · 3 months ago

As far as I understand, in this case opaque binary test data was gradually added to the repository. Also the built binaries did not correspond 1:1 with the code in the repo due to some buildchain reasons. Stuff like this makes it difficult to spot deliberately placed bugs or backdors.

I think some measures can be:

establish reproducible builds in CI/CD pipelines
ban opaque data from the repository. I read some people expressing justification for this test-data being opaque, but that is nonsense. There’s no reason why you couldn’t compress+decompress a lengthy creative commons text, or for binary data encrypt that text with a public password, or use a sequence from a pseudo random number generator with a known seed, or a past compiled binary of this very software, or … or … or …
establish technologies that make it hard to place integer overflows or deliberately miss array ends. That would make it a lot harder to plant a misbehavement in the code without it being so obvious that others note easily. Rust, Linters, Valgrind etc. would be useful things for that.

So I think from a technical perspective there are ways to at least give attackers a hard time when trying to place covert backdoors. The larger problem is likely who does the work, because scalability is just such a hard problem with open source. Ultimately I think we need to come together globally and bear this work with many shoulders. For example the “prossimo” project by the Internet Security Research Group (the organisation behind Let’s Encrypt) is working on bringing memory safety to critical projects: https://www.memorysafety.org/ I also sincerely hope the german Sovereign Tech Fund ( https://www.sovereigntechfund.de/ ) takes this incident as a new angle to the outstanding work they’re doing. And ultimately, we need many more such organisations and initiatives from both private companies as well as the public sector to protect the technology that runs our societies together.

@[email protected] · 3 months ago

The test case purported to be bad data, which you presumably want to test the correct behaviour of your dearchiver against.

Nothing this did looks to involve memory safety. It uses features like ifunc to hook behaviour.

The notion of reproducible CI is interesting, but there’s nothing preventing this setup from repeatedly producing the same output in (say) a debian package build environment.

There are many signatures here that look “obvious” with hindsight, but ultimately this comes down to establishing trust. Technical sophistication aside, this was a very successful attack against that teust foundation.

It’s definitely the case that the stack of C tooling for builds (CMakeLists.txt, autotools) makes obfuscating content easier. You might point at modern build tooling like cargo as an alternative - however, build.rs and proc macros are not typically sandboxed at present. I think it’d be possible to replicate the effects of this attack using that tooling.

@[email protected] · 3 months ago

I’m assuming with as small as a tin foil hat as possible that the CIA and NSA purposefully do this in major closed and open source software

@[email protected] · edit-2 3 months ago

In the sysadmin world, the current approach is to follow a zero-trust and defense-in-depth model. Basically you do not trust anything. You assume that there’s already a bad actor/backdoor/vulnerability in your network, and so you work around mitigating that risk - using measures such as compartmentalisation and sandboxing (of data/users/servers/processes etc), role based access controls (RBAC), just-enough-access (JEA), just-in-time access (JIT), attack surface reduction etc.

Then there’s network level measures such as conditional access, and of course all the usual firewall and reverse-proxy tricks so you’re never exposing a critical service such as ssh directly to the web. And to top it all off, auditing and monitoring - lots of it, combined with ML-backed EDR/XDR solutions that can automatically baseline what’s “normal” across your network, and alert you of any abnormality. The move towards microservices and infrastructure-as-code is also interesting, because instead of running full-fledged VMs you’re just running minimal, ephemeral containers that are constantly destroyed and rebuilt - so any possible malware wouldn’t live very long and would have to work hard at persistence. Of course, it’s still possible for malware to persist in a containerised environment, but again that’s where the defense-in-depth and monitoring comes into play.

So in the case of xz, say your hacker has access to ssh - so what? The box they got access to was just a jumphost, they can’t get to anywhere else important without knowing what the right boxes and credentials are. And even if those credentials are compromised, with JEA/JIT/MFA, they’re useless. And even if they’re compromised, they’d only get access into a very specific box/area. And the more they traverse across the network, the greater the risk of leaving an audit trail or being spotted by the XDR.

Naturally none of this is 100% bullet-proof, but then again, nothing is. But that’s exactly what the zero-trust model aims to combat. This is the world we live in, where we can no longer assume something is 100% safe. Proprietary software users have been playing this game for a long time, it’s about time we OSS users also employ the same threat model.

chebra · 3 months ago

@Faresh 1.) Making it easier to analyze. There are multiple steps in the whole process which may be hiding an exploit. The “tarball-not-same-as-git” is a clear example. Sure, reviewing will still be necessary and it will still be difficult, but it doesn’t have to be as difficult as today. 2.) stop giving maintainer rights, fork instead. That’s what pull requests are for. 3.) we should be careful if our critical infrastructure depends on a hobby project - either pay, or don’t depend.