Some thoughts on the xz backdoor

Gobbel2000 · 1 year ago

Some thoughts on the xz backdoor

kbal@fedia.io · 1 year ago

Too many people are acting as if this type of attack is somehow peculiar to open source projects. Very similar methods can be — and by now probably have been — used against closed-source non-free software. The level of defence against it probably varies even more in the rest of the software world, outside that part of it which is free software. How many of the vulnerabilities that have been discovered in Cisco routers were deliberately planted by outside adversaries? I don’t suppose there’s any way to know for sure. The main thing the open source nature of the project did here was make it easier to detect and impossible to cover up afterwards.

chameleon@kbin.social · 1 year ago

These things are always easy to say in hindsight, but I do believe that a closer review of the build system shenanigans used to install the backdoor would have at least raised some questions.

Nobody noticed it because nobody is reviewing autotools spaghetti and especially not autotools spaghetti that only exists as shipped in a tarball. Minor differences in those files are perfectly normal as the contents of them are copied in from the shared autoconf-archive project, but every distro ships a different version of that, so what any given thing looks like will depend on the maintainer’s computer. And nearly nobody has a good understanding of what any given line in a .m4 file is going to ultimately lead to the execution of regardless, so why bother investigating any differences? The maintainer of Meson has a good take on this.

Shipping tarballs without any form of generated files and having a process to validate release tarballs against the repo would be a good step, but is much easier said than done for a variety of reasons. Same thing can be said for shipping without any form of binary files in the repo, there’s quite high value in integration tests and xz’s README for the test blobs has correctly included this paragraph for 16 years:

Many of the files have been created by hand with a hex editor, thus there is no better “source code” than the files themselves.

Corngood@lemmy.ml · 1 year ago

Many of the files have been created by hand with a hex editor, thus there is no better “source code” than the files themselves.

I don’t buy that. There would have been some rationale behind the contents that could be automated, like “compressed file with bytes 3-7 in the header zeroed”.

You also probably don’t need these test files to be available in the environment where the library itself is built. There are various ways you could avoid that.

I do agree about the autotools stuff though.

Minor differences in those files are perfectly normal as the contents of them are copied in from the shared autoconf-archive project, but every distro ships a different version of that, so what any given thing looks like will depend on the maintainer’s computer.

This seems avoidable. We shouldn’t be copying code around like that.

chameleon@kbin.social · 1 year ago

Test files often represent states that can’t be represented in the library proper. Things like “a tree where node A is a child of B and node B is a child of A”, “the previous instruction repeated x times” where x was never set or there was no previous instruction, or weird combinations of mutually exclusive effects. More often than not, you can’t really generate those using the library itself, as libraries tend to be written to reject those kinds of invalid states (there’s only so much you can do in C but in functional programming land, “make invalid states unrepresentable” is a straight up mantra).

Even if you did manage to do that, using the system under test to generate test data for the system under test is generally not very useful by itself; you’d need some kind of extra protections on top to make sure the actual test files continue to be identical between revisions (like hashing them). Otherwise, a major incompatibility could be easily overlooked. But that also makes it hard to make any kind of valid changes to the library at all. Worse yet, some libraries don’t implement everything needed to generate the test files: even xz is missing pieces, for example there’s an lzip decompressor but not a compressor.

There’s some arguments to be made for separating the test system from the main distribution, but the end result will likely be that nobody runs the testsuite at all. It’s difficult enough to get distros to do it in the first place.

Corngood@lemmy.ml · 1 year ago

Yeah, that’s fair. If you want to test that you can still decompress something compressed with some random old version, you either need to keep the old algorithm around, or the data.

Gobbel2000 · 1 year ago

I (luckily) haven’t had much experience using autotools, but I do suppose it was no coincidence that the injection was initiated there. I really like the comparison that was made in the post of the Meson maintainer you linked:

Several “undefeatable” fortresses have been taken over by attackers entering via sewage pipes.

Jumuta@sh.itjust.works · 1 year ago

Companies that profit off of foss works should def contribute and monitor the projects, but I’m not sure how you could force them to.

lurch (he/him)@sh.itjust.works · 1 year ago

you force them by sneaking in a funny exploit twice per year 🤣

no seriously, don’t do that. it’s unethical.

unalivejoy@lemm.ee · 1 year ago

“My funds are getting low. I guess it’s time to hold the open source world for ransom.”

some_guy@lemmy.sdf.org · 1 year ago

Best breakdown that I’ve seen. I believe I saw it here yesterday.

https://research.swtch.com/xz-script