Secrets of the Filecode ransomware revealed

Earlier this week we wrote about OSX/Filecode, a Mac ransomware family that follows a path of digital extortion that is well-known to Windows users.

Generally speaking, ransomware hits when you download a file, or are tricked into running an attachment, that claims to be one thing, anywhere from a fake invoice to a software crack…

…but turns out to be quite another.

Filecode pretends to be a cracking tool.

The malware pops up a dialog to say that it “may take up to 10 minutes” to do its job, which is supposedly to hack Adobe Premiere or Office 2016 so you can keep on using it without paying the licensing fees:

Software cracks of this sort do exist, and some of them may even work, but figuring out when you can trust shady software from shady corners of the internet that you downloaded for your own shady purpose…

…is always going to be a hard task.

The good news is that Filecode doesn’t contain any software cracking code that might get you in trouble with the piracy police.

The bad news is that it is entirely focused on scrambling your data, using a randomly-generated 25-character encryption key that exists only in memory while the malware runs.

After that, the malware author gives you instructions on how to pay him a ransom, in return for which he’ll unscramble the data for you.

This raises two questions.

If the decryption key is lost after the malware finishes running, how can the crook possibly recover your files?

And if he can do it, what’s to stop you following the same process and recovering them for yourself for free?

Most ransomware creators deal with these decryption issues in one of these ways:

  1. Call home first to a web server that generates an random encryption key on demand. This way, the crooks have a guaranteed record of the decryption key that matches your computer, but you don’t. The generated key is what they sell back to you later. If your network security software blocks the call-home request, the ransomware is stalled and can’t run at all.
  2. Generate a random key locally and use it in memory only, but call home during or immediately after the encryption process to backup the key remotely. If you block the call-home request in cases like this, the ransomware runs but the key is lost.
  3. Generate a random key locally, use it, and save it locally after encrypting it again with a public key. As long as the crooks keep the corresponding private key to themselves, they alone can decrypt the decryption key that you need. (It’s OK if you need to read that twice.)

Approach (3) works because of the difference between symmetric and asymmetric encryption. Symmetric, or secret-key, cryptography uses the same key to lock and unlock data, and it is usually used for protecting large quantities of data such as disks and files. But if you want to share symmetrically-encrypted data across the internet, you need a secure way to share the secret key first. Asymmetric, or public-key, cryptography uses a different key to lock and unlock, making it an excellent tool for sharing data securely over a network. You share your public key with everyone so they can lock data to send it to you; you keep your private key private so you’re the only one who can unlock the data at the other end. That way, it can’t be snooped or modified in transit. But public-key cryptography is far too slow for encrypting files and disks, so a hybrid approach is used: encrypt the data with a random symmetric key, and encrypt the symmetric key with the public key. Now, only the holder of the private key can unlock the symmetric key, so only the holder of the private key can unlock the symmetrically-encrypted data.

Filecode, however, doesn’t use any of these methods.

The malware uses the zip program to encrypt all your files with a randomly-chosen 25-character password, and throws the key away.

The crook then tells you to pay him 0.25 bitcoins (BTC 0.25 was about $300 at the end of February 2017), hook your computer up to the internet so he can access it remotely, and contact him by email to let him know you’re ready.

He says that, within 24 hours (or within 10 minutes if you pay the premium fee of BTC 0.45 instead) he’ll connect into your Mac and recover your files:

So, if the crook can recover your data from first principles by cracking the now-forgotten password, why can’t you?

The answer is, “You can,” and here’s how we did it on our test Mac.

Secrets revealed

The encryption used by default in zip files comes from the original PKZIP software, named after the late Phil Katz, who wrote it and started PKWARE, the company that sold it.

But the PKZIP cipher (officially denoted “traditional PKWARE encryption”, but we’ll call it just zipcrypt from now on) was designed by a programmer, not by a cryptographer.

Zipcrypt simply doesn’t mix things up well enough to be secure.

Internally, the algorithm uses three 32-bit key values, key0, key1 and key2, that are repeatedly mixed together so that each key changes after every byte is encrypted; the byte you’ve just encrypted is added into the mix as well:

function update_keys(newbyte)
   key0 = crc32(key0,newbyte)          --mix the new byte into a running CRC-32
   key1 = key1 + low8bitsof(key0)      --add the bottom byte of key0 into key1
   key1 = key1 * 134775813 + 1         --apply linear congruential PRNG to key1
   key2 = crc32(key2,top8bitsof(key1)) --mix the top byte of key1 into a running CRC-32

Don’t worry if you aren’t a programmer or don’t understand the notation above.

What’s important is that at each byte-sized encryption step:

  • key0 is updated by mixing up its current value with the plaintext character you just encrypted. The mixing algorithm is the CRC-32 checksum algorithm.
  • key1 is updated by adding in one byte from the updated key0 and then mixing up the value using a simple pseudo-random number generator (PRNG). This is the same pseudo-random code used by the old Turbo Pascal and early versions of Borland Delphi.
  • key2 is updated by mixing up its current value with one byte of the update key1, using CRC-32.

To encrypt a byte, you XOR it with a cipher-stream byte that is computed from the bottom 16 bits of key2, like this:

function encrypt_byte(newbyte)
   local cipherbyte
   cipherbyte = bottom16bitsof(key2)             -- take 16 bits from key2
   cipherbyte = cipherbyte OR 2                  -- set the second-bottom bit to 1
                                                 -- (this ensures temp is never zero) 
   cipherbyte = cipherbyte * (cipherbyte XOR 1)  -- mix together the 16 bits of temp
   cipherbyte = secondtobottombyteof(cipherbyte) -- take the top 8 of the bottom 16 bits
   return newbyte XOR cipherbyte                 -- XOR this cipher byte into the stream

With three 32-bit internal keys that produce one XOR key-byte at a time, zipcrypt is technically a stream cipher with a 96-bit key (3×32 = 96), which means there are 296 keys if you want to try them all in a brute-force attack.

At first glance

How secure is zipcrypt?

At first glance, you might think that this algorithm doesn’t do anywhere near as much mixing-mincing-shredding-and-liquidising as you’d expect, at least if you compare it with cryptographic algorithms that work on bigger blocks of data, such as AES or SHA-3.

There also doesn’t seem to be much of what is often referred to as avalanche or diffusion between the three internal key values, with just 8 bits of new data percolating into each key value at every step.

For example, key0 never incorpoates bits from key1 or key2, nor key1 from key2, and so on, and the new bits that come into key0 from the current plaintext byte take two more encryption cycles to have any effect on key2.

You might also be concerned that the core “mixing” algorithms that were chosen as randomisers, CRC-32 and the Turbo Pascal PRNG, were not designed for cryptographic use, but rather for speed and simplicity.

Indeed, the problem with both CRC-32 the the PRNG used by zipcrypt is that if at any point you can figure out where you are in the cycle of random numbers they’ve just produced, you can figure out what comes next and thus reconstruct the random sequence from then on, no matter how secretly the algorithm was initialised, or seeded, in the first place.

And if you think that there might therefore be cryptographic chinks in the internals of zipcrypt, you’d be right, as two well-known names in the cryptographic community, Eli Biham and Paul Kocher, found in the mid-1990s.

Their paper, A known plaintext attack on the PKZIP cipher, documents how they uncovered a way to compare a known plaintext file with its zipcrypt ciphertext equivalent, and from there to work backwards to figure out the starting values of key0, key1 and key2 that were used to encrypt it.

Any other encrypted files in the same ZIP archive, or any other ZIP files using the same password, can then be decrypted without any further effort.

Note that this means you don’t need to figure out the actual password used in the first place, which is itself churned through the zipcrypt algorithm as many times as there are characters in the password to produce the three starting values for key0, key1 and key2.

That’s just as well, because the Filecode malware chooses a 25-character password from the characters A-Za-z0-9 for a total of 62 characters and thus 6225 possible passwords, or close to one billion billion billion billion billion.

You don’t even need to try out all 296 possible combinations of key0, key1 and key2, which is getting on for one million million million million million.

In fact, with a handy utility that also dates back to the 1990s, PKCRACK, we did the job in 42 seconds.

Cracking the keys

After the malware had finished running on our test Mac, all our files had been ZIPped with a password and renamed to have the extension .crypt:

/Users/duck/Documents/Large Spreadsheet Sales (Excel).xls.crypt

We needed just one file for which we had a backup copy of the original, so we chose a logo file we could find again online: corplogo.png.

By ZIPping up the plaintext version of the file using the same compression options as the malware, we then had a corresponding set of plaintext and ciphertext ZIP files, both containing a file called corplogo.png:

duck@testmac:~/Temp$ zip -0 corplogo.png
  adding: corplogo.png (stored 0%)
duck@testmac:~/Temp$ ls -l corplogo*
-rw-r--r--  1 duck  staff  7189 28 Feb 10:00 corplogo.png        --plaintext original file 
-rw-r--r--  1 duck  staff  7435 13 Feb  2010 corplogo.png.crypt  --encrypted ransomware ZIP 
-rw-r--r--  1 duck  staff  7365 28 Feb 11:47  --plain ZIP created above

The malware uses the zip -0 option for “no compression’, presumably to make the scrambling process faster, because compressing files can be very slow. That’s why we did a matching zip -0 above. The unusual timestamp on the ZIP file corplogo.png.crypt (2010-02-13) is deliberately set by the malware. We don’t know what significance this date has, except perhaps to be unusual.

Then, we set PKCRACK to work.

The options it needs are: the name of the encrypted ZIP; the unencrypted ZIP to compare it to; the name of the corresponding file in each archive; and a new name for its own decrypted version of the ZIP so we could check that the process succeed:

duck@testmac:~/Temp$ pkcrack -C corplogo.png.crypt -P \
                                  -c Users//duck/Pictures/corplogo.png  -p corplogo.png \
                                  -d corplogo.png.decrpyt
Files read. Starting stage 1 on Tue Feb 28 11:50:05 2017
Generating 1st generation of possible key2_7200 values...done.
Found 4128994 possible key2-values.
Now we're trying to reduce these...
Lowest number: 969 values at offset 4095
Lowest number: 952 values at offset 4076
[. . .]
Lowest number: 487 values at offset 399
Done. Left with 487 possible Values. bestOffset is 399.
Stage 1 completed. Starting stage 2 on Tue Feb 28 11:50:22 2017
Ta-daaaaa! key0=ac5d37ee, key1=b7c718ab, key2=3bc7973b
[. . .]
Stage 2 completed. Starting zipdecrypt on Tue Feb 28 11:50:47 2017
Decrypting Users//duck/Pictures/corplogo.png (c407e43e418b6e49cbc43e75)... OK!
Finished on Tue Feb 28 11:50:47 2017

42 seconds later, and PKCRACK was done, giving us the 96 bits of key material that we needed to unlock all our other files.

The double-slash in the filename Users//duck/Pictures/corplogo.png in the PKCRACK command line above is needed to match the way the malware creates its encrypted archives. Unix filenames can have one or more slashes to separate each directory name in a path, so that path/file.txt and path//file.txt refer to the same filesystem object. You can check the names of files inside an encrypted ZIP file using unzip -l, because only the file contents are encrypted – that’s how we spotted the extra slash after the first part of the directory path.

We UNZIPped the decrypted ZIP generated by PKCRACK, just to make sure we were on the right track:

duck@testmac:~/Temp$ unzip -j corplogo.png.decrpyt 
Archive:  corplogo.png.decrpyt
replace corplogo.png? [y]es, [n]o, [A]ll, [N]one, [r]ename: r
new name: corplogo.recovered.png
 extracting: corplogo.recovered.png  
duck@testmac:~/Temp$ shasum -a 256 corplogo.recovered.png corplogo.png
7190c9d479e6c344fcd6ebcf2455ec8d9d00d10a09386cd56308b39d70c7ccec  corplogo.recovered.png
7190c9d479e6c344fcd6ebcf2455ec8d9d00d10a09386cd56308b39d70c7ccec  corplogo.png

The logo file we extracted from the unscrambled ZIP file matched the original copy, so we carried on.

Next, we used PKCRACK’s special zipdecrypt tool to convert all our other files – that’s like a special version of unzip that takes in raw values for key0, key1 and key2 where unzip would require you to put in the original password:

duck@testmac:~/Temp$ for f in *.crypt; do zipdecrypt ac5d37ee b7c718ab 3bc7973b "$f" "$f.recovered"; done
Decrypting Users//duck/Desktop/letterlegal5.doc.crypt (02e59dba51ca8cd8daa8c8f3)... OK!
Decrypting Users//duck/Desktop/lorem_document_PDF.pdf.crypt (01c51007238cccd0d7ac6d86)... OK!
Decrypting Users//duck/Desktop/shattered-1200.jpg.crypt (4891b0abcdda0c17614c5804)... OK!
Decrypting Users//duck/Desktop/wt-tmpdkhvbs-500.png.crypt (ec5f2177ef15d77e5aee108f)... OK!
Decrypting Users//duck/Desktop/YankeeHotelFoxtrot.mp3.crypt (d350f0eb554eeec6d1322382)... OK!
Decrypting Users//duck/Documents/.localized.crypt (108efd41d240b1005e7281c6)... OK!
Decrypting Users//duck/Documents/2003-example-spreadsheet.xls.crypt (f0f93d9533d8f6c660c2a2cd)... OK!
Decrypting Users//duck/Documents/2014_04_a4_format.doc.crypt (3ba2cc77598e2536e2a28208)... OK!
Decrypting Users//duck/Documents/Document-English.docx.crypt (f2e63ab43578659571be7b02)... OK!
Decrypting Users//duck/Documents/Large Spreadsheet Sales (Excel).xls.crypt (c9e2ce71ce5e9d4602cdc7a5)... OK!
Decrypting Users//duck/Documents/officialformat.doc.crypt (b125ea6bca507020b8401ac5)... OK!
Decrypting Users//duck/Documents/Thesis-and-Dissertation-Templete.doc.crypt (1c65a78878f3c52dcd5622bc)... OK!
Decrypting Users//duck/Music/Webdriver_Torso.mp3.crypt (139aa20cbb6edcb9c86af6ea)... OK!
Decrypting Users//duck/Music/YankeeHotelFoxtrot.mp3.crypt (3c50baedc1735187809a2591)... OK!
Decrypting Users//duck/Pictures/corplogo.png.crypt (c407e43e418b6e49cbc43e75)... OK!
Decrypting Users//duck/Pictures/shattered-1200.jpg.crypt (3c50baedc1735187809a2591)... OK!
Decrypting Users//duck/Pictures/wt-tmpdkhvbs-500.png.crypt (dee3bd6d201fd31b1ca377a3)... OK!

The last step was unzip all the now-decrypted ZIP files, and that was that:

duck@testmac:~/Temp$ for f in *.recovered; do unzip -d / "$f"; done
Archive:  Users/duck/Desktop/letterlegal5.doc.crypt.recovered
 extracting: Users/duck/Desktop/letterlegal5.doc
Archive:  Users/duck/Desktop/lorem_document_PDF.pdf.crypt.recovered
 extracting: Users/duck/Desktop/lorem_document_PDF.pdf
Archive:  Users/duck/Desktop/shattered-1200.jpg.crypt.recovered
 extracting: Users/duck/Desktop/shattered-1200.jpg
Archive:  Users/duck/Desktop/wt-tmpdkhvbs-500.png.crypt.recovered
 extracting: Users/duck/Desktop/wt-tmpdkhvbs-500.png
Archive:  Users/duck/Desktop/YankeeHotelFoxtrot.mp3.crypt.recovered
 extracting: Users/duck/Desktop/YankeeHotelFoxtrot.mp3
Archive:  Users/duck/Documents/.localized.crypt.recovered
 extracting: Users/duck/Documents/.localized
Archive:  Users/duck/Documents/2003-example-spreadsheet.xls.crypt.recovered
 extracting: Users/duck/Documents/2003-example-spreadsheet.xls
Archive:  Users/duck/Documents/2014_04_a4_format.doc.crypt.recovered
 extracting: Users/duck/Documents/2014_04_a4_format.doc
Archive:  Users/duck/Documents/Document-English.docx.crypt.recovered
 extracting: Users/duck/Documents/Document-English.docx
Archive:  Users/duck/Documents/Large Spreadsheet Sales.crypt.recovered
 extracting: Users/duck/Documents/Large Spreadsheet Sales
Archive:  Users/duck/Documents/officialformat.doc.crypt.recovered
 extracting: Users/duck/Documents/officialformat.doc
Archive:  Users/duck/Documents/Thesis-and-Dissertation-Templete.doc.crypt.recovered
 extracting: Users/duck/Documents/Thesis-and-Dissertation-Templete.doc
Archive:  Users/duck/Music/Webdriver_Torso.mp3.crypt.recovered
 extracting: Users/duck/Music/Webdriver_Torso.mp3
Archive:  Users/duck/Music/YankeeHotelFoxtrot.mp3.crypt.recovered
 extracting: Users/duck/Music/YankeeHotelFoxtrot.mp3
Archive:  Users/duck/Pictures/corplogo.png.crypt.recovered
 extracting: Users/duck/Pictures/corplogo.png
Archive:  Users/duck/Pictures/shattered-1200.jpg.crypt.recovered
 extracting: Users/duck/Pictures/shattered-1200.jpg
Archive:  Users/duck/Pictures/wt-tmpdkhvbs-500.png.crypt.recovered
 extracting: Users/duck/Pictures/wt-tmpdkhvbs-500.png

The malware zips up the original files with their full paths, so we used unzip -d / for our unzipping command above so that the files would be created relative to the root directory, thus ending up back in the directory tree /Users/duck where they started out. Note that the leading / is left out of the names inside the archive to stop you unzipping them into absolute locations by mistake, where you might overwrite system files unexpectedly. It’s a good idea to verify the filenames in an archive before using -d /, in case there are files such as bin/bash or usr/bin/vi sneakily included in there that would replace core apps on your Mac.

$530 saved in 10 minutes

The whole process took about 10 minutes, including downloading the macOS command line compiler tools from Apple and compiling our own copy of PKCRACK, just to make sure we could do it from scratch.

It would have cost us $530 (BTC 0.45) to “hire” the Filecode criminal to do the same job, assuming he were awake, and competent…

…and trustworthy.

We know what we think of his rating on the last score.