The Code is dead. Long live the Code!

Three years ago internet banking Trojans, along with their associated downloader Trojans, began to proliferate: samples started flooding in by the thousands. The poor way to deal with these would be to wait for them to come in then issue thousands of specific signatures. A much more effective approach was to search for definitive characteristics within early samples, identify the main families, and detect them pro-actively.

One of the easier downloader families to detect was Troj/SmDown-Fam. Shortly after some standard initialization code from their Borland Delphi compiler, these contained a distinctive decryption loop:

Decryption Code Loop 1

The trained eye can easily spot a decryption loop in code. A typical code sequence consists of:

1. Indexed read of a value from memory.
2. Performing some arithmetic.
3. Indexed write back to memory.
4. inc (or dec) of the index register
5. Looping until an end of string marker is reached or a length counter reaches zero.

Here we see a slight variation. There is the indexed read of a single byte (mov  dl, [edx+esi-1]); the arithmetic to convert the byte to the desired character consists of a subtraction and a not operation; but the writing of the decrypted string is performed indirectly, via a couple of library routines to append the new character to the end of the destination string. This is not the fastest way of doing things, but given that this code is not speed critical it serves the desired purpose.

So what is being decrypted? Looking back a bit earlier in the code one discovers that the memory referenced by “SourceString” is loaded with

Encrypted URI 1

Suppose we did not already know this was a downloading Trojan. There is a big clue as to what is about to happen here: In the encrypted string we see that the 2nd and 3rd characters are the same.  We have already noted that the above decryption loop is a simply byte by byte decryption, therefore the decrypted string will also have characters 2 and 3 the same. Likewise the 6th and 7th characters. That pattern should trigger a question in every threat researcher’s mind: “Byte-wise encryption of ‘http://’?”

Sure enough if we emulate the above decryption algorithm on the string supplied, the result is:

Decrypted URI 1

and it is no surprise to find, a bit later on, code to pass that decrypted URL to the URLDownloadToFileA API call. There is also an API call to GetSystemDirectoryA. That is another tell tale sign. The file at the above URL (now long gone) was not a .jpg picture but an executable banking Trojan, which the downloading Trojan attempts to install (this time with the correct .exe extension) in the system folder. It is a common malware downloading trick.

So, all members of Troj/SmDown-Fam can be detected by scanning for that decryption loop shortly after standard initializaton code, and then (just in case some legitimate application ever happens to use the same encryption method) checking this is followed by the API calls GetSystemDirectoryA and URLDownloadToFileA. With the publishing of Troj/SmDown-Fam identity, never again would any member of that malware family be permitted to run on a computer protected by SAV. I never expected to see that decryption loop again, the code was dead.

Or so I thought. Analysing a much more recent downloader Trojan I came across the following:

Decryption Code Loop 8

Deja vu?

This family of downloader Trojans has evolved a long way since the days of Troj/SmDown-Fam. The new family includes extra functionality and is considerably larger. As a result that decryption loop does not appear until later on in the file. It is still being used to decrypt  URLs, but now appears several times as downloads are attempted from several sources. That is another common trick in todays downloaders.

Theoretically it would be possible to publish a new identity which scans the whole code of a file for that decryption loop, but it would be slow. In practice malware detection needs to be targeted to maintain reasonable scanning speeds. Nevertheless, it was quite easy for me to perform a test scan within the labs. The results? Scanning 2 million recent malicious files found that 221 of them use that decryption loop. Scanning 1 million clean files: not one occurrence.

This is a classic demonstration of how malware authors continue to reuse custom source code routines, even over a period of years. Some source code may be shared widely across the malware community and gets used in many different malware families. However, I suspect this particular routine is a hallmark of one of the key groups behind Brazilian banking Trojans, one that has persisted even when the actual malware families have changed radically as the authors have added extra functionality and attempt to evade detection.

Malware attempting to steal your bank login credentials are still very prevalent, but we have several methods to detect them. I won’t be publishing a slow scan for the decryption loop, because our behavioural genotype technology already detects these new banker downloaders in eight different ways: a typical sample is detected as  Mal/DelpDldr-C, Mal/DownLdr-AC, Mal/Heuri-E, Sus/DwnLdr-A, Sus/Delf-J, Sus/SvchostS-A, Sus/SmDldr-A, and Sus/DldrFilt-A! If you use SAV you are thoroughly protected.