Mal/Xpaj-B – how to avoid looking like a virus

Many midinfecting viruses leave one or more tell-tale signs in their infected files, which can raise suspicion and increase the chances of heuristic detection. These include a writable code section, unusual imports, cross-section jumps and a large block of encrypted data near the end of the file. The authors of Mal/Xpaj-B have gone to considerable effort to avoid all of these.

The virus reuses most of the code from July’s Mal/Xpaj-A:

  • When searching for files to infect, it targets network drives, removable drives and any programs that start automatically.
  • The whole of the virus body has been run through a code obfuscator. This confuses automated analysis tools, although it also makes the virus slow and bulky.
  • The backdoor rings home to a different domain, but the encrypted communication protocol is the same.

However, Xpaj-B has a major new feature in the form of multi-layer encryption. While Xpaj-A hid its strings and data with a rolling XOR key, Xpaj-B goes a few steps further: the whole of the virus body (including the already-encrypted data) has been put through another layer, its decryption is called by a Virtual Machine and the bytecode for that VM is stored (encrypted again) after the virus body.

The stack-based Virtual Machine uses only seven instructions, which each take two (or fewer) 32-bit arguments:

  • Push immediate value
  • Push from indirect address
  • Pop to indirect address
  • Add top two values
  • Compare top two values and jump if equal
  • Call non-VM function
  • Push from FS region

That Call non-VM function operation may sound like a lazy option, but it is used for three essential steps: Once to call ZwVirtualProtectMemory (which has been located manually within ntdll rather than by the usual import structures), needed to make the virus body executable; once for the decryption loop (which would require an additional XOR operation as well as being much slower if run entirely within a VM); and finally to jump to the decrypted virus body. Even the non-VM functions are first constructed four bytes at a time using the VM’s other instructions, to disguise the distinctive loops and jumps further.

These instructions are all fetched, decrypted and executed by a handful of chunks of polymorphic code scattered around the host file’s code section. Here is the compare and jump if equal function from two different samples:

Mal/Xpaj-B polymorphic code

The registers and local variables have been shuffled between each infection, to prevent virus chunks from being found by simple pattern-matching. The routine that generates this code isn’t as complex as the obfuscator that was used on Xpaj’s body, but it doesn’t need to be. Its aim is not to make the code unreadable but to ensure it looks similar to the output of common HLL compilers.

Writing all these chunks into the code section means that the host has been substantially damaged. During infection, the virus has to save the overwritten bytes after its own body. When the infected copy has started its own infection thread, it decrypts the saved bytes then executes them at their new address. Only then does it pass control back to the host. The host isn’t repaired by the virus at any point, even in memory.

Xpaj-A was an unremarkable parasitic infector but Xpaj-B is a significant advancement. What new tricks are the authors planning for the next variant?