VMware has confirmed that source code posted on PasteBin by a hacker calling himself “Hardcore Charlie”, is a portion of code from its ESX Hypervisor product.
An advisory posted on VMware’s website, explains that its security team had become aware that a single file from the VMware ESX source code had been posted publicly.
VMware’s statement acknowledged that more files may be posted in the future, but attempted to reassure customers by explaining that the published code dated from 2003/2004.
According to Iain Mulholland, director of VMware’s Security Response Center, “the fact that the source code may have been publicly shared does not necessarily mean that there is any increased risk to VMware customers.”
There is some mystery, however, as to exactly whose security was breached for the source code to fall into the hands of hackers. According to VMware, the firm actively “shares its source code and interfaces with other industry participants to enable the broad virtualization ecosystem today.”
The speculation is, however, that the leak occurred following an alleged network breach at Beijing-based military contractor, the China National Electronics Import-Export Corporation (CEIEC).
For its part, CEIEC denied earlier this month that it had been hacked.
Meanwhile, Hardcore Charlie claims to have gained access to source code from other technology companies. Only time will tell if he goes ahead with his plan to publish it.
What kind of file is this? Do we know?
and it seems that he promised more disclosures in May.
I've never even thought about this before now, but how does any application protect its source code? Why can't anyone who has a copy of the application get access to the source code? Is it all encrypted?
I realize that the license agreement might prohibit it, but that doesn't stop thieves. If it's news that VMWare's source code was stolen, I'm inferring that source code thievery isn't a common occurrence. But why not? What keeps ne'er-do-wells from stealing source code all the time?
Programs are compiled or “built” from the source code before they are used. When you run an application on your computer you are running the end result of the compiled source code, not the source code itself. There is a whole science behind reverse engineering programs into source code, and sometimes it can be done with great success, but I don’t think you could ever take a complex program like HyperVisor and reverse engineer the entire source code (folks who are experts in that area can feel free to correct me if I’m wrong).
There are cases where people are expected to compile the code into a program (this is especially true with Linux) but most desktop users never even need to know what source code is. The software that they install has already been compiled for them.
Perhaps this analogy will help: Think of the source code like a recipe and the program as the end product that was created from the recipe. Trying to get the source code from the end product is kind of like trying to get the recipe for coke by drinking one, or a cake recipe by eating a cake. You might be able to figure out some parts of it (sugar, egg, etc) but figuring out the exact combinations so that you could recreate it yourself would be practically impossible. Also, the more complex the recipe is, the harder it would be to figure out by taste.
The recipe analogy also helps understand why some companies protect their source code so much. If you ask your grandma for her chocolate chip cookie recipe then she’ll probably give it to you. If you ask Coca Cola for the coke recipe they’ll just laugh. The amount of secrecy that surrounds it is directly related to how important it is to the individual or corporation.
The same holds true for source code. In many cases it is a company’s product. Many people and companies write free programs and publish the source code for free (Android, Open Office, and most things Linux are just a few examples), but companies like Microsoft and VMWare guard their source code with top of the line security. If the source code got out into the wild then anyone could modify and re-release their program, or unethical competitors could secretly use it to build a competing product that could topple the original company.
Source code is seperate from binary Application executable code. Source is of a form tangible to humans for them to more easily be able to code and maintain, while executable Application code is in very difficult and cumbersome to understand for a human machine code, which machines natively execute.
Customers get the machine code and not the source code. Converting machine code back into source code is hit and miss, you don’t get the original source code or any of the inline comments which document it and often binary machine code is intentionally obfuscated in such as way as to hamper effective de-compilation to source code. Meaning that the source code you get does not make much more sense than the binary machine code.
Hackers getting their hands on actual closed-source source code, means that someone either hacked in or there was a misplaced trust issue with someone not complying with an NDA or similar.