MS-DOS and MS-Word source code released for review – get it while it’s new!

What would you do if you received an email entitled Better late than never?

What if the email contained little more than two URLs, implicitly inviting you to click through?

It sounds like the sort of message you probably ought to delete without a second thought.

But the one that dropped into my mailbox a couple of days ago was from my friend, colleague and fellow Sophos Naked Security writer, Gabor Szappanos, better known as Szappi.

Regular readers will know that Szappi has been writing about Advanced Persistent Threats lately, especially those that use exploits against Microsoft Word, for example in booby-trapped RTF files.

And we just wrote about a newly-discovered zero-day in Word that has been used in the wild by cybercrooks armed with malicious RTFs.

“That must be it,” I thought, “Microsoft has come out with a permanent fix. Better late than never.”

(Technically speaking, patches for zero-days are always “late”, even if they’re quick, since a zero-day means the crooks got there first.)

But it was much more interesting than that.

In fact, the links in Szappi’s email told their own story:

My next thought, of course, was along the lines of, “I need to tell the readers of Sophos Naked Security!”

And as soon as I had come up with a security-related angle to sneak in at the end, lest anyone complain I had drifted off-topic, I decided I’d do just that.

All the heroes of the single-tasking era are there in one shape or another: Tim Paterson, Mark Zbikowski, even the dark lord of errors himself, General Failure.

The earliest source code in these new bundles, arranged by San Francisco’s Computer History Museum, is MS-DOS 1.25.

As Tim Paterson, the Father of DOS himself, notes in an email included in the archive, MS-DOS 1.25 was technically the same as IBM DOS 1.10.

But the MS-DOS flavouring was to make this a version of great historical importance: it marks the release of DOS “into the wild,” with the 1.25 variant going to vendors other than IBM for the first time.

They really don’t make them like they used to.

The operating system is split into two parts, and consists of just three files.

There’s COMMAND.ASM, which is the command intepreter; there’s MSDOS.ASM, which is effectively the kernel; and there is STDDOS.ASM, a mere 22 lines that you assemble to build the kernel (admittedly, it does INCLUDE MSDOS.ASM).

All are written in Tim Paterson’s lean-but-clean old-school assembler style: all code is UPPER CASE; crisp comments are in mixed-case; and lining up the text is done with TABs, set to eight spaces as the Good Lord intended:

Text lines, of course, end with CR+LF, and all files end with a Ctrl-Z character.

There are surprisingly few differences between the MS-DOS and the IBM DOS builds.

What has become the well-known Microsoft C-prompt, or C:\>, actually ended with a colon in the IBM DOS version.

DOS 1 didn’t have a C-drive (no hard disk support) or path names (no directory structure).

So the prompt was simply A> to users of MS-DOS, and A: to those with the IBM flavour:

And the operating system announced itself differently, of course.

Interestingly, the code to display the version and number is in MSDOS.ASM for the MS-DOS build, but in COMMAND.ASM for the IBM version:

In contrast, the Word for Windows source code seems much more modern, although it is less than a decade younger that DOS 1.

The impact of Microsoft’s Charles Simonyi, inventor of the so-called Apps Hungarian notation, where variable names in programs are prefixed with characters to remind you what they are for, is clear:

So fFalse reminds you it’s a flag that simply reflects a state, while a vfAwfulNoise is a variable that saves a flag to do with whether you’ve made the AwfulNoise yet, or not.

And ppfb is a memory aid that you’re dealing with a pointer to a pointer to a formatting block.

Windows programmers will know that Simonyi’s notation found its way from the Word and Excel teams into Windows itself, tortured into a form called Systems Hungarian, where you somewhat purposelessly use the prefixes to encode the actual C type of the variable into its name.

So, instead of helpfully tagging a variable name with some reminder of the sort of data it measured or contained, you simply re-encoded what the compiler already knew about how it was stored in memory, leading to something of a backlash against Hungarian Notation in general.

But that is a religious argument for another time.

I’ll leave you with one gem from the Word source, found for me by none other than Szappi, who carried out extensive work on Word macro virus detection in the 1990s.

Malware for Microsoft Word first came out for Word 6, and was later “backported” to work back in versions from Word 2 onwards.

There were never any viruses for Word 1.1, even though it contained a macro programming language like Word 2 and later.

Nevertheless, somebody in Microsoft seems to have predicted that there might be a security problem looming.

That person obviously had the nagging feeling that allowing user-supplied macros (inside the document you were about to open!) to take over by default from built-in commands might need reconsidering:

Global macros will be selected over commands automaticaly [sic] because the commands are listed in the command table first. If it is decided that the user cannot replace global commands with a macro, this still will not need to change, only adding a macro will.

That “user cannot replace global commands” decision was never made, leaving the door wide open to macro Trojans embedded in documents.

And not just Trojans, because Microsoft went on to add the MacroCopy command into Word’s macro language.

That made it easy to transfer your malware from an incoming booby-trapped document right into the user’s Word environment.

Once you’d infected Word itself, that same MacroCopy command could be abused to replicate your virus into every document the victim opened thereafter, giving you a full-blown, fast-spreading computer virus.

As Szappi wistfully remarked to me in an email:

After Word 1.1 Microsoft made two decisions: [1] stick with the possibility to override global commands with user macros, and [2] add the MacroCopy command.

These two combined led to the emergence of macro viruses, dominant for a decade. Without these two, the 1990s would have been totally different.

You can quote me on that.

Just think: we could have had so much more time to fret about the Millennium Bug!

And you can quote me on that.