Harvard Ethics youngster charged with massive online theft

A 24-year-old, described as a researcher at Harvard University’s Center for Ethics, has been arrested in Massachusetts, USA, on a raft of computer crime charges.

The youngster, Aaron Swartz, has an interesting history, considering his age. He is a co-author of the RSS 1.0 specification, which was published just one month after his 14th birthday, and he joined the W3C’s Resource Description Framework (RDF) working group just five months after that.

Since that, he’s done lots of other things. He describes himself as the cofounder of social networking site Reddit (perhaps a bit of a stretch, since Swartz joined Reddit at 19 when the original founders acquired his company); as the founder of online political activist site Demand Progress; and as the founder of the librarian’s Wikipedia, the anyone-can-edit library catalog Open Library.

Loosely speaking, the charges allege that Swartz used MIT’s network (not Harvard’s) to download a whole bucketload of academic articles from non-for-profit academic journal archive JSTOR in contravention of his entitlement, with the aim of republishing them without restriction.

The charges describe Swartz’s alleged actions in a surprisingly well-written series of entry-level hacking How-Tos – which, incidentally, also make it compellingly obvious why entry-level hackers get caught. (Hint 1. Don’t call your leeching program keepgrabbing.py. Hint 2. Don’t call the next version of your leeching program keepgrabbing2.py.)

This part of the charge-sheet is intended to establish that Swartz knew that his bucket-sized download was excessive. The charges claim that MIT blocked his laptop and banned him from the network, whereupon Swartz repeatedly took steps to circumvent the ban, and continued his extensive leeching of the JSTOR database.

Eventually, claims the charge-sheet, Swartz “simply hard-wired into the network and assigned himself two IP addresses”.

The connection he made was inside a cabling closet, and Swartz is said to have “hid[den] the Acer laptop and a succession of external storage drives under a box in the closet, so that they would not be obvious to anyone who might enter the closet.”

Then, in what sounds surprisingly like a scene from one of those hacker movies you’ve never quite managed to watch until the end:

Swartz returned to the wiring closet to remove his computer equipment. This time he attempted to evade identification at the entrance to the restricted area. As Swartz entered the wiring closet, he held his bicycle helmet like a mask to shield his face, looking through ventilation holes in the helmet. Swartz then removed his computer equipment from the closet, put it in his backpack, and left, again masking his face with the bicycle helmet before peering through a crack in the double doors and cautiously stepping out.

The courts now have to decide whether Swartz really did steal a whopping 4.8 million articles from an academic archive, including 1.7 million articles which weren’t free.

And the courts also need to decide whether this amounts to the sort of criminality which – as the media are chillingly happy to remind us – apparently carries a maximum sentence of: 35 years in prison, restitution and forfeiture, and a fine of $1 million.

Swartz's colleagues at Demand Progress are, unsurprisingly, aghast at the charges, with executive director David Segal suggesting that this is "like trying to put someone in jail for allegedly checking too many books out of the library."

Swartz's detractors – presumably including some of the authors whose non-free works were amongst the 1.7 million he allegedly helped himself to – might see it differently. They might argue that it was a bit more like photocopying or scanning all the books in the library and running off to use the copied material in a project of your own.

Hang on a moment…hasn't that been done already?