Tool scrubs hidden tracking data from printed documents

Just because a document isn’t digital doesn’t mean it doesn’t contain metadata. Printed documents often have their own hidden details, and now German researchers have developed tools to help you scrub them clean.

We have known for over a decade that most colour laser printers embed unique details to trace each document back to its source. They typically use tiny patterns of yellow dots, invisible to the naked eye, containing information such as their serial number and when the document was printed.

Now, researchers have released software to strip documents of that information. This could help whistleblowers to reveal sensitive information without getting caught, they claim.

Printer manufacturers have included this feature for years. The devices add the invisible dots to the image just before it hits the paper. The information hides in plain sight as a repeating matrix, nestled in the document’s white spaces, viewable only with a blue LED light and a magnifier, but it can trace every printout uniquely to your printer. Manufacturers rarely notify customers about these features, but law enforcement uses them to fight counterfeiters.

Timo Richter and Stephan Escher, researchers at TU Dresden’s Chair of Privacy and Data Security, cited NSA whistleblower Reality Leigh Winner as an example of what happens when governments and companies use these tracking dots to invade peoples’ privacy.

Winner, who worked for Pluribus International Corporation, was stationed at the NSA where she printed a top-secret document detailing a cyber attack by Russian military intelligence on US election infrastructure.

She had produced the documents using NSA printers, which investigative journalism site the Intercept then scanned and reproduced online. Winner’s arrest affidavit shows that she was identified following an ‘internal audit’.

Errata Security showed at the time how the document contained a dot pattern showing when it was printed, and on what device, which may have been one of many clues leading to her arrest. Winner is set to serve at least five years in jail after reaching a plea deal last week.

Reading between the lines

The TU Dresden researchers wanted to give people the chance to manipulate these dots for themselves. They analysed 1286 prints from 141 printers spanning 18 manufacturers, to document the patterns that they were using. They found four separate pattern formats used by different manufacturers.

Along with colleagues Dagmar Schönfeld and Thorsten Strufe, the duo created a tool, called Dot Extraction, Decoding and Anonymisation (DEDA). They also wrote a paper detailing its inner workings.

The tool offers a range of functions in two broad groups: analysis and anonymization.

On the analysis side, DEDA ‘reads’ the dots in a scanned document to find out what pattern it uses and to extract any information it can. If the tool cannot read any information from the dot pattern, it can extract the dots for further analysis. Users wanting to forensically analyse several files at once can also use the tool to find any produced by different printers.

On the anonymization side, DEDA can anonymize a scanned image by wiping all the dots from its whitespace. It can also anonymize a document for printing by adding more dots to the existing pattern, confusing anyone that tries to read the information. This is a more time-consuming process, involving the production of a mask which must then be aligned with the scanned document before printing the anonymized version.

TU Dresden’s isn’t the only project to target these yellow dots. A year ago, CryptoAUSTRALIA researcher Gabor Szathmari submitted a pull request to an open source sanitising tool called PDF Redact Tool, produced by the Intercept’s owner, First Look Media. The changes, which were added to the product, take a lower-tech approach by converting images to black and white, effectively removing the tracking dots.

User beware

Does all this mean that you can safely use these tools to scrub your whistleblowing documents of any identifying data? Perhaps not.

The EFF, in its no-longer-updated list of yellow dot-producing printers, cites documents that it received from the government in FOIA requests. These suggest that all major manufacturers may have entered into an agreement to embed some kind of forensic tracking technology, it says, adding:

It appears likely that all recent commercial laser printers print some kind of forensic tracking codes, not necessarily using yellow dots. This is true whether or not those codes are visible to the eye and whether or not the printer models are listed here. This also includes the printers that are listed here as not producing yellow dots.

There are also other tracking mechanisms (which the TU Dresden team describes as ‘passive’ in their paper). These include analyzing halftone patterns in printed images and looking for slight geometrical differences in printed characters. Forensic analysts used that technique to trace typewritten documents long before printers came along.

So if you’re planning to blow the lid off a scandal by scanning and reprinting the telltale documents, be careful – there may, quite literally, be more than meets the eye.