Google strips private medical data from searches

Files

Google has quietly amended its search engine indexing to exclude past and present personal medical data for the first time.

It’s a deceptively straightforward change the company describes in its data removal policy homepage as relating to “confidential, personal medical records of private people.”

The shock of this – which perhaps also explains why it is being sneaked in as a single line of text on a help page very few people visit – is that Google’s search engine would index such a sensitive category of data in the first place.

Other data categories already excluded include social security, bank and credit card numbers, personal signatures and, from 2015 on, non-consenting “revenge porn”.

The move raises the question of why it’s taken so long to institute a change that’s been a pressing problem since stolen and inadvertently leaked medical data started finding its way onto the public internet as long as a decade ago.

Whether breached data ends up in a form that can be seen by search engines depends in part on how the breach occurred. If it’s lost on a stolen laptop or siphoned off from deep inside an organisation then the chances are it will remain within criminal circles. This isn’t secure, of course, but it’s something beyond the ken of a search engine.

By contrast, unsecured data left in an open state accidentally can by picked up by search engines pretty quickly. This is the problem Google is trying to fix. Equally, just stopping Google from displaying unsecured data in its search results doesn’t mean that data is gone from the internet. It may still visible in the results of other search engines and, even if it isn’t, it can still be found by more laborious means

For US citizens at least, this must seem like a paradoxical state of affairs. Medical providers are governed by the federal Health Insurance Portability and Accountability Act (HIPAA) which sets out strict rules about how private medical data should be handled and accessed.

Unfortunately, the minute that data is breached in some way it enters a netherworld where protection is assumed to have stopped in a practical if not a legal sense.

In 2015, the Office of Civil Rights (OCR) recorded that US providers alone were affected by 253 medical data breaches, equivalent to 112 million records.

The bulk of these were covered by only three large incidents: Anthem (78 million), Premera (11 million), and Excellus (10 million), all the result of hacking rather than accidental loss. The biggest of these, Anthem, was later deemed to have originated from a single malicious email opened by one person.

What Google doesn’t say is how it is going to scrub data from searches. Google search – indeed all search engines – operate as black boxes.

Monitoring suggests that Google’s algorithms are frequently tweaked and updated but getting the search giant to admit that its searching differently, let alone how or why, is almost as rare as unicorn dressage,

It’s possible to understand the crawling that happens at one end and see what comes out the other, but what happens in the middle remains a trade secret.

Having spent years indexing every corner of the public web it thinks users and advertisers might be interested in, Google is gradually introducing exceptions. This is hardly the great retreat from Moscow, more an admission that searching and indexing everything has downsides after all:

We want to organize the world’s information and make it universally accessible, but there are a few instances where we will remove content from Search.

Thankfully that now includes your health records.