No sooner has Netflix made an interactive TV show than people are pulling apart its privacy implications and fretting about its potential to leak private information. Research published last week said that it is possible to deduce viewers’ choices from the platform’s interactive TV shows, like Bandersnatch.
After a couple of smaller projects, Bandersnatch was Netflix’s first big foray into interactive TV. Based in 1984, the episode in Charlie Brooker’s Black Mirror series lets the reader control the actions of a young video games programmer Stefan Butler, who idolises established games programmer Colin Ritman. Throughout the episode, the viewer gets to control his actions, including seemingly innocuous choices such as which cereal to eat. The choices guide you down a range of paths concluding in one of several endings for the story.
It’s an idea that anyone who grew up on the Choose Your Own Adventure and Fighting Fantasy book series will warm to. Unlike the books, Netflix records your story choices digitally, and the researchers believe that could pose a privacy problem.
According to their paper, although Netflix uses end-to-end encryption to send those choices from your viewing device to its servers, communication flaws still make it possible to snoop on what you choose. The paper says:
Recent advancements in the domain of encrypted network traffic analysis make it possible to infer basic information about the preferences of Netflix viewers.
The researchers realised that viewers’ devices indicated their choices by sending a JSON file (JSON is a human-readable text file commonly used in cloud-based software queries). It would send one of two different JSON files for each choice, based on what the user chose. By working out the JSON file type and the point in the program when it was sent, they could work out the users’ choices.
Netflix encrypts those JSON files using the SSL encryption mechanism, but they got around that by looking at the record length of each SSL request in bytes. The lengths almost always fell into distinct ranges, meaning that they could identify the two types of JSON files – and therefore the viewer’s choice – 96% of the time.
The problem is readily fixable, the researchers concluded:
An easy fix for the problem would be to either split the JSON file or to compress it so that it becomes indistinguishable.
Does it matter?
So, that’s the technical bit done with. The real question is: who cares? The research team thinks that it’s a potential issue:
The choices made and the path followed can potentially reveal viewer information that ranges from benign (e.g., their food and music preferences) to sensitive (e.g., their affinity to violence and political inclination).
True, the choices you make in Bandersnatch could identify you as a Thompson Twins or a ‘Now That’s What I Call Music’ fan (that’s one of the choices you get to make). Some would argue that this doesn’t matter all that much. However, the paranoid may worry that future shows – and Netflix is planning more – could get you to reveal more about yourself.
While it’s possible that third-party network providers could slurp your SSL packets to work out your choices and use them to try to infer things about you, it’s more likely that Netflix itself would use this data to understand the choices its audiences make at an aggregate level – and of course it doesn’t need to snoop on its own data.
This could enable its production partners to factor that feedback into their writing. Are more people choosing confrontational or peaceful paths? What percentage of its audience choose the romantic ending rather than the sad one?
The best comment on the whole affair comes from one Register reader in the comment section:
Maybe Charlie Brooker should write an episode of Black Mirror about how someone’s Bandersnatch choices turned them into a social pariah.
What do you think? Are you worried about Netflix – or anyone else – monitoring your interactive TV choices?
6 comments on “Researchers fret over Netflix interactive TV traffic snooping”
SSL? Aren’t they using TLS? And doesn’t TLS v1.3 help with this? Yes, it’s relatively new, but Netflix should have the budget to jump on it and nail this.
The paper (which is, ah, kind of perfunctory – perhaps it was a brief class project) refers to “SSL”. I assume that the authors used SSL in a very generic way – just as we still call the famous TLS library “OpenSSL”, or talk about “taping” shows or “dialling” phones).
So let’s assume some flavour of TLS was used. The point is that the authors claim (though the paper is based on a very small sample) that the timing and packet sizes tend to give your choices away. That’s it. Netflix can increase the disguise and obfuscation in all of this easily enough, though to avoid default choices standing out – that’s where the viewer doesn’t respond at all for 10 seconds – would reduce the interactivity of the viewing experience for viewers who reply quickly, because the system would have to wait for the full timeout period every time to keep the time-to-choose delay constant.
Could an easy fix for netflix would be to just insert a random sized buffer into the json payload?
There’s also the issue of timing – you have to make sure that a snooper can’t distinguish between “user pressed a button to make a choice” and “timed out, so choosing defaults”.
Presumably there’s also going to be an issue with how long each chosen segment lasts – if the film segment that plays when you elect to pay the bribe takes 2’45” but the segment that plays when you turn and run away takes 3’16” then that is leaking information about your choice, too.
Sneaky – Didn’t think of the length of the segment!
Isn’t an easier answer just to ensure the file-sizes are the same? As it is, each decision is a two choice question, so it seems weird that decisions aren’t just being stored as “1 or 0” or another binary representation?