We already know that if you threaten to shoot up a school on the ostensibly anonymous social media messaging platform Yik Yak, the law will come knocking, and that gossamer veil of not-really privacy will be shredded.
As Yik Yak says in its guidelines for law enforcement, it maintains a log with the following non-publicly available information for every message posted:
- The date and time of the interaction with the app, beginning with installation of the app
- The IP address used to access the app
- The GPS coordinates of the device used to access the app
- The user-agent string associated with the device used to access the app
- The user’s handle
Now, researchers have found that Yik Yak anonymity can be erased even without a warrant or Yik Yak’s compliance with US laws that force it to turn over user information. The researchers did it by relying on publicly available location data from the app, mixed with location-spoofing and message-recording on a device outfitted with simple machine learning.
Keith Ross, a Computer Science professor at New York University’s Tandon School of Engineering and Dean of Engineering and Computer Science at NYU Shanghai, will present a paper on his team’s findings, titled “You Can Yak But You Can’t Hide: Localizing Anonymous Social Network Users”, at the Association for Computing Machinery (ACM) Internet Measurements Conference in November.
The researchers set out to test Yik Yak’s susceptibility to localization attacks by conducting a series of experiments using “a comprehensive data collection and supervised learning methodology that does not require any reverse engineering of the Yik Yak protocol, is fully automated, and can be remotely run from anywhere.”
The experiments were a success:
We show that we can accurately predict the locations of messages up to a small average error of 106 meters.
We also devise an experiment where each message emanates from one of nine dorm colleges on the University of California Santa Cruz campus. We are able to determine the correct dorm college that generated each message 100% of the time.
Yik Yak, a free mobile app, allows users to create and view posts – called Yaks – within a 5-mile radius. The idea of limiting messages to people within the same general vicinity has made it enormously popular with college students, who’ve used it on campuses, and with younger students on grade school campuses.
Predictably enough, the anonymous chatting has all too often turned toxic, with bullies hiding behind their favorite shield: supposed anonymity.
In fact, many schools, and entire school districts, have banned Yik Yak. (Though they’ve admitted that the bans are a symbolic move, given that that they don’t provide any actual, technological barrier for students to access the app through their phones.)
As it was described in a writeup of the research paper from Tandon, Yik Yak thrives on the promise of anonymity, be it praise for a local restaurant or a bullying comment about teachers or peers.
The researchers reasoned that if it’s possible to locate the geographical origin of a Yik Yak comment, or “yak,” it might be possible to identify the person who posted it.
So, how easy would it be to unveil the people behind those comments?
Experiments showed that yaks can be localized through a “fairly simple machine learning algorithm” that an undergraduate computer science student could program and run in a matter of hours.
In fact, Ross and his team managed to trace a Yik Yak user to within 300 feet. In one experiment, they identified the college dormitories from which yaks originated with 100% accuracy.
The integrity of user anonymity is central to Yik Yak and similar anonymous social media apps, and this research shows that it’s possible for a third party to compromise it.
At this stage, we can narrow down a location to a building, which, when combined with other side information, could potentially de-anonymize the author of any given yak.
The experiments were actually conducted from China: specifically, Shanghai.
The researchers fed spoofed GPS coordinates to a phone, making it think it was located on one of the two US college campuses they ran the experiment on. That’s important, given that a yak won’t show up if a phone is outside Yik Yak’s 5-mile radius.
The researchers designed an automated system to place themselves – again, through the fake GPS coordinates – at different locations in and around the campuses and to then record which yaks were available at each location.
Using machine learning, their system then processed the yaks to predict the location from which each of the messages had been posted.
Imagine such a tool in the hands of a college professor who’s been bruised by anonymous comments, Ross suggested:
It wouldn’t be difficult for a professor to figure out the dorm from which a derogatory yak was posted, then couple this information with student housing information to de-anonymize the yak, and that’s concerning.
The researchers have notified Yik Yak about their findings.
Their recommendations to harden privacy protections against these types of localization attacks include improving localization authentication for users, which they say would make it easier to identify and block users used forged GPS coordinates.
Another strategy: keep the 5-mile radius, but get rid of more specific location data, so that Yik Yak displays the exact same set of messages no matter where the app is being used on a campus.