Concerns over privacy, data anonymity, lead Netflix to abandon contest on improving movie recommendations
In a move reported by the Wall Street Journal online, responding to concerns raised by the Federal Trade Commission and in the wake of a settled lawsuit, online movie rental powerhouse Netflix announced that it is canceling a second planned contest intended to help the company improve its movie recommendations to members. As part of the first contest concluded in 2006, which Netflix credits with improving its recommendation system by 10 percent, Netflix made available a database of member movie ratings, rental dates, and unique subscriber ID numbers, and had promised to add customer demographics such as age, gender, and zip code for the second iteration of the context. The data were supposed to be sufficiently anonymized to protect Netflix member privacy, but University of Texas researchers Arvind Narayanan and Vitaly Shmatikov demonstrated that Netflix customers could be identified by comparing the member ratings in the Netflix-provided datasets with publicly posted ratings such as those on the Internet Movie Database website. Narayanan and Shmatikov published a paper describing the process they used to “re-identify” the anonymized Netflix customers in the datasets. One member, alleging that Netflix had caused her sexual orientation to become known, claimed in a class action lawsuit that Netflix had violated its own privacy policy with respect to guarding customer’s personal information. Such a claim (when it has merit) is usually sufficient to get the FTC involved, inasmuch as violations of stated privacy policies can be considered unfair and deceptive trade practices, which are prohibited under Section 5 of the FTC Act. This case has broader implications beyond Netflix of course, contributing as it does evidence in support of the argument that de-anonymization of personal records can be reversed through correlation with third-party data.