On Aug. 24, 2017, India’s Supreme Court issued a landmark judgment declaring privacy to be a fundamental individual right. In response, a committee headed by a former Supreme Court justice drafted a data protection bill to be presented to parliament later this year. The measure introduces a new type of entity, a “data fiduciary,” which will ensure that data is used for designated purposes only. Just as a financial fiduciary is bound to act ethically in the best interest of the client, similarly, a data fiduciary — which will be one or more regulated entities — will ensure that the user has approved any transaction using his or her data, via an app.
That innovation may be useful elsewhere, given that data-driven artificial intelligence platforms are spreading across our lives, cutting across transportation, law enforcement, finance, defense and governance. The more we give machines access to our data — from our location to our social contacts — the better they get at making decisions, the more we trust them to do everything from reminding us of our flights to organizing our news feed, and the easier they make our lives.
But how much machines know about us will increasingly influence both individual privacy and how much people get involved in their democracies. For example, if a platform’s profit-maximizing algorithm cannot tell fake content from real or malicious players from normal ones, we should continue to expect its precision-targeting algorithms to be used to influence us, politically and otherwise.
The four models of data use
Four distinct models of data use have emerged in the Internet era. These approaches emphasize different goals — call them monetization, liability, control and participation. These correspond to the models of the United States, Europe/Japan, China and India, and each encourages a certain type of data use.
1. The U.S. approach emphasizes moneymaking
The U.S. Internet giants have focused on monetizing data. The U.S. intellectual property doctrine at work here is called “sweat of the brow,” which means that businesses are permitted to “own” the data they’ve invested in collecting, whether by observing Internet browsing patterns or through a credit bureau. That gives them an intangible asset with economic value, even if the asset cannot be valued on the balance sheet. In other words, although Facebook’s financial statements list its assets, data is not listed — even though it is Facebook’s real asset, responsible for its market valuation of a half-trillion dollars.
As a result, companies that established data hegemony before lawmakers or regulators could craft regulations about data now own a trove of knowledge about their hundreds of millions of users that is difficult for anyone else to create. Although it’s true that U.S. lawmakers and regulators are likely to create regulations that respond to the 2016 and 2018 Russian disinformation campaigns, the United States looks unlikely to significantly restrict any company’s ability to collect and sell data about its users.
2. The European approach imposes risks if companies misuse or lose individuals’ data
Meanwhile, this year the European Union passed a law called the General Data Protection Regulation, to protect the privacy of the individual. The GDPR framework charges data collectors or processors costs and penalties if they allow data to be misused, lost or stolen. What’s more, Europe limits the amount of data that businesses can collect. The GDPR’s Article 23 requires a data controller to “hold and process only the data that is absolutely necessary for the completion of its duties.”
“Absolutely necessary” is a strong standard. For example, a navigation application does not need to know your annual income, so if it collects such data, it is risking a penalty. Similarly, if an app infers something monetizable from your navigation patterns — such as where you work, live or shop — sharing or selling such data would be risky, because this would have nothing to do with your reason for using the app. Europe therefore makes sure businesses know that others’ data is not an asset to be used freely and for profit in perpetuity.
3. The Chinese approach emphasizes controlling data — and people
The Chinese government requires every entity doing business in China to host its data locally, and to give the government complete access to such data. As a result, it is relatively free to link and use all kinds of data as it wishes. For instance, China has been punishing traders or their companies for “undesirable” activities, even if their behavior does not violate the law. Using closed-circuit security camera footage, ID card checks, WiFi phone and computer connections, and health, banking and legal records, China’s government now has AI systems that can recognize anyone in the country in real time — and can link that identification to other data about them. Chinese police have been using this data to catch criminals and illegal financial activity — and to detain people preemptively if software predicts they’re likely to get involved in “subversive” political activity, Human Rights Watch reported this year.
4. India is treating critical digital platforms as public goods
To begin along this path, India began the “Aadhaar” initiative in 2009, the largest biometrically based national identification database in the world providing real-time authentication for 1.2 billion users. It assigns a random 12-digit number to everyone, and associates that with fingerprint and iris scans to answer one question: Are you who you say you are? The goal was to bring hundreds of millions of citizens who had no established identity or verifiable transaction history into the economic mainstream — for instance, allowing potential lenders to make loan decisions they couldn’t before.
In 2016, India passed a law requiring authorities to use the Aadhaar ID to distribute public subsidies and benefits. Aadhaar has quickly become India’s most-used ID.
Regardless of how India’s data protection bill shapes up as law, the new data fiduciary entity or entities that it creates will ensure that any data transaction is consensual — getting the individual’s consent expressed explicitly in code that specifies how long that data can be used. Initially, data fiduciaries are expected to oversee transactions in financial services; they’re expected to expand to health, employment and education, areas involving authentication and sensitive personal data.
If we want control over data, the Indian model offers a path toward individual data empowerment for the Internet age.
Vasant Dhar (@vasantdhar) is a professor at the Stern School of Business and the Center for Data Science, and director of graduate studies in data science at New York University. He is also the founder of SCT Capital Management, a machine-learning-based hedge fund in New York.
This article is one in a series supported by the MacArthur Foundation Research Network on Opening Governance that seeks to work collaboratively to increase our understanding of how to design more effective and legitimate democratic institutions using new technologies and new methods. Neither the MacArthur Foundation nor the network is responsible for the article’s specific content. Other posts can be found here.