Sarah E. Igo is a professor of history at Vanderbilt University and the author of “The Averaged American: Surveys, Citizens, and the Making of a Mass Public” (2007).
By Steve Lohr
HarperBusiness. 239 pp. $29.99
Even if we aren’t certain what to make of big data, we all know that it’s, well, big. The phrase is suddenly everywhere, tripping off the tongue in domains from marketing to management to medicine. Newly massive and complex data sets, along with advanced methods of probing them, it is said, are remaking our world. Whether one believes that big data is the key to solving intractable social problems or, instead, the source of new ones — say, surveillance at an unprecedentedly intimate scale — we surely can agree that careful assessments of the phenomenon are in order.
Enter Steve Lohr, a longtime technology reporter for the New York Times. Lohr was investigating big data before it was Big Data, and that experience lends him a somewhat different vantage point on what he calls “data-ism” (a word coined by his Times colleague David Brooks) from those of other devotees and detractors. His book is a dispatch from the field, a chronicle of the present state of big data and the possible future it portends.
But first: What is data-ism? Lohr describes it as a transformative way of measuring and seeing, born of improved methods of analyzing data of all kinds, from browsing histories to GPS locations to genomic information. Powerful algorithms, as well as machine-learning software and artificial intelligence — technologies that can sense and communicate — are at its core. More important, Lohr calls data-ism “a point of view, or philosophy, about how decisions will be — and perhaps should be — made in the future.”
An advocate of the long view, Lohr argues persuasively that big data is just now coming of age and that it has a lot of maturing still to do. By these lights, the consumer Internet is simply the first, but not nearly the most profound, instance of data-ism. Bringing the cutting-edge technologies of Google and Facebook “to huge industries of the physical world, like medicine, energy, and agriculture, is a more difficult challenge — and ultimately a more significant achievement.” This is where Lohr’s hopes for a big data “revolution” lie, and his tour of the inroads that data-ism has made into a diverse set of organizations and industries is the most absorbing part of his book.
Lohr narrates big data’s career, and humanizes it, through biography — both individual and corporate. He profiles figures who have been pivotal in the development of data science as well as the businesses that have underwritten its uses. His touchstone is Jeffrey Hammerbacher, a Wall Street quant who became the first data scientist at Facebook and then the co-founder of Cloudera. Chagrined, as he put it, that “the best minds of my generation are thinking about how to make people click ads,” Hammerbacher moved on, putting his expertise to work on disease modeling at the Mount Sinai medical school in New York. Hammerbacher’s trajectory mirrors Lohr’s optimistic sense of where big data is headed: toward ever more efficient, beneficial, even lifesaving applications.
The other anchor for Lohr’s story is IBM, and especially its 2014 decision to “bet the company” on big data. Like Hammerbacher’s career zigzags, IBM’s deliberate makeover from a computer hardware firm to a data analytics company — encapsulated in its “Smarter Planet” campaign — provides an entry point for what Lohr sees as a deeper transformation of the entire economy. Traffic flow, millennial shopping habits, disease discovery, home temperature control and hotel management may not otherwise have much in common, but a “data-first” approach, he shows, has altered understandings of each.
One fascinating case is precision viticulture in California’s Central Valley, where a “data-guided” system employing sensors and satellite imaging has retooled grape-growing, vine by vine. The upshot: 25 percent more — and better — grapes, as well as a challenge to the wine industry truism about inevitable trade-offs between quality and quantity. Lohr characterizes this as a “layer of intelligence being added to the physical universe.” Pulling our attention away from social media and toward other data projects afoot, such as the “industrial Internet,” he alerts us to developments that may be more significant in the long run. An example is General Electric’s quest to make turbines and jet planes more energy efficient by facilitating their communication with other machines — what one computer scientist calls “Facebook for engines.” Along the way, Lohr proves an adept guide to contemporary big data debates over the importance of causality vs. correlation, as well as the proper balance between human and machine decision-making.
New technologies tend to spawn utopian and dystopian thinking in equal measure. For all his caveats about the unproven promise of big data, Lohr is clearly one of the enthusiasts. He has been captured by data-ism, evincing open admiration for those on its leading edge. Perhaps for this reason, he focuses more on the benign “stumbles” of big data than on the serious ones: the humorous misfires of IBM’s artificially intelligent computer system, Watson, as it trained to compete on the game show “Jeopardy!” (identifying Wonder Woman as the first female astronaut, for example) rather than the more consequential mistakes of financial-market modelers in 2008. Lohr’s cheery view of “ubiquitous connectivity” similarly leads him to give short shrift to the darker side of the ever-expanding hunger for data, from newly granular ways of tracking individuals to the prospect of “discrimination by statistical inference.” When he writes that “there is no opt-out from the big-data world,” he suggests that there is little reason, and perhaps little point, in resisting the coming algorithmic order.
And yet, in relaying his book’s themes through a collection of individual decision-makers, whether data scientists or corporate strategists, Lohr implicitly suggests how wrong it is to treat data-ism as an autonomous force. Data does not actually (yet) operate on its own, and big data is not, in fact, an actor in the traditional sense. Beyond the fine reporting in “Data-ism,” we will require careful thinking by the humans still in charge about the politics and ethics of data systems — those now in place and those yet to be designed. To Lohr’s questions — What is big data doing for us? And what might it do? — we will need to add another: What ought we not entrust to it?