Debates are raging about whether big data still holds the promise that was expected or whether it was just a big bust. The failure of the much-hyped Google Flu Trends to accurately predict peak flu levels since August 2011 has heightened the concerns.
Over the centuries, we gathered data on things such as climate, demographics, and business and government transactions. Our farmers kept track of the weather so that they would know when to grow their crops; we had land records so that we could own property; and we developed phone books so that we could find people. About 15 years ago we started creating Web pages on the Internet. Interested parties started collecting data about what news we read, where we shopped, what sites we surfed, what music we listened to, what movies we watched, and where we traveled to. With the advent of LinkedIn, MySpace, Facebook, Twitter and many other social-media tools, we began to volunteer private information about our work history and social and business contacts and what we like—our food, entertainment, even our sexual preferences and spiritual values.
Today, data are accumulating at exponentially increasing rates. There are more than 100 hours of video uploaded to YouTube every minute, and even more video is being collected worldwide through the surveillance cameras that you see everywhere. Mobile-phone apps are keeping track of our every movement: everywhere we go; how fast we move; what time we wake. Soon, devices that we wear or that are built into our smartphones will monitor our body’s functioning; our sequenced DNA will reveal the software recipe for our physical body.
The NSA has been mining our phone metadata and occasionally listening in; marketers are correlating information about our gender, age, education, location, and socioeconomic status and using this to sell more to us; and politicians are fine-tuning their campaigns.
This is baby stuff compared to what lies ahead. The available tools for analyzing data are still crude; there are very few good data scientists; and companies such as Google still haven’t figured out what is the best data to analyze. This will surely change rapidly as artificial-intelligence technologies evolve and computers become more powerful and connected. We will be able to analyze all data we have collected from the beginning of time—as if we were entering a data time machine.
We will be revisiting crime cases from the past, re-auditing tax returns, tracking down corruption, and learning who were the real heroes and villains. An artificially intelligent cybercop scanning all the camera data that were gathered, as well as phone records, e-mails, bank-account and credit-card data, and medical data on everyone in a city or a country, will instantly solve a crime better than Sherlock Holmes could. Our grandchildren will know of the sins we committed; Junior may wonder why grandpa was unfaithful to grandma.
What is scary is that we will lose our privacy, opening the door to new types of crime and fraud. Governments and employers will gain more control over us, and have corporations reap greater profits from the information that we innocently handed over to them. More data and more computing will mean more money and power. Look at the advantage that bankers on Wall Street have already gained with high-frequency trading and how they are skimming billions of dollars from our financial system.
We surely need stronger laws and technology protections. And we need to be aware of the perils. We must also realize that with our misdeeds, there will be nowhere to hide—not even in our past.
There are many opportunities in this new age of data.
Consider what becomes possible if we correlate information about a person’s genome, lifestyle habits, and location with their medical history and the medications they take. We could understand the true effectiveness of drugs and their side effects. This would change the way drugs are tested and prescribed. And then, when genome data become available for hundreds of millions of people, we could discover the links between disease and DNA to prescribe personalized medications—tailored to an individual’s DNA. We are talking about a revolution in health and medicine.
In schools, classes are usually so large that the teacher does not get to know the student — particularly the child’s other classes, habits, and development through the years. What if a digital tutor could keep track of a child’s progress and learn his or her likes and dislikes, teaching-style preferences, and intellectual strengths and weaknesses? Using data gathered by digital learning devices, test scores, attendance, and habits, the teacher could be informed of which students to focus on, what to emphasize, and how best to teach an individual child. This could change the education system itself.
Combine the data that are available on a person’s shopping habits with knowledge of their social preferences, health, and location. We could have shopping assistants and personal designers creating new products including clothing that are 3D-printed or custom-manufactured for the individual. An artificial intelligence based digital assistant could anticipate what a person wants to wear or to eat and have it ready for them.
All of these scenarios will become possible, as will thousands of other applications of data in agriculture, manufacturing, transportation, and other fields. The only question is how fast will we get there—and what new nightmares we will create.