‘Big data’ from social media, elsewhere online redefines trend-watching
From a trading desk in London, Paul Hawtin monitors the fire hose of more than 340 million Twitter posts flying around the world each day to try to assess the collective mood of the populace.
The computer program he uses generates a global sentiment score from 1 to 50 based on how pessimistic or optimistic people seem to be from their online conversations. Hawtin, chief executive of Derwent Capital Markets, buys and trades millions of dollars of stocks for private investors based on that number: When everyone appears happy, he generally buys. When anxiety runs high, he sells short.
Hawtin has seen a gain of more than 7 percent in the first quarter of this year, and his method shows the advantage individuals, companies and governments are gaining as they take hold of the unprecedented amount of data online. Traders such as Hawtin say analyzing mathematical trends on the Web delivers insights and news faster than traditional investment approaches.
“Big data,” as it has been dubbed by researchers, has become so valuable that the World Economic Forum, in a report published last year, deemed it a new class of economic asset, like oil.
“Business boundaries are being redrawn,” the report said. Companies with the ability to mine the data are becoming the most powerful, it added.
While the human brain cannot comprehend that much information at once, advances in computer power and analytics have made it possible for machines to tease out patterns in topics of conversation, calling habits, purchasing trends, use of language, popularity of sports, spread of disease and other expressions of daily life.
“This is changing the world in a big way. It enables us to watch changes in society in real time and make decisions in a way we haven’t been able to ever before,” said Gary King, a social science professor at Harvard University and a co-founder of Crimson Hexagon, a data analysis firm based in Boston.
The Obama campaign employs rows of people manning computers that monitor Twitter sentiment about the candidates in key states. Google scientists are working with the Centers for Disease Control and Prevention to track the spread of flu around the world by analyzing what people are typing in to search. And the United Nations is measuring inflation through computers that analyze the price of bread advertised in online supermarkets across Latin America.
Many questions about big data remain unanswered. Concerns are being raised about personal privacy and how consumers can ensure that their information is being used fairly. Some worry that savvy technologists could use Twitter or Google to create false trends and manipulate markets.
Even so, sociologists, software engineers, economists, policy analysts and others in nearly every field are jumping into the fray. And nowhere has big data been as transformative as it has been in finance.
Wall Street is all about information advantage. Every little bit could mean the difference between a bonanza or a devastating loss, and so big data is being fed into computers to power high-frequency trading algorithms — and directly to traders in every way imaginable.
Hedge funds are experimenting with scanning comments on Amazon product pages to try to predict sales. Banks are tallying job listings on Monster as an indicator of hiring. Investment firms are conducting computer analyses of the financial statements of public companies to search for signs of a bankruptcy.
Why wait for the government to release official numbers on auto sales, home sales and retail sales when the trends could be gleaned weeks or even months earlier by analyzing publicly available data online?
Five years ago, only 2 percent of investment firms were incorporating Twitter analysis and other forms of “unstructured” data into their trading decisions, according to a report by Adam Honore, a research director at Aite, a financial services consulting group based in Boston. By 2010, the share of companies experimenting with this technology jumped to 35 percent. Today, Honore said, that number is closer to 50 percent.
“Big data is fundamentally changing how we trade,” Honore said.
‘Data in motion’
Richard Tibbetts, chief technology officer at StreamBase, a Lexington, Mass., company that provides tools for analyzing large amounts of data, calls it “examining data in motion.” The trick is to be able to find the digital smoke signals amid all the other stuff. Traders who were analyzing Twitter for unusual activity, for instance, were able to get the news of Osama bin Laden’s death and a massacre in Norway hours before the information was officially confirmed, giving them a significant jump on their colleagues who learned of the events through traditional news sources.
“The new generation of trader expects to have dozens of tools at their fingertips instead of just a Bloomberg terminal,” Tibbetts said.
Hawtin began experimenting with trading on a social-media sentiment algorithm in the spring of 2011, tapping $40 million from his now-closed hedge fund. He has repeatedly warned potential investors that there is a high level of risk. “It’s a very new area we don’t fully understand yet,” he said. But the interest in his project was so great that in April he began offering his technology to retail investors.
In addition to its efforts to gauge the collective mood of the world, the company now examines messages on Twitter, Facebook and other social-media outlets to create measures for individual stocks and commodities.
On a recent weekday, Hawtin was studying his global sentiment monitor when he noticed something troubling, a surge in anxiety after two days of relative calm.
After deliberating for a few minutes, he decided it was too early to take any action. If the anxiety continued to trend up the following day, he said, he would probably start selling.
“There’s a delay between how you’re feeling about your economic situation and having that sentiment turned into a decision like buying or selling a stock or adjusting your portfolio,” he said.
The numbers support Hawtin’s strategy — at least so far. His investors beat the main London stock index by seven-fold in the first quarter of this year.
But programs such as Hawtin’s are only as good as the data being entered, and a growing backlash against big data may threaten the flow of that information.
Companies and governments are pushing the envelope in the use and reuse of data in ways not originally intended, and privacy groups are pushing back. Even the basic definition of personal data varies widely from one country to another, making it unclear how it can be used. The regulatory framework has not caught up with the technology.
Tim Berners-Lee, a founder of the World Wide Web, has become so concerned about the misuse of personal information by companies and governments that he has warned people to be cautious about what they put online. The data sets are so large that they are normally analyzed in aggregate, but privacy advocates worry that information can still be tied to individuals.
Civil liberties groups have sued to stop a U.S. government program that monitors social media data for national security threats, arguing that it could be used to unjustly label people as bad credit risks — or even terrorists — and chill free speech.
There is also the danger of what scholars call information asymmetry, where certain parties have an unfair advantage because they have better information than others — a phenomenon that some have argued shakes the foundation of a market economy.
“It increases opportunities for those who are already richer and disadvantages those that are poor,” said Jay Stanley, a lawyer with the American Civil Liberties Union in Washington.
Beyond the civil liberties issue, data streams can be manipulated. You can spam Twitter streams with positive words about a stock to make it look as if there is a groundswell of optimism about the company. Or you can use the same techniques to try to sink a stock.
Vagelis Hristidis, an associate professor of computer science at the University of California at Riverside, is the lead author of a paper detailing another investment strategy based on Twitter. During a four-month simulation, his approach outperformed other baseline strategies and indexes, including the Dow Jones industrial average, by between 1.4 percent and 11 percent.
“A model that predicts the stock market,” Hristidis said, “can only be successful as long as people don’t know about it.”
More in WashingtonPost/Business: Special report: Zero day, the threat in cyberspace Twitter becomes a key real-time tool for campaigns Silicon Valley’s data gurus lure defense customers IBM’s Tim Sheehy on the next four big things in tech