Algorithmic wizardry already picks which movies we watch, which songs we listen to and which search results we see on Google. Increasingly, big data has converged on another field: what books we read — and even how those books are written.
Scribd, an ebook start-up that operates along the Netflix subscription model, has lifted the proverbial curtain on books and big data by releasing a stream of back-end insights on how (and what, and where) people read. It’s no secret, of course, that Web-savvy retailers like Amazon draw from readers’ past purchases and browsing history to recommend books. But Scribd, which claims its data will help people publish better books, takes that approach one step further — linking reading habits, as tracked by e-readers, to specific genres, demographics and geographies, and then making those links public.
Scribd can tell, for example, exactly how quickly a reader moves through a book and where that reader decides to give up. They know people frequently finish biographies and that they finish erotica faster.
They can also break books down along demographic and geographic lines; drawing from their data, for instance, they were able to compile this list of which books are relatively most popular in each state. And because Scribd tracks actual reading habits among its 80 million monthly readers, this list is based on how many times each book was actually completed — not merely how many times it was bought or downloaded. (Like last week’s viral state music map, a word of caution here: This shows which book is most popular in each state versus other states, Scribd’s data team explained in an e-mail to the Post. It’s not a raw score of which books are read the most, which would not vary too much by state … and ultimately, be pretty boring.)
Larger retailers, like Amazon and Barnes & Noble, are a bit more circumspect about the reading data they track – both companies declined to share similar data with The Post. But Kashif Zafar, the director of ebooks for Barnes & Noble’s NOOK, said the company has actually moved well beyond demographics in its quest to give readers the perfect book. NOOK pioneered Netflix-style categories — clusters of books that match a certain mood or interest rather than a traditional genre — even before Netflix did. Now they bolster algorithms built on purchase behavior with actual human input from Barnes & Noble editors and other internal taste-makers.
“We’ve got very sophisticated algorithms that put the right book in front of the right people at the right time,” Zafar said. Barnes & Noble doesn’t even bother breaking its data down regionally, he added: “We’re more atomized at the customer level.”