The engineers, eight in all, started jumping in: “Pepperoni.” “Half cheese.” “Caesar salad.” Emboldened by the result, they peppered Viv with more commands: Add more toppings. Remove toppings. Change medium size to large.
About 40 minutes later — and after a few hiccups when Viv confused the office address — a Pizz’a Chicago driver showed up with four made-to-order pizzas.
The engineers erupted in cheers as the pizzas arrived. They had ordered pizza, from start to finish, without placing a single phone call and without doing a Google search — without any typing at all, actually. Moreover, they did it without downloading an app from Domino’s or Grubhub.
Of course, a pizza is just a pizza. But for Silicon Valley, a seemingly small change in consumer behavior or design can mean a tectonic shift in the commercial order, with ripple effects across an entire economy. Engineers here have long been animated by the quest to achieve the path of least friction — to use the parlance of the tech world — to the proverbial pizza.
The stealthy, four-year-old Viv is among the furthest along in an endeavor that many in Silicon Valley believe heralds that next big shift in computing — and digital commerce itself. Over the next five years, that transition will turn smartphones — and perhaps smart homes and cars and other devices — into virtual assistants with supercharged conversational capabilities, said Julie Ask, an expert in mobile commerce at Forrester.
Powered by artificial intelligence and unprecedented volumes of data, they could become the portal through which billions of people connect to every service and business on the Internet. It’s a world in which you can order a taxi, make a restaurant reservation and buy movie tickets in one long unbroken conversation — no more typing, searching or even clicking.
Viv, which will be publicly demonstrated for the first time at a major industry conference on Monday, is one of the most highly anticipated technologies expected to come out of a start-up this year. But Viv is by no means alone in this effort. The quest to define the next generation of artificial-intelligence technology has sparked an arms race among the five major tech giants: Apple, Google, Microsoft, Facebook and Amazon.com have all announced major investments in virtual-assistant software over the past year.
Two of them -- Google and Facebook -- have made offers to buy Viv, according to people familiar with the matter. (Facebook chief executive Mark Zuckerberg is also an investor in Viv through the firm Iconiq Capital.)
Viv also has the ultimate pedigree in the elite universe of technologists who strive to build machines that can talk to people. Its creators, Dag Kittlaus and Adam Cheyer, were also co-founders of Siri, the app that became the first widely distributed virtual assistant when it was acquired by Apple in 2010.
“It’s about taking the way that humans have naturally interacted with each other for thousands of years and applying that to the way they interact with services,” said Kittlaus, Viv's chief executive. “Everyone knows how to hold a conversation.”
The goal is not just to build great artificial intelligence. Companies see in this effort the opportunity to become the ultimate intermediary between businesses and their customers.
Search engines were among the first of these “platforms,” enabling Google to generate a fortune from organizing the vast array of Web pages for ordinary users. Then, with the rise of smartphones came apps that pulled consumers out of desktop search into the mobile world. Apple and Google raced to become the gatekeepers of these smartphone programs by building app stores that take a cut of the profits.
But despite apps growing into a $50 billion business, consumer enthusiasm for most new apps is waning, according to ComScore and the analytics company App Annie.
“Little siloed chiclets, none of which speak to each other, living inside the walled gardens of rival app stores owned by Apple and Google,” said John Battelle, a Web entrepreneur and the chairman of digital-ad company Sovrn Holdings.
Too much data used up, too many passwords to remember, too many useless notifications, concluded Dan Grover, product manager at WeChat — the popular Chinese messaging platform that is itself helping to make many apps irrelevant — in a recent blog post.
Mobile users now spend 80 percent of their time in just five apps, according to 2015 data from Forrester. “It’s just too inconvenient for consumers to hop in and out of so many apps,” Ask said. “So consumers are consolidating where they spend their time. There’s now a much bigger bar to get over if you’re going to build an app.”
Chris Messina, developer-experience lead at Uber, one of the most highly valued apps on the market, said that “apps will still have a place. But the landscape is going to get a lot broader.”
Virtual assistants offer an alternative. But the difficulty, stemming back to the early artificial-intelligence efforts in the 1960s, has always been understanding the nuances of how humans talk.
Most virtual assistants today can understand a set of human questions. But those queries have to be stated in a precise way, and they trigger largely scripted responses. What distinguishes Viv is that it aims to mimic the spontaneity and knowledge base of a human assistant, said Oren Etzioni, chief executive of the Allen Institute for Artificial Intelligence in Seattle.
By working with data from movie-ticket vendors, it can understand the multitude of ways people can ask it to buy movie tickets. It can look up showtimes and, on its own, suggest entertainment alternatives from other vendors if the desired showing is sold out. And it can compare prices and then buy the tickets, along with making a restaurant reservation beforehand. If the user changes her mind, the assistant can take care of the cancellations and let her know it's done.
Grubhub chief executive Matt Maloney said he rushed to sign up with Viv two years ago, impressed with the idea of allowing consumers to perform different activities without having to toggle between services. “No one has been able to say, ‘I want the movie ticket, and the bottle of wine, and some flowers on the side’ — all in one breath,” he said.
Achieving that level of communication is a very high bar, Etzioni said. And no technologist has come close to achieving it. In a way, Viv’s founders are among the staunchest adherents to the original Turing Test — the proposition, laid out by artificial-intelligence pioneer Alan Turing over a half century ago, that a machine has achieved intelligence if it can carry on a conversation that is indistinguishable from a human one.
“If it were anybody else, I’d say it was probably too ambitious,” Etzioni said of the Viv team. “If anybody has a shot at doing this, it’s them.”
Viv’s 26-person team has been toiling away for longer than just about anyone else. The effort preceded Siri and stems back to 2003, when Cheyer led a 300-person team at SRI International, the nonprofit, government-funded research-and-development lab in Palo Alto, working on a sprawling Defense Department project to create a next-generation personal assistant.
Kittlaus, an SRI colleague and former Motorola executive, persuaded Cheyer to build the technology into a mobile app after he saw the popularity of smartphones. (Kittlaus, who is Norwegian American, named the product Siri after a former co-worker — he liked that the Nordic word meant “beautiful woman who leads you to victory.”)
Though Siri is known for her conversational skills — which included some dry wit and sass — there’s a lot she and other virtual assistants can’t do. Ask Siri to “buy me a ticket for the Beyoncé concert” and she’ll pull up a link to Ticketmaster’s Web page. Ask her to reserve a table at a restaurant near your house and she can pull up the time and date you requested, but you can’t book the reservation unless you have the OpenTable app installed.
That wasn’t how it was supposed to be, Kittlaus said. The original Siri wasn’t supposed to be a clever AI chatbot. The goal was to reinvent mobile commerce itself. When it initially launched as an independent app in 2010, Siri could buy tickets, reserve tables and summon a taxi — all the while bypassing search pages and without a user having to open or download another app. She was able to siphon data from 42 Web services, including Yelp, StubHub, OpenTable and Google Maps.
But nearly all of the partnerships were dissolved once Apple took over. To build them, Kittlaus had essentially gone door-to-door to various tech companies asking for permission to connect to their stores of proprietary data. Kittlaus and Cheyer, who became close with Apple's Steve Jobs before his death in 2011, will not discuss what happened beyond this from Kittlaus: “Steve had some ideas about the first version, and it wasn’t necessarily aligned with all the things that we were doing.” Kittlaus quietly left Apple the following year. A third of the original Siri engineering team members, including Cheyer, eventually followed him and are now building Viv.
Viv “is what they wanted Siri to become — an open system,” said Bart Swanson, adviser at the venture-capital firm Horizons Ventures and an investor in Viv, Siri and other artificial-intelligence technologies.
Today, Viv has replicated its pizza experiment with about 50 partners. You can tell Viv to order a car and it will deliver your options, nearby, using data from Uber. Viv will order flowers using data from a service called FTD. Viv will turn lights on and off via a home automation platform called Ivee. Other partners include SeatGuru, Zocdoc and Grubhub. Kittlaus is talking to television companies, car companies, media companies and makers of smart refrigerators in his quest to unite all of them into a single, unbroken conversation. The data from these services enables the Viv brain to seem “intelligent.”
Maloney, the Grubhub chief executive, said he liked the idea of getting access to voice and conversational technology without having to build it himself.
The prospect of a new channel that would bypass the primary gatekeepers for apps — Apple and Google’s app stores — was also attractive. “Right now, the main conduits to a consumer are owned by Google and Apple,” he said. “My job is to get my restaurants in front of people. This gives us a new route.”
The landscape has changed dramatically since Kittlaus and Cheyer released Siri and even more so since they, along with a third co-founder, Chris Brigham, started building Viv. For example, Amazon, which last year released its conversational virtual assistant Alexa — a cylindrical device for the home — has opened its capabilities to third parties. You can now order an Uber car by talking aloud to Alexa in your home, and she can read you news, weather and traffic information. Alexa not only bypasses apps and Google — she bypasses the smartphone itself. (Amazon chief executive Jeffrey P. Bezos owns The Washington Post.)
Facebook, meanwhile, is trying to turn its popular messaging app, Messenger, into a portal for businesses. At its annual developer conference last month, Facebook enabled a handful of companies such as Expedia and 1-800-Flowers.com to conduct basic customer service over chat on Messenger. Early reviews found the product to be cumbersome, but businesses see big possibilities. In an interview, Expedia chief executive Dara Khosrowshahi said chatbots and artificial intelligence have the ability to return online travel to the roots of the traditional agent, who knew the customers and their preferences.
Facebook’s initiative takes a page from WeChat and Telegram, two wildly popular apps in China and Europe that have implemented similar systems to great effect. In China, it’s common for a young person to order movie tickets over chat.
There are also major efforts to integrate services that don’t involve conversation or chat. Wand Labs, a start-up founded by a former Google executive who led company’s virtual-assistant project, Google Now, is one of them. It enables users of messaging applications to send a friend an icon that contains a bit of information — say, a playlist from Spotify or a password to a home WiFi account. The receiver can make use of it in a single click, without being directed to or forced to download an app.
“For a lot of things, it’s just easier to click on something than to have a conversation about it,” said Wand Labs CEO Vishal Sharma.
Sharma has a point. While entrepreneurs such as Kittlaus and Cheyer are caught up in the quest to create the ultimate conversational interface, consumers may glom onto simpler methods that have nothing to do with conversation. “The Turing Test is just a bad design, and it kind of set the industry off on the wrong foot,” said Phil Libin, a venture capitalist who has funded virtual-assistant start-ups.
With such promising technology and partnerships already emerging, the biggest challenge for Kittlaus and Cheyer will be to find a distribution model that gets Viv into the hands of as many people as possible — without compromising the vision.
The two faced a similar choice six years ago, when Jobs offered to buy their little-known app and distribute it to millions of people. Jobs took them to his home in Palo Alto, and the group talked for three hours by the fireplace. They left his home convinced that they shared a vision. It didn’t turn out quite that way.
Today Kittlaus and Cheyer find themselves in a similar position: Do they sell to a giant or go at it alone?
“Our goal is ubiquity,” Kittlaus said. “There’s no way to predict where that goes except to say we'll pick the path that gets us there. Either way, we will finish the job.”
Correction: Earlier versions of this article misspelled the names of Matt Maloney and Phil Libin.