Computers as a general rule do only what they're told to do. They don't have artificial intelligence in the classic sense. They have no common sense. IBM's Gruhl, the chief architect of a new product called WebFountain, points out that no computer has ever learned what any 2-year-old human knows.
A computer, he says, can become easily confused by the sentence "Tommy hit a boy with a broken leg." The computer doesn't understand that a broken leg is not going to be an instrument used in an attack. "Common sense, how the world works, even something like irony, are very difficult for computers to understand," says Gruhl.
When Google first appeared, in 1998, it seemed like a throwback. Rather than a jazzy "portal," it was just a plain ol' boring search engine.
To achieve common sense, the Web needs to go through the infantile process of self-discovery. The Web doesn't really understand itself. There's lots of information on the Web, but not much "information about information," also known as "metadata."
If you're a robotic search engine, you look for words in the text of a page, but ideally the page would have all manner of encoded labels that describe who wrote the material, and why, and when, and for what purpose, and in what context.
Hendler explains the problem this way: If you type into Google the words "how many cows in Texas," Google will rummage through sites with the words "cow" and "many" and "Texas," and so forth, but you may have trouble finding out how many cows there are in Texas. The typical Web page involving cows and Texas doesn't have anything to do with the larger concept of bovine demographics. (The first Google result that comes up is an article titled "Mineral Supplementation of Beef Cows in Texas" by the unbelievably named Dennis Herd.)
Hendler, along with World Wide Web inventor Tim Berners-Lee, is working on the Semantic Web , a project to implant the background tags, the metadata, on Web sites. The dream is to make it easier not only for humans, but also machines, to search the Web. Moreover, searches will go beyond text and look at music, films, and anything else that's digitized. "We're trying to make the Web a little smarter," Hendler says.
But Peter Norvig, director of search quality at Google, points out that the current keyword-driven searching system, clumsy though it may be and so heavily reliant on serendipity, still works well for most situations.
"Part of the problem is that keywords are so good," he says. "Most of the time the words do what you want them to do."
Billions of dollars are at stake in this race to invent the next mousetrap, and Google faces serious challenges. Yahoo! has long had a partnership with Google, using it to power many of its searches, but Yahoo! has since acquired two other search engine companies, and plans to drop Google in favor of its own Web crawlers. Microsoft, meanwhile, is sure to make search a fundamental element of the next version of its operating system , due in 2006 and code-named Longhorn.
Will Google get steamrolled like Netscape?
"We spend most of our time worrying about ourselves and not our competition," says Google's Norvig.
Technology creates a horizon beyond which human destiny is unknowable, because we can't anticipate all the crazy stuff that brilliant people will invent. The author Michael Crichton has pointed out that a person in the year 1900 might have contemplated all the human beings who would be on the planet in the year 2000, and wondered how it would be possible to obtain enough horses for everyone.
And where would they put all the horse droppings?
Specific predictions are usually wrong. But a general trend has emerged over the course of centuries: Information escapes confinement. Information has been able to break free from monasteries, libraries, school-board-sanctioned textbooks, and corporate publishers. In the Middle Ages, books were kept chained to desks. Information is now completely unchained.
It has a life of its own -- and someday perhaps that won't be just a metaphor.