.COM – LIVE
Hosted by Leslie Walker
Washington Post Columnist
Thursday, November 4, 1999 at 1 p.m.
Welcome to .com-LIVE. This week our guest is Sergey Brin, 26, co-founder of the search
engine Google. Instead of relying on human intelligence to compile
information from the Web, Google relies on machine smarts, using
computers to analyze a site's value.
A University of Maryland
science graduate, Brin was three months shy of his PhD at Stanford
winter when he and a pal launched Google.
Google's secret formula uses algorithms to assess the importance of a
site based on the volume and "authority" of other sites linking to it.
If a site is good, the thinking goes, many other Web sites will link
it. Google weighs the placement of links on each page, along with font
sizes and capitalization. It has developed a loyal following among
researchers for its ability to display highly relevant sites high in
As the volume of Web sites skyrockets daily, and the content of
sites changes just as quickly, Brin doubts human-compiled directories
can continue tracking the Web in a meaningful way. Sergey will talk about the quest to make search engines easier and smarter and to deliver more precise, relevant week.
"Magical" is one word I've seen used to describe the results you get from searching on Google. I know your formula is secret, but can you tell us more about how you analyze "links" on Web pages to get relevant results?
Sergey Brin: Our formula is secret but I can tell you about three main ingredients: user interface, hypertext analysis, and link analysis. The link analysis you refer to, allows Google to estimate the importance of each page on the Web. For example, Bill Clinton's home page is more important than my home page, and that in turn is more important than my "What I had for lunch" page. The analysis is fairly complex where every web page affects the rank of every other web page. It amounts to solving an equation with 400 million variables and 3 billion terms.
Let's start with your name.
Google comes from a mathematical term--"googol"--right? What does it mean and why did you choose it?
Sergey Brin: That's right. We chose a derivative of googol because it is a huge number - 1 followed by 100 zeroes and we have to work with something huge - the WWW. In our last crawl, we crawled over 200 million web pages and discovered 3 billion links. We have over 2000 computers in our PC farm with over 80 terabytes of disk space.
What typical mistakes do you see people making in trying to enter search queries or in general find things online?
Sergey Brin: Larry Page and I had a fundamental philosophy when we started Google - the user is never wrong. Other search engine developers would tell us "that query is too general" or "this query is poorly formed". We try to answer all queries well. That said, there are quite a few typos that people make and it is a difficult problem to solve well.
Also, Google unlike other search engines works better when you use just a few words to enter your query rather than lots of words.
How is Google different from all the other search engines?
Sergey Brin: Google has been designed from the outset to work on the WWW.
Other search engines have come from traditional information retrieval technologies. Google factors in all aspects of hypertext including fonts, headers, links, and nearby documents. As a result we are able to produce more relevant search results.
Hi Sergey. I was wondering how many people it takes to develop and run a search engine like Google. Can you give us a rundown on how big your research team is, how many technical people you have and in general what they do? Also, what portion of your company's employment is technical vs. business and marketing.
Sergey Brin: We view Google as a technology company so most of our resources are in engineering and research. We have about
50 people total and about 30 are in engineering and we
have a newly formed research group. We have about a dozen
PhD's. The remainder of our employees are in business development, marketing, and administrative.
How does Google plan to make money? Unlike the other big search engines-most of which added a bevy of services and have become mini-AOLs rather than search services-your Web site has little more than a search box.
So where will your revenue come from?
Sergey Brin: Leslie, have you visited our online t-shirt store?
More seriously, we currently make money from cobranding with
partners like Netscape and RedHat. We also recently launched our ad program which is quite differentiated.
What percent of the Web do you currently index? How fresh is the database? -i.e. if a new page goes up today, how long until you get to it?-
Along the same lines, I'd like to know how long it takes your crawler to do its work. You said you recently crawled 200 million pages--in what time frame?
Sergey Brin: We target to refresh our index about once a month though at times we have slipped. However, our latest crawl completed in under two weeks and we are ramping that technology.
Our current database online is 100 million pages and we are working hard to put up a 200 million page database. The number of different documents we can return is twice those numbers.
Despite the recent NEC studies, it is hard to put our coverage in percentage terms. There are many infinite spaces on the Web and it is not clear which documents to count.
Can you give us simple advice on how to make our search queries smarter?I had a devil of a time trying to find information about tribal healing practices in South Africa. It gave me everything about South Africa, healing and tribes--and health--but not what I was looking for.
Sergey Brin: When I search for: tribal healing practices south africa
I get the following results in positions 3 and 4.
They seem to be quite useful. In general I recommend using
terms that you imagine someone writing an answer to your
question would use.
Traditional Healers Gain Recognition In South Africa
...Healers Gain Recognition In South Africa March 31, 1999 By Craig...
...Craig Urquart Pretoria, South Africa (PANA) - The South...
www.africanews.org/south/southafrica/stories/19990331_feat41.html New! Try out GoogleScout
Spiritual healing: a comparison between New Age groups and African Initiated C
...INTRODUCTION Healing is a term much used in South Africa...
...why these practices are so prevalent in South Africa at...
www.unisa.ac.za/dept/press/rt/32/steyn.html Cached (71k) New! Try out GoogleScout
How many questions do people ask Google every day? Also, do you plan to put a human face on google--maybe do natural language questions like AskJeeves does?
Sergey Brin: Google answers over 4 million searches per day and that is growing rapidly.
Google seems to do OK at many natural language queries already. We might try to improve that. However, it is
important to realize that natural language is very powerful.
Current technology including Ask Jeeves does little more than pick the keywords out of the question and search on it.
So until there is better technology the benefits of typing a question versus several search terms is limited.
Hello. Please tell us your favorite Web sites!
Sergey Brin: I like Yahoo a lot. They have a bunch of great services.
However, I was a little disappointed when they got rid of their movie review summaries.
I also like shopper.com a lot. It is a great way to find
good deals on computer equipment.
Slashdot.org is a good "news for nerds" site.
Washingtonpost.com is one of your new partners--they just debuted your search box in their redesigned Web site last week. What is the business model behind these relationships. Do you share revenue?
Sergey Brin: Relationships like the Washington Post are an important source of revenue to Google. I can't comment specifically on the terms of the agreement but we are thrilled to work with the Washington Post and the partnership is a great benefit to both companies.
What are the top five search queries people enter into Google?
Sergey Brin: Beanie baby, sex, yahoo, hotmail, mp3
These fluctuate a bit but in total they account for less than one half of one percent of queries so they are not representative.
We are more than halfway through today's chat, folks. Keep those questions rolling in for Sergey, who is answering from his office in California.
Suggest you market by offering Google to various government intranets as an additional search engine. There are many in this town who could use an alternative to Excite, Lycos, etc., most of which frustrate!
On that subject, how many distribution partners do you have now? Also, what are your plans for advertising?
Maybe you can explain the thinking behind the text links to Amazon books at the top of many search returns now, and how you plan to expand that.
Sergey Brin: Thanks for the .gov suggestion. We currently have a search over the entire US Government extranet off our home page.
I will follow up on that.
As far as distribution and revenue partners, we have about 10 and growing rapidly. Our advertising program (currently in a pilot phase) is focussed on producing fast loading and relevant ads. Right now we are doing quite a bit of experimentation in content and presentation so expect some fluctuation.
The goal is for the ads to be useful to users and not annoying. Books are an interesting example because there are many different books and often they contain useful information to answer user queries.
Kiryat Yearim, Israel:
No matter how sophisticated the algorithm, will it not always come up 'short' when it comes to 'common sense'? Won't human intervention ultimately be needed, after the computer has done the heavy duty work of sifting out what is not wanted? Have you been able to solve the 'fruit flies like a banana' ambiguity?
Sergey Brin: The ideal searcher would be something with human intelligence and all knowledge in the world. Currently, humans have the former and computers have the latter (well, close to it) so you do have to sift through search results.
In the future, who knows...
How much data do you collect about what individuals search for---do you place a cookie on the hard drives of users' computers, for example, so you can track what questions they ask over time? Or do you only track questions in aggregate-in other words, detached from the people who asked them?
Also, in general, how much data do you plan to save and analyze?
Sergey Brin: We do currently cookie our users so that we can at a minimum maintain a count of our users but in truth even that simple analysis is low in priority and on the back burner.
In the future, we will collect a lot of aggregate statistics based on that data like "users who tend to use short queries
also don't look past the first result page". I.e. we just collect aggregate data but the cookies can be useful for that. We never attempt to identify users.
The biggest part of the data we save is past copies of the Web which I believe will be very interesting years from now.
Any guesses what people are hoping to find when they search for "Yahoo"?
Funny question! Makes me wonder what kinds of interesting things you learn from looking at your search logs? Any surprising stuff in there???
Sergey Brin: Three hypotheses:
1) they are new users who don't know to go to yahoo.com
2) they are people who test the search engine
3) they use Google's I'm feeling lucky feature instead of bookmarks (this we could test)
However, they do only account for about 0.05% - one twentieth of one percent of all queries so these are not representative queries. In general people search for very diverse things. We have a monitor in our office where we display currently running queries (a small sample of them anyway) and here are a few over the past minute:
chilli beef recipes
Do you see Yahoo, Excite, Go, etc. as competitors? What about the Inktomis?
Sergey Brin: I see these companies as potential partners.
As far as Inktomi, we only compete on a small portion of their business.
Given the many different search engines currently being used what modifications and changes do you believe will take place in the WWW and how will search engines adapt to them?
Sergey Brin: I think that as the web grows, every parent will have a webcam pointed at their baby. That will be a lot of data to deal with. That is one challenge.
Another is that users will become more sensitive to how long it takes to do things and simpler faster interfaces will win out.
Sergey Brin: Leslie, it has been good talking to you and your viewers.
There were a lot of good questions. If you would like more
info please see our Web site - www.google.com.
This is a very exciting time.
I couldn't agree more about how exciting these times are! Thanks for joining us today, Sergey.
And thanks to all you folks who participated our Web talk show. That's all we have time for today. We were glad to have Sergey Brin with us, typing answers to our questions live from his office in California.
Hope to see you all back soon!
© Copyright 1999 The Washington Post Company