| Page 5 of 5 < |
Interview With Barney Pell and Ramez Naam About Microsoft?s Powerset Acquisition: Integration By End Of Year
|
Discussion Policy
Comments that include profanity or personal attacks or other inappropriate comments or material will be removed from the site. Additionally, entries that are unsigned or contain "signatures" by someone other than the actual author will be removed. Finally, we will take steps to block users who violate any of our posting standards, terms of use or privacy policies or any other policies governing this site. Please review the full rules governing commentaries and discussions. You are fully responsible for the content that you post.
|
MA: Barney, how many of you?re current employees, how many of your employees previously worked at Microsoft? Did anybody get hired back after leaving Microsoft?
BP: Actually I have not counted. I think we have a few Microsoft people, but it is not a high proportion.
MA: One of the things that has obviously hindered Powerset is that you need to index the entire web in a different way than search engines index them today because as you say your reading web pages instead of just noting key words and publicly you have said that you are not prepared to do that yet because it costs money and you wanted to prove it out with the beta product that looks at Wikipedia first. Beyond the fact that it is more expensive to index the web that way, that?s obviously, expense is not as much of an issue now that you are part of Microsoft, how long will it take. If you turned on the gas now full blast and wanted to launch a full version of Powerset that indexed the web, what is the fastest we could expect to see it.
BP: Umm, we are just getting together as a team to look technical integration and look at the best ways for our teams to work together and how we are going to combine and really leverage the resources that Microsoft has, so it is early a little early to say how long it is going to take before you see it. What I can say?
MA: Barney you have become media trained.
(Laughter)
MA: (Mocking) We are Microsoft. We cannot comment on future product releases. You gave me thirty seconds of nothing.
BP: No, No I prefaced it (laughter). I am not finished yet. With all that said about what I can or can?t actually say, what I think I can say is that Powerset has already been doing some experiments processing web pages. Arbitrary, random web pages using our technology, and those results are looking pretty good. It is already a pretty parallel system, so to some extent the basic experience you see right now could be replicated just by running the larger set of content that Microsoft already has using our technology running on the machines that Microsoft already has. Now that doesn?t mean that you would get the full search experience because there?s all the rest of the features that Microsoft has developed that we would want to integrate together to give a really coherent and good search experience. But some of the things you see already like the facts that Powerset extracts from the documents, to building profiles automatically of any kind of concepts that you have and the ability to show the pages with their automatically generated summaries. A lot of those features could really be done, at least to some level of quality today just by running it on a Microsoft infrastructure with resources that exist today. So we are going to have to figure out on what order are we developing what, but we feel that fundamentally the challenges of getting this up to web scale, the main barriers that were in our way, with Microsoft are now removed.
RN: A.) Barney is really showing his media training here, I am really impressed. His answer is also spot on, and something to bear in mind is, at this point, it has been primarily senior people across the teams that have been talking. And we really do have a very bottoms up culture inside of search. I think Powerset does as well. So we are going to connect more and more engineering teams now that we have announced this and we can start working on detailed plans. What we have super high confidence in is that this is a great fit, with great people. The cultures are actually very similar, and this is right on strategy with what we see as the big barriers to customers getting high quality results. You are going to see some short term stuff. We are going to get some stuff out there that is available to you on the live search site before the end of this year for sure. And then we are going to, as Barney was saying, take the current technology and start to scale it out out out. And will we go straight from wikipedia to the entire web? Will we have some interim stuff? I am not sure yet. But we will start scaling it up, and getting more and more benefit for customers over time.
MA: So do you think that you will launch this technology on live search, or will you launch something on Powerset, and sort of keep the brands separate for a while? Or are you ditching the Powerset brand? Have you thought about that yet?
RN: We are going to keep Powerset alive, we think it is a fantastic technology showcase, and we will probably always have some things that are really interesting to play and show people, but that aren?t quite ready yet to be exposed to all of our customers. But what really is the payoff for us is integrating the Powerset technology deep within live search, and really making that product the one that really shines, in addition to Powerset. We want to take Powerset?s technology and really broaden it out and impact tens of millions of people, if not hundreds of millions of people with the benefits of what Powerset brings.
MA: In December of last year, Peter Norwig, head of research at Google, was interviewed, and he said some things about natural language search that were interesting, and I?ll link to this when we post the podcast, but he said that, I will quote him. I would love to get your guys? reaction out of this on just a product and science level. ?We don?t think it?s a big advance to be able to pose something as a question as opposed to keywords. Typing what is the capital of France won?t get you better results than capital of France.? To me that doesn?t really respond at all to what Powerset is promising to do, and what it is already doing with wikipedia. But then he went on to talk about the limiting value, in his opinion, of natural language search. He said, ?We think that what?s important about natural language search is the mapping of words to concepts that users are looking for.? He gives some examples: New York is different from York, but Vegas is the same as Las Vegas, and Jersey may or may not be the same as New Jersey. That is a natural language aspect that we are focusing on. Most of what we do is at the word and phrase level. We are not concentrating on the sentence. We think its important to get the right results rather than change the interface. What is your response to that?
RN: I think what Peter Norwig is saying has some degree of accuracy and that he is also ignoring some things. So, just for normal queries, queries that are not phrased as questions, there is a lot of linguistic structure. If someone types in a query that is ?2 bedroom apartments, under 1000 dollars, within a mile of Portero Hill.? That query is loaded with linguistic content. And that?s a realistic query. That is the type of thing that customers actually want to find on the web. Today there is a sort of helplessness, where customers know that certain queries are too complicated, and they wont even issure them to a search engine. They will go to some deep vertical search engine where they can enter different data into different boxes. What is the capital of France vs. capital of France; that is not really an area that is that interesting. But some of these more complex queries really are. For example, shrub vs. tree. If I do a search for decorative shrubs for my yard, and the ideal web page has small decorative trees for my garden, it really should have matched that page and brought it up as a good result. But today Google won?t do it, Yahoo won?t do it, and Live won?t do it. So even in these normal queries there is a lot of value in the linguistics.
BP: That?s right. So in addition, Powerset just launched the product, and I think that some of the features are really well called out in the iPhone product that we just launched. It?s just another version of our web site, but designed to be used on an iphone. And Mike you have sort of blogged about it. I have been using Powerset on a mobile device ever since we launched, and it?s kind of funny because you have a very limited real estate, and you know what you want in your head but you know it is going to take a long time for the pages to come up. I see a movie, Iron Man, and I wanna know, what other movies did Jeff Bridges star in? How do you want to ask that question? How do you want to get the information? You want to say what movies has Jeff Bridges starred in? Who was that blonde reporter in Iron Man? How are you supposed to ask that? All these things that we think in our head in language, and then we have to figure out how to translate it. It doesn?t mean that you should have to do more typing to get back worse results, but it means that you should be able to do anything in the most natural method possible. We are humans, language is our unique human endowment, yet we have not been able to take advantage of that when interacting with machines.
MA: Wait, wait, wait. So I have been an internet user for 13 years now roughly, and I know better than to type a sentence into a search bar. What I would do is?
BP: That?s the learned helplessness.
MA: Yes, but what I would to is type in iron man, and look up the name of the blonde reporter from there. I have learned to do that because I have been using the internet for so long. Do you think that anyone still searches that way anymore with long sentences? It seems we tried in the early days and realized it didn?t work. So, does anyone even bother searching that way? And a follow up question would be, Barney, with regards what you are seeing in the Wikipedia engine, are you seeing longer queries sort of slowly developing as people learn to speak to a search engine?
BP: Let me answer the question of does anybody actually search this way. The answer is yes, people do this. It isn?t the most common mode, but we do see that probably 5% of queries are natural language queries. These are not all queries that are phrased in complete sentences, but they are queries where the customer has issued something that has some sort of linguistic structure. Almost any query with a preposition: X and Y, A near B, attribute A of Y, etc. Those things are loaded with linguistic structure.
BP: So there?s a couple pieces. One was does anybody do things? I think we all have the experience - if you just get your most basic expression query and your system comes back with a result that?s good enough you?re done and you?re happy. Well, what is it that happens when you don?t get back the result the first time? You have that moment of frustration and you know you?re in for a project. What happens is that moment of prayer, where you?ve basically tried a few different versions and you?re just frustrated and how do you express your query? You express your prayer and you say just let me say what I want and I know I?m not going to get results, but darn it, I?m just going to poke.
MA: I think that?s why so often Yahoo Answers pops up, because they have those questions that are literally a quotation of the question. Somebody else has asked and answered it, but that may not be the best resource for the answer, but it?s the best place to the search engine can find to send me to.
RN: I have a list of some natural language queries in front of me. Can we just show you some queries that our customers have actually sent to us and are random examples. The first person to see the dark side of the moon. How to get a credit card in Malaysia. Enabling system restore in group policy on domain controller. Timeline of Nvidia. How to measure for draperies. What is the difference between Mrs. and women?s sizes? Does my baby have acid reflux? I could just go on and on and I. These fit in the category that we?ve labeled that match about five percent of queries and they?re really just cases where the customer can?t think of a simpler way to express it.
BP: Now I?m going to elaborate Mike on you?re second part of you?re question, which was Powerset launched and have we seen that users are actually doing anything regarding natural language and if the queries look at all different.
MA: Yes.
BP: And the answer is absolutely yes. Our users have had absolutely no problem at all in throwing longer, more interesting, more complex at the system. You know, it?s just a flood of them and so when we watched the initial queries come in at launch, it was kind of a fun moment for us because it was some sense of initial reputation. There was no issue about could users use English or use ways of expressing themselves in all of their daily lives, could they actually manage to do that with a search engine if given the chance. Absolutely, if users are given the chance, the users do and users will. I want to go back about another point though. We don?t want to harbor all on the query side and expression of intent, because all of these billions of documents you?ll look at are all loaded with language. So the ability to read them in advance and extract the key information and then use that, even if you just did a small little simple query by automatically generating a profile. As for example Henry VIII I think you?ve blogged about Mike. Or, when you?re reading an article, giving you the summaries of the article.
MA: You return answers, not web pages sometimes and that?s amazing.
BP: We return answers. We actually synthesize, so if you were to say, ?What did Tom Cruise star in,? you actually get not just the movies, but the cover art for the different movies. It synthesizes multiple pieces of information to give you a whole different kind of presentation. Or, if you were just to say, ?Bill Gates? you?d be given an automatically generated profile of Bill Gates, pulled across many, many articles. It?s no longer just about 10 links, although we can certainly do more relevant job (and will) of the blue links, and a better job of presenting those links. With the language understanding systems which we now have, we can go way beyond that and open up a whole new door in user experience until you think, ?oh god, that?s how I used to search, now I want this whole new different kind of thing.? And now the question is, which are users are asking, is how do I get this on the whole web and with this partnership we?re now going to deliver.
MA: Ok, I?m out of questions. This was really helpful. There?s a million other things that I?d love to ask, but you?re not going to answer them yet. I look forward to seeing the Powerset technology launch with a full web index and Microsoft?s ranking technology behind it. I think it?s going to be great. Ramez are you promising a full launch, or some kind of launch by the end of the year? You mentioned that earlier in the podcast.
RN: What I?m saying is that by the end of the year you will definitely see Powerset technology improving the experience for customers on Live Search.
MA: Ok. Alright guys, thanks very much for your time and congratulations to both of you.
BP: Hey thanks Mike. Bye.
RN: Bye.


![[techcrunch]](http://media.washingtonpost.com/wp-dyn/content/graphic/2008/04/04/GR2008040401977.gif)
