The Washington Post
Navigation Bar
Navigation Bar

Enter symbols
separated by a space:


Look Up Symbols
Portfolio

Made Possible by:
E*Trade


Related Items
 On Our Site
  • Most recent .com column
  • TechThursday
  • E-mail
    walkerl@washpost.com
  •  
    .COM – LIVE

    Hosted by Leslie Walker
    Washington Post Columnist
    Thursday, May 6, 1999 at 1 p.m.
    Leslie Walker
    ".com" columnist Leslie Walker

    Welcome to ".com - Live," a real-time, moderated discussion with the people who are shaping the business strategies in the era of electronic commerce. My guest this week is W.S. "Ozzie" Osborne, (below) the IBM executive who is leading IBM's efforts to change the way humans communicate with computers.
    W.S. Osbourne
    IBM executive W.S. "Ozzie" Osborne

    As general manager of IBM's speech and pen systems business unit, Osborne heads a team of more than 250 people working on technologies that allow computers to recognize and respond to human speech.

    Osborne was online Thursday to answer your questions about speech recognition technology: how it works, how it is moving beyond the desktop, and how being able to talk directly to computers is likely to impact people and the institutions where they work, learn and play.

    You can read more about IBM's speech recognition research and development projects.





    Discussion Transcript

    Leslie Walker: Hello all and welcome to our guest, Ozzie Osborne, who is here to talk about efforts under way to humanize our computers. Let's go right to the questions.


    Leslie Walker: It might help if you start with the basics. Can you explain why it's so hard for a computer to understand human speech, and give us a thumbnail on how it works?

    W.S. Ozzie Osborne: When you speak, the computer digitizes your vocal signal, breaking it into a grid of sounds and matching it up to established patterns. When the computer finds a match, it writes down the word you said. Speech recognition also allows you to command your computer. Once the computer recognizes and understands what you’ve said, it can then complete an action based on your spoken command, like delete a file or create a table in Word.

    There are over 200,000 words that the system has to recognize. It also needs to understand words that sound the same but are spelled differently, such as to, too,and two.


    Leslie Walker: Everyone I know is skeptical that voice recognition is ready for prime time--either in the world of business or the consumer market. What can you tell us about the current state of speech as an input device for computers--how accurate are the leading speech recognition programs today, and how widely are they deployed?

    W.S. Ozzie Osborne: Speech is still an emerging technology but has made great progress in the last few years. This is due the ability to recognize continuous speech as well as Natural language understanding.


    Washington DC: Do you really think speech will be a popular interface with computers in the corporate world? I mean, how noisy would the office be if everyone was talking out loud to their computers?

    W.S. Ozzie Osborne: As computer enter all walks of life including cell phones appliances and other devices that don't have keyboards or screens the only practical way is for speech.


    Denver: I have heard it takes a long time for these speech systems to learn a person's voice. Is that true--and how long do you have to use the dictation machines before they really know your voice?

    W.S. Ozzie Osborne: No that is not the case. For many speech applications it doesn't require any time. An example is speech application thru the telephone.For desktop applications the enrollment time is less than 15 minutes. today.


    Baltimore MD: Can you give us detailed examples of where speech recognition is saving any companies money today?

    W.S. Ozzie Osborne: The biggest area that speech is saving money is in call centers. Such as travel reservations, stocks, ...

    This save people as well as increases customer satisfaction.


    San Francisco: What is the minimum amount of computing power someone needs to run your consumer speech recognition software?

    W.S. Ozzie Osborne: The minimum storage require to run ViaVoice 98 is 166mhz processor and 32Mb of memory. However the more the better.


    ft. myer heights, va: For voice recognition to really work, won't we need an operating system that uses natural language rather than GUI stuff like Windows? Is that what MS Bob was all about

    W.S. Ozzie Osborne: Absolutely! we will see OS with a Conversational user interface (CUI). This will allow people to get the best productivity.


    Newport R.I.: What is the current state of the art in software that reads Web pages out loud to blind people? Does IBM have commercial software that does anything like that?

    W.S. Ozzie Osborne: IBM has a program call Page Reader that uses ViaVoice to read pages for the blind. If you need more information call me.


    Raleigh NC: Is the present state of the art for speech recognition such that you could have used it to automatically textualize your spoken responses to these questions in real time?

    W.S. Ozzie Osborne: Yes, we have a program called Online companion that supports chatting.


    Boston MA: When will I be able to talk to my Web page?

    W.S. Ozzie Osborne: Yes, in fact today with our products you can speech into your browser. IBM also has a speech browser that is in Beta that you can down load from our Alphaworks site.


    Leslie Walker: Jaguar has a new car hitting the market next month that will have 44 voice-activated commands, and most car manufacturers are showing concept cars that let you open your sunroof or activate your radio with your voice. How long do you think it will be --if ever--before we will be navigating our cars with speech?

    W.S. Ozzie Osborne: There is a lot happen in this space. We should see cars with speech navigation as an option within 2 years.


    Boston, MA: I read today about IBM working with Philips on speech recognition. Can you tell me a little more about that?

    W.S. Ozzie Osborne: Philips has licensed our Text to speech(TTS) engine and 9 languages. We together will develop more languages for TTS for both TTS as well as Speech reco.


    Fairfax, VA: If you were to choose a voice recognition system for the home computer user today, which would you select? We would something moderately priced and easy to use -tall order?-.

    W.S. Ozzie Osborne: Of course IBM ViaVoice on an IBM Aptiva. Viavoice costs from $49 to $149 depending on which product you select.


    Leslie Walker: IBM is using speech recognition to route internal calls at its North America call center. What kind of cost savings and error rates is IBM expecting from this technology? How are IBM employees responding--is there any frustration at dealing with a computerized voice?

    W.S. Ozzie Osborne: Yes we have the 200,000 employes up or our directory dialing application. We are seeing saving of 1/4 the cost from the previous system. We have had a very positive reaction from the people using it.


    Boston MA: What are some of the benefits of using speech recognition?

    W.S. Ozzie Osborne: It a productive interface. It allows you to converse in environments that are hand or eye free. It allows you to get information with out a computer such as telephony speech applications.


    bethesda,MD: Can you talk a little about the product that is being included in Red hat 6.0?

    W.S. Ozzie Osborne: We have include our speech recognition engine, the language data for US english, the tools to generate applications, as well as source for sample applications.


    Provincetown, MA: Do you have any plans to develop speech recognition for the MacIntosh?

    W.S. Ozzie Osborne: We have had a lot request for ViaVoice on the Mac. Today we have our Worldbook and Edmark products running on the Mac.


    Leslie Walker: Well, folks, we are about halfway through this fascinating discussion. I now realize Ozzie Osborne doesn't need speech recognition because he is a speed demon on the keyboard!
    By the way, Ozzie is typing his answers from an IBM Thinkpad 600E at a hotel in Washington, while I am entering questions from The Washington Post newsroom and our washingtonpost.com producers are monitoring the chat from their office in Rosslyn. So keep those questions rolling in!


    Washington, D.C.: Will your speech technology replace me as a court reporter in the near future? Will it have the ability to make a transcript of a roundtable discussion where 10 people are speaking -not at the same time.-

    W.S. Ozzie Osborne: Don't go get a job anytime soon. we have technology in our research lab that can identify and verify peoples voices. With that we can separate the speakers. However its not accurate enough to replace you anytime soon.


    Raleigh NC: So, the speech recognition software is what will make the phone the ultimate "dumb terminal." Would you care to hazard a guess on how long it will be before speech recognition, combined with that new IBM hardfile the size of a quarter, will permit super small, keyboardless, near full function palm tops?

    W.S. Ozzie Osborne: It will happen a lot sooner than people think. With the increase in bandwidth of cell phones, the increase power of chips, you will systems shortly that have the power of a pc but don't have a keyboard. Speech will be the interface of choice.

    With the growth of pervasive computing and with a conversational interface, computers will become transparent.


    BETHESDA MARYLAND: I use IBM'S via-voice now. Will this product be more accurate if I use a high end machine like a Pentium 3?

    W.S. Ozzie Osborne: The faster the system is the better accuracy you will get, as well as you will have more responsive result.


    Miami FL: When do you expect Microsoft to include speech recognition software in the Windows or NT operating systems?

    W.S. Ozzie Osborne: You will have to ask Microsoft.


    Reston, VA: What is the relationship of the voice recognition software you are talking about and "chatterbots"

    W.S. Ozzie Osborne: The speech you hear from a Chatterbot is a text to speech engine.


    Boston, MA: How did you get the nickname "Ozzie?"

    Leslie Walker: Yes, and we want to know whether you make music, too!
    :-)

    W.S. Ozzie Osborne: I work for IBM during the day because its fun. I than get the band together, put on the wig and make money at night.

    Seriously, I can't sing I've had the name longer than he has been famous. Even my father was called Ozzie.


    Vienna VA: Is IBM working with America Online to make speech recognition work with AOL's internal software? How about with the AOL Anywhere vision--any partnerships going on there?

    W.S. Ozzie Osborne: Ibm ViaVoice and AOL would be a great solution. How ever I can't discuss what we are doing.


    Rosslyn Hts., VA: From time to time I read about Microsoft efforts to 'own' the standard on voice recognition. Will that happen?

    W.S. Ozzie Osborne: IBM is committed to open industry standards. We are working with many in the industry to develop these stands on all platforms. Today we support Jaspi, SAPI 4.0, and working on developing the VXML stardards with Motorola, Lucent, and AT&T; All of these are important standards to grow the speech industry


    Washington, DC: Earlier you said "Yes, we have a program called Online companion that supports chatting." Why aren't you using this now? Why are you typing your answers on your ThinkPad?

    W.S. Ozzie Osborne: I have used it before on other chats, but I'm not using it now since I'm on the phone with Leslie. It just gets to confusing to talk two people at the same time.


    Herndon VA: Following up on the question from Provincetown MA... do you have any plans to do a speech recognition product for Mac?
    -you didn't answer the original question-

    W.S. Ozzie Osborne: Let me say I didn't say No.


    Rockville, MD: What kind of research is IBM doing to improve speech recognition, and what kind of breakthroughs are you aiming for?

    W.S. Ozzie Osborne: The key technology that we are working is Natural languages Recognition. This will allow people to converse with the computer in a normal manner. The computer will be understand us as apposed to today were we have to learn how to use computers.

    This will make the interface more accurate, since the system will ask questions when it doesn't understand. Just like us Humans.


    Wash. D.C.: Which hardware is most important for effective speech recognition: microphone, sound card, or CPU? Do you have success with built-in sound, or is a separate sound card important?

    W.S. Ozzie Osborne: They are important. Many of our recognition problems are due to faulty sound cards, low quality sound chips, and low quality microphones.

    The system CPU power and larger memory will increase accuracy.


    Leslie Walker: You also supervise interface research for projects other than speech recognition. Can you tell us other kind of work your teams are doing, such as with handwriting recognition? Also, should we be expecting to see brand new computer interfaces besides speech in the future?

    W.S. Ozzie Osborne: My group has Pen as well as speech technology. We believe the best interface will be multi modal, pen and speech.


    Seattle: Where are the most interesting--and exciting-- business applications currently using speech recognition? Can you give us examples?

    W.S. Ozzie Osborne: We have many, of course dictation is commonly used in the professional field, Doctors, Lawyers,...

    There are many telephony examples such as mutual funds, travel, home banking. However best use for me is the many Dyslexic students that have become highly motivated after using our product. It makes my job very fulfilling.


    Arlington, VA: I know that Lernout & Hauspie -sorry to mention that dread name!- is working on some text-to-speech with more natural-sounding inflections. Is IBM doing any work along these lines. Most TTS applications sound so wooden, and they're hard to understand.

    W.S. Ozzie Osborne: We have a very natural sounding technology in our telephony products. We also have some breakthrough technology in our research lab in Yorktown.


    Washington, DC: Earlier you said "Yes, we have a program called Online companion that supprts chating." Why aren't you using this now? Why are you typing your answers on your ThinkPad?

    W.S. Ozzie Osborne: I have used it before on other chats, but I'm not using it now since I'm on the phone with Leslie. It just gets to confussing to talk two people at the same time.


    Herndon VA: Following up on the question from Provincetown MA... do you have any plans to do a speech recognition product for Mac?
    -you didn't answer the original question-

    W.S. Ozzie Osborne: Let me say I didn't say No.


    Rockville, MD: What kind of researh is IBM doing to improve speech recognition, and what kind of breakthroughs are you aiming for?

    W.S. Ozzie Osborne: The key technology that we are working is Natural lanugae Recognition. This will allow people to converse with the computer in a normal manner. The computer will be understand us as apposed to today were we have to learn how to use computers.

    This will make the interface more accurate, since the system will ask questions when it doesn't understand. Just like us Himans.


    Wash. D.C.: Which hardware is most important for effective speech recognition: microphone, sound card, or cpu? Do you have success with built-in sound, or is a separate sound card important?

    W.S. Ozzie Osborne: They are important. Many of our recognition problems are due to faulty sound cards, low quality sound chips, and low uality microphones.

    The system CPU power and larger memory will increase accuracy.


    Leslie Walker: You also supervise interface research for projects other than speech recognition. Can you tell us other kind of work your teams are doing, such as with handwriting recognition? Also, should we be expecting to see brand new computer interfaces besides speech in the future?

    W.S. Ozzie Osborne: My group has Pen as well as speech technology. We believe the best interface will be multi modal, pen and speech.


    Seattle: Where are the most interesting--and exciting-- business applications currently using speech recognition? Can you give us examples?

    W.S. Ozzie Osborne: We have many, of course dictation is commonly used in the professional field, Doctors, Lawyers,...

    There are many telephony exeamples such as mutual funds, travel, home banking. However best use for me is the many Dyslexic students that have become highly motivated after using our product. It makes my job very fulfilling.


    Arlington, VA: I know that Lernout & Hauspie -sorry to mention that dread name!- is working on some text-to-speech with more natural-sounding inflections. Is IBM doing any work along these lines. Most TTS applications sound so wooden, and they're hard to understand.

    W.S. Ozzie Osborne: We have a very natural sounding technology in our telephony products. We alsohave some break through technology in our research lab in Yorktown.


    Washington, DC: Earlier you said "Yes, we have a program called Online companion that supprts chating." Why aren't you using this now? Why are you typing your answers on your ThinkPad?

    W.S. Ozzie Osborne: I have used it before on other chats, but I'm not using it now since I'm on the phone with Leslie. It just gets to confussing to talk two people at the same time.


    Herndon VA: Following up on the question from Provincetown MA... do you have any plans to do a speech recognition product for Mac?
    -you didn't answer the original question-

    W.S. Ozzie Osborne: Let me say I didn't say No.


    Rockville, MD: What kind of researh is IBM doing to improve speech recognition, and what kind of breakthroughs are you aiming for?

    W.S. Ozzie Osborne: The key technology that we are working is Natural lanugae Recognition. This will allow people to converse with the computer in a normal manner. The computer will be understand us as apposed to today were we have to learn how to use computers.

    This will make the interface more accurate, since the system will ask questions when it doesn't understand. Just like us Himans.


    Wash. D.C.: Which hardware is most important for effective speech recognition: microphone, sound card, or cpu? Do you have success with built-in sound, or is a separate sound card important?

    W.S. Ozzie Osborne: They are important. Many of our recognition problems are due to faulty sound cards, low quality sound chips, and low uality microphones.

    The system CPU power and larger memory will increase accuracy.


    Leslie Walker: You also supervise interface research for projects other than speech recognition. Can you tell us other kind of work your teams are doing, such as with handwriting recognition? Also, should we be expecting to see brand new computer interfaces besides speech in the future?

    W.S. Ozzie Osborne: My group has Pen as well as speech technology. We believe the best interface will be multi modal, pen and speech.


    Seattle: Where are the most interesting--and exciting-- business applications currently using speech recognition? Can you give us examples?

    W.S. Ozzie Osborne: We have many, of course dictation is commonly used in the professional field, Doctors, Lawyers,...

    There are many telephony exeamples such as mutual funds, travel, home banking. However best use for me is the many Dyslexic students that have become highly motivated after using our product. It makes my job very fulfilling.


    Arlington, VA: I know that Lernout & Hauspie -sorry to mention that dreaded name!- is working on some text-to-speech with more natural-sounding inflections. Is IBM doing any work along these lines. Most TTS applications sound so wooden, and they're hard to understand.

    W.S. Ozzie Osborne: We have a very natural sounding technology in our telephony products. We also have some break through technology in our research lab in Yorktown.


    W.S. Ozzie Osborne: Thank you to everyone for their time. I have enjoyed it. Unfortunately we are out of time. If you want to send me more questions or want more information on our products you may e-mail me at ozzie@us.ibm.com.

    Bye all and watch out for those bat heads


    Leslie Walker: We're glad Ozzie Osborne could be here today. This has been a wide-ranging discussion. Great questions, and informative answers.

    Thanks so much to Ozzie for taking time during his business trip here to talk with us, and thanks to all of you who joined in. Hope to see you back here in two weeks!


    © Copyright 1999 The Washington Post Company

    Back to the top

    Navigation Bar
    Navigation Bar