By Rob Pegoraro
Sunday, September 10, 2006
Science fiction has been clear about this for a long time: Your future doesn't have a keyboard in it.
Instead of typing to enter data or issue instructions to a computer, you will just speak out loud. Like Han Solo with C-3PO or Capt. Kirk on the Enterprise's bridge, you'll say what you want, and the computer will understand you.
Somehow, though, that future hasn't arrived. We're well into the 21st century, with TVs hanging on the wall and robots washing the floor -- yet we still communicate with computers by pounding away on plastic buttons.
For years, we had the excuse of inadequate voice-dictation software that required lengthy training to learn the intricacies of your voice. The latest version of the leading program in this category, Nuance Communications Inc.'s Dragon NaturallySpeaking ( http://www.nuance.com ), delivers on the long-standing promise of keyboard-less "typing."
The software is available for Windows 2000 or XP and costs $200 for the "Preferred" edition I tested or $100 for a pared-down "Standard" release. After installing this software, I spent a few minutes reciting three snippets of text, then let the program read through my e-mail and word-processing documents to get a grasp of my vocabulary.
Then I stared at the blinking cursor in momentary panic. My usual writing style is to hammer out the front end of a sentence before deciding how I'll finish it, an approach guaranteed to confuse a dictation program.
After diagramming a sentence or two in my head, I started talking . . . and the software functioned as advertised. The correct words appeared on the screen moments after I spoke them.
Dragon's software is smart enough to notice pauses and drop in commas and periods appropriately, though you still have to specify other forms of punctuation: "Wow, this program really works exclamation point."
Fixing the occasional transcription error is a matter of identifying the incorrect text and letting Dragon offer its next best guesses of what you'd said. When "surge protector" appeared as "certain protector," for example, saying "Select certain protector" produced a list of Dragon's other interpretations, allowing me to call out the number of the correct option: "Choose 2."
Spelling a strange word or abbreviation requires telling Dragon to go one letter at a time -- "spell mode on" -- before reciting those characters, one at a time.
You can select and edit text and move around in a document by speaking simple commands: "Move to end of line," "Move left five words," "Select next word."
After a week of using Dragon to answer e-mail, send quick messages to colleagues, take notes in a Word document and write my e-mail newsletter, a few things have become clear.
Not only is it possible to abandon the keyboard, but it's also surprisingly pleasant. Being able to sit back and speak your thoughts into the headset microphone included with the Dragon software instead of hunching over the keys is relaxing.
But Dragon also forces you to pick and choose among your software. Programs -- for instance, Firefox, Thunderbird and OpenOffice -- that aren't built from the same standard blocks of Windows code as Internet Explorer, Outlook Express or Microsoft Office make revising text by voice difficult.
This is the same issue faced by "screen reader" programs that let people with limited or no vision use a computer.
Mac users face a different software-compatibility problem: No Mac OS X version of Dragon is available. And the one current Mac dictation program, MacSpeech's iListen -- http://www.macspeech.com , Mac OS X 10.3 or newer, $179 -- requires much more extensive training to approach Dragon's accuracy and features a far clunkier correction mode.
Talking instead of typing also demands far more accuracy the first time around.
When I tried entering a story from Thursday's Post into Microsoft Word, microphone easily beat keyboard when I didn't stop to fix any errors: I needed just 2 minutes and 22 seconds to recite the story, versus 4 minutes 16 seconds to type it.
But both versions had more than 20 mistakes, with more of them in the Dragon version. (And since a dictation program that matches your speech to entries in a dictionary can't misspell anything, Word didn't flag Dragon's errors.)
Then I tried for complete accuracy, correcting each typo and, er, "speako" as it happened. I had the story typed in less than five 5 minutes, but I needed more than 12 using just my voice.
In this respect, speech-recognition software works like such earlier ventures into handwriting recognition as Palm's Graffiti or Apple's Newton: You have to let the software train you before it will work to its fullest.
With a program such as Dragon, you must reprogram yourself to think in complete sentences, avoiding "ers" and "ums" and maintaining a steady volume and cadence.
But it will always take longer to jump to an error in a document by speech than by flicking the cursor there. And once you've put your hands back on the keyboard and mouse, it's hard to break the habit of using them to edit copy.
Lastly, there's the location I used for my tests -- my home. After editing my first text in Dragon, I realized that my fellow cubicle dwellers would kill me if they had to spend an entire day listening to me chirp out commands like "Select previous word . . . scratch that."
I also couldn't avoid thinking of how I would not only invite, but force every co-worker within earshot to read over my shoulder.
The social implications of "typing" by voice may never be overcome (or, perhaps, the same lack of manners that allows some creeps to talk on cellphones in public toilets will let everybody yammer away into microphones in public).
But the editing process can and ought to be improved. What if a computer could use its own webcam to track your eyeballs and see what part of the screen you've focused on? What if it had a touch-sensitive screen that permitted you to designate and move text with a swipe of a fingertip?
In the meantime, many users contemplating a switch to voice input may keep coming to the same conclusion uttered by one of sci-fi's most infamous English-speaking computers, HAL 9000: "I'm afraid I can't do that."
Living with technology, or trying to? E-mail Rob Pegoraro atrobp@washpost.com.
View all comments that have been posted about this article.