August 01, 1998, Issue: 908
Section: Reviews
Voice Recognition Makes Itself Heard
Owen Linderholm
Ever since 2001: A Space Odyssey's HAL made talking to your
computer a popular idea, voice recognition has been a goal for several software companies.
A few years ago, Dragon Systems, IBM and Kurzweil
introduced PC voice-recognition programs that take dictation. To put it mildly, these
programs weren't very good. But last year, releases from Dragon and IBM took a huge step
forward. Although they still required you to spend long hours training the programs to
understand your voice inflections, the dictation actually worked.
The latest releases of these programs take more than
another step forward; they're completely new and advanced products. We looked at beta
versions of Dragon Systems' new NaturallySpeaking Preferred Edition 3.0 and IBM's new
ViaVoice 98 Executive Edition, as well as VoiceXpress Plus from Lernout & Hauspie
(which purchased Kurzweil). All three let you dictate and correct, control and format your
document purely through verbal commands. In addition, both NaturallySpeaking and ViaVoice
allow even greater control over your computer by permitting you to dictate into almost any
application. Though ViaVoice also gives you control over much of Windows itself,
NaturallySpeaking proved the best, with superior voice-recognition accuracy and speed, and
acceptable voice control of applications and Windows.
NaturallySpeaking Preferred Edition 3.0
NaturallySpeaking is the least obtrusive of the three
programs I tested-once it is up and running, it waits as a small microphone icon on your
taskbar. However, the process of getting the program going isn't easy. As with other
products, you must perform enrollment, a training process that teaches the software to
adapt to your voice and inflections. In NaturallySpeaking's case, you need to choose one
of several passages to train with; each takes about half an hour to complete.
NaturallySpeaking, like the other programs, also adjusts audio and microphone volume
levels and measures the sound level in the background.
Once you've finished reading the enrollment passage,
NaturallySpeaking spends an additional quarter of an hour processing your input and
adapting to the way you speak. Finally, the setup wizard asks you to load a representative
sample of your existing documents into its Vocabulary Builder. It analyzes these to
identify your typical sentence structures and to notice the kinds of words you use. At the
end, it asks you to train the system on all those words it doesn't already recognize.
Finally, NaturallySpeaking runs a short multimedia
presentation on how to make best use of the program before calling up the application
itself. NaturallySpeaking automatically launches its main screen, which is a simple word
processor for taking dictation, with an extended menu system to control voice recognition
operations and options.
I found NaturallySpeaking remarkably accurate right after
enrollment and a quick practice: I dictated my 495-word test document straight into the
program, correcting errors as I went. NaturallySpeaking's big advantage over its
competitors is a winning combination of superior initial recognition and the ability to
select phrases as well as single words for correction.
Why is that important? Voice recognition programs don't
typically recognize single words in context. For example, the sentence "It was too
much to have to eat two pies" has four occurrences of a word pronounced the same way,
but with three meanings and spellings. That means you have to say the whole sentence at
once rather than pausing after each word. Only programs that can make use of context could
get this right. NaturallySpeaking actually heard this sentence correctly the first time.
The point is, when these programs make mistakes (as they
often do), they typically make mistakes over whole phrases rather than single words.
Consequently, you want to be able to correct the whole phrase in one shot, and
NaturallySpeaking is the only program that lets you rapidly select and correct an entire
phrase.
Because making corrections was so easy, I was able to
dictate and correct my test document (with business text, a normal prose passage and a
short snippet of poetry) in only 11 minutes, 28 seconds, at a speed of 43 corrected words
per minute. I'm not a lightning-fast typist, nor terribly slow, but I was able to retype
the same passage-again, going back and completely correcting it-at 37.4 corrected words
per minute.
We chose to measure corrected words per minute for this
test since most users are looking to create final, accurate documents. Most importantly, I
was able to dictate and correct this document without touching the mouse or keyboard,
making NaturallySpeaking truly beneficial for those unable to type because of physical
handicaps, ailments or illness.
NaturallySpeaking falls short in one area, however.
Although it can control individual applications, it cannot effectively control the
operation of Windows itself. An application called MouseGrid lets you move the cursor
around the screen and select and click items. You can also select most items in menus and
dialog boxes using only your voice, but it's too much of a struggle to be truly
worthwhile. IBM's ViaVoice 98 Executive Edition does a better job of controlling the
Windows environment easily, even though it lacks a function to directly control the mouse
cursor's position.
Overall, NaturallySpeaking Preferred Edition is a
breakthrough in dictation and voice recognition. At $159 it's the most expensive of these
three programs, but it's still relatively cheap. Dragon Systems also sells a less capable
version (the Personal Edition) for $99 and a simple dictation-only program called Point
& Speak for $59. NaturallySpeaking Preferred 3.0 is the first voice recognition
program we've seen that has real utility, and it easily earns a spot on the WinList.
ViaVoice 98 Executive Edition
ViaVoice 98 actually comes in three flavors: Home, Office
and Executive versions. The Home Edition lets you dictate into its own word processor,
called SpeakPad, and into Microsoft's Word 97; you can control and format documents in
both. The Office Edition adds the ability to control the Windows Desktop and to launch,
control and close other applications. It also includes an extended vocabulary for business
and finance. The Executive Edition adds the ability to dictate text into any application
at all, with correction and voice command functions in all applications.
I tested the Executive Edition. It proved easy to install
and set up, and took about half an hour to enroll.
ViaVoice 98 uses a VoiceCenter command bar across the top
of the screen that provides access to voice recognition commands and options. Very
extensive help is always available and can be opened and searched using simple voice
commands.
Dictation was decent, but not as good as in
NaturallySpeaking Preferred Edition 3.0. Although you can correct as you go along in
ViaVoice, I eventually found it faster to dictate the entire document and then return to
fix errors. Using this technique, I was able to dictate and correct my test document in 14
minutes, 20 seconds, for a speed of 34.5 corrected words per minute. However, I did have
to use the mouse a couple of times. As an example of the top speed you can expect to get
with these programs, I also measured how long it took to dictate and correct the document,
making full use of the mouse and keyboard where expedient. Using this technique, I
finished the test document in 10 minutes, 43 seconds for a speed of 46 corrected words per
minute-close to NaturallySpeaking's speed without the keyboard and mouse.
I found ViaVoice 98 to be good at recognition and very easy
to use, but corrections can be tedious since the program can only handle a single word at
a time. Making corrections using voice-only was a struggle. In particular, I had trouble
spelling words out using letters and had to resort to using the alpha-bravo-charlie
phonetic alphabet-a useful tool to guarantee accuracy, but a time-consuming process.
Launching applications is a snap with ViaVoice 98. You
simply say something like "Open program Calculator" to launch Calculator. You
can then say what you want the Calculator to do: For example, you'd say "two thousand
and seventeen divided by twenty-three equals" to produce an answer. You can also give
commands like "Move Notepad window two inches right." All in all, command and
control of Windows and applications in ViaVoice 98 was a snap.
ViaVoice 98 includes a lot of voice tools, as well as a
vocabulary extender that analyzes your existing documents for writing style and vocabulary
to improve recognition. IBM also provides very good documentation on how to use the
program most effectively-far more than any of the other programs.
I would recommend ViaVoice 98 Executive Edition to anyone
who needs or wants to use voice to control the overall operation of a PC. It isn't as good
for dictation as NaturallySpeaking Preferred, so it doesn't make the WinList. But at $149,
ViaVoice 98 is a very good value and a worthwhile consideration.
VoiceXpress Plus
Lernout & Hauspie's VoiceXpress Plus is the oldest of
these programs. It's also the only one I looked at in a final shipping version.
Installation and enrollment was harder for VoiceXpress Plus than it was for the other
programs. It takes a very long time to load whenever it's used, and since it has the worst
recognition of the three (but still far better than previous-generation products), it also
takes longer to enroll.
I found myself repeating sections a couple of times to get
them right. Enrollment takes well over an hour.
In most other respects, VoiceXpress Plus is remarkably
similar to ViaVoice 98. Both programs use a command bar at the top of the screen to
control voice recognition commands and options, but VoiceXpress Plus only works within its
own VoicePad application and Word 97.
VoiceXpress Plus has good initial recognition-about as good
as ViaVoice 98. However, it's much harder to correct errors in VoiceXpress. You can select
and correct single words by saying "correct" followed by the incorrect word to
bring up a correction dialog box. But, unlike the other programs, VoiceXpress Plus doesn't
offer a list of potential matches. And you must spell out your correction into the dialog
box. While this is sometimes the method you have to fall back on with the other two
programs, it's the only option in VoiceXpress Plus, and it makes correction slow and
difficult. It took 20 minutes, 26 seconds to dictate and correct the test document with
VoiceXpress Plus, and I was forced to use the keyboard frequently. This translates to 24.2
corrected words per minute-not terrible, but not nearly as good as NaturallySpeaking
Preferred or ViaVoice.
VoiceXpress is easy to use and gives you a few novel
features, such as the ability to set it for female or male speakers in various age ranges
before you begin enrollment. It has a very good vocabulary extender, a lot like that of
NaturallySpeaking Preferred. But it doesn't recognize speech well enough or work for
anything other than dictation into its own word processor or Word 97. It is, however, the
cheapest of the three programs and is perhaps more fairly compared to the lower-end
versions of the other programs. Even so, they still have better basic recognition
accuracy. We recommend you look at NaturallySpeaking Preferred or ViaVoice 98 instead.
Coming in loud and clear
There is more in common among these programs than there are
differences. They all ship with headset microphones to assure good audio quality. They all
do an adequate or better job of dictating into word processors, and they all allow you to
format, correct and control your document while dictating. They are also all susceptible
to abrupt changes in background noise and require extensive enrollment to be effective.
Paradoxically, all work best if you can keep up a decent clip while speaking, since they
use context to recognize words and differentiate commands from dictation.
NaturallySpeaking Preferred Edition 3.0 and ViaVoice 98
Executive Edition stand out because of their ability to work with other applications,
control the Desktop and, in particular, produce higher recognition quality. ViaVoice 98
gets the nod if you need more control over the Windows environment. But if pure dictation
is your primary concern, get NaturallySpeaking Preferred Edition 3.0-we're adding it to
the WinList.
--Quick View--
NaturallySpeaking Preferred Edition 3.0
Bottom Line: Best program for pure dictation into any
application
Price: $159
Platforms: 95, NT
Pros: Takes dictation and lets you correct without mouse or
keyboard; very accurate
Cons: Hard to control Windows; less formatting and fewer
control commands for documents
Strongest Rival: ViaVoice 98 Executive Edition
Dragon Systems, 800-4-DRAGON, 617-965-5200. Winfo #671
--
ViaVoice 98 Executive Edition
Bottom Line: Good program for dictation that can also
control all applications and Windows by voice
Price: $149
Platforms: 95, NT
Pros: Good control of Windows interface; can take dictation
into any application; decent accuracy
Cons: A bit harder to correct unrecognized words
Strongest Rival: NaturallySpeaking Preferred Edition 3.0
IBM Corp., 800-825-5263, 914-642-3000.
--
VoiceXpress Plus
Bottom Line: Decent accuracy, but can format and control
only in Word
Price: $99.99
Platforms: 95, NT
Pros: Accuracy; decent format and control in Word
Cons: Can't control Windows or applications other than
Word; difficult to make corrections
Strongest Rival: NaturallySpeaking Preferred Edition 3.0
Lernout & Hauspie, 800-380-1234, 781-203-5000. Winfo
#769
SIDEBAR: Ten Tips to Help Your PC Hear You Better
1. Use the headset microphone that comes with the program,
not the crummy microphone that came with your PC.
2. Work in a consistent environment; keep background noise
as low as you can make it.
3. Do all the training and enrollment the program wants.
4. Always go back and correct dictation mistakes properly
using the program, rather than just retyping them. This helps the program to learn.
5. Use the vocabulary extender with your existing
documents.
6. Be sure different users enroll with their own settings.
7. Make different enrollments and speaker settings for
different environments: notebook, in office, at home, traveling on an airplane and so
forth.
8. Adjust your microphone carefully and reset its volume
level every time you put it on.
9. Speak clearly and distinctly, but without pausing.
10. Give the program time to learn the way you speak. It
really does get better and better.
Copyright ® 1998 CMP Media Inc. |