Welcome to the 21st Century Eloquence Page,
where we discuss new technological advances,
enhancements and inventions toward Speech / Voice
Recognition Technology. This page is updated
regularly, so you may wish to revisit it every
quarter or so.
Speech recognition software is being used
every day by hundreds of thousands of people.
Have you ever tried to call with a calling card
and get the message "Please say Collect,
Calling Card, Third Number, Person to Person, or
Operator now". If you haven't heard that
message, maybe you heard a similar one? This is
speech recognition.
Whether you know it or not, you have probably
used speech recognition. It's a known fact that
this technology is being implemented everywhere.
It's been around for more than 50 YEARS! Why is
it so popular now? There are two answers. The
first answer is computer hardware is now
available to take advantage of the technology,
and the second answer is that now it's
affordable. Previously computer hardware strong
enough to run this technology was too expensive
for the general public to purchase. But now, with
the the advancements in Central Processing Units
and Digital Signalling Processors, it has become
not a dream, but reality.
The current market consists of different forms
of Voice Recognition: These different forms are:
SPEAKER DEPENDENT
SPEAKER INDEPENDENT
COMMAND & CONTROL
DISCRETE SPEECH INPUT
CONTINUOUS SPEECH INPUT
NATURAL SPEECH INPUT
SPEAKER
DEPENDENT - This technology
requires users to participate in extensive
training exercises that can last several hours.
Once you are done "drilling the
machine" the computer then begins several
calculations from the data it has received from
your exercises. After these calculations, the
computer makes a voice profile that attempts to
match your voice synthesizations.
SPEAKER
INDEPENDENT - This technology, on
the other hand, does not require a person to go
through exercises. A user may begin using the
Voice Recognition program upon installation.
DISCRETE
SPEECH INPUT - This type of input
requires the user to pause between words so that
the computer may distinguish the beginning and
ending of words. Although your speech has to be
modified slightly, hence slowing your regular
dictation, you can achieve well over 80 WPM, the
speed of an advanced typist. Some have even
reported speeds of up to 125 WPM.
CONTINUOUS
SPEECH INPUT - This technology is
currently available for very small vocabularies
(2000 words) and numbers by very few vendors.
This speech input requires the user to say only
words that are known to the system. You are also
limited to the expandability of the libraries.
This technology is currently not useful for
dictation, but is very useful for specific
functions or programs, i.e. data entry systems. I
estimate this technology being available around
the fall of '97 or beginning of '98. However, I
also believe that these systems will start at
around $2000, and, I have also heard that you
should start seeing this technology around the
end of '96, but these systems will be specialized
for specific purposes and they are going to start
at more than $5000. I figure it will be
affordable and usable by the year 2000. No, I'm
not kidding. That's only 3 1/2 years from the
time of this writing.
NATURAL
SPEECH INPUT - This is the ultimate
goal in Voice Recognition Technology. To be able
to talk to your computer in no specific manner
and have the computer understand what the user
wants, then apply the commands or words (i.e. New
letter to Michael Rooney from ABC Systems ). The
computer will know to bring up a letter template
with Michael's name and the inside address of ABC
Systems and stop where dictation is to resume.
This unfortunately is not yet available.
Realistically, I don't expect to see this until
the year 2002.
However, there are some systems in the market
that are excellent today. Depending on what your
goals are, you may be able to gain major benefits
from choosing the right product.
The company I work for knows this and that is
why they carry all of the popular major products
that you would ever consider for your task. This
way, they can give the right advice, and the
right product for everyone's specific use.
I used to see many articles about Voice
Recognition. Some of them said it's not ready.
Since the writing of my first article 11/95, that
has changed and many writers and journalists now
use speech recognition software to write their
articles. There are still some journalists who
feel the technology is not ready, but if they
typed no more than 25 wpm, and did not have the
luxury letting someone else type their work, and
have not received a repetitive strain injury
(RSI), they would be using it too. [I GUARANTEE
IT!].
(some of the players)
Articulate Systems
Articulate systems develops speech recognition
products for the Macintosh. The PowerSecretary
system uses the Dragon Dictate speech engine. So
far, they are the only vendor to offer speech
recognition for Macintosh platforms. Of course,
Macintosh has the Apple Plaintalk (I'm sure there
is a trademark somewhere in there). Plaintalk is
great for some developers wanting to voice
activate their programs, however, I do not see it
as a solution for dictation (not since this
writing).
Dragon
Dictation Systems
Dragon Tools
Allows programmers to customize their
applications with Voice Recognition Technology.
Using continuous speech with switchable
vocabularies, much can be done with this product.
Dragon Dictate for DOS
One of the first and most popular programs
that was quickly used by users and programmers. I
personally think that Dragon has the edge on
their DOS product. It is very simple, fast and
easy to customize. Dragon has what you call an
OOPS buffer. This buffer holds the last ten word
you've spoken into memory. This allows you to
dictate fast, easily and quickly, go back and
correct errors.
Dragon Dictate for
WINDOWS
Dragon Dictate for Windows is almost identical
to its DOS engine. Actually, I have had the
opportunity to test their new 2.5 product which
is NT compatible. Dictation is fast, accuracy is
great, and functionality is awesome. With it's
new release, Drag on has outdone themselves. I
can't imagine what they'll do with their 3.0
version. I guess we'll have to see and I don't
expect seeing it for another 9 - 12 months. So
far, it will have to take my editors pick for
general and practice specific dictation.
IBM VoiceType
for Windows IBM Voicetype
This is one of the fastest dictation systems
using unique Trigram technology that quickly
surpasses the speed of all the systems mentioned
here, however, I must mention, Dragon Dictate is
right behind it. With speed of up to 125 wpm, and
accuracy above 95%. Great speed and great
accuracy. It does however not let you "see
what you are saying". There is a delay
between words. The IBM system thinks while you
continue your dictation. You can see it think
too. If you want to dictate, and let someone e
lse do the proofreading and formatting, this is
an awesome system.
IBM VoiceType for
OS/2
IBM Voicetype is available for OS/2 also, but
if you are a true WARP user, the new WARP
"Code named Merlin" includes dictation.
(Thanks IBM)
IBM Voicetype Application
Factory
IBM Continuous Speech Series is for the
developers. This program allows programmers to
generate up to 1000 word vocabularies for
specific programs or functions. You can have
several 1000 word vocabularies, but you cannot
use them at the same time. My knowledge is
limited to this product since I don't actually
use it.
Kurzweil
VoiceMED clinical reporting systems
Kurzweil Applied Intelligence leads the way
for medical reporting using their patented
VoiceReport technology, although, new continuous
speech engines which should be out by the end of
1996 will challenge these products. Kurzweil's
VoiceReport systems are speaker independent, and
use a unique embedded structured type of
reporting that gives their systems Quality
Assurance and Risk Management features. You might
say something like,
"New Patient"
"Patient Name" , "Juan Pedro"
(you'll have to type that in) "Patient
Demographics" .... "30",
"Year", "Hispanic",
"male"
"SOAP"
"Abdominal Pain"
"Routine"
"Take Defaults"
With these few short words, the Kurzweil
system writes something to the effect of;
Juan Pedro presents today, 7/1/96 for a new
visit.
Juan Pedro is a 30 year old Hispanic male.
SUBJECTIVE: Abdominal Pain
ROUTINE: [here, you will have a paragraph of
information relating to default medical
diagnosis.] Specific systems include;
- VoiceMED - for internal medicine, family
practice and general medical practices
- VoiceRAD - for Radiology
- VoiceORTHO - for Orthopedic Surgeons
- VoiceEM - for Emergency Medicine
- VoicePATH - for Pathology
- VoiceCARD - for Invasive Cardiology
- VoiceDI - for Diagnostic Imaging
Kurzweil Voice
for Windows
A Speaker Independent system for Windows.
Allows one to dictate into any windows
application without having to train it. This
system also has a continuous digit recognizer for
easy data entry of addresses, spreadsheets and
other miscellaneous tasks involving numbers.
Kolvox
Communications, Inc.
Kolvox Communications, Inc. so far leads the
way for 3rd party business solutions using voice
recognition technology. Kolvox has recently
merged with PureData and the new company's name
is Wild Card. For details, visit the Kolvox Site
(kolvox.com).
Kolvox Communications has two Products.
OfficeTALK
Speech solutions which give all types of users
powerful voice word processing solutions.
LawTALK
Speech solutions which give attorneys the power
of dictating without the costs of transcription
or tying up the secretary.
Both of these systems use either the Dragon
Engine, Kurzweil VOICE engine or the IBM engine.
These products are tailored by adding thousands
of macros and words to the dictionaries allowing
a user the ability not only to dictate, but to do
easy word processing and adding power to your
voice engine. Things like customized marked forms
for easy voice fill-in operation, an address book
application to merge your addresses straight to
your document and manipulating text such as
"Bold Sentence" bolds a sentence and
"Move Paragraph" moves paragraphs.
These are things that the voice recognition
engines do not come with and Kolvox has done a
great job in filling in. Some engines currently
offer some of the features of Kolvox's products,
however, none of them have it all. And some, have
none.
Several people wonder, is this for me? Do I
really want, or need this technology?
Here is a quick summary of the types of people
who are currently using the system, and how
helpful it is for their daily work or play.
Physicians
Physicians may use these systems to replace
their transcription personnel or their
transcription services. Using this technology,
Physicians gain a lot. First is cost. It is very
expensive for transcription services. Depending
on the amount of work and the type of field a
Physician is practicing, costs may range from
$1000 - $3000 monthly per Physician. A Physician
can spend more than $30,000 yearly on services or
$20,000 yearly on a full time transcriptionist.
This is not including the costs for unemployment
compensation, insurance, Workers Compensation and
payroll. Second is Risk Management. Using the
VoiceReport system, a Physician only needs to say
a few key words like "Chief Complaint",
"Abdominal Pain", "Routine
Checkup" and the VoiceReport system will
bring up several different procedures that are
usually performed when doing a routine checkup of
abdominal pain. This will remind the Physician of
the most common procedures that are to be
performed when this type of examination occurs
assuring a precise and complete examination. On
the other hand, if the findings are all normal,
you can say "All Normal" and the
VoiceReport system will spew out a well
documented report. Third will be Quality
Assurance. All reports will be full, and wording
will be correct and consistent to these types of
examinations. Malpractice insurance may also be
lower as a result of this technology. Currently,
there are insurance companies that offer
discounts for using systems such as this. Yet,
with sophisticated configurations, a physician
has the power of having automated faxing through
the system, link this system with their current
Medical Management packages, having automated
prescriptions being printed out, and can even
send ICD-9 Codes to their billing systems for
automatic billing. There goes Medical Coders.
There is much to be gained using this type of
voice technology.
Attorneys - Let me
tell you a short story
Once upon a time, there was an attorney who
had a secretary. This attorney depended on his
secretary for most of the office work. One day,
there was a little friction in the office and she
left the job. This was a two days before his
vacation and there was a lot of work to be done.
Well, he did everything that had to be done, but
he couldn't type and his handwriting was bad. He
knew if he sent it to transcription, they would
all come back with errors. He decided to take his
handwritten notes and spend an 2 hours a night on
his vacation to type them in himself. Well, he
finished and it only took him an average of 4
hours per night to finish.
Attorney ..(before LawTALK).......... "I
hate this stupid computer"
Well to make the story short, he bought
LawTALK for DOS and now does all of his documents
by voice. He doesn't depend on anyone for
anything now.
Attorney now
......................................
"Computer, print document and read"
Executives
Use this technology as great presentation
tools, or office tools. You can simply tell your
computer, "Smith & Company file"
and watch the computer bring it up. You can call
up reports, or say "Resume Draft" to
continue editing a draft. You can also use this
with your contact management program. Give it
commands like "Find Company","21st
Century Eloquence", "Print
Profile", "Call Company". Pretty
darn cool!
Keyboard
Impaired Individuals
For Keyboard Impaired individuals such as
people with Carpal Tunnel Syndrome or other
Disabilities, this will allow you to be as
productive as any other person who types. Let's
face it, whether it is for college, work or
hobby, we all type at some point. Dragon Dictate
is the only "Hands Free" product in
this roundup.
Programmers
The most lucrative way to enhance your
applications, and your programming. You can do
voice-ready applications and your product will be
state of the art. On the other hand, most
programmers type fairly well, but can you imagine
telling your program "For Next Loop"
and have it bring in a long procedure written by
you. You can virtually do voice-batch-files so
you can call procedures, objects, libraries or
functions instead of typing them in. Now that
will save you programming time. On the other
hand, make your applications a
"voice-aware" option for easy
navigation, data-entry, and functionality.
Currently, you can play a Klingon game by voice
and do lots of other things. Voice-aware
applications will be the next generation.
"Mark my words"
Well, lots of people really don't know how to
choose from all of the software, and truly, I
don't blame you. There is a lot of software and
each of them has different features. I'll try to
give my best suggestions depending your needs.
Look at my reviews for specific likes and don't
likes, then refer to my software suggestion list
broken up by profession.
Reviews.
| Articulate
Power Secretary |
No program to compare it
to.
Likes:Dictate 45 -55 wpm, adapts
very quickly
Dislikes: You must train the
system before using it. |
Overall rating. 3 1/2
stars. |
| Dragon
Dictate |
Likes: Very
adaptive, quick dictation, continuous
speech in various areas. Best technical
support. The product is SOLID. Easy to
use and learn
Dislikes: You must switch back and
forth to dictation mode / command mode. |
Overall rating. 4 1/2
stars. |
| IBM
Voicetype |
Likes: Dictate
without looking at your screen. Very
accurate. Large vocabulary.
Dislikes: No ability to correct or
change while dictating. |
Overall rating: 4 Stars |
| Kolvox
Officetalk |
Likes: Fills in
all the missing pieces from dictation
engines. Adds most obvious wanted
features.
Dislikes: They do not have their
own engine. |
Overall rating 4 Stars |
| Kurzweil
Voice |
Likes: Easy to
correct text. Commands and text always
active. Very easy to learn and handle.
Dislikes: Learns slow. Dictation
speed is tops 60 wpm. |
Overall rating 3 1/2
Stars |
Software suggestion list
Please send me your
suggestions, questions & disagreements.
Please be advised the the preceding article is
based on the opinion of X-chief voice recognition
specialist at 21st Century Eloquence. In no way
does this article represent 21st Century
Eloquence, Kurzweil Applied Intelligence, Kolvox
Communications, Dragon Systems, IBM,
MicroIntrovoice or any other company mentioned.
Also, companies / products mentioned are
Trademarks or Registered Trademarks of their
respective entities.
This information may not be reproduced in ANY
media unless written permission is granted.
I hope I have been helpful in guiding you
through Voice Recognition Technology. Overall,
all products do more or less what they're
supposed to do. "Type while you
dictate".
When you are ready to order, you may click here, or from
anywhere in this site, click on the ORDER button.
To contact me, you may email.
Thank You.