Eric S. Fishman, M.D.
Palm Beach, Florida, USA: August 15th, 1997
There is a new landscape in Speech Recognition Software as of June 23, 1997. Many of the older products are either no longer sold, or are merely shadows of their former success.
Why is this? It is because of the tremendous progress made in continuous speech, aka natural language programs.
For years, while we spoke like this to our computers, we were all waiting for the day when we could speak like this to them.
In September 1996, IBM announced the first shipments of IBM MedSpeak/Radiology, the first truly continuous speech recognition program. It is an incredibly accurate program. Unfortunately, it is exclusively for Radiologists. They have since then announce IBM MedSpeak/Pathology, again, a continuous speech program only for one specific medical specialty.
Notwithstanding their excellent quality, they made merely a tiny dent in the continuous speech market because of their limited usage.
However, on June 23, 1997 Dragon Systems started shipping the
first truly continuous speech general purpose, large vocabulary
speech recognition system. It is called Dragon
NaturallySpeaking, Personal Edition, V1.0, and has truly
turned the industry upside down.
Dragon NaturallySpeaking allows for continuous speech processing using any high end pentium computer with loads of RAM. Minimum requirements are a Pentium 133 with 32 MB of RAM (48 for Windows NT), however, we highly recommend a Pentium 200 or better with 48 MB of RAM, even for Windows 95.
With this hardware and software combination what can you do? You can talk to your computer almost as quickly as a New Yorker speaks. And if you are from the Midwest, you probably won't be required to alter your speech patterns at all. Accepting speech at clocked rates of over 160 words per minute, with an accuracy frequently quoted in the high 90% range (98% - 99% and higher are frequently mentioned in the voice-users email group), it is truly incredible. Most users are getting 95% accuracy immediately after the entertaining 20 minutes of initial training.
It is not Nirvana, however. Unfortunately, the initially released Personal Edition (the only one available until October of this year see below), does not permit the user to dictate into anything other than its own word processor. In order to get your information into Microsoft Word, for instance, you need to have the NaturallySpeaking window and the Word window open simultaneously. After your dictation you say 'copy to clipboard switch to previous window paste that' and your dictation has miraculously moved into Microsoft Word. The NaturallySpeaking word processor does have a few features such as centering, bolding, and some font characteristics. However, the more sophisticated formatting will need to be done within your word processor.
Additionally, macro capabilities are severely limited, particularly when compared with those of Dragon Dictate and other discrete speech programs. Dragon NaturallySpeaking will allow for any continuous string of up to 128 characters, including spaces. However, it will not permit non printing characters such as the enter key - within a macro. Thus if you wish to make your letterhead into a macro you will need a separate macro for each line of your address. With the recently announced, and soon to be released (October 1997) Deluxe Version of NaturallySpeaking, some of these deficits will be corrected.
There is an acceptable work around provided by a third party vendor, and for the adventurous, this may well be worthwhile. Speechtrieve has announced, a program called SpeechLinks. SpeechLinks is expected to offer many of the capabilities that Dragon Systems is building into its Deluxe Version of Dragon NaturallySpeaking, as well as many features that will not be in the Deluxe Edition. In fact SpeechLinks links Dragon NaturallySpeaking's capabilities with those of Dragon Dictate, offering the sophisticated macro capabilities and command and control found in Dragon Dictate. Of course, at the present time, the end-user must purchase a separate license for each user on one machine for Dragon NaturallySpeaking, if used in conjunction with SpeechLinks. SpeechLinks retails for $129. To be useful, however, the user must have a copy of Dragon NaturallySpeaking as well as Dragon Dictate. The use of SpeechLink, however, will maintain that enormous macro file that you have built up using Dragon Dictate during the previous few years.
Dragon NaturallySpeaking was initially announced at a manufacturer's suggested retail price of $695. It quickly came out with a street price of $299. The current manufacturer's suggest retail price is now $349, and it is still widely available through Value Added Resellers at $299.
In addition, a similar product is just now going into general distribution through cataloguers and mass merchandisers. The price at these locations is not known at this time, but is expected to be somewhat lower than $299, and possibly as low as $199.
There are some differences between the products available through mass merchandisers and the Value Added Reseller Channel that should be kept in mind for all but the extremely price conscious. One of these differences is the ability to upgrade to the soon to be released Deluxe Version. It is anticipated that end-user who purchase through the Value Added Resellers channel will have less expensive, and thus easier access to the upgrade to the Deluxe Version. While this is a matter of personal business practice with each Value Added Reseller, it is anticipated that the upgrade will have a manufacturer's suggested retail price of over $300, but will likely be available for significantly less than this through the VAR channel for at least the remainder of 1997.
What features will the Deluxe Version include that are currently lacking in the Personal Version, and will they be worth the upgrade price? As usual Dragon Systems is true to its end-users, and is promising a significant improvement to the Personal Edition.
The following additional features will be available:
It is expected that the Manufacturers Suggested Retail Price of Dragon NaturallySpeaking Deluxe Version will be $695 when released. Release date is expected in October 1997. Again, there will be a relatively gentle upgrade path for all end-users who purchase the Personal Edition through the VAR channel. Because of the anticipated low upgrade cost through the VAR channel, it is probable that an end-user purchasing Dragon NaturallySpeaking Personal Edition now who upgrades to the Deluxe Edition when available, will end up paying less in total than an end-user who waits and purchases just the Deluxe Edition.
While Dragon NaturallySpeaking is currently available only in US English, Dragon does have plans to release this program in multiple additional languages in the near future. Sorry, we cannot give exact details regarding these languages at this point in time.
Dragon Systems, to the best of our knowledge, does not intend to issue their continuous speech program in a Macintosh compatible format any time in the immediate future.
Where does this leave Dragon Dictate, and why would someone wish to purchase it?
Dragon Dictate Version 3.0 has been announced, and will soon be released. Again, it comes with a significant price drop, and a change in the breadth of products available. It will be available simultaneously in U.S. English, British English, French, Spanish, German, and Italian.
First of all, their $99 Singles, and the low end Personal Editions have been dropped from the line-up. The Classic Edition, by far their previous best seller, previously retailing at $695, is now offered at $149, and the Power Edition (previously $1695) is now offered at $695. The Classic Edition offers a 30,000 word vocabulary and the Power Edition offers a 60,000 word vocabulary. The Power Edition come with one of their DragonPro specialized vocabularies, such as Medical, Legal, Technical, Business or Journalism. Additional DragonPro vocabularies are available for $299.
Dragon Dictate 3.0 offers the following features:
Dragon Dictate continuous to support Windows 3.x and Windows 95, and has limited support for Windows NT.
IBM has been successfully selling its IBM MedSpeak/Radiology to small and large Radiology installations throughout the United States. It is compatible with large hospital mainframes as well as with stand alone diagnostic centers.
However it is just entering the large vocabulary continuous speech general purpose dictation software business this month.
IBM's ViaVoice was announced in June 1997, with a scheduled release date sometime in August 1997 with a manufacturer's suggest retail price of $199. This price has been lowered even before its initial release, and it is now expected to go on sale at $99.
It has many of the features of Dragon NaturallySpeaking, but is lacking in at least one major feature. This is the ability to correct your written text by voice. Dragon NaturallySpeaking has significantly improved its correction methods, now allowing you to spell your corrections in using just the alphabet. For instance you would correct the word 'two' by saying 't w o'. Gone are the days of the alpha bravo charlie alphabet for corrections. However, IBM's ViaVoice offers neither of these methods, insisting manual correction using the keyboard.
It does, however, currently allow for dictation within Microsoft Office Word 97. This is a feature which is expected in the Deluxe version of Dragon NaturallySpeaking. However, again, it will be available with the first release of ViaVoice.
Philips Speech Processing has released, but has not yet had wide acceptance, of a number of specific subject continuous speech recognition programs. Offering programs for Radiologists, Orthopedic Surgeons, Emergency Physicians, as well as legal contexts for Litigation and Bankruptcy, the Speech Magic programs are just beginning to be successful in a number of Beta site installations. It is expected that these will become significantly more widely available, at significantly lower prices, within the next 3-6 months. Currently a Philips Speech Processing system, including the underlying engine and the context is likely to cost between $5000 and $10,000. While no firm pricing change date or specifics have been offered, we are expecting this to drop below $5000 by the beginning of next year. For those whose specific field of work has been covered by one of the Philips' products, these will soon be a viable, albeit significantly more expensive alternative to the general purpose continuous speech dictation programs. However, the added features may well be worth looking into if price is of no major concern.
Kurzweil has joined with Lernout & Hauspie, and to the best of our knowledge, has not released any new products recently. They are still offering Kurzweil VoicePad, VoicePlus and Kurzweil VoicePro. These are discrete speech products, and while they have their adherents, and were state of the art prior to the release of the currently available continuous speech products, they are not as successful at the present time due to their discrete speech nature. However, the VoicePad, at under $50, is still available for students etc., who do not have the funds to purchase the hardware necessary to run the more sophisticated continuous speech programs.
We anxiously await what the combination of Lernout & Hauspie and Kurzweil will produce as both of them have had viable products in the past, and with their combined resources, it is expected that they will offer again, and probably in the near future, state of the art products for the vertical markets in the speech recognition field.
Unfortunately for WildCard, previously known as Kolvox, essentially all of their add-on features which they packaged under the names LawTalk and OfficeTalk, have been included in the manufacturers more recent releases. Therefore these programs are no longer widely available nor supported.
Voice Pilot is a program geared for those who are involved in international events. Using the IBM VoiceType engine, this program translates your spoken words into other languages. This program works inside of its own window. It still does not quite use a natural speech pattern. Nonetheless, it is an interesting concept.
VXI, known as the best microphones available for speech recognition, has several style microphones that make the speech system more comfortable. There are even adapters that allow you to use the telephone and speech system with the same microphones.
Andrea Microphones are widely accepted, tested and approved by many speech recognition manufacturers. Many manufacturers offer this microphone with the initial software purchase.
Previous Versions of the Speech Recognition Update are available. These earlier articles define many of the terms, and describe in fuller detail many of the products discussed above.
This article is ©1997 21st Century Eloquence. All rights reserved. However, any or all of this article MAY be reproduced as long as all identifying information concerning its origin is prominently noted in the reproduced work. You may identify us in any or all of the following manners using, as a minimum, our company name, phone number and url:
Eric S. Fishman, M.D.
21st Century Eloquence
7108 Fairway Drive Suite 101
Palm Beach, FL 33480
http://www.voicerecognition.com
http://www.voicerecognition.com/article_8_97.html
http://www.continuous-speech.com
http://www.speechrecognition.com
1.800.245.2133
1.561.689.0055
voice@voicerecognition.com