Help - Search - Members - Calendar
Full Version: Dragon iPhone App
VoiceRecognition.com Forum > Voice Recognition Software and Hardware > Dragon NaturallySpeaking & Dragon Medical
mcrose
Is the Dragon iPhone App the wave of the future? I was impressed with its accuracy and speed of processing.

Just wondering what the rest of you guys think....


MCR
Chucker
QUOTE (mcrose @ Dec 12 2009, 01:46 PM) *
Is the Dragon iPhone App the wave of the future? I was impressed with its accuracy and speed of processing.

Just wondering what the rest of you guys think....


MCR


MCR,

The Dragon iPhone app is not performing any speech recognition on your iPhone. It's an interface that connects to the Dragon NaturallySpeaking transcription server by phone connection through your cell phone service provider just as if you are calling a friend to have a normal phone conversation. In other words, it dials a phone number (Nuance), transfers what you say just as if you were talking to another person, and is transcribed at the Nuance end as if you were dictating directly into Dragon NaturallySpeaking. In essence, it basically works just like an answering machine except that instead of recording what you say it's actually engaging the Dragon NaturallySpeaking transcription server and transcribing your dictation on-the-fly. The results of that recognition are then returned to you in exactly the same manner except that it's returned to you as text just as if your friend on the other end were texting you back after listening to what you said.

This type of transcription has been available for years. Various companies have set up methods whereby the same thing can be done by doctors dictating their reports by phone directly into a workflow based transcription server, after which the transcription (text) is converted into a document and sent back to the doctor as a file. The difference here is that instead of sending the transcription (text) back to you as a file, it sent back to you using the texting capabilities of your cell phone. The transcription services available to doctors are more sophisticated, as well as more complex, but the principle is basically the same. It's just making use of the capabilities of smartphones, in this case and at this time specifically the iPhone. At some point in time it will be available for most smartphones. Nevertheless, it basically works as if you were dictating remotely over the phone.

Basically, when you hit the record button, the application dials the Nuance phone number and listens to what you say. Then it sends the text back to you. Since it only takes a few milliseconds to perform the transcription and since most users aren't going to sit there and dictate for a half an hour, the processes very quick. It's also highly accurate because every time you use the application, everything that you say is used as data for improving the Acoustic Model on the server end. Everybody who uses the service contributes to the further improvement of the accuracy. This is why it's free at this point. When Nuance feels that it has acquired sufficient data and begins to offer it on a paid basis, it will be offered as a service for a specified monthly fee. Right now, Nuance is providing it to you free because you're scratching their back by providing the acoustic data that they are interested in acquiring. Since this is designed to be basically a speaker independent service, a large corpus of acoustic data is necessary in order to improve the accuracy across many types of speakers. Even when it is offered as a service for a monthly fee, the data will still be being acquired for this purpose. At this point, Nuance is simply performing a long-term Beta test and offering it to you at no charge.

Nothing special, except for the methodology and the fact that it is a public service. If you want to see how this has been used over the last few years, take a look at vendors like CustomsSpeechUSA, which provides this type of workflow and transcription service to doctors, lawyers, transcriptionists, etc. Also, Nuance is not the first to come out with this. IBM and VoxForge have been providing this type of service for a fee for little more than a year now. However, their approach is not restricted to smartphones. Anyone with a cell phone can opt into that one. The difference there is that the results are sent back your e-mail because the average cell phone can't multitask like smartphones can.

Is it the wave of the future? One of them. If you want to look down the road, the wave of the future and where the technology is moving is in the direction of the Star Trek computer. For the end-user in the next 10 or 15 years, you will carry your personal computer in your pocket just like your iPhone except that it will run on fuel cells that will last for three or four years constantly on, have the power of today's supercomputers, be always connected to the Internet via your ISP, which will also be providing your phone service, and you'll be wearing your video display via a pair of glasses, or sunglasses, that worked like the heads-up displays available in automobiles, but as sophisticated as those available in modern aircraft (i.e., military and the new Boeing 787). That is the technology of the future and speech recognition will be the major user interface for that technology. Why? Where are you going to stick your keyboard, unless you have a pair of jeans with an oversized back pocket. Even with regard to touch keyboards on your cell phone, or in this case your computer, who's going to want to use them since they will be an incredible time waster when speech is much faster and at that point will be pretty close to 100% accurate most of the time.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC

We live in a society exquisitely dependent on science and technology, in which hardly anyone knows anything about science and technology. - Carl Sagan
mcrose
Hi Chuck --

Thank you for your lengthy and very informative reply -- especially your view of the future of computing/voice recognition, etc. One of my psychiatrist buddies 'outsources' his transcription to India where it is almost instantaneously transcribed and returned to him for pennies on the dollar. He is still leery of DNS and does not yet feel it is ready for prime-time -- at least for him.

Thanks again. I have learned a ton of stuff about DNS, computers, etc. reading your posts. I hope you will continue to illuminate us.

Merry Christmas!

MCR



QUOTE (Chucker @ Dec 20 2009, 02:47 PM) *
MCR,

The Dragon iPhone app is not performing any speech recognition on your iPhone. It's an interface that connects to the Dragon NaturallySpeaking transcription server by phone connection through your cell phone service provider just as if you are calling a friend to have a normal phone conversation. In other words, it dials a phone number (Nuance), transfers what you say just as if you were talking to another person, and is transcribed at the Nuance end as if you were dictating directly into Dragon NaturallySpeaking. In essence, it basically works just like an answering machine except that instead of recording what you say it's actually engaging the Dragon NaturallySpeaking transcription server and transcribing your dictation on-the-fly. The results of that recognition are then returned to you in exactly the same manner except that it's returned to you as text just as if your friend on the other end were texting you back after listening to what you said.

This type of transcription has been available for years. Various companies have set up methods whereby the same thing can be done by doctors dictating their reports by phone directly into a workflow based transcription server, after which the transcription (text) is converted into a document and sent back to the doctor as a file. The difference here is that instead of sending the transcription (text) back to you as a file, it sent back to you using the texting capabilities of your cell phone. The transcription services available to doctors are more sophisticated, as well as more complex, but the principle is basically the same. It's just making use of the capabilities of smartphones, in this case and at this time specifically the iPhone. At some point in time it will be available for most smartphones. Nevertheless, it basically works as if you were dictating remotely over the phone.

Basically, when you hit the record button, the application dials the Nuance phone number and listens to what you say. Then it sends the text back to you. Since it only takes a few milliseconds to perform the transcription and since most users aren't going to sit there and dictate for a half an hour, the processes very quick. It's also highly accurate because every time you use the application, everything that you say is used as data for improving the Acoustic Model on the server end. Everybody who uses the service contributes to the further improvement of the accuracy. This is why it's free at this point. When Nuance feels that it has acquired sufficient data and begins to offer it on a paid basis, it will be offered as a service for a specified monthly fee. Right now, Nuance is providing it to you free because you're scratching their back by providing the acoustic data that they are interested in acquiring. Since this is designed to be basically a speaker independent service, a large corpus of acoustic data is necessary in order to improve the accuracy across many types of speakers. Even when it is offered as a service for a monthly fee, the data will still be being acquired for this purpose. At this point, Nuance is simply performing a long-term Beta test and offering it to you at no charge.

Nothing special, except for the methodology and the fact that it is a public service. If you want to see how this has been used over the last few years, take a look at vendors like CustomsSpeechUSA, which provides this type of workflow and transcription service to doctors, lawyers, transcriptionists, etc. Also, Nuance is not the first to come out with this. IBM and VoxForge have been providing this type of service for a fee for little more than a year now. However, their approach is not restricted to smartphones. Anyone with a cell phone can opt into that one. The difference there is that the results are sent back your e-mail because the average cell phone can't multitask like smartphones can.

Is it the wave of the future? One of them. If you want to look down the road, the wave of the future and where the technology is moving is in the direction of the Star Trek computer. For the end-user in the next 10 or 15 years, you will carry your personal computer in your pocket just like your iPhone except that it will run on fuel cells that will last for three or four years constantly on, have the power of today's supercomputers, be always connected to the Internet via your ISP, which will also be providing your phone service, and you'll be wearing your video display via a pair of glasses, or sunglasses, that worked like the heads-up displays available in automobiles, but as sophisticated as those available in modern aircraft (i.e., military and the new Boeing 787). That is the technology of the future and speech recognition will be the major user interface for that technology. Why? Where are you going to stick your keyboard, unless you have a pair of jeans with an oversized back pocket. Even with regard to touch keyboards on your cell phone, or in this case your computer, who's going to want to use them since they will be an incredible time waster when speech is much faster and at that point will be pretty close to 100% accurate most of the time.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC

We live in a society exquisitely dependent on science and technology, in which hardly anyone knows anything about science and technology. - Carl Sagan

Chucker
QUOTE (mcrose @ Dec 20 2009, 11:27 PM) *
Hi Chuck --

Thank you for your lengthy and very informative reply -- especially your view of the future of computing/voice recognition, etc. One of my psychiatrist buddies 'outsources' his transcription to India where it is almost instantaneously transcribed and returned to him for pennies on the dollar. He is still leery of DNS and does not yet feel it is ready for prime-time -- at least for him.

Thanks again. I have learned a ton of stuff about DNS, computers, etc. reading your posts. I hope you will continue to illuminate us.

Merry Christmas!

MCR


MCR,

I would be willing to bet that your psychiatrist buddy would be very surprised to find out that his transcriptions were being done by an outsourced service using Dragon NaturallySpeaking.

Merry Christmas to you and have a safe and happy holiday

Chuck Runquist
GEMCCON - The Choice of Intelligence
Speech Recognition Consulting and Training

We would often be sorry if our wishes were gratified. - Aesop (620 BC - 700 BC)
KnowBrainer Tech Support
Just adding to Chucks excellent advice... NaturallySpeaking includes a customer satisfaction 30 day no restock fee guarantee which is backed by the manufacturer. It also helps to purchase NaturallySpeaking from a Nuance licensed speech recognition solutions partner for support reasons. The 2 most common reasons for NaturallySpeaking returns is a lack of support and/or a substandard sound system (microphone & soundcard).
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2013 Invision Power Services, Inc.