IPB

Welcome Guest ( Log In | Register )


 
Reply to this topicStart new topic
> Pros and Cons of voice recognition program
Alvin l
post Jul 28 2004, 09:07 PM
Post #1


Junior Member
*

Group: Members
Posts: 2
Joined: 28-July 04
Member No.: 1,566



Hi people!
I'm doing a project on voice recognition. Can you help by telling me what are the pros and cons of voice recognition ,the impact to the society when using this program and also how to tackle the disadvantages. I really need your help and please reply as soon as possible.Thank you.
Go to the top of the page
 
+Quote Post
John Wickett
post Jul 29 2004, 12:51 PM
Post #2


Moderator
****

Group: Super Moderators
Posts: 4,266
Joined: 9-January 01
From: wickworks@xplornet.com
Member No.: 51
VR User?: about 18 years
Which Program?: DragonDictate then DNS
Why do you Use VR?: Ease, speed, and to stay current to assist disabled clients
Where are you Located: Villsge Green, PEI, Canada



Marty,

> >People can think you are strange, progressive, anti-social because you talk to your computer.

People used to think that of me, but now that my computer talks back to me they are convinced! :-)

John
Dictated with Dragon Naturally Speaking
John Wickett
PEI Canada
902-368-0141
902-651-2152
jwickett@auracom.com
john.wickett@vac-acc.gc.ca
Go to the top of the page
 
+Quote Post
BryanCrow
post Aug 3 2004, 02:17 AM
Post #3


Member
***

Group: Members
Posts: 115
Joined: 9-June 03
Member No.: 1,141



Alvin, I regret to that I'm inclined to agree with Martin that speech recognition is unlikely to become 100% accurate in my lifetime, but then I'm 65. Without captions, I can't understand a lot of movie dialogue myself! I used to think that was just because my hearing isn't as good as it used to be, and it’s true that it isn't. But teenagers and people in their twenties really don't understand movie dialogue much better when they watch with me. It turns out that they just gloss over what they don't understand.

I wonder if human speech recognition is 99% accurate.

A futurist named Ray Kurzweil believes computer science is advancing much faster than we realize. Let me recommend his The Age of Spiritual Machines: When Computers Exceed Human Intelligence. In this book, he traces the history of computing devices and especially the progress made in the past few years. He makes a fairly persuasive argument that computers will soon outstrip human brains— at least, in their structural complexity. That's pretty impressive when you consider that there are 100 billion neurons between your ears, each connected to as many as a thousand others.

As anyone can tell, I really don't know much about this subject myself, but friends who know this field better than I tell me that the main challenge is software instead of hardware. Futurists tell us that the next step in software will enable computers to "Do as I mean" instead of doing quite literally exactly what we command. Furthermore, according to my computer gurus, software designers will increasingly use computers to handle many of the difficult aspects of program writing, which is already accelerating the advancement of software.

I'm a big fan of novelist Greg Iles whose most recent book, Footprints of God, was inspired by Kurzweil's book. In the Iles novel, scientists construct a device with the computational power of the human brain. Then they go a step further. Using high-speed microscopic imagery of actual human brains, they're able to trace its moment-by-moment circuitry—that is, the circuitry of human thought, human feeling, and even human spirituality.

All of these things, scientists claim, are really different aspects of the same processes. It may be convenient and comforting to divide them into intellectual and nonintellectual processes, but you can't do mental arithmetic without having some feeling about it or having some feeling at the same time. And even the most intense emotions have content. By that, they just mean that when we get mad, we get mad at somebody or about something. You don't just get mad, do you?

If you see where I'm going, you see where the plot of the novel goes. The scientists in the Iles’ novel copy human personalities into a machine. Iles’ genius is that one of these personalities, has a profound religious experience at the site of Christ's death on the cross. I won't spoil the book by telling more, but I think it's notable that Iles can accept the implications of both technology and spirituality.

100% accurate speech recognition in my lifetime or Martin's? Unlikely, indeed, but I do believe HAL is coming. I believe speech recognition will begin to usurp mouse and keyboard in our lifetimes. I bought the first version of Dragon NaturallySpeaking and every succeeding one. The early versions seem extremely limited and flawed to me now, but a lot of us hung in there and kept working with it. (Don't get me wrong, Alvin, I'm one of the least expert people here.) Progress has come faster, in my opinion, in the last version or two, and DNS 7 was, for me, a dramatic breakthrough.

Again, I’m just speaking for myself when I say this: Speech recognition (and I mean DNS, because I know of nothing currently available to the public that's better) is no longer a toy or a novelty. It’s an entirely practicable technology now. I can definitely create Word documents faster with it than I can produce them by typing. And my typical document is hardly boilerplate as you see. What’s more, the job goes easier as well as faster. I’m almost as adept now at thinking for dictation as I am thinking through my fingertips or a with a roller writer. The way things are going, I think I’ll soon be able to compose prose better by (as Martin would put it) "talking to the computer."

I started using computers in 1972, desktop-sized ones in 1978, and PCs in the early nineties. In 1996, Newsweek wrote that computers were about to revolutionize the world. I was skeptical, but they have changed us faster than I anticipated. I think we’re at the Model T stage. Before Ford’s T Model, it was mostly rich hobbyists who operated motor cars, and they either had their own mechanics or they had mechanical aptitude themselves and time to fool around with a then-unreliable technology. The Model T ushered in a big change. John Doe could afford it, and while it seems primitive now, it was reliable enough to be truly useful.

Its tires regularly went flat, it wouldn’t start or it broke your arm when you cranked it, and it leaked weather, but it had one huge advantage over the horse: speed.
If you had a horse-drawn carriage as comfortable as a Mercedes S 600, it wouldn’t be very useful at all compared to the T Model. I rode 100 miles on my bicycle Monday. No horse could cover that distance, even in a whole day. I ran the Boston Marathon in 1980. If there had been horses in that race, even those of us in the middle of the pack would’ve quickly run them down. Horses can run fast only for short distances.

Why do we fly? Speed may kill, but it’s the very raison d’ętre for the automobile.

I say computers are at the Model T stage. They were supposed to increase our productivity, and they have—despite all their glitches. But they’ve hardly even begun. They’re buggy and quirky, and while it’s human to screw up, the computer gives us the facility to do it royally. But that's becoming less and less true. We’re headed for that day when computers, as I’ve said, won’t “blindly” follow our commands but will do as we intend them to, will help us make them do what we want.

You ask what the implications of speech recognition are. That can only be answered by predicting what computers will mean to us. It’s not science fiction to foresee a kid coming home from school and having the computer hidden within the structure of his home recognize him and unlock the door yet keep it locked if he’s accompanied by an unknown adult. Why couldn’t it fix him a snack and draw his bath? Why, in other words, couldn’t it control all our appliances? Cleaning a house is a pretty complicated job. No single machine can do it today, but under the direction of the house computer or some central computer somewhere, there’s no reason machines can’t do anything a human servant can do.

How will we direct machines in the future? I don’t think it’s at all farfetched to expect speech recognition to prevail over current interfaces. It isn’t some isolated technology. It’s computer technology.


--------------------
Bryan Crow
Go to the top of the page
 
+Quote Post
John Wickett
post Aug 3 2004, 12:47 PM
Post #4


Moderator
****

Group: Super Moderators
Posts: 4,266
Joined: 9-January 01
From: wickworks@xplornet.com
Member No.: 51
VR User?: about 18 years
Which Program?: DragonDictate then DNS
Why do you Use VR?: Ease, speed, and to stay current to assist disabled clients
Where are you Located: Villsge Green, PEI, Canada



Bryan,

>>A futurist named Ray Kurzweil believes computer science is advancing much....

It may interest you to know that a number of years ago I had to pick between Kurzweil Voice and DragonDictate for my clients. Kurzweil was then the most accurate in regards to recognition but DragonDictate made it much easier to make corrections. Since most of my client have a mobility problem, the ability to easily correct and edit took precedence over the then extreme accuracy of Kurzweil.

John
Go to the top of the page
 
+Quote Post
BryanCrow
post Aug 3 2004, 01:01 PM
Post #5


Member
***

Group: Members
Posts: 115
Joined: 9-June 03
Member No.: 1,141



Wow, John. I remember Kurzweil did SR work. He mentioned it in either his most recent book or his previous one. I should've realized you'd know about him.

BC


--------------------
Bryan Crow
Go to the top of the page
 
+Quote Post
Alvin l
post Aug 4 2004, 09:37 PM
Post #6


Junior Member
*

Group: Members
Posts: 2
Joined: 28-July 04
Member No.: 1,566



Hmmm.... my teacher wants me to verify the source... Would you mind assisting me by maybe giving me your contacts and sources please, thanks.
Go to the top of the page
 
+Quote Post
BryanCrow
post Aug 5 2004, 01:47 AM
Post #7


Member
***

Group: Members
Posts: 115
Joined: 9-June 03
Member No.: 1,141



Yeah, Alvin, why don't you read Martin's first post again? You haven't made it clear whether it's speech recognition and its impact on society your teacher wants you to write about, as we assume, or voice recognition. And as Martin says, we don't know whether you're addressing him, since he answered first, or all of us who responded. I don't consider myself especially well-informed about speech recognition. I was just spouting opinions. The books about speech recognition that I'm aware of are either practical how-to books or highly technical theoretical treatises that I wish I could say I've read but haven't.


--------------------
Bryan Crow
Go to the top of the page
 
+Quote Post
jnuttallphd
post Aug 5 2004, 04:20 AM
Post #8


Member
***

Group: Members
Posts: 832
Joined: 21-April 02
Member No.: 543



Hello forum members:

I'm a bit more optimistic than Bryan. I think we are going to get pretty much 100% accuracy within 10 years. I'm not at 100% speech recognition myself. Ask my wife how often I say,"What was that?" of course she thinks this is "convenient spousal hard-of-hearingness".

For example, take a look at OCR work which I do a great deal of. It used to be very buggy. I spent more time cleaning up documents than actually doing the scanning. But now FindReader Professional 7.0 is almost 100% accurate when doing OCR. By the way, Ray Kurzweil also made a significant contribution to OCR. Another good example is text-to-speech. Text-to-speech when I was at the University was produced by mechanical voice synthesizers. They sounded like Roby the robot or the chipmunks with a bad cold. Today, text-to-speech voices operate on any computer and are made up of real human speech phonemes. Thus, the audio books that I produced using text-to-speech perform better than the vast majority of human readers. And voices keep getting better and better. ScanSoft has a demonstration from Speechify text-to-speech voice that is dramatic in its emphasis and indistinguishable from the professional actors reading a script.

Speech recognition is not the preferred interface for many people over the age of 30 :-(. We are just too tied to our keyboards/mice, due to overuse. But for the younger generation speech recognition will become common and preferred. For example, many companies are now going to speech recognition for telephone center work.

By the way, I have low vision and I used speech recognition mainly for navigating my computer. I actually do little dictation. 90% of my work is the navigation of my computer. My voice is a substitute for the visual/mouse interface for Windows.

Jim -- Michigan

[This message has been edited by jnuttallphd (edited 08-05-2004).]


--------------------
Jim -- Michigan
Go to the top of the page
 
+Quote Post
Judy Evans
post Aug 5 2004, 07:10 AM
Post #9


Moderator
****

Group: Super Moderators
Posts: 2,242
Joined: 26-November 01
From: UK
Member No.: 369
VR User?: 10 years
Which Program?: NaturallySpeaking Pro 8.0
Why do you Use VR?: 1. I'm disabled 2. at home 3. ex-academic (disability retirec)
Where are you Located: Cardiff, Wales



Hello Alvin

We are our own contacts and sources! (I didn't reply because your query was a bit vague and I haven't got time to write an essay on, for example, the actual and potential impact on society of voice and speech recognition -- if you mean *voice* recognition, you should perhaps ask elsewhere, though some of us here do know about that.


>>>
Hmmm.... my teacher wants me to verify the source... Would you mind assisting me by maybe giving me your contacts and sources please, thanks.
>>>

*Verify* the source/s? or simply provide one/s? If you want first-hand sources you've been given one here, Ray Kurzweil: look him up.

We could help you more (if we had the time...) if we knew what you're studying and what course this is for and how much time you have (etc.).

Judy
Go to the top of the page
 
+Quote Post
BryanCrow
post Aug 5 2004, 09:08 AM
Post #10


Member
***

Group: Members
Posts: 115
Joined: 9-June 03
Member No.: 1,141



Dear Jim,

Maybe we shouldn't waste forum resources with attaboys, me-too's, and golly gee's, but golly gee, I enjoyed your post. What text-to-speech application do you recommend? I also enjoyed the post you wrote to John and me about CPU power.

The following is for Alvin, but it bears on your comments:

Bill Gates is betting that advances in hardware and computing will make it possible for computers to interact with people via speech and that computers which can recognize handwriting will become as ubiquitous as Windows. This is what he said recently: "Ten years out, in terms of actual hardware costs you can almost think of hardware as being free -- I'm not saying it will be absolutely free -- but in terms of the power of the servers, the power of the network will not be a limiting factor. Many of the holy grails of computing that have been worked on over the last 30 years will be solved within [the next] 10-year period, with speech being in every device…a device that's like a tablet that you just carry around."


In the Toronto Star on January 6, 2000, Myles White wrote this:

“By 2010, continuous speech recognition will have finally come of age when its developers discover that a 200 GHz computer was all it really needed.”

This was not said altogether with tongue in cheek.

The rest of his predictions are available at
http://www.computerwriter.com/Star/2000/ja...predictions.htm

Most of his predictions through 2004 have been exceeded.

This last quote is from Nick Stam on The ExtremeTech website
http://www.extremetech.com/article2/0,3973...3,533051,00.asp

He wrote this in September 2002.

One of the big goals today: Simplifying human/machine interaction, particularly improving the accuracy of speech recognition. The MRL is refining algorithms that process array microphone inputs to form conical beams to effectively isolate noise, while using sensors to allow that beam to "track" the talker as he moves.
Such tasks are quite CPU-intensive, but it doesn't stop there. A video camera will read human lip movement and combine the visual data with the audio inputs to improve accuracy, particularly in noisy environments. In a study involving 295 people, Intel found that this technique reduced recognition errors by 55% versus audio-only.

Speech recognition is fraught with many other computational problems, like interpreting different dialects and slang, or using contextual analysis to recognize identical sounding words that have different meanings (which/witch, bear/bare, etc). MRL is trying to understand the problems, define algorithms that help solve those problems, and then develop new microprocessor hardware architectures and/or instruction set extensions if needed, so the algorithms can run most effectively. Pinfold explained that natural signal processing requires enormous amounts of CPU power. It'll be a long time before your notebook can act as a high-quality speakerphone, and be able to take voice-dictation with 100% accuracy.

BC


--------------------
Bryan Crow
Go to the top of the page
 
+Quote Post
jnuttallphd
post Aug 5 2004, 09:24 AM
Post #11


Member
***

Group: Members
Posts: 832
Joined: 21-April 02
Member No.: 543



Hello Bryan:

For text-to-speech I highly recommend a program called TextAloud MP3 at www.nextup.com

If you purchased their TextAloud 1.0 you can go to their user forum and download the beta copy of 2.0. I also purchased two sets of the voices -- AT&T Natural Voices (Mike and Crystal for US English) there some good voices for UK English. I also purchased Neospeech Paul and Kate which are an improvement over the AT&T voices. These are also better voices with Jennifer which is bundled with Dragon. ScanSoft has also been acquiring better voices.

With my low vision I scanned books. And then make them into audio books using TextAloud. Fortunately I just got access to a high-speed scanner that can scan of 400 page book in 15 minutes. This beats the two days of hard work that I used to do for such scanning.

Jim -- Michigan


--------------------
Jim -- Michigan
Go to the top of the page
 
+Quote Post
BryanCrow
post Aug 5 2004, 09:55 AM
Post #12


Member
***

Group: Members
Posts: 115
Joined: 9-June 03
Member No.: 1,141



Thanks, Jim.

I didn't throw out that 1998 Dell. I have a wonderful German soundcard Toslinked to a MiniDisc as well as applications, some good old good ones, that don't support XP. The old Dell is mainly a jukebox now. My wife and I listen to things like NPR and Audible.com throughout our house.

Do you think a 450MHz CPU and 768MB of RAM can handle TextAloud MP3?

We won't be investing in a high-speed scanner like yours, but there are plenty of things on the Web we'd like to listen to.

Besides, I'm a writer manque who subscribes to Elmore Leonard's dictum: "If it sounds like writing, rewrite it." It'suseful to listen to what I've written, but the text-to-speech applications I have now, including the one that comes with DNS, sound disappointingly unnatural to me.

Bryan


--------------------
Bryan Crow
Go to the top of the page
 
+Quote Post
John Wickett
post Aug 5 2004, 12:39 PM
Post #13


Moderator
****

Group: Super Moderators
Posts: 4,266
Joined: 9-January 01
From: wickworks@xplornet.com
Member No.: 51
VR User?: about 18 years
Which Program?: DragonDictate then DNS
Why do you Use VR?: Ease, speed, and to stay current to assist disabled clients
Where are you Located: Villsge Green, PEI, Canada



Bryan and Jim,

Looks like you two have been very busy today, and have even managed to coax Judy out of her lair!

Those of us who have been dictating for a number of years, and have learned the proper techniques of accurate dictation, operate at close to 100% accuracy. For myself, any inaccuracies are either words that I have never used before with DNS, mispronunciation on my part or not dictating long enough phrases for DNS to interpret what words are required.

By this I mean that if I dictate the phrase "I would like to book a table for four people" as one continuous phrase then DNS is able to distinguish between for and four. If I caused in the middle of that phrase then DNS would likely misinterpret it.

If I was in the woods, I could then ask the bear to bear with me even though I was bare because I just climbed out of the swimming hole! I could also go crazy here and go to a coven and decide to try to find out which witch was which.

No, we may never achieve 100% accuracy but I believe that more and more any inaccuracies are due to the users rather than the software.

John
Dictated with Dragon Naturally Speaking
John Wickett
PEI Canada
902-368-0141
902-651-2152
jwickett@auracom.com
john.wickett@vac-acc.gc.ca
Go to the top of the page
 
+Quote Post
jnuttallphd
post Aug 6 2004, 03:44 AM
Post #14


Member
***

Group: Members
Posts: 832
Joined: 21-April 02
Member No.: 543



Hello John:

I must agree with you. Whenever I'm getting really lousy accuracy it really is due to all my mumbling. I can't remember who mentioned this to me about a half year ago. But it's really true. When I make an effort to really enunciate clearly my accuracy goes way up. My dictation is faster and more accurate than my typing. So I do dictate my memos and e-mails rather than type them. In fact, as I have mentioned on other posts my spelling is not very good, I would rather dictate since Dragon's spelling is better than my spelling!

Jim -- Michigan

[This message has been edited by jnuttallphd (edited 08-06-2004).]


--------------------
Jim -- Michigan
Go to the top of the page
 
+Quote Post

Reply to this topicStart new topic
2 User(s) are reading this topic (2 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 24th May 2013 - 02:24 PM

We Recommend Using Dragon Medical and Dragon NaturallySpeaking Speech Recognition Software

Physicians Using Dragon Medical - Looking for the Best Electronic Health Record (EHR / EMR)?