VOICE-ACTIVATED products are being used in a wide range of areas, including
- Manufacturing
- Laboratory reporting
- Data management
- Inventory control
- Quality assurance
- Order fulfillment
- Field data capture
An example of an industry where speech input is invaluable is food
processing, where food handlers are required to record FDA information as they
perform inspections. Keyboard data entry is impractical because using a
keyboard while handling food product s would risk contamination. Speech input
lets workers perform these tasks simultaneously without compromising health
regulations.
Features and
Benefits
- Frees users hands and eyes for other tasks. In situations where users
need their hands or eyes to perform other tasks, speech input allows them to
complete tasks while simultaneously inputting data.
- Simplifies computing for novice users. Voice commands are more natural and
easier to remember than complex keystroke sequences. Speech input makes it
easier to use computers, reducing training time and lowering training costs for
new workers.
- Improves data entry speed and accuracy. Speech input can result in faster
data entry with fewer errors when used instead of manual typing.
- Automates processes requiring instant data access. Speech recognition makes
it easier to enter data on-line to improve the efficiency of automated
processes. Because systems gain immediate access to statistical data, they can
react more quickly to corre ct problems and speed throughput.
- Protects workers against repetitive stress injuries. By reducing the amount
of typing required to run your application, you also reduce workers risk
of developing Repetitive Stress Injuries (RSI), the leading cause of work
related injuries today. Speech input gives people with RSI an alternative to
taking disability leave, and helps them continue to be productive employees.
Powerful capabilities for building voice interfaces
Developed by Dragon Systems, world leader in PC speech recognition, Dragon
VoiceTools is a software developers kit that enables you to add a voice
activated interface to your application.
Heres how it works: A user speaks into a microphone attached to a
sound card inside the computer. The card converts the sound into digital form.
The Dragon Speech Driver, which handles voice processing and conversion,
interprets the digital signal and compares it to all the words you have defined
in your application. The Driver then determines which word was spoken and
returns it to your application.
Dragon VoiceTools lets you incorporate speech recognition as the primary
input source or as one source in a combination of input modes. It provides you
with everything you need to build applications that recognize human speech as
input.
DOS and Windows compatibility
Speech input can result in faster data entry with fewer errors when used
instead of manual typing. you can integrate speech into your MS-DOS and
Microsoft Windows applications.
Speech recognition with little or no user training
Dragon VoiceTools speaker-independent acoustic voice models for
American English ensure that your application will recognize speech from
practically anyone.
Speaker adaptation
You can add the capability to learn and adapt to each users voice for
improved recognition accuracy.
Speaker-dependent speech recognition
Once the Speech Driver has adapted the voice models to a specific user, they
become speaker-dependent. To accelerate this process, Dragon VoiceTools allows
you to enroll the user to create speaker-dependent voice models.
Continuous speech for digits
You can program continuous speech recognition for digits. This feature is
invaluable for speech input of zip codes, account numbers, social security
numbers and telephone numbers.
Flexible interface
You can integrate speech recognition into any source code that can call C
functions.
Library of C-function calls
Dragon VoiceTools contains all of the functions you need to control the
Speech Drivers powerful capabilities.
Dragon Speech Driver
For best performance, speech recognition functions are performed in a
separate Speech Driver that resides in extended memory. The Driver controls:
Word Recognition. The Driver compares utterances against
the application-specific vocabularies to determine which word was spoken.
Vocabularies. The Driver handles the total collection of
words defined in an application, along with the word pronunciations and
frequency information.
States. The Driver manages developer-defined application
states so you can specify which words are available at any given time.
Microphone input. The Driver sets up communication
channels between the speech driver and the microphone, and turns the microphone
on and off.
Word adaptation. The Driver can build and adapt word
models for the speech patterns of specific users.
User files. The Driver stores and updates acoustic
information for words and phrases for each user.
Speech data collection. The Driver collects, saves and
loads speech samples for training or recognition.
Developer-specified application vocabularies. Dragon VoiceTools
lets you design the vocabularies for your application.
The total vocabulary is limited to 10,000 words or phrases. Up to 1,000 words
or phrases can be active at any time. You specify the words and phrases to be
used within your application in an easy-to-read text format and compile it with
Dragon VoiceTools Finite State Grammar Compiler.
In a large-vocabulary application, such as inventory control for an
automobile parts store, you could easily divide the inventory into categories.
Then you would define a different vocabulary state for each category, such as
chassis parts, engine parts, and transmission parts. Each state could have up
to 1,000 entries. For example, the vocabulary for the state engine parts might
include fuel injector, distributor cap, alternator, timing belt and spark plug.
Flexible Speech Model Options. Dragon VoiceTools words by creating models of
the sounds of human speech. You decide how to handle the speech models for your
application, choosing one or any combination of these approaches:
Use the speaker-independent acoustic voice models included with Dragon
VoiceTools.
Build and adapt your own acoustic voice models.
Include no models, and have users train the speech recognition.
Dragon VoiceTools components
(DOS and Windows) Dragon VoiceTools is a software developers kit that
lets you integrate speech into your PC applications.
Software
DOS and Windows Speech Drivers: The Speech Driver is a program that handles
the processing and conversion of speech into data that you program can
understand. Dragon VoiceTools includes both DOS and Windows versions of the
Speech Driver.
Speech Driver Application Program Interface (SDAPI) Library: A
collection of C language functions that you can link to you program code to
control the Speech Driver. These functions work with Borland and Microsoft C
and C++ compilers.
Finite State (FSG) Compiler: A tool that creates a vocabulary file specific to your application. The FSG
Compiler converts an easy-to-read text file that you create to determine the
words and phrases your application will recognize at any particular time.
Speaker-Independent Acoustic Voice Models
Derived from a broad range of speakers, speaker-independent acoustic voice
models are included for over 110,000 of the most commonly spoken American
English words and phrases. These models allow you to develop applications
requiring little or no training .
Easy-to-follow Example Programs
DOS and Windows examples are included to guide you through the process of
integrating speech recognition.
Documentation
Programmers Guide A step-by-step explanation of the development
process and conventions used in speech recognition.
Reference Guide
A list of function calls in the SDAPI library, with explanations and
examples of each function.
Hardware
Microphone The microphone shipped with Dragon VoiceTools Developers
System is certified by Dragon Systems to provide outstanding performance and
noise-canceling capabilities.
Additional Required Component Sound Card (sold separately). For DOS, Dragon
VoiceTools requires a Dragon Systems certified DSP audio board with a 1/8 inch
microphone jack.
For Windows, you can use a 16-bit industry standard soundcard or a Dragon
certified DSP audio board.
To sum it up, Dragon VoiceTools is a software developer's kit that lets
developers write speech-aware PC software applications for either DOS or Windows
operating environments.
System
Requirements
Processor:
Windows 3.1 or above (not Windows '95) & DOS v5.0 Desktop & Laptop.
System:
IBM PC or compatible, 386 or 486 (486DX required for continuous speech
recognition of digits) 20 MHz or faster, ISA, EISA or MCA architectures. One
8-bit expansion slot. System requirements vary according to the application and
vocabulary size.
Audio:
Creative Labs Sound Blaster 16 | Microsoft Windows Sound System or | Dragon
Systems certified DSP audio board. | For DOS, a Dragon Systems certified DSP
audio board.
Memory:
5 MB RAM
Storage:
28 MB minimum free hard disk space, plus 2 MB to compile and store example
programs.
|