Regarding the error message and the single huge file. You can have
Vocabulary Builder process a list of smaller documents. You don't have to
dump all the text into a single large document. And no, this will not
adversely affect the language modeling as a whole, it was designed to work
that way.
Regarding Lincoln's Gettysburg Address, did you run the Vocabulary Builder
on that text first? Keep in mind that there's no such thing as "plain
English" (and if there is, "four score" certainly isn't it ;-). Unless
you're secretly Abraham Lincoln, this dictation is very probably *not*
typical of the files you processed with the Vocabulary Builder.
This is actually a common question. People have dictated everything from
Genesis (the Old Testament, not the rock band) to Jabberwocky, laughing at
the misrecognitions. But what is the probability that your average
businessperson or doctor is going to be dictating "begat" or "brillig"?
They stop laughing after they run the Vocabulary Builder on that text and
try it again. You might be surprised. Now all you have to do is figure out
how you're supposed to pronounce "slithy toves"!
Regards,
--Ted Kempster
--Dragon Systems, Inc.
> And *this* is what I said, around 160 wpm, 140 wpm and 120 wpm,
> respectively:
> Four score and seven years ago, our fathers brought forth upon this
> continent a new nation. Conceived in liberty and dedicated to the
> proposition that all men are created equal.
![]() |