msm6679a-110 Oki Semiconductor, msm6679a-110 Datasheet - Page 22

no-image

msm6679a-110

Manufacturer Part Number
msm6679a-110
Description
Si/sd Voice Recognizer, Recorder/player, And Speech Synthesizer
Manufacturer
Oki Semiconductor
Datasheet
MSM6679A-110 Voice Recognition Processor
To achieve high accuracy rates, phrase selection, data collection, background initialization
strategy, and control software need careful consideration. There are no published standards for
recognition accuracy.
Oki defines accuracy by:
with the following definitions:
Parameters for Recognition Accuracy
A typical target accuracy of 97% is achieved with a 3% E
a 3%E
SD Recognition
In SD recognition mode, the MSM6679A-110 can be trained to recognize up to 61 words. The
MSM6679A-110 can support multiple speakers by switching vocabularies, but only one speaker’s
vocabulary should be active at one time.
The end user enrolls a phrase in the MSM6679A-110’s vocabulary by recording the phrase three
times or more. The host Micro Controller Unit (MCU) controls the number of times each phrase
in enrolled. Generally, higher recognition accuracy is achieved with each additional enrollment.
The word set is made more robust by pronouncing each phrase slightly differently during initial
enrollment.
In addition to enrollment training, adaptive template updating can drive the accuracy towards
100%. The host MCU updates templates by first asking the speaker to confirm a recognized
phrase with a “yes” or “no” response, and subsequently updating the template for corresponding
words. The use of name tags (see next paragraph) facilitates this process.
Name Tag Recording
To facilitate SD recognition, the MSM6679A-110 supports recording and playback of name tags.
Name tags are used to confirm correct responses in SD recognition. For example, in a phone
dialer application, the user associates a “name” (which is recorded into memory) with a phone
number. The MSM6679A-110 then plays back the name tag so that the user can verify that the
recognized phrase is the correct one.
The VRP stores names tags in memory using an ADPCM compression algorithm with 28 kbps
of speech. The length of a name tag is controlled with a command from the users host MCU
program. The maximum number of name tags possible is 61, but the actual number is dependent
upon record time and memory available. See the section on memory interface for more detail.
20
Substitution Error
Rejection Error
Gap Error
Time-Out Error
Spurious Response Error
REJ
rate.
Name
Accuracy = 100% - E
E
RATE
Symbol
E
E
E
E
E
SUB
GAP
TME
SPU
REJ
= E
SUB
Most critical type error, e.g., Say "Five", recogrize "Nine"
Word not recognized, opportunity for operator to repeat
Word spoken before recognizer ready
Word length is too long
Sourd or imvalid word classfied as a valid word
(i.e., drop handset or speak wong word)
+ 1/2 E
RATE
REJ
RATE
, composed of a 1.5% E
Condition
¡ Semiconductor
SUB
rate and

Related parts for msm6679a-110