Why Concatenated Voice Technology Is Your Best
Choice
For Public Safety Applications
In plain English, concatenated voice technology is the best choice
for public safety applications, where a mispronounced word could
misdirect a fire/EMS crew to the wrong address during an emergency.
Concatenated
voice technology is based on a pre-recorded audio database. A
voice talent records the necessary information (street names, incident
types,
etc.) and the voice "engine" breaks apart the recordings and stores
them in fragments in the audio database. When a call comes through, it assembles
specific word "bits" and audio phrases to form sentences, i.e.
the emergency dispatch. The "Pros" of Concatenated Speech Technology
| |
• |
Natural sounding because it's based on actual
recordings of a human voice |
| |
• |
Clear and easy to understand |
| |
• |
Sentences are spoken with natural rhythm
and pitch |
| |
• |
Little chance for pronunciation errors because
the audio is pre-recorded |
| |
• |
Recordings are created in a noise-free environment |
 |
• |
Often can't tell the difference
between a concatenated and "live" announcement |
The "Cons" of Concatenated Speech Technology
| |
• |
There aren't many |
| |
• |
Concatenated speech technology requires the
development of a pre-recorded audio database |
| |
• |
It requires disk space, RAM, and processing
power |
 |
• |
Example: A 15k-entry audio database
would require about 2 gigabytes of storage, which is much less
than what's typically available on today's PCs |
Another Technology Option: Synthesized Speech
Synthesized speech technology has some serious drawbacks for mission-critical
public safety applications.
| |
• |
Can sound robotic and unnatural |
| |
• |
Requires correctly spelled, well-punctuated,
unambiguous text to be accurate |
| |
• |
Mispronounces and misconstrues words quite
easily |
 |
• |
Can convey only limited emphasis
and emotion |
|