Untitled Document

MENU
Design
TTS & MMB
Return to Home

Text-to-Speech

Application Design Considerations

Using Text-to-Speech for Short Phrases An application should use text-to-speech only for short phrases or notifications, not for reading long passages of text. Because listening to a synthesized voice read more than a few sentences requires more concentration, a user can become irritated.

Presenting Important Information Visually An application should communicate critical information visually as well as audibly, and it should not rely solely on text-to-speech to communicate important information. The user can miss spoken messages for a variety of reasons, such as not having speakers or headphones attached to the computer, being distracted or out of earshot when the application speaks, or the user may simply have turned off text-to-speech.

Avoiding a Mix of Text-to-Speech and Recorded Voice The synthesized voice provided by even the best text-to-speech engine is noticeably different from that provided by a digital-audio recording. Mixing the two in the same utterance can be disturbing to the user (and usually makes the text-to-speech voice sound worse by comparison).
For example, to have an application speak "The number is 56,738," you should not prerecord "The number is" and use text-to-speech to speak the numbers. You should either prerecord everything or use text-to-speech for everything.

Making Text-to-Speech Optional An application should always allow the user to turn off text-to-speech. Some users work in environments in which a talking computer may distract coworkers or in which privacy may be important. Also, some users may simply dislike the sound of a synthesized voice.

Speech synthesis: Creating a synthetic replica of speech.
Machine-generated output, simulating speech either electronically (by modelling changing resonances of the vocal tract) or by splicing together samples of speech.

Phoneme: Basic sound unit of speech. The phonemic repertoire of a language includes all the sounds a speaker will use. For example, English has 44 phonemes.

TTS in Multimedia Builder