Speech perception


Also found in: Dictionary, Thesaurus, Medical, Wikipedia.
Related to Speech perception: speech production

Speech perception

A term broadly used to refer to how an individual understands what others are saying. More narrowly, speech perception is viewed as the way a listener can interpret the sound that a speaker produces as a sequence of discrete linguistic categories such as phonemes, syllables, or words. See Psycholinguistics

Classical work in the 1950s and 1960s concentrated on uncovering the basic acoustic cues that listeners use to hear the different consonants and vowels of a language. It revealed a surprisingly complex relationship between sound and percept. The same physical sound (such as a noise burst at a particular frequency) can be heard as different speech categories depending on its context (as “k” before “ah,” but as “p” before “ee” or “oo”), and the same category can be cued by different sounds in different contexts. Spoken language is thus quite unlike typed or written language, where there is a relatively invariant relationship between the physical stimulus and the perceived category.

The reasons for the complex relationship lie in the way that speech is produced: the sound produced by the mouth is influenced by a number of continuously moving and largely independent articulators. This complex relationship has caused great difficulties in programming computers to recognize speech, and it raises a paradox. Computers readily recognize the printed word but have great difficulty recognizing speech. Human listeners, on the other hand, find speech naturally easy to understand but have to be taught to read (often with difficulty). It is possible that humans are genetically predisposed to acquire the ability to understand speech, using special perceptual mechanisms usually located in the left cerebral hemisphere. See Hemispheric laterality

Building on the classical research, the more recent work has drawn attention to the important contribution that vision makes to normal speech perception; has explored the changing ability of infants to perceive speech and contrasted it with that of animals; and has studied the way that speech sounds are coded by the auditory system and how speech perception breaks down in those with hearing impairment.There has also been substantial research on the perception of words in continuous speech.

Adult listeners are exquisitely sensitive to the differences between sounds that are distinctive in their language. The voicing distinction in English (between “b” and “p”) is cued by the relative timing of two different events (stop release and voice onset). At a difference of around 30 milliseconds, listeners hear an abrupt change from one category to another, so that a shift of only 5 ms can change the percept. On the other hand, a similar change around a different absolute value, where both sounds are heard as the same category, would be imperceptible. The term categorical perception refers to this inability to discriminate two sounds that are heard as the same speech category.

Categorical perception can arise for two reasons: it can have a cause that is independent of the listener's language—for instance, the auditory system may be more sensitive to some changes than to others; or it can be acquired as part of the process of learning a particular language. The example described above appears to be language-independent, since similar results have been found in animals such as chinchillas whose auditory systems resemble those of humans. But other examples have a language-specific component. The ability to hear a difference between “r” and “l” is trivially easy for English listeners, but Japanese perform almost at chance unless they are given extensive training. How such language-specific skills are developed has become clearer following intensive research on speech perception in infants.

Newborn infants are able to distinguish many of the sounds that are contrasted by the world's languages. Their pattern of sucking on a blind nipple signals a perceived change in a repeated sound. They are also able to hear the similarities between sounds such as those that are the same vowel but have different pitches. The ability to respond to such a wide range of distinctions changes dramatically in the first year of life. By 12 months, infants no longer respond to some of the distinctions that are outside their native language, while infants from language communities that do make those same distinctions retain the ability. Future experience could reinstate the ability, so it is unlikely that low-level auditory changes have taken place; the distinctions, although still coded by the sensory system, do not readily control the infant's behavior.

Although conductive hearing losses can generally be treated adequately by appropriate amplification of sound, sensorineural hearing loss involves a failure of the frequency-analyzing mechanism in the inner ear that humans cannot yet compensate for. Not only do sounds need to be louder before they can be heard, but they are not so well separated by the ear into different frequencies. Also, the sensorineurally deaf patient tolerates only a limited range of intensities of sound; amplified sounds soon become unbearable (loudness recruitment).

These three consequences of sensorineural hearing loss lead to severe problems in perceiving a complex signal such as speech. Speech consists of many rapidly changing frequency components that normally can be perceptually resolved. The lack of frequency resolution in the sensorineural patient makes it harder for the listener to identify the peaks in the spectrum that distinguish the simplest speech sounds from each other; and the use of frequency-selective automatic gain controls to alleviate the recruitment problem reduces the distinctiveness of different sounds further. These patients may also be less sensitive than people with normal hearing to sounds that change over time, a disability that further impairs speech perception.

Some profoundly deaf patients can identify some isolated words by using multichannel cochlear implants. Sound is filtered into different frequency channels, or different parameters of the speech are automatically extracted, and electrical pulses are then conveyed to different locations in the cochlea by implanted electrodes. The electrical pulses stimulate the auditory nerve directly, bypassing the inactive hair cells of the damaged ear. Such devices cannot reconstruct the rich information that the normal cochlear feeds to the auditory nerve. See Hearing (human), Perception, Psychoacoustics, Speech

References in periodicals archive ?
The elderly often tend to complain of speech perception without the presence of considerable hearing loss.
Sound and speech perception hearing levels were assessed by the speech reception threshold (SRT) and the speech discrimination test (SDT), respectively.
In line with Casserly and Pisoni (2010), since speech production cannot be reduced to a mere motor activity, speech perception cannot be reduced to mere sensory interpretation either.
Few researchers have investigated the use of some of the noise reduction methods to improve speech perception in noisy listening situations with cochlear implant devices, but with varying degrees of success.
An independent samples t test was conducted to compare the increase in speech perception tests and language quotient score in children who had their hearing aid at 2 years of age and who had hearing aid at [greater than or equal to] 2 years of age.
Improvements in speech perception in both quiet and noisy environments were found at six and 12 months following cochlear implantation, and the data also indicated significant improvements in depressive symptoms--76 percent of participants had no depressive symptoms 12 months after implantation, compared to 59 percent with no depressive symptoms prior to the procedure.
Decoding involves attention, speech perception, word recognition, and grammatical parsing; comprehension includes activation of prior knowledge, representing propositions in short term memory, and logical inference; interpretation encompasses comparison of meanings with prior expectations, activation participation frames, and evaluation of discourse meanings.
1997) studied the speech perception abilities of implanted children at different age and they support the notion that the earlier in life implantation is performed, the better the development of speech perception.
Within-subjects comparison of the HiRes and Fidelity120 speech processing strategies: speech perception and its relation to place-pitch sensitivity.
Background noise impacting speech perception is one of the greatest problems affecting voice command systems," said Anthony Cirurgiao, Setem's co-founder and CEO.