speech(redirected from scanning speech)
Also found in: Dictionary, Thesaurus, Medical, Legal.
systematic communication by vocal symbols. It is a universal characteristic of the human species. Nothing is known of its origin, although scientists have identified a gene that clearly contributes to the human ability to use language.
..... Click the link for more information. .
A set of audible sounds produced by disturbing the air through the integrated movements of certain groups of anatomical structures. Humans attach symbolic values to these sounds for communication. There are many approaches to the study of speech.
The physiology of speech production may be described in terms of respiration, phonation, and articulation. These interacting processes are activated, coordinated, and monitored by acoustical and kinesthetic feedback through the nervous system.
Most of the speech sounds of the major languages of the world are formed during exhalation. Consequently, during speech the period of exhalation is generally much longer than that of inhalation. The aerodynamics of the breath stream influence the rate and mode of the vibration of the vocal folds. This involves interactions between the pressures initiated by thoracic movements and the position and tension of the vocal folds. See Respiration
The phonatory and articulatory mechanisms of speech may be regarded as an acoustical system whose properties are comparable to those of a tube of varying cross-sectional dimensions. At the lower end of the tube, or the vocal tract, is the larynx. It is situated directly above the trachea and is composed of a group of cartilages, tissues, and muscles. The upper end of the vocal tract may terminate at the lips, at the nose, or both. The length of the vocal tract averages 6.5 in. (16 cm) in men and may be increased by either pursing the lips or lowering the larynx.
The larynx is the primary mechanism for phonation, that is, the generation of the glottal tone. The vocal folds consist of connective tissue and muscular fibers which attach anteriorly to the thyroid cartilage and posteriorly to the vocal processes of the arytenoid cartilages. The vibrating edge of the vocal folds measures about 0.92– 1.08 in. (23–27 mm) in men and considerably less in women. The aperture between the vocal folds is known as the glottis. The tension and position of the vocal folds are adjusted by the intrinsic laryngeal muscles, primarily through movement of the two arytenoid cartilages. See Larynx
When the vocal folds are brought together and there is a balanced air pressure to drive them, they vibrate laterally in opposite directions. During phonation, the vocal folds do not transmit the major portion of the energy to the air. They control the energy by regulating the frequency and amount of air passing through the glottis. Their rate and mode of opening and closing are dependent upon the position and tension of the folds and the pressure and velocity of airflow. The tones are produced by the recurrent puffs of air passing through the glottis and striking into the supralaryngeal cavities.
Speech sounds produced during phonation are called voiced. Almost all of the vowel sounds of the major languages and some of the consonants are voiced. In English, voiced consonants may be illustrated by the initial and final sounds in the following words: “bathe,” “dog,” “man,” “jail.” The speech sounds produced when the vocal folds are apart and are not vibrating are called unvoiced; examples are the consonants in the words “hat,” “cap,” “sash,” “faith.” During whispering all the sounds are unvoiced.
The rate of vibration of the vocal folds is the fundamental frequency of the voice (F0). It correlates well with the perception of pitch. The frequency increases when the vocal folds are made taut. Relative differences in the fundamental frequency of the voice are utilized in all languages to signal some aspects of linguistic information.
Many languages of the world are known as tone languages, because they use the fundamental frequency of the voice to distinguish between words. Chinese is a classic example of a tone language. There are four distinct tones in Chinese speech. Said with a falling fundamental frequency of the voice, ma means “to scold.” Said with a rising fundamental frequency, it means “hemp.” With a level fundamental frequency it means “mother,” and with a dipping fundamental frequency it means “horse.” In Chinese, changing a tone has the same kind of effect on the meaning of a word as changing a vowel or consonant in a language such as English.
The activity of the structures above and including the larynx in forming speech sound is known as articulation. It involves some muscles of the pharynx, palate, tongue, and face and of mastication.
The primary types of speech sounds of the major languages may be classified as vowels, nasals, plosives, and fricatives. They may be described in terms of degree and place of constriction along the vocal tract.
The only source of excitation for vowels is at the glottis. During vowel production the vocal tract is relatively open and the air flows over the center of the tongue, causing a minimum of turbulence. The phonetic value of the vowel is determined by the resonances of the vocal tract, which are in turn determined by the shape and position of the tongue and lips.
The nasal cavities can be coupled onto the resonance system of the vocal tract by lowering the velum and permitting airflow through the nose. Vowels produced with the addition of nasal resonances are known as nasalized vowels. Nasalization may be used to distinguish meanings of words made up of otherwise identical sounds, such as bas and banc in French. If the oral passage is completely constricted and air flows only through the nose, the resulting sounds are nasal consonants. The three nasal consonants in “meaning” are formed with the constriction successively at the lips, the hard palate, and the soft palate.
Plosives are characterized by the complete interception of airflow at one or more places along the vocal tract. The places of constriction and the manner of the release are the primary determinants of the phonetic properties of the plosives. The words “par,” “bar,” “tar,” and “car” begin with plosives. When the interception is brief and the constriction is not necessarily complete, the sound is classified as a flap. By tensing the articulatory mechanism in proper relation to the airflow, it is possible to set the mechanism into vibrations which quasiperiodically intercept the airflow. These sounds are called trills.
These are produced by a partial constriction along the vocal tract which results in turbulence. Their properties are determined by the place or places of constriction and the shape of the modifying cavities. The fricatives in English may be illustrated by the initial and final consonants in the words “vase,” “this,” “faith,” “hash.”
The ability to produce meaningful speech is dependent in part upon the association areas of the brain. It is through them that the stimuli which enter the brain are interrelated. These areas are connected to motor areas of the brain which send fibers to the motor nuclei of the cranial nerves and hence to the muscles. Three neural pathways are directly concerned with speech production, the pyramidal tract, the extrapyramidal, and the cerebellar motor paths. It is the combined control of these pathways upon nerves arising in the medulla and ending in the muscles of the tongue, lips, and larynx which permits the production of speech. See Nervous system (vertebrate)
communication by means of language; a type of human communicative activity. Speech originated between individuals in a collective as a means of coordinating joint work activity and as a manifestation of nascent consciousness. The means of expression used in this process gradually lost their “natural” character and became a system of artificial signals. The signals do not merely organize in some way or another an activity that is theoretically independent of the signals but introduce into the activity new and objective content—the word combining communication and generalization—and thus serve to alter the structure of the activity. A language symbol fixes not only the external, natural relations between objects but also the relations that arise in the very process of activity.
Conflicting views on the nature and function of speech can be found in scholarly literature. B. Croce regards speech as a means of emotional expression. O. Dittrich, K. Jaberg, and K. Vossler ascribe two main functions to speech: expression and communication. For A. Marty and P. Wegener, speech is only a means of exerting influence. K. Bühler distinguishes between expression, address, and information. Soviet psychologists point to two main functions of language (speech): as a means or instrument of communication and as a means of generalization and instrument of thought. Speech also serves to express, influence, and indicate.
Speech as a psychological phenomenon is usually regarded as a special type of activity, alongside work, cognition, memory, and so on, and as the speech actions or operations included in the above-mentioned kinds of activity. In this sense, speech is on a level with such categories as thought and memory. From the psychological and physiological standpoints, speech is one of the higher mental functions of man.
The physiological basis of speech is a complex organization of several functional systems, which are partly specialized and partly serve other kinds of activity. This organization is multimember and multilevel and embraces both elementary physiological mechanisms of the stimulus-reaction type and specific mechanisms hierarchically structured and characteristic solely of the higher forms of speech activity. The psychophysiological organization of speech contains both completely automatic components and components that are consciously perceived; meaning and sometimes words, grammar, and even sounds are perceived. The nature of perception varies with the type of speech, the level to which the speaker’s verbal abilities are developed, the social situation, and other factors. As an infant develops mentally, it gradually learns to speak as a result of the interaction of the increasingly complex processes of communication (and the processes of using speech for purposes other than communication) and the emergence of other kinds of activity that underlie speech.
Linguistics studies speech as one of two basic categories—language and speech—in their unity and mutual opposition. Speech here is usually taken to mean the realization of a language system. Language is potentiality, something that exists as an abstraction, and social. Speech, on the other hand, is realization, something actual and individual. Such an interpretation generally derives from F. de Saussure’s Course in General Linguistics, which distinguished between language (langue) and linguistic ability (faculté de langage) as social and individual, respectively. Langue and faculté de langue are united within the framework of verbal activity (langage). Langage is opposed to speech (parole) as potentiality is to realization; the potential is equated with the social and the real with the individual. Neither Saussure’s conception nor the usual interpretation of the relationship between speech and language in modern linguistics is in complete harmony with the views on the nature and function of communication (speech) found in modern Soviet philosophy, sociology, and psychology. Soviet scholars emphasize the relationship of verbal communication to the process of interaction and to the system of social relations manifested in communication.
Since language is used not only for communication but also for other kinds of activity, such as thought and memory reinforcement, it is possible to distinguish between speech proper, or external speech, and internal speech. The former is used for communication, that is, it is intended to be understood by other persons so that minds and actions may be influenced and social interaction stimulated. The latter, on the other hand, is essentially communication with oneself intended to formulate and solve a particular cognitive problem. The orientation of internal speech to cognitive problems leads to the use of not only the language units and constructions found in speech proper but also various kinds of auxiliary means, such as images and diagrams. It also results in the emergence of the specific patterns of the syntactic structure of internal speech that were first described by L. S. Vygotskii. Internal speech differs from talking to oneself, silent speech, and internal programming, that is, the creation of a plan for a future statement or series of statements. It is genetically derived from speech proper. Only the use of speech for communication makes it possible to use speech for cognition, both in phylogeny and ontogeny.
Speech proper (external speech) is divided into dialogic and monologic, the former being genetically primary. A distinction is also made between oral and written speech. Aside from the fact that written speech has a specific structure and is physically embodied in a writing system, written speech differs from oral speech in that the writer has a much greater opportunity arbitrarily or consciously to select and organize any language elements that he wishes.
In addition to being communicative, external speech can perform a variety of supplementary functions at the same time. These functions include the poetic (speech as a way of communicating through art), magic (“acting” on the real world by means of words), nominative (speech as denomination), and diacritic (speech directly included in a verbal situation and used only to make the verbal situation more exact and to supplement nonverbal actions). In these functions speech may acquire specific patterns of internal organization, so that one can speak of poetic speech, magic speech, and so on.
REFERENCESVoloshinov, V. N. Marksizm i filosofiia iazyka, 2nd ed. Leningrad, 1930.
Saussure, F. de. Kurs obshchei lingvistiki. Moscow, 1933. (Translated from French.)
Vygotskii, L. S. Izbr. psikhologicheskie issledovaniia. Moscow, 1956.
Zhinkin, N. I. Mekhanizmy rechi. Moscow, 1958.
Zhinkin, N. I. “Psikhologicheskie osnovy razvitiia rechi.” In V zashchitu zhivogo slova. Moscow, 1966.
Nikolaeva, T. M. “Pis’mennaia rech’ i spetsifika ee izucheniia.” Voprosy iazykoznaniia, 1961, no. 3.
lazyk i rech’: Tezisy dokladov. Moscow, 1962.
Miller, G. “Rezh’ i iazyk.” In Eksperimental’naia psikhologiia, vol. 2. Moscow, 1963.
Rech’: Artikuliatsiia i vospriiatie. Moscow-Leningrad, 1965.
Kholodovich, A. A. “O tipologii rechi.” In Istoriko-filologicheskie issledovaniia. Moscow, 1967.
Sokolov, A. N. Vnutrenniaia rech’i myshlenie. Moscow, 1968.
Leont’ev, A. A. Iazyk, rech’, rechevaia deiatel’nost’. Moscow, 1969.
Leont’ev, A. N. Problemy razvitiia psikhiki, 3rd ed. Moscow, 1972.
Osnovy teorii rechevoi deiatel’nosti. Moscow, 1974.
Bühler, K. Sprachtheorie. Jena, 1934.
Coseriu, E. Sistema, norma y habla. Montevideo, 1952.
Jakobson, R. “Linguistics and Poetics.” In the collection Style in Language. New York-London, 1960.
Slama-Cazacu, T. Langage et contexte. The Hague, 1961.
Moscovici, S. “Communication Processes and the Properties of Language.” In the collection Advances in Experimental Social Psychology, vol. 3. New York-London, 1967.
A. A. LEONTEV
The physiological functions that enable people to communicate with each other by means of speech sounds consist in speech production and perception. A number of organs are used to produce speech sounds. The organs that participate in speech production are called the vocal apparatus. The lungs and respiratory muscles, the initial energy source, produce pressure and air flow in the vocal tract, which consists of the larynx, pharynx, mouth, nose, soft palate, and lips. The activity of the speech organs is strictly coordinated, resulting in the production of articulate sounds. The process of speech production is largely organized by the nervous system and subordinated to a hierarchical control principle. The main levels of speech production are synthesis of the sentence to be uttered, organization of the program of articulation, realization of the program in a succession of articulatory movements, and the actual production of the sounds.
The auditory system and the nervous system, in which acoustic signals are transformed, participate in the perception of speech and ensure the ultimate understanding of the meaning of the verbal communication. The process is organized on a hierarchical principle: discrimination by the ear of the signal’s spectral and temporal characteristics, which are the distinguishing features of speech sounds; phonetic analysis to transform the flow of signal characteristics into a series of discrete elements of communication (phonemes or syllables); and analysis of the syntax and meaning of the communication.
From the physical standpoint, oral speech consists of a succession of speech sounds—vowels and consonants—usually run in, with pauses only after individual words or groups of sounds. The running together of sounds that results from the continuous articulatory movements of the speech organs causes neighboring sounds to influence each other. The size of the speech organs varies by individual, and each person has his own manner of pronunciation. As a result, each person’s speech sounds have an individual character. For all their variety, however, speech sounds are the physical realization of only a small number of phonemes. A phoneme is the smallest phonetic unit of a given language, which in speech can have any number of concrete realizations. Russian has 41 phonemes: six vowels (/a/, /o/, /u/, /e/, /i/, and /i/), three hard consonants (/∫/, /3/, and /ts/), two soft consonants (/t∫/and /j/), and 15 consonants that can be either hard or soft. Although the sounds [ja], [ju], [je], and [jo] are represented by single letters in the Russian alphabet, the sounds consist of two phonemes each.
Speech sounds are not all informative to the same degree. Vowels contain little information about the meaning of speech, whereas voiceless consonants are the most informative. For example, in the word posylka (“parcel”) the sequence o-y-a says nothing, but p-s-lk- gives a fairly clear idea of the meaning of the word. The precision with which speech is transmitted (for example, in communication systems) is determined by the articulation method. A set of speech elements (words, syllables) reflecting the makeup of speech sounds in a particular language is transmitted, and the relative number of elements perceived is determined. The intelligibility of speech is largely determined by the intelligibility of the voiceless consonants.
Air pulses created by the vocal cords during the articulation of voiced speech sounds can fairly accurately be regarded as periodic. The corresponding period of oscillations is called the period of the main tone of voice and the reciprocal is the frequency of the main tone, which usually ranges from 70 to 450 hertz (Hz). The frequency of the main tone changes when speech sounds are pronounced. This change is called intonation. Each person has his own range of change in the main tone—usually a little more than an octave—and his own intonation. The latter is very important for voice recognition.
Because pulses of the main tone are saw-toothed in form, pulses that are repeated periodically create a discrete spectrum with a large number of overtones or harmonics. When plosives or fricatives are pronounced, the flow of air is forced through narrow portions of the vocal tract, forming eddies that create noises with a continuous broadband spectrum. Therefore, when speech sounds are articulated, a signal with a tonal or noise spectrum or both passes through the vocal tract. The vocal tract is a complex acoustic filter with a number of resonant cavities created by the articulatory organs. As a result, the output signal, that is, the speech sound that is pronounced, has a complex wavelike spectrum envelope. The maximums of energy concentration in a sound spectrum are called formants, and the sharp dips are known as antiformants. Since every speech sound has its own resonances and antiresonances in the vocal tract, the spectrum envelope of the sound has an individual form. Most vowels have a characteristic position of the formants and antiformants and a correlation of the levels between the two. In addition, the change in formant parts over time is important in the case of consonants (see Figures 1,2, and 3).
Voiced speech sounds, especially vowels, have a high level of intensity, whereas voiceless sounds have a very low level of intensity. Therefore, the loudness of speech changes continuously and very abruptly when plosives are pronounced. The range of speech levels varies from 35 to 45 decibels (dB). The duration of vowels and consonants averages 0.15 and 0.08 sec, respectively, and that of the sound [p] is about 0.03 sec.
Speech sounds are produced after commands are issued in the form of bioelectric signals to the muscles of the articulatory
organs from the speech center in the brain. There are no more than ten such signals, and they change slowly, at the rate at which speech sounds succeed one another (from five to ten sounds per sec). The total flow of these signals amounts to between 50 and 100 information units (bits/sec), whereas an entire speech signal is 1,000 times greater. This is because a speech signal is a unique modulated carrier. All the information consists in spectral modulation, that is, in change of the shape of the spectrum envelopes and speech level, whereas the carrier proper contains no information about the meaning of speech.
The main purpose of speech is to transmit information from one person to another, whether the persons are talking directly to each other or by some means of communication. Because a communication channel must have a capacity of between 50,000 and 70,000 bits/sec to transmit natural speech, efforts are made to compress the speech signal flow at the transmitting end of the tract and to expand the flow at the receiving end; this is done to save on capacity and to increase the number of possible conversations proportionately. For example, lowering the level of loud speech sounds decreases the difference in levels between loud and soft sounds; the dynamic range is compressed. The frequency band of a sound signal can be compressed in the same way. Portions of a signal not carrying information (the middle of long sounds) can be eliminated from speech, that is, speech can be compressed in time. At the receiving end, the bands can be reconstructed accordingly and the eliminated portions of the sounds filled in.
If the modulating signal is separated from the carrier, the communication channel requires even less capacity to transmit speech. Vocoders are used for this purpose in communication systems.
Present-day research on human communication with machines is concerned with two matters: automatic control of machines and processes by speech (oral input into a computer, automatic typewriters) and speech synthesis by various code signals (oral output from a computer, talking machines to read to the blind). M. A. Sapozhkov
Specialists in acoustics, psychoacoustics, and phonetics are doing research on the mechanisms of the acoustic and phonetic analysis of speech. Linguists, psycholinguists, and specialists in the physiology of the second signaling system are engaged in syntactic and semantic analysis of communication. Speech production is investigated by means of X-ray motion pictures, electromyography, and sensors for air pressure and currents, acoustic phenomena, and movements of the vocal apparatus. The most important method of studying the perception of speech involves determining perception characteristics in relation to the physical properties of natural or synthetic speech sounds. Physical and mathematical simulation is very important. Data are needed for linguistics, logopedics, the study of deafness, communications technology, and the construction of automatic systems for speech recognition and synthesis.
REFERENCESFant, G. Akusticheskaia teoriia recheobrazovaniia. Moscow, 1964. (Translated from English.)
Rech’: artikuliatsiia i vospriiatie. Moscow-Leningrad, 1965.
Flanagan, J. L. Analiz, sintez i vospriiatie rechi. Moscow, 1968. (Translated from English.)
Chistovich, L. A., and V. A. Kozhevnikov. “Vospriiatie rechi.” In Fiziologiia sensornykh sistem, part 2. Leningrad, 1972. (A manual on physiology.)
Sapozhkov, M. A. Rechevoi signal ν kibernetike i sviazi. Moscow, 1963.
Speech disorders include impairments of speech perception and formation. They are caused by anatomical defects of the peripheral vocal apparatus and disturbances of its innervation, as well as by organic and functional changes in parts of the central nervous system concerned with speech activity. Disorders of speech formation are manifested by disturbances of the syntactic structure of phrases, changes in vocabulary and sounds, and changes in the technique, tempo, and smoothness of speech. In disorders of speech perception, there is impairment of the processes involved in recognition of speech elements and in grammatical and semantic analysis of received information. Disturbances of perception resulting from injuries to the peripheral auditory system are not regarded as forms of speech pathology. Disturbances of speech formation are studied by physiological and biophysical methods, phonetic and linguistic analysis of patients’ speech productions, and techniques allowing for acoustic analysis of speech signals. Disturbances of perception are studied by psychoacoustic and psycho-linguistic methods.
Speech disorders are classified on the basis of principal manifestations, associated neurological symptoms, and the nature of the anatomical changes in the vocal apparatus. Disturbances of the processes involved in analysis and synthesis of communications and disorders of speech memory caused by local cerebral lesions are types of aphasia. Similar lesions of the central nervous system that develop in children before they learn to speak result in alalia. The various types of aphasia are associated with disturbances of speech formation caused by the loss of complex verbal coordination (verbal apraxia). Disturbances of speech formation caused by lesions of the cranial nerves, the nuclei of cranial nerves, and some subcortical structures are forms of dysarthria. Disorders caused by the anatomical characteristics of the vocal apparatus (skeleton and soft tissues) are forms of dysplasia. Dysplasia may result from a harelip, cleft palate, malocclusion, or traumas. It includes disturbances of vocalization caused by paralysis or paresis of the vocal cords, scar changes of the vocal cords, laryngeal tumors, and various diseases that cause speech defects, the most severe being those that develop after total removal of the larynx (laryngectomy).
Functional speech disorders include stuttering, functional ankyloglossia (tongue-tie), mutism, and disturbances caused by laryngeal dysfunction, such as aphonias of psychogenic origin. Secondary speech disorders develop in deafness and hardness of hearing, their nature and severity dependent on the degree of hearing loss and on the time the disease developed. If hearing is lost before speech is learned, deaf-mutism may result. Speech disorders are often accompanied by difficulty in reading (dyslexia) or writing (dysgraphia), to the point where these capabilities are completely lost (alexia and agraphia).
Speech disorders are diagnosed by neuropathologists and specialists in logopedics.
Outpatient treatment of those with speech disorders is provided at logopedic centers, and inpatient treatment is provided at psychoneurological and specialized speech clinics. Therapeutic measures include medication, surgery, psychotherapy, and special speech training (logopedic exercises). Children with major speech defects are taught in special schools.
REFERENCESLuriia, A. R. Vysshie korkovye funktsii cheloveka i ikh narusheniiapri lokal’nykh porazheniiakh mozga. Moscow, 1962.
Pravdina, O. V. Logopediia. Moscow, 1969.
Handbook of Speech Pathology. Edited by L. E. Travis. New York .
Luchsinger, R., and G. E. Arnold. Lehrbuch der Stimmund Sprachheilkunde, 2nd ed. Vienna, 1959.
IU. I. KUZ’MIN