1. Trang chủ
  2. » Luận Văn - Báo Cáo

Listening to talking faces motor cortical activation during speech perception

31 3 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Motor Cortical Activation During Speech Perception
Tác giả Jeremy I. Skipper, Howard C. Nusbaum, Steven L. Small
Trường học The University of Chicago
Chuyên ngành Psychology
Thể loại thesis
Thành phố Chicago
Định dạng
Số trang 31
Dung lượng 239,86 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Nonetheless these experiments have collectively yielded a fairly consistentresult: Audiovisual speech integration and perception produce activation of auditory cortices,predominantly pos

Trang 1

Running head: Listening to Talking Faces

Listening to Talking Faces: Motor Cortical Activation During

Speech Perception

Jeremy I Skipper, 1,2 Howard C Nusbaum, 1 and Steven L Small1, 2

Departments of 1Psychology, and 2Neurology and the Brain Research Imaging Center

The University of Chicago

Address correspondence to:

Trang 2

AbstractNeurophysiological research suggests that understanding the actions of others harnesses neural circuitsthat would be used to produce those actions directly We used fMRI to examine brain areas activeduring language comprehension in which the speaker was seen and heard while talking (audiovisual)

or heard but not seen (audio-alone) or when the speaker was seen talking but the audio track removed(video-alone) We also examined brain areas active during speech production We found that speechperception in the audiovisual, but not audio-alone or video-alone conditions activated a network ofbrain regions overlapping cortical areas involved in speech production and proprioception related tospeech production These regions included the posterior part of the superior temporal gyrus and sulcus,the superior portion of the pars opercularis, the dorsal aspect of premotor cortex, adjacent primarymotor cortex, somatosensory cortex, and the cerebellum Activity in dorsal premotor cortex andposterior superior temporal gyrus and sulcus was modulated by the amount of visually distinguishablephonemes in the stories These results suggest that integrating observed facial movements into thespeech perception process involves a network of brain regions associated with speech production Wesuggest that this distributed network serves to represent the visual configuration of observed facialmovements, the motor commands that could have been used to generate that configuration, and theassociated expected auditory consequences of executing that hypothesized motor plan These regions

do not, on average, contribute to speech perception when in the presence of the auditory or visualsignals alone

Trang 3

Neurobiological models of language processing have traditionally assigned receptive andexpressive language functions to anatomically and functionally distinct brain regions This divisionoriginates from the observation that Broca’s aphasia, characterized by nonfluent spontaneous speechand fair comprehension, is the result of more anterior brain lesions than Wernicke’s aphasia, which ischaracterized by fluent speech and poor comprehension, and is the result of more posterior brainlesions (Geschwind, 1965)

As with many such simplifications, the distinction between expressive and receptive languagefunctions within the brain is not as straightforward as it may have appeared Comprehension in Broca’saphasia is almost never fully intact nor is production in Wernicke’s aphasia normal (Goodglass, 1993).Electrical stimulation of sites in both the anterior and posterior aspects of the brain can disrupt bothspeech production and speech perception (Ojemann, 1979; Penfield & Roberts, 1959) Neuroimagingstudies have also shown that classically defined receptive and expressive brain regions are often bothactive in tasks that are specifically designed to investigate either perception or production (Braun et al.,2001; Buchsbaum et al., 2001; Papathanassiou et al., 2000)

Recent neurophysiological evidence from nonhuman primates suggests an explanation for theobserved interactions between brain regions traditionally associated with either languagecomprehension or production The explanation requires examination of motor cortices and their role inperception Regions traditionally considered to be responsible for motor planning and motor controlappear to play a role in perception and comprehension of action (Graziano & Gandhi, 2000; Romanski

& Goldman-Rakic, 2002) Certain neurons with visual and/or auditory and motor properties in theseregions discharge both when an action is performed and during perception of another personperforming the same action (Gallese et al., 1996; Kohler et al., 2002; Rizzolatti et al., 1996) In themacaque brain, these neurons reside in area F5 which is the proposed homologue of Broca’s area, theclassic speech production region of the human (Rizzolatti et al., 2002)

The existence of “mirror neurons” suggests the hypothesis that action observation aids actionunderstanding via activation of similar or overlapping brain regions used in action performance If this

is the case, perhaps speech understanding, classically thought to be an auditory process (e.g., Fant,1960), might be aided in the context of face-to-face interaction by cortical areas more typicallyassociated with speech production Seeing facial motor behaviors corresponding to speech production

Trang 4

(e.g., lip and mouth movements) might aid language understanding by recognition of the intendedgesture within the motor system, thus further constraining possible interpretations of the intendedmessage.

Audiovisual Language Comprehension

Most of our linguistic interactions evolved, develop, and occur in a setting of face-to-faceinteraction where multiple perceptual cues can contribute to determining the intended message.Although we are capable of comprehending auditory speech without any visual input (e.g., listening tothe radio, talking on the phone), observation of articulatory movements produces significant effects oncomprehension throughout the lifespan Infants are sensitive to various characteristics of audiovisualspeech (Kuhl & Meltzoff, 1982; Patterson & Werker, 2003) By adulthood, the availability of visualinformation about speech production significantly enhances recognition of speech sounds in abackground of noise (Grant & Seitz, 2000; Sumby & Pollack, 1954) and improves comprehensioneven when the auditory speech signal is clear (Reisberg et al., 1987) Furthermore, incongruentaudiovisual information can change the identity of a speech percept For example, when an auditory/ba/ is dubbed onto the video of someone making mouth movements appropriate for production of /ga/,the resulting percept is usually /da/ We are susceptible to this audiovisual integrative illusion fromearly childhood through adulthood (Massaro, 1998; McGurk & MacDonald, 1976)

Our experience as talkers and as listeners may associate the acoustic patterns of speech withmotor planning and proprioceptive and visual information about accompanying mouth movements andfacial expressions Thus experience reinforces the relationships among acoustic, visual, andproprioceptive sensory patterns and between sensory patterns and motor control of articulation, so thatspeech becomes an “embodied signal”, rather than just an auditory signal That is, information relevant

to the phonetic interpretation of speech may derive partly from experience with articulatorymovements that are generated by a motor plan during speech production The mechanisms that mediatethese associations could provide a neural account for some of the observed interactions betweenacoustic and visual information in speech perception that may not be apparent by studying acousticspeech perception alone

The participation of brain areas critical for language production during audiovisual speechperception has not been fully explored It may be that the observed effects on speech comprehensionproduced by observation of a speaker’s face involves visual cortical areas or other multisensory areas

Trang 5

(e.g., posterior superior temporal sulcus), and not actual speech production areas However, theevidence from nonhuman primates with regard to “mirror neurons” suggests that production centers inconcert with other brain regions are likely candidates for the neural structures mediating thesebehavioral findings.

Neurophysiological Studies of Audiovisual Language

Relatively little is known about the neural structures mediating the comprehension ofaudiovisual language This may be because when language comprehension is not viewed as modality-independent, spoken language comprehension is seen as essentially an auditory process, and that itshould be investigated as such in neuropsychological and brain imaging studies However, visual inputplays an important role in spoken language comprehension, a role that cannot be accounted for assolely a cognitive bias to categorize linguistic units according to visual characteristics when acousticand visual information are discrepant (Green, 1998)

Neuroimaging studies of speech processing incorporating both auditory and visual modalitiesare often focused on the problem of determining specific sites of multisensory integration (Calvert etal., 2000; Mottonen et al., 2002; Olson et al., 2002; Sams et al., 1991; Surguladze et al., 2001) Otherstudies have focused on only one (potential) component of audiovisual language comprehension,speech (i.e., lip) reading (Calvert et al., 1997; Calvert & Campbell, 2003; Campbell et al., 2001;Ludman et al., 2000; MacSweeney et al., 2000; MacSweeney et al., 2002a; MacSweeney et al., 2001;Surguladze et al., 2001) However, few studies have investigated the extent of the entire network ofbrain regions involved in audiovisual language comprehension overall (Callan et al., 2001;MacSweeney et al., 2002b) Nonetheless these experiments have collectively yielded a fairly consistentresult: Audiovisual speech integration and perception produce activation of auditory cortices,predominantly posterior superior temporal gyrus and superior temporal sulcus Though studies havereported activation in areas important for speech production (e.g., MacSweeney et al., 2002b), therehas not been much theoretical interpretation of these activations This may be in part because somestudies use tasks that require an explicit motor response (e.g., Calvert et al., 1997; MacSweeney et al.,2002b; Olson et al., 2002), which limit the inferences that can be drawn about the role of motor areas

in perception (Small & Nusbaum, In Press) However, it would be surprising if brain regionsimportant for language production (e.g., Broca’s area and the precentral gyrus and sulcus) did not play

a role in audiovisual speech perception, given the known connectivity between frontal and superior

Trang 6

temporal structures (Barbas & Pandya, 1989; Hackett et al., 1999; Petrides & Pandya, 1988, 2002;Romanski et al., 1999) and the multisensory sensitivity of these areas (Graziano & Gandhi, 2000;Kohler et al., 2002; Romanski & Goldman-Rakic, 2002) in nonhuman primates.

In the present study, we used fMRI with a block design to investigate whether audiovisual

language comprehension activates a network of brain regions that are also involved in speech

production and whether this network is sensitive to visual characteristics of observed speech We alsoinvestigated whether auditory language comprehension alone (without visual information about themouth movements accompanying speech production) would activate the same motor regions, as it haslong been proposed that speech perception (whether multimodal or unimodal) occurs by reference tothe speech production system (e.g., Liberman & Mattingly, 1985) Finally, we investigated whether thevisual observation of the mouth movements accompanying speech activate this network even withoutthe speech signal In an audio-alone condition (A), participants listened to spoken stories In anaudiovisual condition (AV), participants saw and heard the storyteller telling these stories In thevideo-alone (V) condition, participants watched video clips of the storyteller telling these stories, butwithout the accompanying soundtrack Participants were instructed to listen to and/or watch the stories

attentively No other instructions were given (e.g., in the V condition, participants were not overtly

asked to speech read) Stories were approximately 20 seconds in duration Finally, a second group ofparticipants produced consonant-vowel syllables (S), in the scanner so that we could identify brainregions involved in phonetic speech production The data from this group allows us to ascertain theoverlap between the actual regions activated during speech production with those areas activated in thedifferent conditions of language comprehension

Results

Group Results

The total brain volume of activation for the A and V conditions together accounted for only 8%

of the variance of the total volume associated with the AV condition (F (1, 8) = 561, p = 4728) Thissuggests that when the auditory and visual modalities are presented together, emergent activationoccurs The emergent activation in the AV condition appears to be mostly in frontal areas andposterior superior temporal gyrus and sulcus (STG/STS)

Indeed, relative to baseline (i.e., rest), the AV but not the A condition activated a network ofbrain regions involved in sensory and motor control and critical for speech production (see the

Trang 7

discussion for further details; Tables 1; Figure 1) These areas include the inferior frontal gyrus (IFG;

BA 44 and 45), the precentral gyrus and sulcus (BA 4 and 6), the postcentral gyrus, and thecerebellum Of these regions, the A condition activated only a cluster in the IFG (BA 45) In the directstatistical contrast of the AV and A conditions (AV-A), the AV condition produced greater activation

in the IFG (BA 44, 45, and 47), the middle frontal gyrus (MFG), the precentral gyrus and sulcus (BA

4, 6, and 9), and the cerebellum, whereas the A condition produced greater activation in the superiorfrontal gyrus and inferior parietal lobule The AV-V contrast showed that AV produced greateractivation in all frontal areas with the exception of the MFG, superior parietal lobule, and the right IFG(BA 44) for which V produced greater activation

Relative to baseline, both the AV but not the A or V conditions activated more posterioraspects of the STG/STS (BA 22), a region previously associated with biological motion perception,multimodal integration, and speech production Though both the AV and A conditions activated theSTG/STS (BA 41/42/22) bilaterally, regions commonly associated with auditory languagecomprehension, activation in the AV condition was more extensive and extended more posterior fromthe transverse temporal gyrus than activation in the A condition The AV-A and AV-V contrastsconfirmed this pattern

The AV and V conditions activated cortices associated with visual processing (BA 18, 19, 20,and 21) and the A condition did not However, the V condition only activated small clusters in theinferior occipital gyrus and the inferior temporal gyrus relative to baseline whereas the AV conditionactivated more extensive regions of occipital cortex as well as the left fusiform gyrus (BA 18).However, the AV-V contrast revealed that the AV condition produced greater activation in these areas

in the left hemisphere whereas the V condition produced greater activation in these areas in the righthemisphere

Trang 8

posterior aspects of the STG/STS (posterior BA 22) The conjunction of the A or V with the Scondition produced no significant overlap in these regions (Figure 2).

The specific ROIs chosen, aimed to permit finer anatomical statements about differencesbetween the AV and A conditions in the speech production areas, included the pars opercularis of theIFG (F3o), pars triangularis of the IFG (F3t), the dorsal two-thirds (PMd) and ventral one-third (PMv)

of the precentral gyrus excluding primary motor cortex, the posterior aspect of the STG and the upperbank of the STS (T1p), and the posterior aspect of the supramarginal gyrus and the angular gyrus(SGp-AG) Table 2 describes the ROIs, their anatomical boundaries, and functional properties Wewere particularly interested in F3o because the distribution of “mirror neurons” is hypothesized to begreatest in this area (Rizzolatti et al., 2002) Another ROI, the anterior aspect of the STG/STS (T1a),was drawn with the hypothesis that activation in this area would be more closely associated withprocessing of connected discourse (Humphries et al., 2001) and therefore would not differ between the

Trang 9

AV and A conditions Finally, we included an ROI that encompassed the occipital lobe and occipital visual association cortex (including the lower bank of the posterior STS; TO2-OL) with thehypothesis that activity in this region would reflect visual processing and should not be active in the Acondition.

temporal-After delimiting these regions, we determined the total volume of activation within each ROIfor each condition for each participant We collected all voxels with a significant change in signal

intensity for each task compared to baseline, i.e., voxels exceeding the threshold of z > 3.28, p < 001

corrected To determine the difference between conditions, we compared the total volume of activationacross participants for the AV and A conditions within each ROI using paired t-tests correcting for

multiple comparisons (p < 004 unless otherwise stated).

-Insert Table 2 about here -

As in the group data, AV differed from A in volume of activation in a network of brain regions

related to speech production These regions included left PMd (t(8) = 5.19), right PMd (t(8) = 3,70), left F3o (t(8) = 4.06), left F3t (t(8) = 3.54), left T1p (t(8) = 4.12), and right T1p (t(8) = 4.45) There

was no significant difference in the right F3o, right F3t, and bilateral SGp-AG There were nosignificant differences in bilateral T1a, an area less closely associated with speech production andmore closely associated with language comprehension Finally, the AV and A conditions differed in

the volume of activation in left TO2-OL (t(8) = 3.45), and right TO2-OL (t(8) = 3.74), areas associated

primarily with visual processing

Viseme Results

There were a variety of observable “non-linguistic” (e.g., head nods) and “linguistic” (e.g.,place of articulation) movements produced by the talker in the AV condition Some of the latterconveyed phonetic feature information, though most mouth movements by themselves are notsufficient for phonetic classification However, a subset of visual speech movements, “visemes”, aresufficient (i.e., without the accompanying auditory modality) for phonetic classification

In this analysis we wished to determine if visemes, in contrast to other observable informationabout face and head movements in the AV stories, modulated activation in those motor regions thatdistinguish the AV from A conditions This assesses whether the observed motor system activity wasspecific to perception of a specific aspect of motor behavior (i.e., speech production) on the part of theobserved talker That is, if the motor system activity is in service of understanding the speech, this

Trang 10

activity should be modulated by visual information that is informative about phonetic features and thepresence of visemes within a story should relate to the amount of observed motor system activity.

All stories were phonetically transcribed using the automated Oregon Speech Toolkit (Sutton etal., 1998) and the Center for Spoken Language Understanding (CSLU) labeling guide (Lander &Metzler, 1994) The proportion of visemes, derived from a prior list of visemes (Goldschen, 1993),relative to the total number of phonemes in each story was determined and stories were grouped into

quartiles according the number of visemes Stories in the first and fourth (t(6) = 23.97, p< 00001) and

the first and third (t(6) = 13.86, p<.00001) quartiles significantly differed in the proportion of visemes.The volume and intensity of brain activity were compared in ROIs for the AV condition between thefirst and fourth and first and third viseme quartiles As a control we also performed these comparisonsfor the A condition

The intensity of activity significantly increased when comparing the first and fourth quartiles inthe AV condition in the same regions distinguishing the AV from A condition, regions that were alsoactive during speech production as identified in our speech production control group These were the

right T1p (t(8) = 1.89 p< 05) and right PMd (t(8) = 2.81, p< 01) The first and third quartiles also differed for the AV condition in three areas, left T1p (t(8) = 3.38 p< 005), right T1p (t(8) = 4.26 p< 002), and right PMd (t(8) = 2.42 p< 02) None of these regions significantly differed for the A

condition In regions identified as being less closely related to speech production and more closely

related to language comprehension, only left T1a (t(8) = 2.55 p< 02) for the AV condition and left T1a (t(8) = 2.79 p< 01) and right SGp-AG (t(8) = 2.21 p< 03) for the A condition differed when

comparing the first and third quartiles There were no differences between the first and secondquartiles for either the AV or A conditions

Discussion

The present results show that audiovisual language comprehension activates brain areas that areinvolved in both sensory and motor aspects of speech production This is an extensive network thatcomprises Broca’s area (i.e., the pars opercularis) of the inferior frontal gyrus, the dorsal aspect of theprecentral gyrus, including both premotor and adjacent primary motor cortices, the postcentral gyrus,and the cerebellum In these areas, there was a paucity of activation in either audio-alone or visual-alone conditions For the auditory comprehension condition, only the post-central gyrus was activated

Trang 11

both in language comprehension and in speech production For the visual condition there was sometendency for activation in the pars opercularis, which was active during speech production.

Activation of speech production areas during audiovisual but not audio-alone languagecomprehension cannot be attributed simply to methodological considerations (e.g., an overlyconservative correction for multiple comparisons) Nor are these results likely attributable todifferences in speech comprehensibility across conditions Participants understood and reported details

of the stories in both AV and A conditions and there were no differences in these reports betweenconditions The participants attended to and understood the stories in both conditions In addition,previous research has shown that certain areas associated with language comprehension show greateractivity with increasing difficulty of sentence comprehension (Just et al., 1996) If the A condition wasmore difficult to understand than the AV condition, then we would expect to see greater activity inthese areas during the audio-alone condition, but we did not

When considering the results of the audio-only condition, we attribute the lack of activity inthose cortical areas typically associated with speech production to the fact that under normal conditionslisteners can process language solely in terms of its acoustic properties (cf Klatt, 1979; Stevens &Blumstein, 1981) and thus may not need to recruit the motor system to understand speech (Liberman &Mattingly, 1985) This aspect of our results is consistent with previous functional imaging studies inwhich passive listening to auditory stimuli does not reliably elicit Broca’s area, premotor, or primary

motor activation whereas overt phonetic decisions (among other overt tasks) do (for a review see Small

& Burton, 2001) These tasks may engage parts of the brain involved in language production throughcovert rehearsal and/or working memory (e.g., Jonides et al., 1998; Smith & Jonides, 1999) However,

in “normal” listening environments the production system is not normally involved (or is only weaklyinvolved) in auditory language comprehension, although it certainly can be engaged

Our interpretation of our results is that audiovisual speech activates areas traditionallyassociated with both speech production and speech comprehension to encode observed facial

movements and to integrate them into the overall process of understanding spoken language This does

not occur in the absence of the visual modality When the visual modality is presented alone a subset

of these processes may take place In the following sections, we elaborate on this interpretation inrelation to our results and to prior research in language comprehension and production

Trang 12

Broca’s Area

Broca’s area was significantly active during both audiovisual and audio-alone languagecomprehension This activity was primarily restricted to the pars triangularis in the A condition.Broca’s area is traditionally viewed as supporting a mechanism by which phonological forms arecoded into articulatory forms (Geschwind, 1965) It is commonly activated during both overt andcovert speech production (Friederici et al., 2000; Grafton et al., 1997; Huang et al., 2001;Papathanassiou et al., 2000) However, results of production studies seem to suggest that Broca’s area

is not itself involved in controlling articulation per se (Bookheimer et al., 1995; Huang et al., 2001;

Wise et al., 1999), but may be a “pre-articulatory” region (Blank et al., 2002) In support of this,naming is interrupted in fewer than 36% of patients stimulated at the posterior aspect of the inferiorfrontal gyrus (Ojemann et al., 1989) Furthermore, lesions restricted to Broca’s area are clinicallyassociated with Broca’s aphasia for only a few days (Knopman et al., 1983; Masdeu & O'Hara, 1983;Mohr et al., 1978) and the role of Broca’s area in producing Broca’s aphasia is unclear (Dronkers,

1996, 1998) Further supporting the notion that Broca’s area is not involved in controlling articulation

per se is that activation in this area is not specific to oral speech as Broca’s area is activated during

production of American Sign Language (Braun et al., 2001; Corina et al., 1999) and is activated by the

observation and imitation of nonlinguistic but meaningful goal-directed movements (Binkofski et al.,

2000; Ehrsson et al., 2000; Grezes et al., 1999; Hermsdorfer et al., 2001; Iacoboni et al., 1999; Koski etal., 2002) Nor does activation of Broca’s area in nonlinguistic domains simply represent covert verbalcoding of the tasks given to subjects (Heiser et al., 2003)

This review suggests the Broca’s area, though playing a role in speech production, is not simply

a speech production area but rather, given its functional properties, is a general-purpose mechanism forrelating (multimodal) perception and action This review also suggests that refinements are necessary

in the functional neuroanatomy of Broca’s area in both speech comprehension and production Wedistinguished between the pars triangularis and the pars opercularis We postulate that the commonactivation of the pars triangularis in both audiovisual and auditory language comprehension mayreflect semantic or memory processing related to discourse comprehension in the two conditions(Devlin et al., 2003; Friederici et al., 2000; Gabrieli et al., 1998), and may not be related to cortical

systems playing a role in speech production per se However, as one moves more posterior along the

gyrus (i.e., the pars opercularis), functions tend to be more closely related to production

Trang 13

Broca’s Area: Pars Opercularis

Our results indicate that AV language comprehension specifically activates the dorsal aspect ofpars opercularis and that this activation overlaps that associated with speech production Althoughthere is no clear sulcal boundary between areas 44 and 45 (Amunts et al., 1999) we consider the parsopercularis a proxy for area 44 as this is where most of this cytoarchitectural region is located Theresults of the ROI analysis confirm that activation was truly in the opercular region of individualsubjects Region 44 is the suggested homologue of macaque inferior premotor cortex (area F5), aregion containing mirror neurons (Rizzolatti et al., 2002) Our results are consistent with the knownproperties of these neurons, namely that they fire upon perception (i.e., hearing and/ or seeing) andexecution of particular types of goal-directed hand or mouth movements (Fadiga et al., 2000; Gallese

et al., 1996; Kohler et al., 2002; Rizzolatti, 1987; Rizzolatti et al., 1996; Umilta et al., 2001) Thisresult is also consistent with other neuroimaging evidence suggesting that the pars opercularis in thehuman has functional properties consistent with those of mirror neurons (Binkofski et al., 2000;Iacoboni et al., 1999; Koski et al., 2002) More specifically, our results are consistent with the claimthat the dorsal aspect of the pars opercularis has more mirror-cell-like properties than the ventralportion, as the dorsal aspect is activated during both observation and imitation of goal-oriented actionswhere as the more ventral portion is activated during imitation only (Koski et al., 2002; Molnar-Szakacs et al., 2002)

From our results and the reviewed literature, we argue that neurons the dorsal aspect of the parsopercularis of Broca's area play a role in perception of articulatory gestures in audiovisual speechcomprehension due to their mirror like properties That is, we suggest that the perceived articulatorygesture is represented in this area as a tentative phonological simulation or hypothesis, rather than aparametric specification of the particular muscle movements necessary to produce them The increasedactivation in this region during the V condition suggests that, in the absence of auditory perceptualcontingencies or constraints, this area has to work harder to simulate or hypothesize movements

Dorsal Precentral Gyrus

The observed precentral gyrus and sulcus activity occurred only in the AV condition Thisactivation was primarily in the dorsal aspect, and included both premotor (PMd) and adjacent primarymotor cortex and did not include classically defined frontal eye fields (Geyer et al., 2000) Activation

in this region overlapped with that occurring during speech production In addition, activation in the

Trang 14

right PMd region was modulated by the amount of viseme content for the AV condition and in no othercondition These activation patterns are consistent with the hypothesized role of this area in humanspeech production Stimulation of the PMd region has been shown to disrupt vocalization (Ojemann,1979) and to do so more consistently than stimulation of the inferior frontal gyrus (Ojemann et al.,1989) In addition, this region has been shown to be more consistently active than the pars opercularisduring overt speech production (Huang et al., 2001; Wise et al., 1999) Stimulation of these sites,however, does not produce speech arrest nearly to extent that occurs in the more ventral aspect of theprecentral gyrus (Ojemann et al., 1989).

PMd may serve to integrate the hypothesized articulatory goal of the observed talker (specified

in the dorsal aspect of pars opercularis) with the actual motor commands that would be necessary toachieve that goal in production This hypothesis has been previously suggested in the monkeyliterature with regard to integrating visual information about motor movements with motor commands(Halsband & Passingham, 1985) Similarly in humans it has been suggested that PMd is involved inselecting motor acts based on arbitrary visual or auditory stimuli (Grafton et al., 1998; Iacoboni et al.,1998; Kurata et al., 2000) Kurata et al (2000) conclude that PMd plays an important role inconditional sensory-motor integration We believe that modulation of activity in this region by theviseme content of the stories reflects the role of PMd in sensory-motor integration – as the ambiguity

of the audiovisual signal decreases, there is concomitant increase in PMd, suggesting it is easier toderive a motor plan This specificity is not apparent in the pars opercularis because there the level ofrepresentation is more abstract and multiple hypotheses are being represented Finally, activation ofadjacent primary motor cortex may be inhibitory or may reflect subthreshold activation of the motorplan generated in PMd

Superior Temporal Gyrus and Sulcus

The superior temporal gyrus and sulcus posterior to primary auditory cortex, anterior to thesupramarginal and angular gyri, were more active during the AV than in the A condition This areawas also active during speech production Furthermore, during audiovisual comprehension, activity inthis region significantly modulated by the amount of viseme content in the audiovisual stories,becoming more active as viseme content increased Previous research has shown that damage toposterior superior temporal cortex results in a deficit in repeating speech (Anderson et al., 1999;Hickok, 2000) and stimulation of these sites results in speech production errors (Ojemann, 1979;

Trang 15

Ojemann et al., 1989) On the perceptual side, research indicates that the STS is activated by theobservation of biologically relevant movements (such as the mouth movements in the present study)and by implied movements of the eyes, mouth, and hands (for a review see Allison et al., 2000) Inaddition, this area is activated to a greater extent by linguistically meaningful facial movements than tofacial movements that do not have linguistic meaning (e.g., Campbell et al., 2001) In the presentstudy, the activation that was produced by the presence of visemes is consistent with the sensitivity ofthis region to biologically relevant movements and specifically to speech movements In addition, ourfinding is consistent with the interpretation that this area is a site participating in the integration of seenand heard speech (Calvert et al., 2000; Sams et al., 1991).

In sum, the posterior superior temporal gyrus and sulcus seem to participate in both speechperception and production as a cortical convergence zone (Damasio et al., 1990) having auditory,visual, and motor properties We offer an active “knowledge based” model of speech perception toaccount for these results (Nusbaum & Schwab, 1986) The posterior superior temporal gyrus andsulcus may provide an audiovisual description of speech actions that can interface with cortical areasthat are important in speech production We suggest it is through this interface that the pars opercularisand dorsal premotor cortex receive audio-visual pattern information about heard and observed speechmovements These areas, in turn, work to generate audio-visual-motor “hypotheses” consistent withthe audio-visual properties The winning hypothesis is generated in pre/primary motor cortex and sentback to the posterior superior temporal area to help constrain interpretation of a given linguisticsegment Such motor-to-sensory discharges have been found to occur during speech production (Paus

et al., 1996a; Paus et al., 1996b) We suggest that such discharge also occurs during perception in theaudiovisual context because speech is being (covertly) produced or modeled in the form of thishypothesis about what is heard and observed This also suggests an explanation for the activation ofthe postcentral gyrus during audiovisual language comprehension as this area is associated withproprioceptive feedback related to speech production (Lotze et al., 2000) The somatotopy in thisregion corresponds to the mouth and tongue and is activated by tongue stimulation and movement(Cao et al., 1993; Sakai et al., 1995) This hypothesized movement may help constrain interpretation of

a given linguistic segment by interacting with or by comparison with results of sensory processing inposterior STG/STS This has been proposed for the imitation of manual movements where it was foundthat a “reafferent copy” of an actual movement interacts with observed movements in the STG/STS

Ngày đăng: 12/10/2022, 16:47

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w