In the present studies, anchored ABX discrimination functions and signal detection analyses of identification data Ibefore and after anchoring} for an [i]-[I] vowel series were used to d
Trang 11980, Vol 27 (5), 421-434
Contextual effects in vowel perception II:
Evidence for two processing mechanisms
JAMES R SAWUSCH, HOWARD C NUSBAUM and EILEEN C SCHWAB
State University of New York, Buffalo, New York 14226
Recent experiments have indicated that contrast effects can be obtained with vowels by
anchoring a test series with one of the endpoint vowels These contextual effects cannot
be attributed to feature detector fatigue or to the induction of an overt response bias In
the present studies, anchored ABX discrimination functions and signal detection analyses
of identification data Ibefore and after anchoring} for an [i]-[I] vowel series were used to
demonstrate that [i] and [I] anchoring produce contrast effects by affecting different
per-ceptual mechanisms The effects of [i] anchoring were to increase within-[’:] category
sensitiv-ity, while [I] anchoring shifted criterion placements When vowels were placed in CVC
syllables to reduce available auditory memory, there was a significant decrease in the size
of the [I]-anchor contrast effects The magnitude of the Ill-anchor effect was unaffected by
the reduction in vowel information available in auditory memory These results suggest that
[i] and [I] anchors affect mechanisms at different levels of processing The [i] anchoring
results may reflect normalization processes in speech perception that operate at an early
level of perceptual processing, while the [I] anchoring results represent changes in response
criterion mediated by auditory memory for vowel information
Previous research in speech perception has
consis-tently revealed differences between the perception of
stop-consonants and vowels In experiments on
dichotic listening, stop-consonants consistently yield
a right-ear advantage (Shankweiler &
Studdert-Kennedy, 1967; Studdert-Kennedy & Shankweiler,
1970), while a right-ear advantage is not found for
steady-state vowels under similar circumstances
(Darwin, 1971 ; Haggard, 1971; Studdert-Kennedy &
Shankweiler, 1970) Experiments on categorical
per-ception also yield consistent differences between
stop-consonants and vowels For the stop consonants,
discrimination is typically categorical That is, the
ability to determine whether two stimuli are different
is limited by the ability to identify the same stimuli
(Liberman, Harris, Hoffman, & Griffith, 1957;
Pisoni, 1971, 1973) The discrimination of vowels,
however, is typically much better than would be
pre-dicted from identification data (Pisoni, 1973, 1975;
Fujisaki & Kawashima, Note 1, Note 2) Finally,
stop-consonants tend to show little or no influence of
contextual information on their identification (Eimas,
This work was supported by NINCDS Grant NS-12179 to
Indiana University (which supported development of the speech
synthesizer used in Experiment 3), NIMH Grant MH31468-01 to
SUNY/Buffalo, NSF Grant BNS7817068 to SUNY/Buffalo, and
SUNY Research Foundation and University Awards grants The
authors would like to thank Dr David B Pisoni for making the
facilities of the Speech Perception Laboratory at Indiana
Univer-sity available for stimulus preparation and Jerry C Forshee for
his assistance in constructing the tapes for Experiments 1 and 2.
Reprint requests should be sent to the first author at the Department
of Psychology, 4230 Ridge Lea Road, Buffalo, New York 14226.
1963; Fry, Abramson, Eimas, & Liberman, 1962; Simon & Studdert-Kennedy, 1978; Sawusch & Pisoni, Note 3), while vowels show large changes in identifi-cation as a function of context and surrounding vowels (Fry et al., 1962; Ladefoged & Broadbent, 1957; Repp, Healy, & Crowder, 1979; Sawusch & Nusbaum, 1979)
The question of contextual influences in speech perception has been addressed by a number of proce-dures Eimas (1963), Fry et al (1962), and more recently, Repp et al (1979) have requested subjects
to identify stimuli presented in a discrimination task format One general finding for all these experiments
is that identification of any individual stimulus item tends to migrate toward categories other than those
of the items it is presented with This contrastive effect is especially pronounced for ambiguous syl-lables (those near a phonetic category boundary) Furthermore, the contrastive effects in isolated, steady-state vowel identification are substantially larger than the effects found with stop-consonants (see Eimas, 1963) Using a similar procedure in which stimuli were presented in groups of four, Diehl, Elman, and McCusker (1978; Diehl, Lang, & Parker,
in press) have also reported small contrastive changes
in stop-consonant identification for near boundary stimuli
A different procedure was employed by Ladefoged and Broadbent (1957; Broadbent & Loadefoged, 1960) in which an ambiguous word was placed at the end of a sentence The first formant frequencies of the carrier sentence were systematically varied The
Copyright 1980 Psychonomic Society, Inc 421 0031-5117/80/050421-14501.65/0
Trang 2422 SAWUSCH, NUSBAU~M, AND SCHWAB
effect of the various carrier sentences on the ambiguous
test items (which differed in their vowel) was one of
contrast That is, with a carrier sentence that was
synthesized with first-formant frequencies low in
their range, a word that was ambiguous between
"bit" and "bet" would be heard as "bet" (higher F1)
With a high first-formant carrier sentence, the same
word would be heard as "bit" (lower F1)
A third procedure that has been used to investigate
contextual influences in speech perception is
anchor-ing (Sawusch & Nusbaum, 1979; Simon &
Studdert-Kennedy, 1978; Sawusch & Pisoni, Note 3; Rosen,
Note 4) In this procedure, subjects are presented
with a set of stimuli to be identified under two
condi-tions The first is an equiprobable control in which
each stimulus occurs equally often In the second,
anchor condition, one of the stimuli occurs more
often than the other stimuli This procedure has been
used to investigate the perception of brightness
(Helson, 1964), dots varying in numerosity (Helson
& Kozaki, 1968), heaviness of lifted weights (Parducci,
1963, 1965), and tones varying in frequency or
inten-sity (Cuddy, Pinn, & Simons, 1973; Sawusch &
Pisoni, Note 3), as well as vowels and
stop-con-sonants In general, small contrast effects or no
effects at all have been found for stop-consonants as
a function of anchoring (Simon & Studdert-Kennedy,
1978; Sawusch & Pisoni, Note 3) However, large,
consistent effects have been found in anchoring
experiments with vowels (Sawusch & Nusbaum,
1979; Simon & Studdert-Kennedy, 1978; Sawusch &
Pisoni, Note 3)
A number of possible processing mechanisms have
been considered in connection with these contrast
effects with vowels Sawusch and Nusbaum grouped
these into three classes: feature detector fatigue,
changes in auditory ground (adaptation level), and
changes in response bias The feature detector fatigue
explanation seems to be implausible because the extra
occurrences of a stimulus in the anchor condition are
usually widely separated in time and are interspersed
with presentations of other stimuli Thus, although
virtually identical patterns of results are found for
vowels with adaptation (Morse, Kass, & Turkienicz,
1976) and anchoring procedures (see Sawusch &
Nusbaum, 1979), both results probably reflect
pro-cesses other than feature detector fatigue
The response bias explanation was explored in an
experiment by Sawusch and Nusbaum (1979), who
found identical contrast effects for subjects who were
informed of the extra occurrences of the anchoring
vowel and subjects who were not This would seem to
eliminate any overt response bias explanation of the
vowel anchoring results in which the subjects simply
tried to use the available response categories equally
often (cf Parducci, 1975)
The third possibility concerns changes to an
audi-tory ground (see Sawusch & Nusbaum, 1979; Simon
& Studdert-Kennedy, 1978) The auditory ground represents a standard against which incoming stimuli are compared The composition of this auditory ground could include information from long-term memory about auditory characteristics, pattern.,;, or features for various items as well as information from auditory memory concerning the immediately preceding stimuli During baseline identification test.-ing, no one stimulus would dominate the auditory ground, since each stimulus is equally likely How.-ever, when in an anchoring procedure, one stimulus occurs more often than any of the other stimuli, it
is more likely that auditory memory information about this stimulus will be available for comparison with subsequent stimuli Thus, the auditory ground is more likely to contain information about the anchor-ing stimulus than any other stimulus This will cause ambiguous stimuli to be mapped on to categories other than that of the anchoring stimulus
The influence of the more frequently occurring stimulus could come about in one of two ways One possibility is that the presentation of any particular stimulus now has a higher probability of being pre-ceded by the anchor stimulus If the subject retains some trace of the quality of the preceding stimulus and uses this in evaluating the current stimulus, the effect of anchoring would be to give the anchored stimulus the largest weight in this comparison The second possibility is that a cumulative adaptation level, as suggested by adaptation level theory (Helson, 1964), is the basis of vowel anchoring results Since the anchored vowel occurs more often than any other vowel, it would have a disproportionate weight in determining this adaptation level Recent results reported by Nusbaum and Sawusch (Note 5) support the adaptation level description In their experiment,
a target vowel from the middle of an [i]-[II vowel series was preceded by either an [i] or an [I] endpoint vowel and the interval between the two vowels was varied At very short ISis, both endpoint vowels caused contrast effects in the identification of ambig-uous test vowels However, this influence decreased substantially as ISI was increased to 500 msec Given that the ISI in previous anchoring studies was 4 sec, the contrast effects found in vowel anchoring can not
be adequately explained simply on the basis of audi-tory memory for the immediately preceding stimulus Rather, vowel anchoring seems to involve the
build-up of information about the anchoring vowel, pos-sibly in the form of an adaptation level If this adap-tation level is, indeed, auditory in nature, it may be
in a form similar to Massaro’s (1972) synthesized auditory memory
The auditory ground explanation is consistent with the vowel anchoring results previously reported It is also consistent with the anchoring data for stop.-consonants Since stops show less evidence of auditory memory than do vowels (Pisoni, 1971, 1973; Fujisaki
Trang 3& Kawashima, Note 2), they should have a smaller
auditory memory component in determining their
auditory ground This should lead to little or no
effect of anchoring upon stop-consonant
identifica-tion, which has, in general, been the case (see Simon
& Studdert-Kennedy, 1978; Sawusch & Pisoni, Note 3,
for review) The experiments described below were
conducted as a further test of the auditory ground
and response bias explanations of anchoring effects
with vowels.
EXPERIMENT 1
The first experiment was designed to test the
effects of anchoring upon ABX discrimination of
vowels Previous experiments that have investigated
the relationship between isolated steady-state vowel
identification and ABX discrimination have found
that listener’s discrimination is not categorical (Pisoni,
1971, 1973, 1975; Fujisaki & Kawashima, Note 1,
Note 2) Rather, the discrimination of vowels from
within a phonetic category is typically well above
chance Thus, listeners can discriminate vowels that
are identified as belonging to the same category.
Pisoni (1973, 1975) and Fujisaki & Kawashima
(Note 2) have attributed this within-category
dis-crimination to the use of information in auditory
memory about the three stimuli in an ABX triad.
Thus, to the extent that changes in auditory memory
or other early perceptual processes underlie the
contrast effects found with the anchoring procedure,
we would expect systematic changes in ABX
discrim-ination within a phonetic category as a function of
anchoring That is, discrimination should change
sys-tematically at the anchored end of the vowel series.
On the other hand, if the contrast effects due to
anchoring are a result of a criterion shift in the
label-ing (identification) process, then only
between-category changes in discriminability would be expected
as a function of anchoring The change in the
phonetic category boundary caused by anchoring
would lead to a shift in the peak of the ABX
dis-crimination function (since disdis-crimination across the
category boundary seems to rely on identification
labels in STM; see Pisoni, 1973, 1975; Repp et al.,
1979; Fujisaki & Kawashima, Note 2) However,
unless changes were found in identification within a
phonetic category as a function of anchoring, no
change in within-category discriminability would be
expected That is, no changes in discriminability
would be expected for the anchored end of the series.
Thus, the nature of changes in ABX discrimination
as a function of anchoring will provide evidence of
the relative involvement of early perceptual processes
vs later criterion shifts in vowel contrast effects.
Method
Subjects The subjects in this experiment were 12
undergrad-uate and gradundergrad-uate students at the State University of New York
at Buffalo All subjects were right-handed, native speakers of English with no reported histories of any speech or hearing dis-orders The subjects were paid $3/h for their participation.
Stimuli The stimuli consisted of a set of seven isolated, steady-state vowels which ranged perceptually from [i] as in
beet to [I] as in bit These vowels were originally generated by
Pisoni (1971) using the vocal tract analogue synthesizer at the Research Laboratory of Electronics, Massachusetts Institute of Technology All of the stimuli were 300 msec in duration and con-tained five formants These stimuli varied in their formant fre-quencies for their first three formants from 270 Hz (F1), 2,300 Hz (F2), and 3,019 Hz (F3) for the [i] end of the series to 374 Hz (F1), 2,070 Hz (F2), and 2,666 Hz (F3) for the [I] end of the series
in six logarithmic steps A more complete description of these stimuli can be found in Pisoni (1971) These seven vowels were recorded on audiotape and then digitized using the PDP-11 com-puter in the Speech Perception Laboratory at Indiana University These stimuli were then reconverted to analogue form to make six test tapes Three of these tapes were identification tapes In the baseline identification tape, each of the seven stimuli occurred
10 times in random order In the [i]-anchor tape, Stimulus 1 occurred 40 time and each of the other six stimuli occurred
10 times In the [l]-anchor tape, Stimuli 1 through 6 each oc-curred 10 times and Stimulus 7 ococ-curred 40 times In each of the anchor tapes, the order of stimuli was randomized, with the restriction that no single stimulus could occur more than three times in succession All three tapes were recorded with 4 sec between stimuli.
The other three tapes were ABX discrimination tapes All dis-crimination tapes were composed of the six one-step ABX triads.
In any given triad, the first two stimuli (A and B) were adja-cent stimuli from the series The third stimulus (X) was identical
to either the first or the second stimulus For any given pair
of stimuli, this allowed four distinct triads: ABA, ABB, BAA, and BAB Each of these four compositions for each of the six adjacent pairs occurred four times (96 total triads) in the base-line ABX tape In the [i]-anchor ABX tape, the Stimulus 1,2 triads (121, 122, 211, 212) each occurred 16 times and the other triads occurred 4 times each In the [l]-anchor ABX tape, the Stimulus 6,7 triads occurred 16 times each and the other triads occurred four times In all three ABX tapes, the order of triads was random, with the restriction that no more than three triads from one particular stimulus pair could occur in succession In all tapes, there was 1 sec between items within a triad and 4 sec between triads.
Procedure The subjects were divided into two groups of six subjects each They were run in small groups of from two to four subjects each Each subject participated in two 1-h sessions
on successive days The stimulus tapes were reproduced on a Revox A-700 tape deck and presented binaurally to subjects via Telephonics TDH-39 matched and calibrated headphones The intensity of the stimuli was set to 80 dB SPL for a steady-state calibration vowel ([i]) for all tapes Each group listened to the baseline identification and ABX tapes at the beginning of each session The subjects were informed that they would be listening
to synthetic syllables that would sound like the vowels [i] and [I] They were asked to make two responses to each item on the identification tape First, they were requested to identify each vowel as either [i] or [I] Their second response was to be a rating indicating how sure they were that they had identified the stimu-lus correctly A 4-point scale was used, with a 1 indicating that the subject was positive her (his) identification was correct, a 2 indicating a probable correct, a 3 indicating a possible correct, and a 4 indicating a guess For the ABX tapes, the subjects were informed that they would be hearing groups of three stimuli.
In these groups, the first two stimuli would always be different while the third would be identical to either the first or the second They were to indicate whether the third item sounded most like the first stimulus or most like the second.
Following the control tapes, each of the two groups listened to
a different set of anchor tapes The [i] group heard [i]
Trang 4identifi-424 SAWUSCH, NUSBAUM, AND SCHWAB
0 7¸
STIMULUS VALUE
Figure 1 Rating functions for the control (solid circles) and
anchor (open circles) conditions with data for the [i]-anchor group
on the left and the [i]-anchor group on the right.
cation and [i] ABX anchor tapes, while the other group heard
the corresponding [I]-anchor tapes The subjects were not given
any new instructions regarding these tapes They used the same
response procedures for the two types of anchor tapes that they
had used for the baseline tapes By the end of the experiment,
each subject had provided at least 20 identification responses
to each stimulus under both baseline and anchoring conditions.
They had also provided at "least 32 discrimination responses to
each pair of stimuli under each condition.
Results
The identification and rating responses for the
identification tapes were converted to an 8-point
scale A rating of 1 indicated an extremely confident
[i] response, ratings of 4 and 5 indicated [i] and [I]
guesses, while a rating of 8 indicated a positive [I]
response The results for the two groups are shown in
Figure 1 Both groups showed significant shifts in
their category boundaries toward the category of the
anchoring stimulus [t(5) = 3.80, p < 02, for the
[i]-anchor group and t(5) = 4.90, p < 01, for the
[I]-anchor group].’ These contrast effects are essentially
identical to those previously reported by Sawusch
and Nusbaum (1979) and Sawusch and Pisoni
(Note 3)
The ABX discrimination results are shown in
Figure 2 The category boundaries for the
cor-responding identification functions are shown by the
arrows The peak in the [i]-anchor group
discrimina-tion funcdiscrimina-tion is shifted toward the [i] end of the series,
relative to the baseline condition (Figure 2, left side)
A corresponding shift in the [I]-anchor ABX peak
was found for the [I] group (right side, Figure 2)
Of the 12 subjects, 11 show this pattern of a shift in
the discrimination function peak toward the anchored
category (which is significant, p = 012, using a
two-tailed sign test) The one subject who did not show
the expected shift was in the [I]-anchor group This
subject showed no evidence of a discrimination peak
shift
In addition to a shift in peak discriminability, the [i]-anchor group showed a marked increase in the dis-criminability of the Stimulus 1,2 pair (the anchoring pair) Each of the six [i]-anchor-group subjects showed this increase in discriminability, which was significant [t(5) = 5.72, p < 01, for the 16.8°70 mean increase in discriminability] No comparable increase,
in discriminability for the Stimulus 6,7 pair was found for the [I]-anchor group (Three subjects showed increases in discrimination and three showed decreases following the [I] anchor.)
Discussion
For the [I]-anchor group, no changes in within- category discriminability were found as a function of anchoring The discriminability of the Stimulus 6,7 pair showed no change due to anchoring Thus, for the [I] anchor, the contrast effects found could be due to criterion shifts Criterion shifts in the identifi-cation of stimuli would lead to a shift in the ABX discrimination peak if, as is usually assumed, implicit identification (categorization) underlies the between- category discrimination of subjects (see Pisoni, 1973; Repp et al., 1979; Fujisaki & Kawashima, Note 2) The criterion shift explanation of anchoring predicts
no change in within-category ABX discrimination because no change in identification performance within either category was found
The criterion shift explanation does not, however,, appear to be an adequate explanation of the [i] anchor results In the [i]-anchor group, a large increase
in discriminability was found within the [i] category, for the anchored, Stimulus 1,2 pair This increase can not be accounted for by the small and inconsistent change in identification (rating) for Stimulus 2 fol.-lowing [i] anchoring (see Figure l, left side)? The [ill anchoring results seem to be due, in part, to a change
100
50
z w w
ABX DISCRIMINATION
~ Control o -0 Anchor
2 3 4 5 ~ 7 1 2 3 4 5 6
STIMULUS VALUE
Figure 2 Percent correct discrimination for the [i]-anchor group (left) and the [D-anchor group (right) Rating category boundaries (see text) are marked by arrows.
Trang 5in perceptual processing prior to identification of the
stimulus Thus, these results indicate that two
distinct types of processing changes may be involved
in the contrast effects found with vowels The next
experiment was conducted as a further test of
whether [i] and [I] anchoring effects reflect the
involvement of distinct perceptual processing
mech-anisms.
EXPERIMENT 2
If two distinct processes are involved in anchoring
effects with vowels, and one of these represents an
early perceptual change while the other represents a
higher level change, then we might expect them to
show up as sensitivity changes and criterion shifts,
respectively, in a signal detection analysis However,
our previous experiments have collected far too few
judgments per stimulus to allow the use of this type
of data analysis The present experiment was
de-signed to collect a sufficient number of subject
responses to each stimulus for a signal detection
analysis of individual subject data The variation of
signal detection theory proposed by Durlach and
Braida (1969; Braida & Durlach, 1972) will be used
to evaluate the data If the within-category
discrim-inability increase for [i] vowels that was found in
Experiment 1 reflects an early perceptual change, we
would expect an increase in sensitivity for the
Stimu-lus 1,2 pair as a result of [i] anchoring However, to
the extent that anchoring induces changes at a later
stage in perceptual processing, we would expect to
find criterion shifts as a result of both [i] and [I]
anchoring.
Method
Subjects The subjects in this experiment were 14 undergraduates
at the State University of New York at Buffalo who
partici-pated for course credit These subjects met the same
require-ments as those in Experiment 1.
Stimuli The same seven vowel series used in Experiment 1
was also used here These stimuli were recorded to make two
additional baseline, two [i]-anchor and two[l]-anchor tapes
(yielding three tapes of each type) As before, all stimuli were
recorded in random order, with no more than three occurrences
of any stimulus in succession.
Procedure The experimental tapes were reproduced and
presented to subjects in a manner similar to that of Experiment 1.
The subjects were divided into two groups of seven All seven
subjects in a group were run simultaneously Each subject
par-ticipated for a total of 5 h spread over 4 days In each session,
the subjects listened to three presentations of baseline tapes (all
using different stimulus orders on any given day) and then, after
a short break, listened to three presentations of an anchoring tape
(again, all different on any given day) One group listened to
[i]-anchor tapes and the other listened to [I]-anchor tapes In
addition, on Day 1, the subjects listened to the seven vowel
stimuli in order and one extra presentation of a baseline tape for
practice purposes.
The subjects were informed that the tapes contained random
orders of seven different vowels varying from [i] to [11 The
subjects then listened to the seven stimuli, in order, from the [i] endpoint to the [I] endpoint They were asked to use a 7-point response scale and to attempt to uniquely identify each of the seven vowels A response of 1 was to denote the [i] endpoint vowel, while a 7 was to denote the [I] vowel The values in between were to identify the intermediate vowels Subjects then listened to the vowels in order a second time Following this, they used the seven responses to identify the stimuli from a base-line tape for practice The results of the practice tape were not included in the data analysis By the end of the experiment, each subject had provided at least 120 responses (4 sessions × 3 tapes
× 10 occurrences) to each of the seven stimuli in both baseline and one of the anchoring conditions, exclusive of the practice tape.
Results The data for 2 of the 14 subjects were dropped from the experiment because these subjects did not use all seven responses One subject in the [i] group did not use Responses 3 and 4 and one subject in the [I] group did not use Response 4 The average rating functions for the remaining six subjects in each group are shown in Figure 3 In both groups, a significant shift in the category boundary toward the category of the anchor was found [t(5) = 4.71,
p < 01, and t(5) = 5.21, p < 01, for the [i]-and [I]-anchor groups, respectively) Each of the 12 subjects showed the expected shift in their category boundary The confusion matrices (seven stimuli by seven responses) for both baseline and anchored con-ditions for each subject were submitted to a signal detection analysis The version of signal detection analysis proposed by Durlach and Braida (1969; Braida & Durlach, 1972) was used The individual confusion matrices were converted to cumulative probability matrices with entries accumulated over responses for each stimulus These cumulative proba-bilities were converted to z scores, with the restric-tion that only probabilities between 008 and 992 were converted Cells with probabilities outside this range were considered indefinite and were not used
[i] Anchor [I] Anchor
7
Z 6
4
STIMULUS VALUE
Figure 3 Rating functions for the [i]-anchor (left) and [l]-anchor groups of Experiment 2 in both control (solid circles) and anchored (open circles) conditions.
Trang 6426 SAWUSCH, NUSBAUM, AND SCHWAB
2,0’
0.0
I
I II
1,2 2,3 3,4 4,5 5,6 6,7 1,2 2,3 3.4 4,5 5,6 6,7
STIMULUS PAIR
Figure 4 Paired d’ values for the [i]-anchor (left) and
Ill-anchor (right) groups in both control (solid circles) and Ill-anchored
(open circles) conditions.
in determining either sensitivities (d’) or criterions.3
The d’ for each adjacent pair of stimuli was computed
by taking the mean difference between z scores for
the two stimuli over the seven response alternatives.
The paired d’ values for baseline and [i]-anchored
conditions are shown in Figure 4 on the left A
number of aspects of the data should be noted.
First, every subject showed an increase in d’ for the
Stimulus 1,2 pair and every subject showed a
decrease in d’ for the Stimulus 4,5 pair Second,
for three of the subjects, the category boundary fell
between Stimuli 4 and 5, while for the other three
subjects, the boundary fell between Stimuli 3 and 4.
Each of the six subjects showed a decrease in d’
for the stimulus pair which spanned (one stimulus
on either side of) the baseline category boundary.
The paired d’ values for the baseline and [I]-anchor
conditions are shown in the right-hand side of Figure 4.
In contrast to the [i]-anchor results, there was no
sig-nificant change in d’ for the stimulus pair spanning
the baseline category boundary [t(5) = 1.64, p > 1].
Separate two-way ANOVAs were used to evaluate
the d’ results for [i] and [I] anchoring For the
[i] anchoring condition, the main effect of stimuli
was significant [F(5,25) = 7.50, p < 001], but the
main effect of anchoring was not [F(1,5) = 32,
p > 25] The interaction between anchoring and
stimuli was significant [F(5,25) = 2.71, p < 05].
Post hoc Newman-Keuls tests revealed that for the
Stimulus 1,2 pair, anchoring caused a significant
increase in d’, while for the Stimulus 4,5 pair,
anchoring led to a significant decrease (both p < 05).
For the comparable [I]-anchor analysis, the main
effect of stimuli was significant [F(5,25) = 6.01,
p < 001], but the effect of anchoring was not [F(I,5)
= 1.04, p > 25] The interaction was marginally
significant [F(5,25) = 2.42, 05 < p < 1] Post hoc
tests showed that only the increase in d’ for the Stimulus 4,5 pair was significant (p < 05) Thus, the d’ results from the present experiment are gener-ally consistent with the discriminability changes in Experiment 1 Consistent d’ changes were found both within the [i] category, for the Stimulus 1,2 pair and across the category boundary for [i] anchor-ing, but little consistent change in d’ was found for the [I]-anchor condition.
Criterion cut points were determined for each response pair, based on the previously computed values for d’ Cumulative criterion placements were calculated by determining the z score that cor-responded to each criterion based on a cumulative d’ scale The six cumulative criterion placements for the seven response categories for both baseline and [i] anchoring are shown on the left side of Figure 5 Although there does appear to be an overall shift
in the criteria toward the [i] end of the series, this was not consistent across subjects Rather, four of the subjects showed a shift in all of their criteria toward the [i] end of the series, while two showed
a shift in all criteria toward the [I] end of the series Thus, the direction of criterion shift for two of the subjects is opposite the shift in the category bound-ary However, these two subjects also showed the smallest category boundary shifts due to [i] anchoring Thus, the criterion shifts may account for part of the category boundary shifts for the [i] anchor How.-ever, all of the [i]-anchor shift for two of the sub-jects was due to sensitivity changes, while at least part of the [i]-anchor effect for the other four sub-jects was due to sensitivity changes The cumulative criteria for the baseline and [I]-anchored functions are shown on the right side of Figure 5 Every one
1,2 23 3,4 45 5,6 67 1,2 2,3 3,4 4,5 56 6
RESPONSE PAIR
Figure 5 Cumulative criterion cutpoints (in z units) for the
[i]-anchor (left) and [I]-[i]-anchor (fight) groups in both control (solid
circles) and anchored (open circles) conditions.
Trang 7[i] Anchor [I] Anchor
1.0¸
.75
.5O
1,2 2,3 3,4 4,5 5,6 6,7 1,2 2,3 :3,4 4,5 5,6 6,7
STIMULUS PAIR
Figure 6 Paired values of the sensitivity index P(A) for the
[i]-anchor (left) and [l]-anchor (right) groups in both control
(solid circles) and anchored (open circles) conditions.
of the six subjects exhibited a shift in all six of their
criteria toward the [I] end of series as a result of
[I1 anchoring.
As a check on the validity of the d’ results, a
nonparametric measure of sensitivity was also
com-puted for each pair of stimuli from the cumulative
probability matrix for each subject The area under
the ROC curve [P(A); see Green & Swets, 1974]
was computed as our alternative measure of paired
sensitivity since it does not depend upon the equal
variance, normal distribution assumptions of the
Durlach and Braida (1969) model.3 The mean values
of P(A) (across subjects) for the baseline and
[i]-anchor conditions are shown in Figure 6 (left side).
As with the d’ results, a significant decrease in
sensi-tivity was found for the stimulus pair spanning the
baseline category boundary for each subject It(5) =
4.92, p < 01] A similar P(A) analysis was done for
the [I]-anchor subjects, and the mean results are
shown in Figure 6 on the right As with the
para-metric analysis, no significant change in sensitivity
was found for the stimulus pair spanning the
base-line category boundary [t(5) = 1.23, p > 2] As
with the d’ data, separate two-way ANOVAs were
run on the P(A) data for the [i]- and [I]-anchor
conditions For the [i]-anchor group, the main effect
of stimuli was significant, while the effect of
anchor-ing was not [F(5,25) = 5.62, p < 01, and F(1,5)
= 33, p > 25, respectively] The Stimulus by
Anchoring interaction was significant [F(5,25) =
6.62, p < 001] and post hoc tests revealed that
both the Stimulus 1,2 and Stimulus 2,3 pairs showed
significant increases in sensitivity as a result of
anchoring, while both the Stimulus 3,4 and
Stimu-lus 4,5 pairs showed significant decreases (all p <
.05) For the [I]-anchor group, the main effect of
stimuli was significant and the effect of anchoring
was not [F(5,25) = 6.183, p < 001, and F(1,5)
= 0.21, p > 25, respectively] The Stimulus by
Anchor interaction was marginally significant [F(5,25) = 2.23, 05 < p < 1], and none of the stimulus pairs showed a significant change using post hoc tests No consistent differences were found
in P(A) as a function of [I] anchoring Thus, the results for P(A) and d’ for the [I]-anchor subjects are virtually identical and show little or no influence
of the [I] anchor on sensitivity measures.
Discussion
The increases in sensitivity found with both P(A) (area under the ROC curve) and d’ for stimuli within the [i] category following [i] anchoring mirror the increase in ABX discriminability for the [i] category found in Experiment I The Stimulus 1,2 pair showed increases in discriminability in both experiments In addition, decreases in sensitivity were found for both measures at the category boundary after [i] anchoring When coupled with the inconsistent changes in criterion placement across subjects for the [i| anchor, these results indicate that the contrast effects found with [i] anchoring on an [i]-[I] series are predom-inantly due to changes in sensitivity By comparison, consistent changes in sensitivity were not found as
a consequence of [I] anchoring Rather, systematic shifts in criteria toward the [I] category would seem
to be responsible for the contrast effects found for
an [I] anchor
Although both Experiments 1 and 2 demonstrate that different processing changes underlie [i] and [I] vowel anchoring, these experiments do not isolate the processing changes themselves However, since one of the major differences between these vowel stimuli (which show anchoring effects) and stop con-sonants (which do not consistently show anchoring effects) is the degree of available auditory memory (Pisoni, 1973; Fujisaki & Kawashima, Note 2), some form of auditory memory, as outlined earlier, may underlie part of the anchoring effects found with vowels Experiment 3 was designed to investigate the role of auditory memory in anchoring
EXPERIMENT 3
As noted earlier, Nusbaum and Sawusch (Note 5) have provided evidence that if some form of audi-tory memory is involved in vowel-anchoring effects,
it is not the auditory memory trace of only the immediately preceding vowel This would seem to indicate that the auditory memory explored by Crowder (1971, 1973; Crowder & Morton, 1969), termed precategorical acoustic storage, or PAS, is probably not responsible for our vowel-anchoring effects The duration of PAS is typically around
2 sec (Darwin, Turvey, & Crowder, 1972), while our ISis were 4 sec Thus, if auditory memory is involved
in our vowel-anchoring results, it is probably not the
Trang 8428 SAWUSCH, NUSBAUqVl, AND SCHWAB
short-lived PAS However, there is evidence that
some form of precategorical information does persist
at durations of 2 sec or more Repp et al (1979)
found that even with a 2-sec interval filled with
repetitions of an extraneous semivowel, AX vowel
discrimination was well above chance Thus, some
vowel information does seem to persist in a form
where it could influence the identification of a
fol-lowing vowel ’This vowel information may be
involved in our anchoring results The
discrimin-ability of vowels, using either an AX procedure
(Repp et al., 1979) or an ABX procedure (Pisoni,
1973, 1975) can thus serve as an index of the strength
of this memory trace in determining the role of
auditory memory in vowel anchoring results.
The vowel stimuli used in Experiments 1 and 2 are
perceived more nearly continuously than
categori-cally (see Pisoni, 1973) In the present experiment,
we embedded vowels in CVC contexts in an effort
to reduce the influence of auditory memory upon
identification (see Stevens, 1968; Sachs, Note 6) To
the extent that this is successful, ABX discrimination
results for the CVCs should be less continuous than
the isolated vowel results of Pisoni (1973) and more
categorical The results of Experiments 1 and 2
showed that two distinct processes are involved in
vowel anchoring effects Reducing the information
in auditory memory could have its primary influence
on the magnitude of either the [i]-anchor contrast
effects or the [I]-anchor effects.
If auditory memory underlies the influence of the
[i] anchor and the [I] anchor induces changes at a
later, response, stage, then reducing available auditory
memory should also reduce the size of the contrast
effects found for [i] In addition, we might also
expect the size o.~ the [I]-induced contrast effects to
decrease, since reducing available auditory memory
renders perception more categorical (Pisoni, 1973,
1975) and previous results have shown that
categori-cally perceived speech stimuli usually show smaller
contrast effects than noncategorically perceived
stimuli (Eimas, 1963; Sawusch & Pisoni, Note 3).
Thus, if the CVC series show less influence of
audi-tory memory in ABX discrimination and both the
CiC and CIC anchor effects are drastically reduced
in magnitude, we will have evidence that some form
of auditory memory underlies the [i| (and [I])
vowel-anchoring effects.
On the other hand, if auditory memory is not a
major factor in [i] anchoring, then a different pattern
of results would be expected According to this
explanation, reducing the available auditory memory
should reduce the contrast effects only for a CIC
anchor (as outlined above) The effects of CiC
anchoring would be relatively unaffected by auditory
memory and should remain despite decreases in
available auditory memory.
A third possibility is that reducing the informa-tion available in auditory memory will have no effect on the contrast effects caused by either (SiC
or CIC anchors This results is predicted by a model recently proposed by Fujisaki and Shigeno (1979),
in which auditory memory underlies assimilation effects (not contrast) in identification According to this model, contrast effects in identification are mediated by categorical (phonetic) short-term memory, while auditory memory underlies assimilation effects Thus, reducing the available auditory mem-ory information should either increase the size of the contrast effects found as a function of vowel anchoring (due to a larger reliance on categor~cal STM) or leave these contrast effects unchanged The following experiment, in which vowels were placed
in CVC syllables to reduce the available auditory memory for vowels, should allow us to discern the role of auditory memory in vowel anchoring.
Method
Subjects The subjects were 20 undergraduates at the State University of New York at Buffalo, who participated to fulfill
a course requirement They met the same requirements as subjects
in the previous experiments.
Stimuli The stimuli consisted of two sets of seven CVC syl-lables One set ranged perceptually from [sis] (as in cease) to [sis] (as in sister), while the other ranged from [bit] (beetj to [blt[ (bit) In the [sVs] series, 250-msec steady-state vowels were
embedded between initial and final Is] fricatives The Is] frica-tives were both 200 msec in duration and consisted of band-limited noise between 3,400 and 5,000 Hz The seven syllables varied only in the first three formants for the vowel The actual formant frequencies, bandwidths, and fundamental frequency were identical to those of the isolated vowels used in Experi-ments 1 and 2 (see also Pisoni, 1971) Thus, the [sis]-[slsl stimuli represent the embedding of vowels similar to our original isolated vowels in an [sVs] context.
The [bit]-[blt] series varied both the initial consonantal transi-tions and the formant frequencies for the vowel in equal logarithmic steps All seven stimuli were 200 msec in duration and consisted of an initial 30-msec consonantal transitional followed by a 90-msec dynamic vowel During the vowel, the first three formants gradually changed from their values at the end
of the consonant to "target" values The target values were attained 60 msec into the vowel and held for the last 30 msec
of the vowel The formant frequencies for the first three formants
of the [bit] and [blt] endpoints at onset (0 msec); end of conso-nantal transitions (30 msec); and end of the vowel transit:ons (90 msec) are shown in Table 1 The values between these p~3ints were determined by linear interpolation The first 35 msec of each vowel contained vocalic excitation, while the last 55 nasec
Table 1 Formant Frequency Values for the [bit] (Stimulus 1) and [bit] (Stimulus 7) Endpoints of the [bit] -[bit] Series at Onset
(0 msec), 30 msec, and 90 msec
Time
Trang 9[bit] [blt]
3
TIME (msec)
Figure 7 Sound spectrograms of the [bit] and [bit] endpoints of the [bit]-[blt] series used in Experiment 3.
were aspirated in preparation for the voiceless final stop In all
seven stimuli, the vowels were followed by a 60-msec silent period
and then a 20-msec burst appropriate for the voiceless stop It].
The endpoints of the [bVt] series were patterned after sound
spectrograms of utterance of one of the authors Spectrograms of
these two endpoints are shown in Figure 7 All stimuli were
gener-ated using a software cascade synthesizer (Klatt, Note 7, or see
Kewley-Port, Note 8) in the Speech Perception Laboratory at the
State University of New York at Buffalo.
The stimuli were converted to analogue form and recorded to
make four test tapes for each of the two stimulus series The
baseline identification tape contained 10 occurrences of each
stimulus from a series in random order The [sis] and [bit]
end-point anchor tapes each contained 40 occurrences of the [sis]
or [bit] endpoint stimulus and 10 occurrences of each of the other
six stimuli in random order The [sls] and [blt] anchor tapes were
constructed in a similar fashion The final tape for each series
was an ABX discrimination tape This tape consisted of two-step
comparisons (i.e., Stimuli 1 and 3, 2 and 4, etc.) with 500 msec
between stimuli within a triad This tape was constructed in a
fashion similar to the baseline ABX tape for Experiment 1.
In all four tapes, there were 4 sec between trials (triads) All
stimuli were presented in random order, with the restriction that
no stimulus (or stimulus pair for ABX) could occur more than
three times in succession.
Procedure The tapes were reproduced and played to the
sub-jects on the same equipment and in a fashion similar to that used
in Experiments 1 and 2 Eight of the subjects listened to the
[bit]-[blt] series and 12 listened to the [sis]-[sls] series All
sub-jects were run in small groups of from two to six at a time.
Each subject participated in two l-h sessions on separate days.
On each day, the subjects listened to one presentation of the
baseline identification tape followed by the ABX discrimination
tape Following these, two presentations of the [bit] ([sis]) anchoring tape were presented on one day and two presentations
of the [bit] ([sis]) anchoring tape on the other The order of these tapes was counterbalanced across subjects The subjects used the same identification plus rating response for the identification tapes that was used in Experiment 1 For the ABX tapes, the subjects also used the same response procedure that was used in Experiment 1 By the end of the experiment, each subject had provided at least 20 responses to each stimulus in each of the three identification conditions and 32 responses to each stimulus pair for the ABX discrimination tapes.
Results The baseline identification and ABX discrimina-tion results for the [sis]-[sls] group are shown in Figure 8 on the right For comparison purposes, the ABX results for these same vowels in isolation are shown (from Pisotfi, 11973), as are the predictions of the Haskins categorical perception model for our [sis]-[sls] series.4 The ABX results for this series are clearly not categorical [X2(4) = 28.7, p < 001].s However, the obtained [sis]-[sls] discrimination is not as continuous as the long, isolated vowel data
of Pisoni [X2(4) = 7.79, 05 < p < 1] Thus, the [sis]-[sls] series seems to allow some use of auditory memory but less than that for the vowels used in Experiments 1 and 2.
The identification plus rating data was also con-verted into the same 8-point rating scale used in
Trang 10430 SAWUSCH, NUSBAUM, AND SCHWAB
[sis]-[sls] Series
r~ 5
la.I
4-2 3 4 5 6 7 2 3 4
STIMULUS VALUE
Ioo
5o
Figure 8 Rating functions (left) and ABX discrimination
(right) data for the [sis]-[sls] series in Experiment 3 Control
(solid triangles), [sis]-anchor (open circles) and [sls]-anchor (open
squares) rating functions are shown on the left Control rating
data (solid triangles), obtained ABX discrimination data (solid
circles), Haskins predicted ABX discrimination (open circles), and
obtained ABX discrimination for isolated vowels (open squares)
are shown on the right.
Experiment 1 The baseline rating, [sis] anchor, and
Isis] anchor functions appear on the left in Figure 8
each of the anchoring conditions caused a significant
shift in the category boundary toward the anchored
end of the series [t(ll) = 7.72, p < 001, for the
[sis] anchor and t(ll) = 2.93, p < 02, for the
[sis] anchor] Furthermore, the [sis]-anchor-induced
shift was significantly larger than the Isis] anchor
shift [t(ll) = 3.73, p < 01] Thus, since the
[sis]-Isis] series shows less use of auditory memory than
the isolated vowels and contrast effects at the [I]
end of this series seems to have been reduced, it
appears that auditory memory is the mediating factor
in the anchoring effects of [I] but not of [i]
A similar pattern of results was found for the
[bit]-[bit] series The ABX results are shown on the right
side of Figure 9 Again, the predictions of the
Haskins model and the data of Pisoni (1971) are
shown for comparison The [bit]-[bIt] data were
sig-nificantly less categorical than the Haskins
predic-tions [X2(4) = 26.7, p < 001] Also, as before, the
obtained [bit]-[blt] discrimination data were not as
continuous as the isolated vowels [X2(4) = 6.01, 1
< p < 2] The baseline, [bit]-anchor, and [bIt]-anchor
rating functions are shown on the left side of Figure 9
Both anchors produced significant shifts in the
cate-gory boundary toward the catecate-gory of the anchor
It(7) = 7.04, p < 001, and t(7) = 3.12, p < 02,
for the [bit] and [bit] anchors, respectively] As with
the [sis]-|sIs] series, the [bit] anchor produced a
sig-nificantly larger shift than the [bit] anchor [t(7)
= 3.45, p < 02]
Discussion
The reduction in available auditory memory for
both of these two test series when compared to the
long, isolated vowels clearly indicates that this series
is intermediate between categorical and continuous perception Furthermore, both the [i] and the [I] ends of the series appear to have equal amounts of auditory memory information available, since they have equal within-category discriminability (approxi-mately 65% correct) However, only the anchoring at the [I] end of the series appears to have been affected by this reduction in auditory memory
In Experiments 1 and 2, the [i] and [I] anchor-ing effects were approximately equal, with a slightly larger [i] effect in Experiment 1 and a slightly larger [I] effect in Experiment 2 Although di-rect comparison of the magnitudes of the shifts across experiments is inappropriate because the stimuli are different and hence the category bound-ary shifts (measured in stimulus units) are not strictly comparable,6 the results of Experiment 3 are clearly different from those of Experiments 1 and 2 and from those of Sawusch and Nusbaum (1979) The results with both the [bit]-[bIt] series and the [sis]-[sIs] series indicate that reducing the available auditory memory for vowel information reduces the contrast effects for the [I]-anchor stimuli Thus, the contrast effects found with [I]-vowel anchoring appear to be mediated by auditory memory for vowel information As auditory memory was reduced, the contrast effects for [I] anchors were reduced cor-respondingly These results are inconsistent with the model proposed by Fujisaki and Shigeno (1979) as outlined earlier The effect of the [i] anchors however, was not related to the use of auditory memory Instead, the [i] effect appears to represent a change
in early perceptual processing, separate from audi-tory memory, possibly the rerunning of a prototype from long-term memory
[bit]-[blt] Series
1 2 3 4 5 6 7 1 2 3 4
STIMULUS VALUE
IOO -~
m m
0
5 6 7
Figure 9 Rating functions (left) and ABX discrimination
(right) data for the [bit]-[bit] series in Experiment 3 Control
(solid triangles), [bitl-anchor (open circles) and [blt]-anchor (olden squares) rating functions are shown on the left Control rating
data (solid triangles), obtained ABX discrimination data (solid
circles), Haskins predicted ABX discrmination (olden circles), and obtained ABX discrimination for isolated vowels (open squares) are shown on the right.