The pigeons were able to correctly discriminate the cause of the stimulus change, although their attributions were strongly affected by the amount and the delay of reward given for corre
Trang 1Perception of Contingency in Conditioning: Scalar Timing, Response Bias, and Erasure of Memory by Reinforcement
Peter R Killeen and James Phillip Smith
Arizona State University
Pigeons' key pecks turned off a key light, which also went off independently of
their pecks The pigeons were able to correctly discriminate the cause of the stimulus
change, although their attributions were strongly affected by the amount and the
delay of reward given for correct responses Their discrimination was based on the
asynchrony between a response and the change in the key light A simple detection
model that combined detectability and motivational factors provided a good
de-scription of the data It was shown that the discriminative criteria did not change
with changes in distribution of noise events in an optimal fashion and that the
pigeons therefore were not "ideal detectors." In the final experiment, the pigeons
were asked to discriminate the cause of a key light change, of a hopper illumination,
and of a feeding Performance decreased with each condition and with the duration
of the last two events It was noted that the memory trace for a stimulus change
decays at the same rate as the primary reinforcement gradient but that it decays
faster when the delay is filled with an event such as reinforcement The possibility
that the effects of reinforcement may be blocked by reinforcement is briefly discussed
Discriminative stimuli are denned in terms
of their impact on current behavior, but they
also affect subsequent behavior Analysis of
memory in animals has been stimulated by
theories of learning that posit an important
role for it and other "cognitive" processes (e.g.,
Grant, Brewster, & StierhofF, 1983; Shimp,
1976a; Wagner, Rudy, & Whitlow, 1973) The
stimuli to be remembered may be not only
lights and tones but also reinforcement
sched-ules (Lattal, 1975; Rilling & McDiarmid,
1965), the animal's own behavior (Reynolds,
1966; Shimp, 1983), and even the animal's
characterization of a particular
behavior-re-inforcement contingency (see e.g., Commons
& Nevin, 1981) The techniques used in these
studies permit evaluation of sensitivity to
events in a motivational context different from
the ones in which they occurred and thus allow
the measurement of detectability separate from
motivational biases
An instance of this approach that provides
the prototype for the research to be reported
This research was supported in part by Grant BNS
76-24534 from the National Science Foundation.
Requests for reprints should be sent to Peter Killeen,
Department of Psychology, Arizona State University,
Tempe, Arizona 85287.
here is provided by Killeen (1978, 198 Ib), who asked pigeons to discriminate whether they had brought about a change in the illumination
of a key light, or whether that change was a random event initiated by the computer The pigeons were good at the task, but their per-formance was strongly affected by the amount
of food they received for correct detections relative to the amount for correct rejections Accounts of "superstitious" conditioning that emphasized discrimination failure were clearly incorrect: Elevated response rates in the pres-ence of noncontingent rewards are better viewed as the effects of bias on temporal gen-eralization gradients
The present experiments provide a stronger basis for that conclusion: The original exper-iment is replicated, the motivational variables are changed, better control over response-event asynchronies is provided, a model of the dis-crimination/motivation interaction is devel-oped, and memory for responlight se-quences is contrasted with memory for re-sponse-reinforcer sequences
Experiment 1
Method Subjects Two White Carneaux pigeons (A and B) and
1 Silver King pigeon (C), all with previous histories of 333
Trang 2Table 1
The Order in Which the Pigeons Were Exposed to Each of the Delay Conditions
Subject
A
B
C
1 1.5/1.0 1.0/1.5 1.0/2.5
2
2.5/1.0 1.0/2.5 1.0/1.5
Order
3 1.0/1.5 2.5/1.0 2.5/1.0
4 1.0/2.5 1.5/1.0 1.5/1.0
5 1.0/1.0 1.0/1.0 1.0/1.0
Note The values represent the delay in seconds to the rewards available for correct responses to the left/right keys.
experimentation, served as subjects They were maintained
at 80%-85% of their free-feeding weights.
Apparatus The experimental apparatus comprised a
standard three-key Lehigh Valley operant chamber
con-nected to a PDP11/05 computer The center key was a
Gerbrands pigeon key, activated by 0.14 N of force; the
side keys were original equipment A food hopper located
beneath the center key provided access to mixed grain A
house-light atop the front panel provided diffuse
illumi-nation during the session.
Procedure The pigeons were trained to peck on the
center key for occasional rewards of access to grain After
several days they were shifted to a schedule in which pecks
on the center key had a 5% chance of causing its light to
go out and the side-key lights to go on Occasionally the
center-key light would go off and the side-key lights would
go on independently of the pigeons' behavior A single
peck on one of the side keys would provide either 2.5 s
of access to food or 2.5 s during which the chamber was
darkened The subjects would receive the food if and only
if they chose the correct side key If their center-key peck
was the event that had caused the transition from
center-to side-key lights, then a choice of the right key would be
rewarded If the transition occurred independently of the
pigeon's behavior, then a choice of the left key would be
rewarded After reward or time-out, a 3-s intertrial interval
elapsed during which only the house-light was illuminated,
followed by a reversion to the original condition with the
center key lit, The colors of the keys were green (left),
white (center), and red (right) The session terminated
after 50 rewards.
The probability that the transition would occur without
a key peck was determined by letting the computer generate
"pseudopecks" at the same rate as the pigeon's pecks and
generating a transition with the same probability (5%).
The computer updated its estimate of the pigeons'
inter-response time (IRT) every second using an exponentially
weighted moving average, weighting the most recent latency
20% and the previous average 80% When the time from
the last pseudopeck exceeded the current estimate of the
pigeon's IRT, the computer would make another
"re-sponse." This technique has three implications: (a) The
rates of response-dependent and response-independent
transitions were approximately equal during a session, (b)
The latter events were not truly independent of the pigeon's
behavior, because their rates were correlated; during any
one trial the correlations were close to zero, but over the
course of a session they approached 1.0 (c) A
computer-initiated transition could occur "simultaneously" with a
pigeon's response and yet be categorized by the computer
as computer initiated.
After 10 sessions, a 1-s delay was instituted between the offset of the side keys and reward (or time out) This con-dition was in force for 30 sessions, followed by exposure
to the major experimental conditions, in which the value
of the delay was varied over the range listed in Table 1 Each condition was in effect for approximately 40 sessions.
Results
The pigeons learned the discrimination; over the course of the experiment the average prob-ability of a correct response was 65% However, this statistic underestimates the ability of the pigeons, whose behavior was quite sensitive to the differential payoff for correct yes and no responses Figure 1 plots the data from the various conditions in the traditional coordi-nates of signal detectability theory: the prob-ability of saying yes given that the pigeons had indeed caused the transition ("hits") and the probability of saying yes given that the com-puter had caused the transition ("false alarms") As the relative payoff for hits in-creased, the data points moved from the lower left corner to the upper right corner of the graph Maximizing the probability of being correct did not maximize reward; for instance, when the relative payoff for hits was great, it was to the pigeon's advantage to presume that
it caused all transitions about which there was any question, rather than be unbiased A sta-tistic that reflects accuracy independently of
bias is A', the area under an average curve
through the points (Grier, 1971; Pollack & Norman, 1964) This estimates the accuracy that would obtain if the subjects were unbiased; its value was 79 ± 4% under the conditions
of this experiment
How did the pigeons do so well? Presumably they based their choices on a temporal
Trang 3dis-.2 4 6 6
PROBABILITY OF A FALSE ALARM
Figure 1 The probability of a right-key response given
that the transition to lit side keys was caused by a
center-key peck ("hits") versus the probability of a right-center-key
response given that the transition was caused by a
pseu-dorandom device in the computer ("false alarms") (The
smooth line, a "perceiver operating characteristic," is
de-rived from Equation 1, assuming a signal-to-noise ratio
[S] of [400/60].)
crimination: If a transition immediately fol-lowed a center-key response, they responded
as though they had generated the transition, whereas if the stimulus change was delayed, they responded as though it was computer ini-tiated Figure 2 bears out this presumption The open symbols are the probability of a right-key response (yes) as a function of the asynchrony between a center-key peck and the subsequent stimulus change for four of the bias conditions (In the 1.0/2.5 condition, the pigeons were most strongly biased away from the right key by the 2.5-s delay of reward, and they almost never responded there.) The data represent the averages for asynchronies of 825
to 475 ms, 475 to 250 ms, 250 to 60 ms, 60
to -50 ms, and -50 to -150 ms, taken from the last seven sessions at each condition It can be seen that as the asynchrony between
a response and stimulus change decreases, the probability of a right-key response increases The solid points represent hits, the probability
of a right-key response when the stimulus change was caused by the pigeon They are plotted 60 ms to the left of the origin because that was the measured latency of the key-light
600 500 400 300 200 100 -100
ASYNCHRONY OF EVENT (MSEC)
Figure 2 The probability of a right-key response as a function of the asynchrony between the previous
center-key peck and the transition to lit side keys (The parameters signify the delay of reinforcement for correct responses on the left and right keys For clarity, each curve and its associated data points are elevated 10% above the ones below it Negative asynchronies indicate that a center-key peck occurred after the transition had begun Filled symbols represent hits, open symbols false alarms The data are averages over subjects over the last seven sessions; the curves in the left panel are derived from the discrimination model.)
Trang 4change on these trials Points to the right of
the origin represent center-key pecks that
oc-curred after the stimulus change; these
re-sponses were infrequent and probably were
initiated before the change took place
Discussion
Skinner (1948) proposed that superstitious
behavior is caused by adventitious contiguities
between a response and reward Figure 2
shows, however, that pigeons can distinguish
between events that they originate and those
that they don't, even when the latter occur
very soon after a response But Figures 1 and
2 also show that these gradients are affected
by motivational variables; pigeons may behave
as though they caused an event whose latency
they can clearly discriminate as greater than
zero if it is to their advantage to do so If the
cost of such an assertion is low, as it often is
in conditioning experiments where it may
in-volve only a few responses that an animal is
well prepared to make (Seligman, 1970), a few
instances of approximate contiguity may be
adequate to generate durable "superstitious"
conditioning (Neuringer, 1970) Conversely,
only moderate differential costs may be
effec-tive in shifting the attribution of causality; the
parametric values of delay used in this study
were established after months of pilot work
during which we searched for differentials
small enough to avoid driving the animals to
exclusive choices
The performance of the pigeons may be
de-scribed by a simple detection model that
com-bines discriminability of the signal with
mo-tivational variables (More sophisticated
mod-els are available from Church & Gibbon, 1982;
Davison & Tustin, 1978; McCarthy & Davison,
1980; and Nevin, Jenkins, Whittaker, &
Yar-enski, 1982.) It is based on the choice model
of Bradley, Terry, and Luce (Luce, 1963) and
predicts the probability of saying "Yes, a signal
is present" to be
lag between a response and the transition The simplest such function is
P y = S/(S + M'), (1)
where M' is a measure of the motivational bias
to say no relative to that to say yes, and S is
a measure of the similarity between the event
and the signal to be detected Because the
sig-nal to be detected is a minimal asynchrony,
similarity must be a decreasing function of the
S = fly/ax, a > 0, (2)
where Oy is the asynchrony of the signal (here
60 ms), and a x is the asynchrony of the event
in question See the Appendix for further dis-cussion of this model
For our measure of motivational bias, we turn to incentive theory (Killeen, 1982a, 1982b) A basic assumption of incentive theory
is that two factors control responding, a di-rective factor that comprises the effects of the delayed primary reinforcer (which decay ex-ponentially with delay) and the effects of the conditioned reinforcers signaling the delay (whose strength approximately equals the re-ciprocal of the delay) The directive factor is
multiplied by an arousal factor (A), which is
proportional to the overall rate of
reinforce-ment, and a response bias parameter (b):
M = bA(e- qD + 1/D). (3)
The value of q usually lies close to 0.13,
and that is the value assigned to it here Equa-tion 3 is evaluated for both no and yes alter-natives, with the ratio of those values being
M' Although this treatment of motivational
bias is relatively secure, many other concave functions would do about as well, because this factor merely sets the level of the curves in Figure 2; their curvature is determined by the measure of similarity as denned in Equation
2 The single free parameter in this model is the ratio of response biases for the two
out-comes, b' = b n /by Assigning b' a value of 0.37
yields the curves in the left panel of Figure 2
We may also collapse these curves over their abscissas and ask what is the probability of saying yes given that a signal occurred versus saying yes given that the stimulus change was independent of the animal's behavior for any value of M' We know that the value of ay for
a signal was 60 ms; we can evaluate Equation
1 for a range of values of M' to obtain the ordinates of the curve in Figure 1 We may then do the same for the average asynchrony
of the computer-initiated transitions to gen-erate coordinate probabilities of false alarms Unfortunately, the value of that asynchrony varied with conditions and response rates of individual subjects and is not generally known
Trang 5However, we may treat it as a free parameter
and assign it the value of 400 ms and thus
obtain the abscissas for the "operating
char-acteristic" drawn in Figure 1 This analysis is
different from traditional signal detectability
theory in that it uses independent variables to
predict both dependent variables, rather than
merely using one dependent variable to predict
the other As with more familiar signal
detec-tion models, the distance of the curve from
the positive diagonal reflects the animals
sen-sitivity to the signal; the operating position on
that curve is determined by the animal's
mo-tivation (M 1)
This experiment demonstrated an
impres-sive ability of pigeons to discriminate their
role in bringing about a subsequent event and
a surprising sensitivity of the animals' bias to
small differences in the delay of reward The
data supported a simple model of performance
based on a temporal discrimination and were
extended to the coordinate system of signal
detectability theory But the indeterminancy
of the asynchronies of computer-initiated
events—their dependence on the pigeons'
cen-ter-key response rate and their variability from
one condition to another—somewhat
under-mines the strength of this last demonstration,
requiring that a parameter be estimated post
hoc That shortcoming is remedied in the next
experiment
Experiment 2 The first experiment demonstrated strong
control of pigeons' attributions by small shifts
in the delay of rewards The present
experi-ment replicates it with several modifications
In this experiment motivation is controlled by
differential amounts of reward, rather than
dif-ferential delays of reward In the first
exper-iment the asynchrony for a transition caused
by the animals was 60 ms; in this experiment
it is reduced to 20 ms In the first experiment
the asynchrony between a response and a
computer-initiated transition varied as a
func-tion of the animal's response rate on the center
key; in this experiment those events are
in-dependent
Method
Subjects The subjects were 4 naive Roller pigeons
maintained at 80%-85% of their free-feeding weight
Table 2
The Order in Which the Pigeons Were Exposed
to Each of the Amount Conditions
Subject D E F O
1 2.5/2.5 2.5/2.5 2.5/2.5 2.5/2.5
Order 2 1.0/4.0 3.5/1.5 1.0/4.0 3.5/1.5
3 3.5/1.5 1.0/4.0 3.5/1.5 1.0/4.0
Note The values represent the amount of reward (seconds
access to grain) available for correct responses to the left/ right keys The duration of blackout was the same as the duration of reward on that key
Apparatus and procedure The apparatus was the same
as in the first experiment The procedure was similar to that of the first experiment Pecks on the white center key would cause it to go off and the side keys to come on, with a probability of 05 The delay between a response and this transition was 20 ms, which includes the half-life for the decay in brightness of the center-key light (10 ms) If the pigeon then correctly responded to the right key, it would receive grain; if it incorrectly responded to the left key, it would receive a brief blackout After these events, a 2.5-s intertrial interval ensued, followed by a reversion to the original condition, with the center key lit Pecks on the center key would also be followed, after a delay, by offset of the center key and onset of the side keys with a probability of 05 On these trials responses to the left key would be rewarded, and responses to the right key would be punished with blackout The delays were ran-domly chosen from the set 40, 120, 200, 280, and 360
ms The measured asynchronies were 20 ms greater than these values, yielding an average asynchrony of 220 ms for the "noise" event The pigeons received 60 rewards per session and approximately 35 sessions at each of the conditions listed in Table 2
Insofar as the controlling variable in the first experiment was the asynchrony between the response arid the transition, this experiment provides much better control over that variable However, the problem the pigeons must solve has shifted from one of determining causality to pne of making
a temporal discrimination Of course, it is our contention that the latter is often the basis for the former
Results
This task appears more difficult than the first, with the difference between the "signal" and the average "noise" asynchronies being only 0.2 s and with some of the noise events occurring only 0.06 s after a response Yet the pigeons performed about as accurately in this experiment as they did in the first, with values
of A' being 82, 84, and 75 for Conditions
Trang 61/4, 2.5/2.5, and 3.5/1.5, respectively Their
performance, averaged over the last eight
ses-sions at each condition, is shown in Figure 3,
where it is evident that small shifts in the
amount of reward brought about large shifts
in response bias
The curves through the data in Figure 3 are
parameter-free predictions based on the
de-tection model (Equations 1 and 2) The average
ratio of asynchronies for hits to false alarms
was always 0.137 (S = 2ay /a n /N), and this
completely determined the height of the curves
above the positive diagonal We see some
sys-tematic error, with detectability lower than
predicted in the 3.5/1.5 condition and higher
elsewhere, but overall the predictions are good
In Figure 4 we plot the average temporal
gradients for the three conditions The data
from these closely spaced time intervals are
irregular, with a strong secondary peak at 300
ms The variability in Figure 4 is greater than
can be explained by sampling error (binomial
"error" for these data would be about 4%)
The peak at 300 ms is noteworthy: This latency
is of the same magnitude as the most common
interresponse time on comparable schedules
of reinforcement (250-400 msec; Blough,
1963; Gott & Weiss, 1972) It is quite likely,
given the high probability of multiple responses
PROBABILITY OF A FALSE ALARM
Figure 3 Operating characteristics for the subjects in
Ex-periment 2; the curves are parameter-free predictions
de-rived from Equations 1 and 2 (Circles, triangles, and
squares represent food ratios of 1/4, 2.5/2.5, and 3.5/1.5,
respectively, for left/right choices.)
A 24/2.5
ASYNCHRONY OF EVENT (MSEC)
Figure 4 The probability of a right-key response as a
function of the asynchrony between a center-key peck and the transition to lit side keys in Experiment 2 (The pa-rameters signify the amount of food available for correct responses on the left and right keys; For clarity, each curve
is elevated 10% above the ones below it Filled symbols represent hits, open symbols false alarms The data are averages over subjects over the last eight sessions.)
at this tempo, that a peck will strike the key just as it changes Pigeons close their eyes dur-ing the dozen milliseconds of impact with the key and could easily take the change to be due
to such a peck
The foregoing interpretation reconciles the data in Figure 4 with those in Figure 2 (where the larger bias obscured the effect) and Figure
5 (where there is a similar, though less marked, effect) It implies, however, that the reported
values for A' are not true maxima—by
ex-cluding from calculation trials in which a peck immediately followed a stimulus change (and trials in which a peck struck off-key or with
subthreshold force) even larger values for A'
would be obtained But this experiment was not designed as an evaluation of temporal dis-criminative ability (for which alternatives of constant asynchrony would be preferred) It was designed to emulate more naturally oc-curring situations in which organisms must make future decisions (respond again, quit the situation, etc.) based on their detection of a causal relation between their behavior and a change in the environment Animals may use other cues to establish that relation, being well prepared to recognize some types of causal links and contraprepared to recognize others
Although the absolute values of A' are likely
to vary with the nature of the response, the stimulus, and their compatibility, we have
Trang 7O 120 MSEC
A 220 MSEC
D 420 MSEC
740 600 900 400 300 200 100 0
ASYNCHRONY OF EVENT (MSEC)
Figure 5 The probability of a right-key response as a function of the asynchrony between a center-key peck
and the transition to lit side keys in Experiment 3 (The different symbols represent the performance in the different conditions of the experiment: Filled symbols represent hits, open symbols false alarms The data are averages over subjects over the last eight sessions.)
demonstrated that a second factor—bias—is
also involved Superstitions cannot be
attrib-uted simply to failures of discrimination
(re-sponse-event asynchronies characteristic of
adventitious rewards can easily be
discrimi-nated by these animals), but should be thought
of in terms of a signal-detection task, in which
both signal strength and motivational variables
play a joint role in determining performance
Whereas we have shown that the pigeons'
bias is affected by payoff variables in a rational
way, this is not to say that they are acting as
optimal decision makers Our model predicts
that the animal's choice will depend only on
the asynchrony of a responstimulus
se-quence but not on the statistics of the
distri-bution of asynchronies due to
computer-ini-tiated stimuli But an optimal decision strategy
would take both items into account and move
the criterion for saying yes to longer
asyn-chronies as the typical lag between a response
and computer-initiated events increased In the
next experiment we stress that implication
Experiment 3
Method
The subjects and apparatus were the same as in EXT
periment 2,
The procedure was basically the same as in Experiment
2 The amounts of grain available for hits and correct
rejections were equal (2.5 s) The asynchrony of "caused"
events was 20 ms; the asynchrony of "uncaused" events,
depending on the experimental condition, was drawn from
one of three sets with average asynchronies of 120, 220,
and 420 ms The compositions of these sets were 120—
40,80,120,160, 300; 220—60,140,220,300,380; 420—
100, 260,420, 580, 740 All values are in ms and include
the response time of the computer and light bulb All subjects experienced these sets in the order 420 ms, 220
ms, and 120 ms, with approximately 40 sessions at each condition.
Results and Discussion
Figure 5 shows the probability of a response
on the right key as a function of the asynchrony between a center-key peck and the transition
to the side key lights The data come from the last eight sessions at each condition The curves are essentially congruent and confirm our ear-lier impressions of the acuteness of the pigeon's perception of short delays between its behavior and subsequent events: The curves reach their floors at delays of approximately 150 ms Again, there is a slight rise in the curves be-tween 250 and 400 ms The average values of
A' were 90, 85, and 80 for Conditions 420,
220, and 120 Differences in accuracy among conditions were not due to different elevations
of the curves but to different extensions beyond
150 ms
As the average asynchrony of delayed events decreased, the animals could optimize their performance by tightening their criterion for saying yes: In the conditions where computer-initiated events typically occurred sooner after
a response (e.g., 120 ms), it was to the animal's benefit to say no to some of the more ques-tionable events that it might otherwise have claimed This did not happen; the circles do not lie below the other curves between 100 and 200 ms However, the pigeons are oper-ating close to the optimal criteria, and judging from optimized decision-theory models, this
Trang 8systematic error cost them only 4% of the
re-wards available in the session
The accuracy of pigeons in making these
discriminations may be due in part to the fact
that the interval to be timed is initiated by
a key peck That accuracy may be portrayed
with two different statistics, the just noticeable
difference (jnd) and the Weber fraction If we
measure the jnd as a function of the
semi-interquartile range (in Figure 5, the abscissa
coordinate to 0.375 minus the abscissa
co-ordinate to 0.625), the present experiments
give an impressive value of approximately 50
ms for all curves If we divide that by the point
of subjective equality (the abscissa coordinate
to 0.50) to obtain the Weber fraction, the result
is a ratio of 0.65, not nearly so impressive
(values of 0.1 being obtainable at longer
in-tervals) Thus, although absolute sensitivity
(the jnd) is excellent, relative sensitivity (the
Weber fraction) is considerably below the best
obtainable
We conclude that, whereas pigeons are
sen-sitive to the rewards contingent on their
de-cisions and thus bias their responses in a
ra-tional fashion, they are not in this paradigm
ideal decision makers, in that their bias does
not change with changes in the signal to noise
ratio in an optimal fashion Their absolute
sensitivity is excellent, their relative sensitivity
less so
Experiment 4
In the previous experiments we have
dem-onstrated the acute sensitivity of pigeons to
the relationship between their behavior and
an ensuing change of cue lights, and we ask
in this experiment whether pigeons will be yet
more sensitive to the relationship between their
behavior and a food reward
Method
Subjects and apparatus The subjects were 3 Silver
King pigeons (H, I, and J) and 1 common pigeon (K) All
had previous, histories of reinforcement in other
experi-ments They were maintained at 80% of their ad libitum
weight.
The apparatus was the same as in the previous
exper-iments.
Procedure There were three major conditions—food,
"lite," and null In the food condition, pecks on the center
key would occasionally cause it to go off and the food
hopper to be elevated After a feeding, the side-key lights
would come on If the feeding was an immediate
conse-quence of the pigeon's peck, a response to the right key
would yield an additional reinforcement of 2.5 s; if not,
a 2.5 s blackout Conversely, if the feeding was asynchro-nous with the pigeon's peck, a response to the left key would yield a reinforcement of 2.5 s; if "not, a 2.5-s blackout The feedings were scheduled in the same manner as the transitions in the previous experiment, with the delay of
"immediate" events being 20 ms and that of asynchronous events averaging 420 ms, with the same rectangular dis-tribution as in Experiment 3.
The procedure for the lite condition was the same as that for the food condition, but when the center key went off, the light in the food hopper went on (as in the food condition), but the food hopper was not activated and no food was available After the lite event, the side keys went
on, and the pigeons were reinforced for correct discrim-inations The purpose of the lite condition was to move the pigeons away from the center key or from their position elsewhere in the box, thus breaking: up response chains that might mediate the discrimination, while providing a delay between the event and the subsequent discrimination that was equal to that experienced in the food condition The procedure for the null condition was identical to that in the 420-s condition of Experiment 3: When the center key light went off, the side key lights came on im-mediately, and correct discriminative responses were rein-forced.
The duration of the food and lite events were equal in each phase of the experiment and took values of 1, 2, 3 and 4 s, in that order.
The basic experimental cycle consisted of a 6-day week, with the lite condition on Monday and Thursday, the food condition on Tuesday and Friday, and the null condition
on Wednesday and Saturday The first (1 -s) condition lasted for 12 cycles, and the subsequent conditions for 6 cycles.
A session ended after 60 correct responses.
Results
It may be seen in Figure 6 that accuracy decreased as the duration of the food or light increased and that the decrease was much greater for the food condition than for the lite condition; sensitivities to the1 response-event relation were equal only at the 1-s condition, where the pigeons would have received little
or no food in the food condition Conversely,
accuracy in the null condition was high (A 1 =
89%, the same as that found in Experiment 3 with different subjects under comparable con-ditions) and did not decrease when event du-ration was increased in the other conditions This indicates that general motivational changes, such as satiation, were probably not responsible for the observed decrements in the other conditions
Discussion
This experiment shows a remarkable de-crease in pigeons' ability to identify the relation between their behavior and a subsequent
Trang 9re-1 2 3 4
DURATION OF LIGHT OR FOOD (SEC)
Figure 6 Accuracy of the pigeons in reporting the temporal relation between a center-key peck and the
ensuing event, either illumination of the hopper light (lite), delivery of food (food), or direct transition to lit side keys (null) (The data are plotted for individual subjects, and averages over subjects, as a function
of the duration of the event The solid lines are regressions; the dashed line is an exponential decay function
originating at A' = 89 and having a rate constant of 0.13/s.)
ward as the value of that reward increases
This relationship does not seem to make sense
from a functional viewpoint, even though the
mechanism—decay,of the memory trace as it
ages and greater decay when the interval is
filled with distracting events—is well
docu-mented However, the task we have set the
animals, that of remembering the events in
question, is not the same as the task of learning
instrumental responses (Maki, 1979a) This
distinction will be discussed later
We may characterize the rate of decay of
accuracy by its half-life, the time necessary
for the performance to fall half of the distance
from its maximum (at 0 s delay) to chance
level If we take the maximum to be A' = 89,
the half-lives for the lite and food conditions
were 4.5 s and 2.0 s, respectively These values
are in the range found by other investigators
for memory of various events: pigeons'
mem-ory of whether they made one or two pecks
(2.5 s; Kramer, 1982); pigeons' memory of the
location of a red or green stimulus (2.5 s;
Jit-sumori & Sugimoto, 1982); and pigeons'
memory for the location of a previous response
(4-6 s; Shimp, 1976b) Jans and Catania
(1980) studied pigeons' memory for the color
of a stimulus over a "standard" delay (house-light only on) and over a delay filled with "ac-tivity" (the feeder was operated) Converting
their results to A' and interpolating gives
me-dian half-lives of 4 s and 1 s for the standard and activity conditions, respectively The value
of the half-lives will be affected by other factors such as recency (Shimp, 1976b; Weisman, Wassermaii, Dodd, & Larew, 1980), "surpris-ingness" (Maki, 1979b; Wagner, Rudy, & Whitlow, 1973; ef.ColwiU& Dickinson, 1980), blocking effects (Cook, 1980; Grant & Roberts, 1976; Maki, Moe, & Bierley, 1977; Tranberg
& Rilling, 1980; Wilkie, Summers, & Speteh, 1981; Williams, 1978), arid, of course, me-diating behavior (Kramer, 1982, Experiments
1 & 2; Smith, Niedorowslci, & Attwood, 1982;
Zentall, Hogan, Howard, '& Moore, 1978).
Where mediation is minimized, the results speak to a very rapid decay of the memory trace
We may compare these results with those issuing from the study of the delay of rein-forcement gradient, an enterprise with a ven-erable tradition in experimental psychology
Trang 10(Benjamin & Perloff, 1982) Estimates of the
half-life vary widely, depending in large part
on the investigator's empirical success at
min-imizing mediation by conditioned
reinforce-ment (Renner, 1964) A theoretical way of
re-moving the effects of conditioned
reinforce-ment is provided by Equation 3 In over a
dozen different studies Killeen (1982b) found
that the contribution of primary reinforcement
to the maintenance of differential responding
could be captured by an exponential decay
function with a half-life of 5.5 s (q = 0.13);
this function provides the dashed line through
the lite data in Figure 6 (assuming maximum
A' of 89 and correcting for chance).1 Insofar
as that line parallels the data, it is evidence
that the decay of discriminative (memorial)
strength follows a time course similar to the
decay in reinforcing strength
The disruptiveness of interpolated
rein-forcement on the memory trace, suggested by
Shimp (1976b) and Staddon (1974), is depicted
by the food line in Figure 6 and by the
"ac-tivity" data df Jans and Catania (1980) Just
as a reinforcer may intervene between a
stim-ulus and an animals' subsequent report of it,
thereby impairing memory of the stimulus, it
may also intervene between a resppnse and a
subsequent reinforcer, thereby impairing the
effectiveness of the delayed reinforcer
Argu-ments for the disruptiveness of interpolated
reinforcement "on the delay of reinforcement
gradient are provided by Killeen ,(198la,
1982a), who used that assumption to generate
models for predicting response rates as a
func-tion of reinforcement rates Killeen.argued
that, because the strengthening effects of
re-inforcement on earlier responses was disrupted
more by an intervening period of
reinforce-ment than by a period of quiescence,
increas-ing rates of reinforcement would have
mar-ginally decreasing effectiveness as it became
ever more likely that the range of impact of
a reinforcer would be truncated by
interpo-sition of other reinforcers These
cpnsidera-tions led to a model formally equivalent to
those of Herrnsitein (1970) and Catania (1973),
although the interpretation of the parameters
differed The present paradigm provides a basis
for independent measurement of a key
param-eter in that model, the disruptiveness of
re-inforcement as a function of its duration
General Discussion
We have demonstrated that pigeons are eas-ily able to discriminate between events that they cause and those that are independent of their responding (Experiment 1), even when the contiguity between a response and an ad-ventitious event is quite close (Experiment 3) The discrimination appears to be based on a temporal discrimination in which the critical variable is the ratio of the asynchrony between
a response and stimulus, relative to the typical asynchrony between a response and stimulus that the pigeon has been reinforced for calling
caused Although the pigeons' attributions are
quitev sensitive to the relative payoffs (Exper-iments 1 & 2), they do not change with changes
in the distribution of asynchronies of delayed events as they should if they were "ideal ob-servers" maximizing payoff (Experiment 3)
In Experiment 4 we demonstrated that the decay of accuracy in attributing the locus of control over the hopper light onset decreased along approximately the same time course as the theoretical decrease in the effectiveness of primary reinforcement This suggests that when there is no significant blocking or dif-ferential conditioned reinforcement, the ability
of a delayed reinforcer to strengthen a response
is on the same order as the animals' ability to
remember the response at the onset of
rein-forcement But as reinforcement continues, memory deteriorates at an accelerated rate Whereas memory decays rapidly, perhaps ex-ponentially, with the duration of a reinforcer, the ability of that reinforcer to strengthen the response should be correlated not with
mem-ory at some instant but with the integral of
memory over the course of the reinforcer and should thus increase as a concave function of
1 Equation 3 was originally used to.capture the differ-ential impact on choice responses of delays between them and reward The exponential decay of primary reinforce-ment strength was derived as a consequence of potential blocking of the association between two events as a function
of the delay, between them The same rationale is appro-priate here, even though the events differ: If there is a constant probability of forgetting an event (or having its association blocked), we expect an exponential function Whether the similarity in values of the time constants is reliable remains to be seen.