1. Trang chủ
  2. » Giáo Dục - Đào Tạo

Perception of contingency in conditioning scalar timing, response bias, and erasure of memory by reinforcement

13 4 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 13
Dung lượng 1,21 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The pigeons were able to correctly discriminate the cause of the stimulus change, although their attributions were strongly affected by the amount and the delay of reward given for corre

Trang 1

Perception of Contingency in Conditioning: Scalar Timing, Response Bias, and Erasure of Memory by Reinforcement

Peter R Killeen and James Phillip Smith

Arizona State University

Pigeons' key pecks turned off a key light, which also went off independently of

their pecks The pigeons were able to correctly discriminate the cause of the stimulus

change, although their attributions were strongly affected by the amount and the

delay of reward given for correct responses Their discrimination was based on the

asynchrony between a response and the change in the key light A simple detection

model that combined detectability and motivational factors provided a good

de-scription of the data It was shown that the discriminative criteria did not change

with changes in distribution of noise events in an optimal fashion and that the

pigeons therefore were not "ideal detectors." In the final experiment, the pigeons

were asked to discriminate the cause of a key light change, of a hopper illumination,

and of a feeding Performance decreased with each condition and with the duration

of the last two events It was noted that the memory trace for a stimulus change

decays at the same rate as the primary reinforcement gradient but that it decays

faster when the delay is filled with an event such as reinforcement The possibility

that the effects of reinforcement may be blocked by reinforcement is briefly discussed

Discriminative stimuli are denned in terms

of their impact on current behavior, but they

also affect subsequent behavior Analysis of

memory in animals has been stimulated by

theories of learning that posit an important

role for it and other "cognitive" processes (e.g.,

Grant, Brewster, & StierhofF, 1983; Shimp,

1976a; Wagner, Rudy, & Whitlow, 1973) The

stimuli to be remembered may be not only

lights and tones but also reinforcement

sched-ules (Lattal, 1975; Rilling & McDiarmid,

1965), the animal's own behavior (Reynolds,

1966; Shimp, 1983), and even the animal's

characterization of a particular

behavior-re-inforcement contingency (see e.g., Commons

& Nevin, 1981) The techniques used in these

studies permit evaluation of sensitivity to

events in a motivational context different from

the ones in which they occurred and thus allow

the measurement of detectability separate from

motivational biases

An instance of this approach that provides

the prototype for the research to be reported

This research was supported in part by Grant BNS

76-24534 from the National Science Foundation.

Requests for reprints should be sent to Peter Killeen,

Department of Psychology, Arizona State University,

Tempe, Arizona 85287.

here is provided by Killeen (1978, 198 Ib), who asked pigeons to discriminate whether they had brought about a change in the illumination

of a key light, or whether that change was a random event initiated by the computer The pigeons were good at the task, but their per-formance was strongly affected by the amount

of food they received for correct detections relative to the amount for correct rejections Accounts of "superstitious" conditioning that emphasized discrimination failure were clearly incorrect: Elevated response rates in the pres-ence of noncontingent rewards are better viewed as the effects of bias on temporal gen-eralization gradients

The present experiments provide a stronger basis for that conclusion: The original exper-iment is replicated, the motivational variables are changed, better control over response-event asynchronies is provided, a model of the dis-crimination/motivation interaction is devel-oped, and memory for responlight se-quences is contrasted with memory for re-sponse-reinforcer sequences

Experiment 1

Method Subjects Two White Carneaux pigeons (A and B) and

1 Silver King pigeon (C), all with previous histories of 333

Trang 2

Table 1

The Order in Which the Pigeons Were Exposed to Each of the Delay Conditions

Subject

A

B

C

1 1.5/1.0 1.0/1.5 1.0/2.5

2

2.5/1.0 1.0/2.5 1.0/1.5

Order

3 1.0/1.5 2.5/1.0 2.5/1.0

4 1.0/2.5 1.5/1.0 1.5/1.0

5 1.0/1.0 1.0/1.0 1.0/1.0

Note The values represent the delay in seconds to the rewards available for correct responses to the left/right keys.

experimentation, served as subjects They were maintained

at 80%-85% of their free-feeding weights.

Apparatus The experimental apparatus comprised a

standard three-key Lehigh Valley operant chamber

con-nected to a PDP11/05 computer The center key was a

Gerbrands pigeon key, activated by 0.14 N of force; the

side keys were original equipment A food hopper located

beneath the center key provided access to mixed grain A

house-light atop the front panel provided diffuse

illumi-nation during the session.

Procedure The pigeons were trained to peck on the

center key for occasional rewards of access to grain After

several days they were shifted to a schedule in which pecks

on the center key had a 5% chance of causing its light to

go out and the side-key lights to go on Occasionally the

center-key light would go off and the side-key lights would

go on independently of the pigeons' behavior A single

peck on one of the side keys would provide either 2.5 s

of access to food or 2.5 s during which the chamber was

darkened The subjects would receive the food if and only

if they chose the correct side key If their center-key peck

was the event that had caused the transition from

center-to side-key lights, then a choice of the right key would be

rewarded If the transition occurred independently of the

pigeon's behavior, then a choice of the left key would be

rewarded After reward or time-out, a 3-s intertrial interval

elapsed during which only the house-light was illuminated,

followed by a reversion to the original condition with the

center key lit, The colors of the keys were green (left),

white (center), and red (right) The session terminated

after 50 rewards.

The probability that the transition would occur without

a key peck was determined by letting the computer generate

"pseudopecks" at the same rate as the pigeon's pecks and

generating a transition with the same probability (5%).

The computer updated its estimate of the pigeons'

inter-response time (IRT) every second using an exponentially

weighted moving average, weighting the most recent latency

20% and the previous average 80% When the time from

the last pseudopeck exceeded the current estimate of the

pigeon's IRT, the computer would make another

"re-sponse." This technique has three implications: (a) The

rates of response-dependent and response-independent

transitions were approximately equal during a session, (b)

The latter events were not truly independent of the pigeon's

behavior, because their rates were correlated; during any

one trial the correlations were close to zero, but over the

course of a session they approached 1.0 (c) A

computer-initiated transition could occur "simultaneously" with a

pigeon's response and yet be categorized by the computer

as computer initiated.

After 10 sessions, a 1-s delay was instituted between the offset of the side keys and reward (or time out) This con-dition was in force for 30 sessions, followed by exposure

to the major experimental conditions, in which the value

of the delay was varied over the range listed in Table 1 Each condition was in effect for approximately 40 sessions.

Results

The pigeons learned the discrimination; over the course of the experiment the average prob-ability of a correct response was 65% However, this statistic underestimates the ability of the pigeons, whose behavior was quite sensitive to the differential payoff for correct yes and no responses Figure 1 plots the data from the various conditions in the traditional coordi-nates of signal detectability theory: the prob-ability of saying yes given that the pigeons had indeed caused the transition ("hits") and the probability of saying yes given that the com-puter had caused the transition ("false alarms") As the relative payoff for hits in-creased, the data points moved from the lower left corner to the upper right corner of the graph Maximizing the probability of being correct did not maximize reward; for instance, when the relative payoff for hits was great, it was to the pigeon's advantage to presume that

it caused all transitions about which there was any question, rather than be unbiased A sta-tistic that reflects accuracy independently of

bias is A', the area under an average curve

through the points (Grier, 1971; Pollack & Norman, 1964) This estimates the accuracy that would obtain if the subjects were unbiased; its value was 79 ± 4% under the conditions

of this experiment

How did the pigeons do so well? Presumably they based their choices on a temporal

Trang 3

dis-.2 4 6 6

PROBABILITY OF A FALSE ALARM

Figure 1 The probability of a right-key response given

that the transition to lit side keys was caused by a

center-key peck ("hits") versus the probability of a right-center-key

response given that the transition was caused by a

pseu-dorandom device in the computer ("false alarms") (The

smooth line, a "perceiver operating characteristic," is

de-rived from Equation 1, assuming a signal-to-noise ratio

[S] of [400/60].)

crimination: If a transition immediately fol-lowed a center-key response, they responded

as though they had generated the transition, whereas if the stimulus change was delayed, they responded as though it was computer ini-tiated Figure 2 bears out this presumption The open symbols are the probability of a right-key response (yes) as a function of the asynchrony between a center-key peck and the subsequent stimulus change for four of the bias conditions (In the 1.0/2.5 condition, the pigeons were most strongly biased away from the right key by the 2.5-s delay of reward, and they almost never responded there.) The data represent the averages for asynchronies of 825

to 475 ms, 475 to 250 ms, 250 to 60 ms, 60

to -50 ms, and -50 to -150 ms, taken from the last seven sessions at each condition It can be seen that as the asynchrony between

a response and stimulus change decreases, the probability of a right-key response increases The solid points represent hits, the probability

of a right-key response when the stimulus change was caused by the pigeon They are plotted 60 ms to the left of the origin because that was the measured latency of the key-light

600 500 400 300 200 100 -100

ASYNCHRONY OF EVENT (MSEC)

Figure 2 The probability of a right-key response as a function of the asynchrony between the previous

center-key peck and the transition to lit side keys (The parameters signify the delay of reinforcement for correct responses on the left and right keys For clarity, each curve and its associated data points are elevated 10% above the ones below it Negative asynchronies indicate that a center-key peck occurred after the transition had begun Filled symbols represent hits, open symbols false alarms The data are averages over subjects over the last seven sessions; the curves in the left panel are derived from the discrimination model.)

Trang 4

change on these trials Points to the right of

the origin represent center-key pecks that

oc-curred after the stimulus change; these

re-sponses were infrequent and probably were

initiated before the change took place

Discussion

Skinner (1948) proposed that superstitious

behavior is caused by adventitious contiguities

between a response and reward Figure 2

shows, however, that pigeons can distinguish

between events that they originate and those

that they don't, even when the latter occur

very soon after a response But Figures 1 and

2 also show that these gradients are affected

by motivational variables; pigeons may behave

as though they caused an event whose latency

they can clearly discriminate as greater than

zero if it is to their advantage to do so If the

cost of such an assertion is low, as it often is

in conditioning experiments where it may

in-volve only a few responses that an animal is

well prepared to make (Seligman, 1970), a few

instances of approximate contiguity may be

adequate to generate durable "superstitious"

conditioning (Neuringer, 1970) Conversely,

only moderate differential costs may be

effec-tive in shifting the attribution of causality; the

parametric values of delay used in this study

were established after months of pilot work

during which we searched for differentials

small enough to avoid driving the animals to

exclusive choices

The performance of the pigeons may be

de-scribed by a simple detection model that

com-bines discriminability of the signal with

mo-tivational variables (More sophisticated

mod-els are available from Church & Gibbon, 1982;

Davison & Tustin, 1978; McCarthy & Davison,

1980; and Nevin, Jenkins, Whittaker, &

Yar-enski, 1982.) It is based on the choice model

of Bradley, Terry, and Luce (Luce, 1963) and

predicts the probability of saying "Yes, a signal

is present" to be

lag between a response and the transition The simplest such function is

P y = S/(S + M'), (1)

where M' is a measure of the motivational bias

to say no relative to that to say yes, and S is

a measure of the similarity between the event

and the signal to be detected Because the

sig-nal to be detected is a minimal asynchrony,

similarity must be a decreasing function of the

S = fly/ax, a > 0, (2)

where Oy is the asynchrony of the signal (here

60 ms), and a x is the asynchrony of the event

in question See the Appendix for further dis-cussion of this model

For our measure of motivational bias, we turn to incentive theory (Killeen, 1982a, 1982b) A basic assumption of incentive theory

is that two factors control responding, a di-rective factor that comprises the effects of the delayed primary reinforcer (which decay ex-ponentially with delay) and the effects of the conditioned reinforcers signaling the delay (whose strength approximately equals the re-ciprocal of the delay) The directive factor is

multiplied by an arousal factor (A), which is

proportional to the overall rate of

reinforce-ment, and a response bias parameter (b):

M = bA(e- qD + 1/D). (3)

The value of q usually lies close to 0.13,

and that is the value assigned to it here Equa-tion 3 is evaluated for both no and yes alter-natives, with the ratio of those values being

M' Although this treatment of motivational

bias is relatively secure, many other concave functions would do about as well, because this factor merely sets the level of the curves in Figure 2; their curvature is determined by the measure of similarity as denned in Equation

2 The single free parameter in this model is the ratio of response biases for the two

out-comes, b' = b n /by Assigning b' a value of 0.37

yields the curves in the left panel of Figure 2

We may also collapse these curves over their abscissas and ask what is the probability of saying yes given that a signal occurred versus saying yes given that the stimulus change was independent of the animal's behavior for any value of M' We know that the value of ay for

a signal was 60 ms; we can evaluate Equation

1 for a range of values of M' to obtain the ordinates of the curve in Figure 1 We may then do the same for the average asynchrony

of the computer-initiated transitions to gen-erate coordinate probabilities of false alarms Unfortunately, the value of that asynchrony varied with conditions and response rates of individual subjects and is not generally known

Trang 5

However, we may treat it as a free parameter

and assign it the value of 400 ms and thus

obtain the abscissas for the "operating

char-acteristic" drawn in Figure 1 This analysis is

different from traditional signal detectability

theory in that it uses independent variables to

predict both dependent variables, rather than

merely using one dependent variable to predict

the other As with more familiar signal

detec-tion models, the distance of the curve from

the positive diagonal reflects the animals

sen-sitivity to the signal; the operating position on

that curve is determined by the animal's

mo-tivation (M 1)

This experiment demonstrated an

impres-sive ability of pigeons to discriminate their

role in bringing about a subsequent event and

a surprising sensitivity of the animals' bias to

small differences in the delay of reward The

data supported a simple model of performance

based on a temporal discrimination and were

extended to the coordinate system of signal

detectability theory But the indeterminancy

of the asynchronies of computer-initiated

events—their dependence on the pigeons'

cen-ter-key response rate and their variability from

one condition to another—somewhat

under-mines the strength of this last demonstration,

requiring that a parameter be estimated post

hoc That shortcoming is remedied in the next

experiment

Experiment 2 The first experiment demonstrated strong

control of pigeons' attributions by small shifts

in the delay of rewards The present

experi-ment replicates it with several modifications

In this experiment motivation is controlled by

differential amounts of reward, rather than

dif-ferential delays of reward In the first

exper-iment the asynchrony for a transition caused

by the animals was 60 ms; in this experiment

it is reduced to 20 ms In the first experiment

the asynchrony between a response and a

computer-initiated transition varied as a

func-tion of the animal's response rate on the center

key; in this experiment those events are

in-dependent

Method

Subjects The subjects were 4 naive Roller pigeons

maintained at 80%-85% of their free-feeding weight

Table 2

The Order in Which the Pigeons Were Exposed

to Each of the Amount Conditions

Subject D E F O

1 2.5/2.5 2.5/2.5 2.5/2.5 2.5/2.5

Order 2 1.0/4.0 3.5/1.5 1.0/4.0 3.5/1.5

3 3.5/1.5 1.0/4.0 3.5/1.5 1.0/4.0

Note The values represent the amount of reward (seconds

access to grain) available for correct responses to the left/ right keys The duration of blackout was the same as the duration of reward on that key

Apparatus and procedure The apparatus was the same

as in the first experiment The procedure was similar to that of the first experiment Pecks on the white center key would cause it to go off and the side keys to come on, with a probability of 05 The delay between a response and this transition was 20 ms, which includes the half-life for the decay in brightness of the center-key light (10 ms) If the pigeon then correctly responded to the right key, it would receive grain; if it incorrectly responded to the left key, it would receive a brief blackout After these events, a 2.5-s intertrial interval ensued, followed by a reversion to the original condition, with the center key lit Pecks on the center key would also be followed, after a delay, by offset of the center key and onset of the side keys with a probability of 05 On these trials responses to the left key would be rewarded, and responses to the right key would be punished with blackout The delays were ran-domly chosen from the set 40, 120, 200, 280, and 360

ms The measured asynchronies were 20 ms greater than these values, yielding an average asynchrony of 220 ms for the "noise" event The pigeons received 60 rewards per session and approximately 35 sessions at each of the conditions listed in Table 2

Insofar as the controlling variable in the first experiment was the asynchrony between the response arid the transition, this experiment provides much better control over that variable However, the problem the pigeons must solve has shifted from one of determining causality to pne of making

a temporal discrimination Of course, it is our contention that the latter is often the basis for the former

Results

This task appears more difficult than the first, with the difference between the "signal" and the average "noise" asynchronies being only 0.2 s and with some of the noise events occurring only 0.06 s after a response Yet the pigeons performed about as accurately in this experiment as they did in the first, with values

of A' being 82, 84, and 75 for Conditions

Trang 6

1/4, 2.5/2.5, and 3.5/1.5, respectively Their

performance, averaged over the last eight

ses-sions at each condition, is shown in Figure 3,

where it is evident that small shifts in the

amount of reward brought about large shifts

in response bias

The curves through the data in Figure 3 are

parameter-free predictions based on the

de-tection model (Equations 1 and 2) The average

ratio of asynchronies for hits to false alarms

was always 0.137 (S = 2ay /a n /N), and this

completely determined the height of the curves

above the positive diagonal We see some

sys-tematic error, with detectability lower than

predicted in the 3.5/1.5 condition and higher

elsewhere, but overall the predictions are good

In Figure 4 we plot the average temporal

gradients for the three conditions The data

from these closely spaced time intervals are

irregular, with a strong secondary peak at 300

ms The variability in Figure 4 is greater than

can be explained by sampling error (binomial

"error" for these data would be about 4%)

The peak at 300 ms is noteworthy: This latency

is of the same magnitude as the most common

interresponse time on comparable schedules

of reinforcement (250-400 msec; Blough,

1963; Gott & Weiss, 1972) It is quite likely,

given the high probability of multiple responses

PROBABILITY OF A FALSE ALARM

Figure 3 Operating characteristics for the subjects in

Ex-periment 2; the curves are parameter-free predictions

de-rived from Equations 1 and 2 (Circles, triangles, and

squares represent food ratios of 1/4, 2.5/2.5, and 3.5/1.5,

respectively, for left/right choices.)

A 24/2.5

ASYNCHRONY OF EVENT (MSEC)

Figure 4 The probability of a right-key response as a

function of the asynchrony between a center-key peck and the transition to lit side keys in Experiment 2 (The pa-rameters signify the amount of food available for correct responses on the left and right keys; For clarity, each curve

is elevated 10% above the ones below it Filled symbols represent hits, open symbols false alarms The data are averages over subjects over the last eight sessions.)

at this tempo, that a peck will strike the key just as it changes Pigeons close their eyes dur-ing the dozen milliseconds of impact with the key and could easily take the change to be due

to such a peck

The foregoing interpretation reconciles the data in Figure 4 with those in Figure 2 (where the larger bias obscured the effect) and Figure

5 (where there is a similar, though less marked, effect) It implies, however, that the reported

values for A' are not true maxima—by

ex-cluding from calculation trials in which a peck immediately followed a stimulus change (and trials in which a peck struck off-key or with

subthreshold force) even larger values for A'

would be obtained But this experiment was not designed as an evaluation of temporal dis-criminative ability (for which alternatives of constant asynchrony would be preferred) It was designed to emulate more naturally oc-curring situations in which organisms must make future decisions (respond again, quit the situation, etc.) based on their detection of a causal relation between their behavior and a change in the environment Animals may use other cues to establish that relation, being well prepared to recognize some types of causal links and contraprepared to recognize others

Although the absolute values of A' are likely

to vary with the nature of the response, the stimulus, and their compatibility, we have

Trang 7

O 120 MSEC

A 220 MSEC

D 420 MSEC

740 600 900 400 300 200 100 0

ASYNCHRONY OF EVENT (MSEC)

Figure 5 The probability of a right-key response as a function of the asynchrony between a center-key peck

and the transition to lit side keys in Experiment 3 (The different symbols represent the performance in the different conditions of the experiment: Filled symbols represent hits, open symbols false alarms The data are averages over subjects over the last eight sessions.)

demonstrated that a second factor—bias—is

also involved Superstitions cannot be

attrib-uted simply to failures of discrimination

(re-sponse-event asynchronies characteristic of

adventitious rewards can easily be

discrimi-nated by these animals), but should be thought

of in terms of a signal-detection task, in which

both signal strength and motivational variables

play a joint role in determining performance

Whereas we have shown that the pigeons'

bias is affected by payoff variables in a rational

way, this is not to say that they are acting as

optimal decision makers Our model predicts

that the animal's choice will depend only on

the asynchrony of a responstimulus

se-quence but not on the statistics of the

distri-bution of asynchronies due to

computer-ini-tiated stimuli But an optimal decision strategy

would take both items into account and move

the criterion for saying yes to longer

asyn-chronies as the typical lag between a response

and computer-initiated events increased In the

next experiment we stress that implication

Experiment 3

Method

The subjects and apparatus were the same as in EXT

periment 2,

The procedure was basically the same as in Experiment

2 The amounts of grain available for hits and correct

rejections were equal (2.5 s) The asynchrony of "caused"

events was 20 ms; the asynchrony of "uncaused" events,

depending on the experimental condition, was drawn from

one of three sets with average asynchronies of 120, 220,

and 420 ms The compositions of these sets were 120—

40,80,120,160, 300; 220—60,140,220,300,380; 420—

100, 260,420, 580, 740 All values are in ms and include

the response time of the computer and light bulb All subjects experienced these sets in the order 420 ms, 220

ms, and 120 ms, with approximately 40 sessions at each condition.

Results and Discussion

Figure 5 shows the probability of a response

on the right key as a function of the asynchrony between a center-key peck and the transition

to the side key lights The data come from the last eight sessions at each condition The curves are essentially congruent and confirm our ear-lier impressions of the acuteness of the pigeon's perception of short delays between its behavior and subsequent events: The curves reach their floors at delays of approximately 150 ms Again, there is a slight rise in the curves be-tween 250 and 400 ms The average values of

A' were 90, 85, and 80 for Conditions 420,

220, and 120 Differences in accuracy among conditions were not due to different elevations

of the curves but to different extensions beyond

150 ms

As the average asynchrony of delayed events decreased, the animals could optimize their performance by tightening their criterion for saying yes: In the conditions where computer-initiated events typically occurred sooner after

a response (e.g., 120 ms), it was to the animal's benefit to say no to some of the more ques-tionable events that it might otherwise have claimed This did not happen; the circles do not lie below the other curves between 100 and 200 ms However, the pigeons are oper-ating close to the optimal criteria, and judging from optimized decision-theory models, this

Trang 8

systematic error cost them only 4% of the

re-wards available in the session

The accuracy of pigeons in making these

discriminations may be due in part to the fact

that the interval to be timed is initiated by

a key peck That accuracy may be portrayed

with two different statistics, the just noticeable

difference (jnd) and the Weber fraction If we

measure the jnd as a function of the

semi-interquartile range (in Figure 5, the abscissa

coordinate to 0.375 minus the abscissa

co-ordinate to 0.625), the present experiments

give an impressive value of approximately 50

ms for all curves If we divide that by the point

of subjective equality (the abscissa coordinate

to 0.50) to obtain the Weber fraction, the result

is a ratio of 0.65, not nearly so impressive

(values of 0.1 being obtainable at longer

in-tervals) Thus, although absolute sensitivity

(the jnd) is excellent, relative sensitivity (the

Weber fraction) is considerably below the best

obtainable

We conclude that, whereas pigeons are

sen-sitive to the rewards contingent on their

de-cisions and thus bias their responses in a

ra-tional fashion, they are not in this paradigm

ideal decision makers, in that their bias does

not change with changes in the signal to noise

ratio in an optimal fashion Their absolute

sensitivity is excellent, their relative sensitivity

less so

Experiment 4

In the previous experiments we have

dem-onstrated the acute sensitivity of pigeons to

the relationship between their behavior and

an ensuing change of cue lights, and we ask

in this experiment whether pigeons will be yet

more sensitive to the relationship between their

behavior and a food reward

Method

Subjects and apparatus The subjects were 3 Silver

King pigeons (H, I, and J) and 1 common pigeon (K) All

had previous, histories of reinforcement in other

experi-ments They were maintained at 80% of their ad libitum

weight.

The apparatus was the same as in the previous

exper-iments.

Procedure There were three major conditions—food,

"lite," and null In the food condition, pecks on the center

key would occasionally cause it to go off and the food

hopper to be elevated After a feeding, the side-key lights

would come on If the feeding was an immediate

conse-quence of the pigeon's peck, a response to the right key

would yield an additional reinforcement of 2.5 s; if not,

a 2.5 s blackout Conversely, if the feeding was asynchro-nous with the pigeon's peck, a response to the left key would yield a reinforcement of 2.5 s; if "not, a 2.5-s blackout The feedings were scheduled in the same manner as the transitions in the previous experiment, with the delay of

"immediate" events being 20 ms and that of asynchronous events averaging 420 ms, with the same rectangular dis-tribution as in Experiment 3.

The procedure for the lite condition was the same as that for the food condition, but when the center key went off, the light in the food hopper went on (as in the food condition), but the food hopper was not activated and no food was available After the lite event, the side keys went

on, and the pigeons were reinforced for correct discrim-inations The purpose of the lite condition was to move the pigeons away from the center key or from their position elsewhere in the box, thus breaking: up response chains that might mediate the discrimination, while providing a delay between the event and the subsequent discrimination that was equal to that experienced in the food condition The procedure for the null condition was identical to that in the 420-s condition of Experiment 3: When the center key light went off, the side key lights came on im-mediately, and correct discriminative responses were rein-forced.

The duration of the food and lite events were equal in each phase of the experiment and took values of 1, 2, 3 and 4 s, in that order.

The basic experimental cycle consisted of a 6-day week, with the lite condition on Monday and Thursday, the food condition on Tuesday and Friday, and the null condition

on Wednesday and Saturday The first (1 -s) condition lasted for 12 cycles, and the subsequent conditions for 6 cycles.

A session ended after 60 correct responses.

Results

It may be seen in Figure 6 that accuracy decreased as the duration of the food or light increased and that the decrease was much greater for the food condition than for the lite condition; sensitivities to the1 response-event relation were equal only at the 1-s condition, where the pigeons would have received little

or no food in the food condition Conversely,

accuracy in the null condition was high (A 1 =

89%, the same as that found in Experiment 3 with different subjects under comparable con-ditions) and did not decrease when event du-ration was increased in the other conditions This indicates that general motivational changes, such as satiation, were probably not responsible for the observed decrements in the other conditions

Discussion

This experiment shows a remarkable de-crease in pigeons' ability to identify the relation between their behavior and a subsequent

Trang 9

re-1 2 3 4

DURATION OF LIGHT OR FOOD (SEC)

Figure 6 Accuracy of the pigeons in reporting the temporal relation between a center-key peck and the

ensuing event, either illumination of the hopper light (lite), delivery of food (food), or direct transition to lit side keys (null) (The data are plotted for individual subjects, and averages over subjects, as a function

of the duration of the event The solid lines are regressions; the dashed line is an exponential decay function

originating at A' = 89 and having a rate constant of 0.13/s.)

ward as the value of that reward increases

This relationship does not seem to make sense

from a functional viewpoint, even though the

mechanism—decay,of the memory trace as it

ages and greater decay when the interval is

filled with distracting events—is well

docu-mented However, the task we have set the

animals, that of remembering the events in

question, is not the same as the task of learning

instrumental responses (Maki, 1979a) This

distinction will be discussed later

We may characterize the rate of decay of

accuracy by its half-life, the time necessary

for the performance to fall half of the distance

from its maximum (at 0 s delay) to chance

level If we take the maximum to be A' = 89,

the half-lives for the lite and food conditions

were 4.5 s and 2.0 s, respectively These values

are in the range found by other investigators

for memory of various events: pigeons'

mem-ory of whether they made one or two pecks

(2.5 s; Kramer, 1982); pigeons' memory of the

location of a red or green stimulus (2.5 s;

Jit-sumori & Sugimoto, 1982); and pigeons'

memory for the location of a previous response

(4-6 s; Shimp, 1976b) Jans and Catania

(1980) studied pigeons' memory for the color

of a stimulus over a "standard" delay (house-light only on) and over a delay filled with "ac-tivity" (the feeder was operated) Converting

their results to A' and interpolating gives

me-dian half-lives of 4 s and 1 s for the standard and activity conditions, respectively The value

of the half-lives will be affected by other factors such as recency (Shimp, 1976b; Weisman, Wassermaii, Dodd, & Larew, 1980), "surpris-ingness" (Maki, 1979b; Wagner, Rudy, & Whitlow, 1973; ef.ColwiU& Dickinson, 1980), blocking effects (Cook, 1980; Grant & Roberts, 1976; Maki, Moe, & Bierley, 1977; Tranberg

& Rilling, 1980; Wilkie, Summers, & Speteh, 1981; Williams, 1978), arid, of course, me-diating behavior (Kramer, 1982, Experiments

1 & 2; Smith, Niedorowslci, & Attwood, 1982;

Zentall, Hogan, Howard, '& Moore, 1978).

Where mediation is minimized, the results speak to a very rapid decay of the memory trace

We may compare these results with those issuing from the study of the delay of rein-forcement gradient, an enterprise with a ven-erable tradition in experimental psychology

Trang 10

(Benjamin & Perloff, 1982) Estimates of the

half-life vary widely, depending in large part

on the investigator's empirical success at

min-imizing mediation by conditioned

reinforce-ment (Renner, 1964) A theoretical way of

re-moving the effects of conditioned

reinforce-ment is provided by Equation 3 In over a

dozen different studies Killeen (1982b) found

that the contribution of primary reinforcement

to the maintenance of differential responding

could be captured by an exponential decay

function with a half-life of 5.5 s (q = 0.13);

this function provides the dashed line through

the lite data in Figure 6 (assuming maximum

A' of 89 and correcting for chance).1 Insofar

as that line parallels the data, it is evidence

that the decay of discriminative (memorial)

strength follows a time course similar to the

decay in reinforcing strength

The disruptiveness of interpolated

rein-forcement on the memory trace, suggested by

Shimp (1976b) and Staddon (1974), is depicted

by the food line in Figure 6 and by the

"ac-tivity" data df Jans and Catania (1980) Just

as a reinforcer may intervene between a

stim-ulus and an animals' subsequent report of it,

thereby impairing memory of the stimulus, it

may also intervene between a resppnse and a

subsequent reinforcer, thereby impairing the

effectiveness of the delayed reinforcer

Argu-ments for the disruptiveness of interpolated

reinforcement "on the delay of reinforcement

gradient are provided by Killeen ,(198la,

1982a), who used that assumption to generate

models for predicting response rates as a

func-tion of reinforcement rates Killeen.argued

that, because the strengthening effects of

re-inforcement on earlier responses was disrupted

more by an intervening period of

reinforce-ment than by a period of quiescence,

increas-ing rates of reinforcement would have

mar-ginally decreasing effectiveness as it became

ever more likely that the range of impact of

a reinforcer would be truncated by

interpo-sition of other reinforcers These

cpnsidera-tions led to a model formally equivalent to

those of Herrnsitein (1970) and Catania (1973),

although the interpretation of the parameters

differed The present paradigm provides a basis

for independent measurement of a key

param-eter in that model, the disruptiveness of

re-inforcement as a function of its duration

General Discussion

We have demonstrated that pigeons are eas-ily able to discriminate between events that they cause and those that are independent of their responding (Experiment 1), even when the contiguity between a response and an ad-ventitious event is quite close (Experiment 3) The discrimination appears to be based on a temporal discrimination in which the critical variable is the ratio of the asynchrony between

a response and stimulus, relative to the typical asynchrony between a response and stimulus that the pigeon has been reinforced for calling

caused Although the pigeons' attributions are

quitev sensitive to the relative payoffs (Exper-iments 1 & 2), they do not change with changes

in the distribution of asynchronies of delayed events as they should if they were "ideal ob-servers" maximizing payoff (Experiment 3)

In Experiment 4 we demonstrated that the decay of accuracy in attributing the locus of control over the hopper light onset decreased along approximately the same time course as the theoretical decrease in the effectiveness of primary reinforcement This suggests that when there is no significant blocking or dif-ferential conditioned reinforcement, the ability

of a delayed reinforcer to strengthen a response

is on the same order as the animals' ability to

remember the response at the onset of

rein-forcement But as reinforcement continues, memory deteriorates at an accelerated rate Whereas memory decays rapidly, perhaps ex-ponentially, with the duration of a reinforcer, the ability of that reinforcer to strengthen the response should be correlated not with

mem-ory at some instant but with the integral of

memory over the course of the reinforcer and should thus increase as a concave function of

1 Equation 3 was originally used to.capture the differ-ential impact on choice responses of delays between them and reward The exponential decay of primary reinforce-ment strength was derived as a consequence of potential blocking of the association between two events as a function

of the delay, between them The same rationale is appro-priate here, even though the events differ: If there is a constant probability of forgetting an event (or having its association blocked), we expect an exponential function Whether the similarity in values of the time constants is reliable remains to be seen.

Ngày đăng: 13/10/2022, 14:38

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm