1. Trang chủ
  2. » Kỹ Thuật - Công Nghệ

Tài liệu Digital Signal Processing Handbook P39 ppt

28 265 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Auditory Psychophysics For Coding Applications
Tác giả W. K. Jenkins, J. L. Hall
Người hướng dẫn Vijay K. Madisetti, Douglas B. Williams
Trường học CRC Press LLC
Chuyên ngành Digital Signal Processing
Thể loại Chương
Năm xuất bản 1999
Thành phố Boca Raton
Định dạng
Số trang 28
Dung lượng 414,98 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The yes-no experiment could be described by a psychometric function thatranges from zero to one, and threshold could be defined as the stimulus intensity that elicits a positiveresponse

Trang 1

Jenkins, W K “Auditory Psychophysics for Coding Applications”

Digital Signal Processing Handbook

Ed Vijay K Madisetti and Douglas B Williams

Boca Raton: CRC Press LLC, 1999

Trang 2

Hall, J.L “Auditory Psychophysics for Coding Applications”

Digital Signal Processing Handbook

Ed Vijay K Madisetti and Douglas B Williams

Boca Raton: CRC Press LLC, 1999

Trang 3

Loudness•Pitch•Threshold of Hearing•Differential old •Masked Threshold•Critical Bands and Peripheral Audi-

Thresh-tory Filters

39.3 Summary of Relevant Psychophysical Data

Loudness•Differential Thresholds•Masking

39.4 ConclusionsReferences

In this chapter we review properties of auditory perception that are relevant to thedesign of coders for acoustic signals The chapter begins with a general definition

of a perceptual coder, then considers what the “ideal” psychophysical model wouldconsist of and what use a coder could be expected to make of this model We thenpresent some basic definitions and concepts The chapter continues with a review ofrelevant psychophysical data, including results on threshold, just-noticeable differences,masking, and loudness Finally, we attempt to summarize the present state of the art,the capabilities and limitations of present-day perceptual coders for audio and speech,and what areas most need work

1 Perceptual coding is not limited to speech and audio It can be applied also to image and video [ 16 ] In this paper we consider only coders for acoustic signals.

Trang 4

these physical measures do not directly address the appropriate issue For signals that are to belistened to by people, the “best” coder is the one that sounds the best There is a very clear distinction

between physical and perceptual measures of a signal (frequency vs pitch, intensity vs loudness,

for example) A perceptual coder can be defined as a coder that minimizes some measure of thedifference between original and coded signal so as to minimize the perceptual impact of the codingnoise We can define the best coder given a particular set of constraints as the one in which the codingnoise is least objectionable

It follows that the designer of a perceptual coder needs some way to determine the perceptualquality of a coded signal “Perceptual quality” is a poorly defined concept, and it will be seen that insome sense it cannot be uniquely defined We can, however, attempt to provide a partial answer tothe question of how it can be determined We can present something of what is known about humanauditory perception from psychophysical listening experiments and show how these phenomenarelate to the design of a coder

One requirement for successful design of a perceptual coder is a satisfactory model for the dependent sensitivity of the auditory system Present-day models are incomplete, but we can attempt

signal-to specify what the properties of a complete model would be One possible specification is that, forany given waveform (the signal), it accurately predicts the loudness, as a function of pitch and of time,

of any added waveform (the noise) If we had such a complete model, then we would in principle

be able to build a transparent coder, defined as one in which the coded signal is indistinguishablefrom the original signal, or at least we would be able to determine whether or not a given coderwas transparent It is relatively simple to design a psychophysical listening experiment to determinewhether the coding noise is audible, or equivalently, whether the subject can distinguish betweenoriginal and coded signal Any subject with normal hearing could be expected to give similar results

to this experiment While present-day models are far from complete, we can at least describe theproperties of a complete model

There is a second requirement that is more difficult to satisfy This is the need to be able to determinewhich of two coded samples, each of which has audible coding noise, is preferable While a satisfactorymodel for the signal-dependent sensitivity of the auditory system is in principle sufficient for thedesign of a transparent coder, the question of how to build the best nontransparent coder does nothave a unique answer Often, design constraints preclude building a transparent coder Even the bestcoder built under these constraints will result in audible coding noise, and it is under some conditionsimpossible to specify uniquely how best to distribute this noise One listener may prefer the moreintelligible version, while another may prefer the more natural sounding version The preferences

of even a single listener might very well depend on the application In the absence of any bettercriterion, we can attempt to minimize the loudness of the coding noise, but it must be understoodthat this is an incomplete solution

Our purpose in this paper is to present something of what is known about human auditoryperception in a form that may be useful to the designer of a perceptual coder We do not attempt

to answer the question of how this knowledge is to be utilized, how to build a coder Present-day

perceptual coders for the most part utilize a feedforward paradigm: analysis of the signal to be coded produces specifications for allowable coding noise Perhaps a more general method is a feedback

paradigm, in which the perceptual model somehow makes possible a decision as to which of twocoded signals is “better” This decision process can then be iterated to arrive at some optimum solution

It will be seen that for proper exploitation of some aspects of auditory perception the feedforwardparadigm may be inadequate and the potentially more time-consuming feedback paradigm may berequired How this is to be done is part of the challenge facing the designer

Trang 5

39.2 Definitions

In this section we define some fundamental terms and concepts and clarify the distinction betweenphysical and perceptual measures

39.2.1 Loudness

When we increase the intensity of a stimulus its loudness increases, but that does not mean that

intensity and loudness are the same thing Intensity is a physical measure We can measure the

intensity of a signal with an appropriate measuring instrument, and if the measuring instrument

is standardized and calibrated correctly anyone else anywhere in the world can measure the same

signal and get the same result Loudness is perceptual magnitude It can be defined as “that attribute

of auditory sensation in terms of which sounds can be ordered on a scale extending from quiet toloud” ([23], p.47) We cannot measure it directly All we can do is ask questions of a subject andfrom the responses attempt to infer something about loudness Furthermore, we have no guaranteethat a particular stimulus will be as loud for one subject as for another The best we can do is assumethat, for a particular stimulus, loudness judgments for one group of normal-hearing people will besimilar to loudness judgments for another group

There are two commonly used measures of loudness One is loudness level (unit phon) and the other is loudness (unit sone) These two measures differ in what they describe and how they are

obtained The phon is defined as the intensity, in dB SPL, of an equally loud 1-kHz tone The sone

is defined in terms of subjectively measured loudness ratios A stimulus half as loud as a one-sonestimulus has a loudness of 0.5 sones, a stimulus ten times as loud has a loudness of 10 sones, etc A1-kHz tone at 40 dB SPL is arbitrarily defined to have a loudness of one sone

The argument can be made that loudness matching, the procedure used to obtain the phon scale,

is a less subjective procedure than loudness scaling, the procedure used to obtain the sone scale Thisargument would lead to the conclusion that the phon is the more objective of the two measures andthat the sone is more subject to individual variability This argument breaks down on two counts:first, for dissimilar stimuli even the supposedly straightforward loudness-matching task is subject tolarge and poorly understood order and bias effects that can only be described as subjective Whileloudness matching of two equal-frequency tone bursts generally gives stable and repeatable results,the task becomes more difficult when the frequencies of the two tone bursts differ Loudness matchingbetween two dissimilar stimuli, as for example between a pure tone and a multicomponent complexsignal, is even more difficult and yields less stable results Loudness-matching experiments have to bedesigned carefully, and results from these experiments have to be interpreted with caution Second,

it is possible to measure loudness in sones, at least approximately, by means of a loudness-matchingprocedure Fletcher [6] states that under some conditions loudness adds Binaural presentation of astimulus results in loudness doubling; and two equally-loud stimuli, far enough apart in frequencythat they do not mask each other, are twice as loud as one If loudness additivity holds, then it followsthat the sone scale can be generated by matching loudness of a test stimulus to binaural stimuli or

to pairs of tones This approach must be treated with caution As Fletcher states, “However, thismethod [scaling] is related more directly to the scale we are seeking (the sone scale) than the twopreceding ones (binaural or monaural loudness additivity)” ([6], p 278) The loudness additivityapproach relies on the assumption that loudness summation is perfect, and there is some more recentevidence [28,33] that loudness summation, at least for binaural vs monaural presentation, is notperfect

Trang 6

39.2.3 Threshold of Hearing

Since the concept of threshold is basic to much of what follows, it is worthwhile at this point todiscuss it in some detail It will be seen that thresholds are determined not only by the stimulus andthe observer but also by the method of measurement While this discussion is phrased in terms ofthreshold of hearing, much of what follows applies as well to differential thresholds (just-noticeabledifferences) discussed in the next subsection

By the simplest definition, the threshold of hearing (equivalently, auditory threshold) is the lowestintensity that the listener can hear This definition is inadequate because we cannot directly measurethe listener’s perception A first-order correction, therefore, is that the threshold of hearing is thelowest intensity that elicits from the listener the response that the sound is audible Given thisdefinition, we can present a stimulus to the listener and ask whether he or she can hear it If we

do this, we soon find that identical stimuli do not always elicit identical responses In general, theprobability of a positive response increases with increasing stimulus intensity and can be described

by a psychometric function such as that shown for a hypothetical experiment in Fig.39.1 Here thestimulus intensity (in dB) appears on the abscissa and the probabilityP (C) of a positive response

appears on the ordinate The yes-no experiment could be described by a psychometric function thatranges from zero to one, and threshold could be defined as the stimulus intensity that elicits a positiveresponse in 50% of the trials

FIGURE 39.1: Idealized psychometric functions for hypothetical yes-no experiment (zero to one)and for hypothetical two-interval forced-choice experiment (0.5 to one)

Trang 7

A difficulty with the simple yes-no experiment is that we have no control over the subject’s criterion

level The subject may be using a strict criterion (“yes” only if the signal is definitely present) or a lax

criterion (“yes” if the signal might be present) The subject can respond correctly either by a positive

response in the presence of a stimulus (hit) or by a negative response in the absence of a stimulus (correct rejection) Similarly the subject can respond incorrectly either by a negative response in the presence of a stimulus (miss) or by a positive response in the absence of a stimulus (false alarm).

Unless the experimenter is willing to use an elaborate and time-consuming procedure that involvesassigning rewards to correct responses and penalties to incorrect responses, the criterion level isuncontrolled

The field of psychophysics that deals with this complication is called detection theory The field of

psychophysical detection theory is highly developed [12] and a complete description is far beyondthe scope of this paper Very briefly, the subject’s response is considered to be based on an internal

decision variable, a random variable drawn from a distribution with mean and standard deviation

that depend on the stimulus If we assume that the decision variable is normally distributed with afixed standard deviationσ and a mean that depends only on stimulus intensity, then we can define

an index of sensitivity d0 for a given stimulus intensity as the difference betweenm0(the mean inthe absence of the stimulus) andm s (the mean in the presence of the stimulus), divided byσ An ideal observer (a hypothetical subject who does the best possible job for the task at hand) gives a

positive response if and only if the decision variable exceeds an internal criterion level An increase

in criterion level decreases the probability of a false alarm and increases the probability of a miss

A simple and satisfactory way to deal with the problem of uncontrolled criterion level is to use a

criterion-free experimental paradigm The simplest is perhaps the two-interval forced choice (2IFC)

paradigm, in which the stimulus is presented at random in one of two observation intervals Thesubject’s task is to determine which of the two intervals contained the stimulus The ideal observerselects the interval that elicits the larger decision variable, and criterion level is no longer a factor.Now the subject has a 50% chance of choosing the correct interval even in the absence of any stimulus,

so the psychometric function goes from 0.5 to 1.0 as shown in Fig.39.1 A reasonable definition ofthreshold isP (C) = 0.75, halfway between the chance level of 0.5 and one If the decision variable is

normally distributed with a fixed standard deviation, it can be shown that this definition of thresholdcorresponds to ad0of 0.95.

The number of intervals can be increased beyond two In this case, the ideal observer respondscorrectly if the decision variable for the interval containing the stimulus is larger than the largest ofthe N-1 decision variables for the intervals not containing the stimulus A common practice is, for

an N-interval forced choice paradigm (NIFC), to define threshold as the point halfway between thechance level of 1/N and one This is a perfectly acceptable practice so long as it is recognized that themeasured threshold is influenced by the number of alternatives For a 3IFC paradigm this definition

of threshold corresponds to ad0of 1.12 and for a 4IFC paradigm it corresponds to ad0of 1.24.

39.2.4 Differential Threshold

The differential threshold is conceptually similar to the auditory threshold discussed above, and many

of the same comments apply The differential threshold, or just-noticeable difference (JND), is theamount by which some attribute of a signal has to change in order for the observer to be able todetect the change A tone burst, for example, can be specified in terms of frequency, intensity, andduration, and a differential threshold for any of these three attributes can be defined and measured.The first attempt to provide a quantitative description of differential thresholds was provided by

the German physiologist E H Weber in the first half of the 19th century According to Weber’s law,

the just-noticeable difference1I is proportional to the stimulus intensity I, or 1I/I = K, where the

constant of proportionality1I/I is known as the Weber fraction This was supposed to be a general

description of sensitivity to changes of intensity for a variety of sensory modalities, not limited just

Trang 8

to hearing, and it has since been applied to perception of nonintensive variables such as frequency.

It was recognized at an early stage that this law breaks down at near-threshold intensities, and in thelatter half of the 19th century the German physicist G T Fechner suggested the modification that is

now known as the modified Weber law, 1I/(I + I0) = K, where I0is a constant While Weber’s lawprovides a reasonable first-order description of intensity and frequency discrimination in hearing,

in general it does not hold exactly, as will be seen below

As with the threshold of hearing, the differential threshold can be measured in different ways, andthe result depends to some extent on how it is measured The simplest method is a same-differentparadigm, in which two stimuli are presented and the subject’s task is to judge whether or not theyare the same This method suffers from the same drawback as the yes-no paradigm for auditorythreshold: we do not have control over the subject’s criterion level

If the physical attribute being measured is simply related to some perceptual attribute, then thedifferential threshold can be measured by requiring the subject to judge which of two stimuli hasmore of that perceptual attribute A just-noticeable difference for frequency, for example, could bemeasured by requiring the subject to judge which of two stimuli is of higher pitch; or a just noticeabledifference for intensity could be measured by requiring the subject to judge which of two stimuli islouder As with the 2IFC paradigm discussed above for auditory threshold, this method removes theproblem of uncontrolled criterion level

There are more general methods that do not assume a knowledge of the relationship betweenthe physical attribute being measured and a perceptual attribute The most useful, perhaps, is theN-interval forced choice method: N stimuli are presented, one of which differs from the other N-1along the dimension being measured The subject’s task is to specify which one of the N stimuli isdifferent from the other N-1

Note that there is a close parallel between the differential threshold and the auditory thresholddescribed in the previous subsection The auditory threshold can be regarded as a special case of thejust-noticeable difference for intensity, where the question is by how much the intensity has to differfrom zero in order to be detectable

39.2.5 Masked Threshold

The masked threshold of a signal is defined as the threshold of that signal (the probe) in the presence

of another signal (the masker) A related term is masking, which is the elevation of threshold of the

probe by the masker: it is the difference between masked and absolute threshold More generally,the reduction of loudness of a supra-threshold signal is also referred to as masking It will be seenthat masking can appear in many forms, depending on spectral and temporal relationships betweenprobe and masker

Many of the comments that applied to measurement of absolute and differential thresholds alsoapply to measurement of masked threshold The simplest method is to present masker plus probeand ask the subject whether or not the probe is present Once again there is a problem with criterionlevel Another method is to present stimuli in two intervals and ask the subject which one containsthe probe This method can give useful results but can, under some conditions, give misleadingresults Suppose, for example, that the probe and masker are both pure tones at 1 kHz, but that thetwo signals are 180◦out of phase As the intensity of the probe is increased from zero, the intensity

of the composite signal will first decrease, then increase The two signals, masker alone and maskerplus probe, may be easily distinguishable, but in the absence of additional information the subjecthas no way of telling which is which

A more robust method for measuring masked threshold is the N-interval forced choice methoddescribed above, in which the subject specifies which of the N stimuli differs from the other N-1.Subjective percepts in masking experiments can be quite complex and can differ from one observer

to another In the N-interval forced choice method the observer has the freedom to base judgments

Trang 9

on whatever attribute is most easily detected, and it is not necessary to instruct the observer what tolisten for.

Note that the differential threshold for intensity can be regarded as a special case of the maskedthreshold in which the probe is an intensity-scaled version of the masker

A note on terminology: suppose two signals,x1(t) and [x1(t) + x2(t)] are just distinguishable.

the probe In either case, the difference can be described in several ways These ways include (1) theintensity increment betweenx1(t) and [x1(t) + x2(t)], 1I; (2) the intensity increment relative to

increment in dB, 10×log10(1I/I); and (5) the intensity ratio in dB, 10 ×log10[(I +1I)/I] These

ways are equivalent in that they show the same information, although for a particular applicationone way may be preferable to another for presentation purposes Another measure that is often used,particularly in the design of perceptual coders, is the intensity of the probex2(t) This measure is

subject to misinterpretation and must be used with caution Depending on the coherence between

resulting ambiguity has been responsible for some confusion

39.2.6 Critical Bands and Peripheral Auditory Filters

The concepts of critical bands and peripheral auditory filters are central to much of the auditory

modeling work that is used in present-day perceptual coders Scharf, in a classic review article [33],defines the empirical critical bandwidth as “that bandwidth at which subjective responses ratherabruptly change” Simply put, for some psychophysical tasks the auditory system behaves as if itconsisted of a bank of bandpass filters (the critical bands) followed by energy detectors Examples ofcritical-band behavior that are particularly relevant for the designer of a coder include the relationshipbetween bandwidth and loudness (Fig.39.5) and the relationship between bandwidth and masking(Fig.39.10) Another example of critical-band behavior is phase sensitivity: in experiments measur-ing the detectability of amplitude and of frequency modulation, the auditory system appears to besensitive to the relative phase of the components of a complex sound only so long as the componentsare within a critical band [9,45]

The concept of the critical band was introduced more than a half-century ago by Fletcher [6], andsince that time it has been studied extensively Fletcher’s pioneering contribution is ably documented

by Allen [1], and Scharf ’s 1970 review article [33] gives references to some later work More recently,Moore and his co-workers have made extensive measurements of peripheral auditory filters [24].The value of critical bandwidths has been the subject of some discussion, because of questions

of definition and method of measurement Figure39.2([31], Fig 1) shows critical bandwidth as afunction of frequency for Scharf ’s empirical definition (the bandwidth at which subjective responsesundergo some sort of change) Results from several experiments are superimposed here, and theyare in substantial agreement with each other Moore and Glasberg [26] argue that the bandwidthsshown in Fig.39.2are determined not only by the bandwidth of peripheral auditory filters but also

by changes in processing efficiency By their argument, the bandwidth of peripheral auditory filters

is somewhat smaller than the values shown in Fig.39.2at frequencies above 1 kHz and substantiallysmaller, by as much as an octave, at lower frequencies

39.3 Summary of Relevant Psychophysical Data

In Section39.2, we introduced some basic concepts and definitions In this section, we review somerelevant psychophysical results There are several excellent books and book chapters that have been

Trang 10

FIGURE 39.2: Empirical critical bandwidth (Source: Scharf, B., Critical bands, ch 5 in Foundations

of Modern Auditory Theory, Vol 1, Tobias, J.V., ed., Academic Press, NY, 1970 With permission).

written on this subject, and we have neither the space nor the inclination to duplicate material found

in these other sources Our attempt here is to make the reader aware of some relevant results and torefer him or her to sources where more extensive treatments may be found

39.3.1 Loudness

Loudness Level and Frequency

For pure tones, loudness depends on both intensity and frequency Figure39.3(modifiedfrom [37], p 124) shows loudness level contours The curves are labeled in phons and, in parentheses,sones These curves have been remeasured many times since, with some variation in the results, butthe basic conclusions remain unchanged The most sensitive region is around 2-3 kHz The low-frequency slope of the loudness level contours is flatter at high loudness levels than at low It followsthat loudness level grows more rapidly with intensity at low frequencies than at high The 38- and48-phon contours are (by definition) separated by 10 dB at 1 kHz, but they are only about 5 dB apart

at 100 Hz

This figure also shows contours that specify the dynamic range of hearing Tones below the 8-phoncontour are inaudible, and tones above the dotted line are uncomfortable The dynamic range ofhearing, the distance between these two contours, is greatest around 2 to 3 kHz and decreases atlower and higher frequencies In practice, the useful dynamic range is substantially less We knowtoday that extended exposure to sounds at much lower levels than the dotted line in Fig.39.3canresult in temporary or permanent damage to the ear It has been suggested that extended exposure

to sounds as low as 70 to 75 dB(A) may produce permanent high-frequency threshold shifts in some

Trang 11

individuals [39].

FIGURE 39.3: Loudness level contours Parameters: phons (sones) The bottom curve (8 phons)

is at the threshold of hearing The dotted line shows Wegel’s 1932 results for “threshold of feeling”.This line is many dB above levels that are known today to produce permanent damage to the auditory

system (Modified from Stevens, S.S and Davis, H.W., Hearing, John Wiley & Sons, New York, 1938).

Loudness and Intensity

Figure39.4 (modified from [32], Fig 5) shows loudness growth functions, the relationship

between stimulus intensity in dB SPL and loudness in sones, for tones of different frequencies Ascan be seen in Fig.39.4, the loudness growth function depends on frequency Above about 40 dB SPLfor a 1-kHz tone the relationship is approximately described by the power lawL(I) = (I/I0)1/3, so

that if the intensityI is increased by 9 dB the loudness L is approximately doubled.2 The relationshipbetween loudness and intensity has been modeled extensively [1,6,46]

Loudness and Bandwidth

The loudness of a complex sound of fixed intensity, whether a tone complex or a band of noise,depends on its bandwidth, as is shown in Fig.39.5([48], Fig 3) For sounds well above threshold,the loudness remains more or less constant so long as the bandwidth is less than a critical band If thebandwidth is greater than a critical band, the loudness increases with increasing bandwidth Nearthreshold the trend is reversed, and the loudness decreases with increasing bandwidth.3

2 This power-law relationship between physical and perceptual measures of a stimulus was studied in great detail by S S.

Stevens This relationship is now commonly referred to as Stevens’ Law Stevens measured exponents for many sensory

modalities, ranging from a low of 0.33 for loudness and brightness to a high of 3.5 for electric shock produced by a 60-Hz electric current delivered to the skin.

3 These data were obtained by comparing the loudness of a single 1-kHz tone and the loudness of a four-tone complex of the specified bandwidth centered at 1 kHz The systematic difference between results when the tone was adjusted (“T”

Trang 12

FIGURE 39.4: Loudness growth functions (Modified from Scharf, B., Loudness, ch 6 in Handbook

of Perception, Vol IV, Hearing, Carterette, E.C and Friedman M.P., eds., Academic Press, New York,

1978 With permission)

These phenomena have been modeled successfully by utilizing the loudness growth functionsshown in Fig.39.4in a model that calculates total loudness by summing the specific loudness percritical band [49] The loudness growth function is very steep near threshold, so that dividing thetotal energy of the signal into two or more critical bands results in a reduction of total loudness Theloudness growth function well above threshold is less steep, so that dividing the total energy of thesignal into two or more critical bands results in an increase of total loudness

Loudness and Duration

Everything we have talked about so far applies to steady-state, long-duration stimuli Theseresults are reasonably well understood and can be modeled reasonably well by present-day models.However, there is a host of psychophysical data having to do with aspects of temporal structure ofthe signal that are less well understood and less well modeled The subject of temporal dynamics ofauditory perception is an area where there is a great deal of room for improvement in models forperceptual auditory coders One example of this subject is the relationship between loudness andduration discussed here Other examples appear in a later section on temporal aspects of masking.There is general agreement that, for fixed intensity, loudness increases with duration up to stimulusdurations of a few hundred milliseconds (Other factors, usually discussed under the terms adaptation

symbol) and when the complex was adjusted (“C” symbol) is an example of the bias effects mentioned in section 39.2.1

(Loudness).

Trang 13

FIGURE 39.5: Loudness vs bandwidth of tone complex (Source: Zwicker, E et al., Critical

bandwidth in loudness summation, J Acoust Soc Am., 29: 548-557, 1957 With permission).

FIGURE 39.6: Frequency JND as a function of frequency and intensity (Modified from Wier, C.C

et al., Frequency discrimination as a function of frequency and sensation level, J Acoust Soc Am.,

61: 178-184, 1977 With permission)

Trang 14

or fatigue, come into play for longer durations of many seconds or minutes We will not discuss thesefactors here.) The duration below which loudness increases with increasing duration is sometimes

referred to as the critical duration Scharf [32] provides an excellent summary of studies of therelationship between loudness and duration In his survey, he cites values of critical duration rangingfrom 10 msec to over 500 msec About half the studies in Scharf ’s survey show that the total energy(intensity x duration) stays constant below the critical duration for constant loudness, while theremaining studies are about evenly split between total energy increasing and total energy decreasingwith increasing duration

One possible explanation for this confused state of affairs is the inherent difficulty of makingloudness matches between dissimilar stimuli, discussed above in Section39.2.1(Loudness) Two

stimuli of different durations differ by more than “loudness”, and depending on a variety of understood experimental or individual factors what appears to be the same experiment may yielddifferent results in different laboratories or with different subjects

poorly-Some support for this explanation comes from the fact that studies of threshold intensity as afunction of duration are generally in better agreement with each other than studies of loudness as afunction of duration As discussed above in Section39.2.3(Threshold of Hearing) measurements

of auditory threshold depend to some extent on the method of measurement, but it is still possible

to establish an internally-consistent criterion-free measure The exact results depend to some extent

on signal frequency, but there is reasonable agreement among various studies that total energy atthreshold remains approximately constant between about 10 msec and 100 msec (See [41] for asurvey of studies of threshold intensity as a function of duration.)

39.3.2 Differential Thresholds

Frequency

Figure39.6shows frequency JND as a function of frequency and intensity as measured inthe most recent comprehensive study [43] The frequency JND generally increases with increasingfrequency and decreases with increasing intensity, ranging from about 1 Hz at low frequency andmoderate intensity to more than 100 Hz at high frequency and low intensity

The results shown in Fig.39.6are in basic agreement with results from most other studies of quency JND’s with the exception of the earliest comprehensive study, by Shower and Biddulph ([43],

fre-p 180) Shower and Biddulph [35] found a more gradual increase of frequency JND with frequency

As we have noted above, the results obtained in experiments of this nature are strongly influenced bydetails of the method of measurement Shower and Biddulph measured detectability of frequencymodulation of a pure tone; most other experimenters measured the ability of subjects to correctlyidentify whether one tone burst was of higher or lower frequency than another Why this difference

in procedure should produce this difference in results, or even whether this difference in procedure

is solely responsible for the difference in results, is unclear

The Weber fraction1f/f , where 1f is the frequency JND, is smallest at mid frequencies, in the

region from 500 Hz to 2 kHz It increases somewhat at lower frequencies, and it increases verysharply at high frequencies above about 4 kHz Wier et al [43] in their Fig 1, reproduced here asour Fig.39.6, plotted log1f againstf They found that this choice of axes resulted in the closest

fit to a straight line It is not clear that this choice of axes has any theoretical basis; it appears simply

to be a choice that happens to work well There have been extensive attempts to model frequencyselectivity These studies suggest that the auditory system uses the timing of individual nerve impulses

at low frequencies, but that at high frequencies above a few kHz this timing information is no longeravailable and the auditory system relies exclusively on place information from the mechanically tunedinner ear

Rosenblith and Stevens [30] provide an interesting example of the interaction between method of

Ngày đăng: 19/01/2014, 19:20

TỪ KHÓA LIÊN QUAN