Glossary of Statistical Terms and Symbols

Một phần của tài liệu Astm mnl 26 2005 (Trang 84 - 87)

This glossary contains a number of definitions for frequently encountered statistical terms and symbols. It is not meant to be all inclusive. In the bibliography of this manual are several statistical texts which should be consulted for more extensive definitions and explanations. Formulas and symbols are included in the definitions where they are appropriate. An illustration of some common operations on two arbitrary data sets is included to aid in using this section.

1. Definitions

Alternate Hypothesis—See section on "Hypothesis Testing."

Symbol H,

Confidence Interval—This is the interval within which a population value is expected to be found with some specified probability. The usual statement is

"the 95% confidence interval for x is jc ± it" The statement is interpreted as saying that there is a 95% probability that the value of x will be between (x - k) and (jc + k). Note that there is a relationship to "statistical significance" since the statement infers that less than 5% of the time the value of ;c will be outside of the confidence interval.

Degrees of Freedom—This is a difficult concept related to the independence of observations. A complete discussion is beyond the scope of this manual. However, the basic meaning can be shown by the following. If the mean of n observations is known, then specifying the values of any ô - 1 of the observations fixes the

79

Copyright 1996 b y AS FM International www.astm.org

8 0 SENSORY TESTING METHODS: SECOND EDITION

value of the remaining observation and the set is said to have n - 1 degrees of freedom. For example, if the mean of 6 observations is 3.5 and we know that 5 of the observations are 2, 5, 3, 6, 3 then the remaining observation must be (6

• 3.5) — 2 - 5 — 3 - 6 — 3 = 2. The degrees of freedom are 5 or n — 1. Also see the example data sets and the section on "hypothesis testing."

Geometric Mean—The geometric mean of a set of n-numbers is the N* root of the cumulative product of the numbers. Another way of defming the geometric mean is the antilog of the arithmetic mean of the logarithms of the numbers in the set. This is usually easier to calculate. Note that a geometric mean may be calculated only for data sets where all values are positive.

Symbol Xg

F o r m u l a Xg = (xi • X2 • x^ • . . . JC„)""

Calculation formula

_ ^ ... (log(xi) + logfe) + logfe) + - log(jc„)) Xg = Antilog

Interval Data—^Numbers used to denote a distance or location on a known continuous scale with a zero point that is usually arbitrary.

Examples: time or temperature

Mean—^The arithmetic average of a set of observed values.

Symbol x ((JL is used for the population mean) Formula x =

n where

Xx = sum of the individual values and n = the number of individual values.

Median—TTje midpoint of a set of observed values which have been ordered from the lowest to the highest. Exactly half the values are higher than the median and half are lower. See example data sets.

Nominal Data—^Numbers or symbols used to denote membership in a group or class.

Examples: Zip codes, male/female, area codes Null Hypothesis—See section on "Hypothesis Testing"

Symbol H,

Probability Distribution—A mathematical equation that relates the value of an observation (for example, an individual's height) to the likelihood of observing that value.

One-Sided and Two-Sided Hypothesis Test—See section on "Hypothesis Testing."

CHAPTER 7 O N STATISTICAL PROCEDURES 8 1

Ordinal Data—Numbers used to denote a ranking within or between groups or classes.

Examples: preference, socioeconomic status

p-Value—^TTie probability associated with some observation or statistic.

Ratio Data—A special case of interval data where a true zero exists.

Examples: mass, volume, density

Random Sample—A sample taken from a population in such a way as to give each individual in the population gn equal chance of being selected.

Sample—A subset of observed values from a population of values.

Standard Deviation—^The square root of the variance.

Symbol SEM (ij is sometimes used)

If

Formula SEM = — V"

Statistic—A fiinction of the observed values in a sample. Statistics are used to estimate the population values (for example, x estimates the population mean,

|JL). Statistics are also used to test hypotheses (for example, x or the t-test).

Statistical Significance—See section on "Hypothesis Testing."

Subscripts—Subscripts are used to identify members of a set of data. For exam- ple: Xi, X2, Xi ... Xa- They may also be used to identify different sets of data such as Xf, and X^

Variance—A measure of the scatter or dispersion of a set of observed values about the mean of the set.

Symbol s^ (a^ is used for the population variance)

„ , -i S(jc - x)^

Formula r = ;—

n - 1 Calculation formula s^ =

^ _ ( W

n ( n - 1 ) 2. Some Other Common Symbols

d The difference between two values.

d The mean difference between two sets of paired values.

p,q The two values of proportion such that q = \ — p.

a The Type I error probability (see "Statistical Errors" section).

P The Type II error probability (see "Statistical Errors" section).

8 2 SENSORY TESTING AAETHODS: SECOND EDITION

Dlustrative Examples of Some Statistical Calculations

X.

X2 X3

^ 4

Xs Xe Xi Xti Xg

^ 1 0

Sum n Median

Set A 5 6 4 3 5 6 4 4 3 7 47

10 4.5 Mean (sum/n) Variance

SetA^

25 36 16 9 25 36 16 16 9 49 237

4.7

237 ^''*:

( 1 0 - 1.789 Standard deviation y 1.789

SEM

1.337 1.337

yio

0.423 lU 1)

SetE 3

1 4 5 6 4 2 5 4 3 37 10 4 47)

1 Set B^

3.7 157 -

9 1 16 25 36 16 4 25 16 9 157

(37 * 37) 10 (10 - 1) 2.233

^2.233 1.494 1.494

yio

0.472

A - B 2 5 0 - 2 - 1 2 2 - 1 - 1 4 10 10 1 1.0 60 - (1ô

( 1 0 - 5.556 V5.556 2.357 2.357

V16

0.745

(A - B)^

4 25 0 4 1 4 4 1 1 16 60

* 10) 10

1)

Một phần của tài liệu Astm mnl 26 2005 (Trang 84 - 87)

Tải bản đầy đủ (PDF)

(120 trang)