1. Trang chủ
  2. » Thể loại khác

ninh.giang.teaching.des.post.can.2013

48 5 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Định Nghĩa Một Số Thuật Ngữ Trong Sinh Thống Kê
Tác giả Lê Hoàng Ninh
Trường học Đại Học
Chuyên ngành Sinh Thống Kê
Thể loại bài giảng
Năm xuất bản 2006
Định dạng
Số trang 48
Dung lượng 1,02 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Biostatistics © 2006 1 Sinh thống kê GS TS Lê Hoàng Ninh © 2006Evidence based Chiropractic 2 Dịnh nghỉa một số thuật ngữ trong sinh thống kê • Dữ liệu – Số đo hay quan sát một biến số • Biến số – Đặc[.]

Trang 1

Sinh thống kê

GS TS Lê Hoàng Ninh

Trang 2

© 2006 Evidence-based Chiropractic 2

– Đặc trưng được khảo sát đo đạt

– Có thể có nhiều trị số khác nhau từ đối tượng nầy đến đối tượng khác

Trang 3

Định nghĩa từ dùng trong

thống kê

• Biến số độc lập

– Có trước biến số phụ thuộc; căn nguyên/

nguyên nhân của một hệ quả nào đó

– Thuốc lá -> ung thư phổi

– Thuốc A -> khỏi bệnh

• Biến số phụ thuộc:

– Số đo hệ quả,/ kết cuộc

– Trị số phụ thuộc và biến độc lập

Trang 4

© 2006 Evidence-based Chiropractic 4

Trang 5

Quần thể

• Quần thể là tập hợp các cá thể mà mẫu được lấy ra

– e.g., headache patients in a chiropractic

office; automobile crash victims in an

emergency room

• Trong nghiên cứu, không thể đo đạt khảo sát

trên toàn bộ quần thể

• Do vậy cần phải lấy mẫu ( tổ hợp con của quần thể)

Trang 6

© 2006 Evidence-based Chiropractic 6

Mẫu ngẫu nhiên

• Các đối tượng được lấy ra từ quần thể để sao

cho các cá thể có cơ hội như nhau được chọn ra

• Mẫu ngẫu nhiên thì đại diện cho quần thể

• Mẫu không ngẫu nhiên thì không đại diện

– May be biased regarding age, severity of the condition, socioeconomic status etc

Trang 7

Mẫu ngẫu nhiên

• Mẫu ngẫu nhiên hiếm có trong các nghiên cứu chăm sóc bệnh nhân

• Thay vào đó, dùng phân phối ngẫu nhiên vào 2 nhóm điều trị và nhóm chứng

– Each person has an equal chance of being assigned to either of the groups

• Phân phối ngẫu nhiên vào các nhóm =

randomization

Trang 8

© 2006 Evidence-based Chiropractic 8

• Cách tóm tắt dữ liệu

• Minh họa bộ dữ liệu = shape, central tendency, and variability of a set of data

– The shape of data has to do with the

frequencies of the values of observations

Trang 9

• Thống kê mô tả khác biệt với thống kê suy lý

– Thống kê mô tả không thể kiểm định giả

thuyết

Trang 10

© 2006 Evidence-based Chiropractic 10

• Distribution provides a summary of:

– Frequencies of each of the values

Trang 12

© 2006 Evidence-based Chiropractic 12

PHÂN PHỐI TẦN SỐ ĐƯỢC BIỂU

THỊ BẰNG histogram

Trang 13

Histograms (cont.)

• A histogram is a type of bar chart, but there are

no spaces between the bars

• Histograms are used to visually depict frequency distributions of continuous data

• Bar charts are used to depict categorical

information

– e.g., Male–Female, Mild–Moderate–Severe, etc

Trang 14

© 2006 Evidence-based Chiropractic 14

SỐ ĐO KHUYNH HƯỚNG

TRUNG TÂM

• Số trung bình

– The most commonly used DS

• Tính số trung bình

– Add all values of a series of numbers and

then divided by the total number of elements

Trang 15

• X is a command that adds all of the X values

n is the total number of values in the series of a sample and

N is the same for a population

X = Σ

Trang 16

© 2006 Evidence-based Chiropractic 16

Số đo trung tâm

• Mode

– The most frequently

occurring value in a

series

– The modal value is

the highest bar in a

histogram

Mode

Trang 17

Số đo trung tâm

• Trung vịn

– The value that divides a series of values in

half when they are all listed in order

– When there are an odd number of values

• The median is the middle value

– When there are an even number of values

• Count from each end of the series toward the middle and then average the 2 middle values

Trang 18

© 2006 Evidence-based Chiropractic 18

Số đo trung tâm

• Each of the three methods of measuring central tendency has certain advantages and

disadvantages

• Which method should be used?

– It depends on the type of data that is being

analyzed

– e.g., categorical, continuous, and the level of measurement that is involved

Trang 19

Cấp độ số đo

• There are 4 levels of measurement

– Nominal, ordinal, interval, and ratio

1 Nominal

– Data are coded by a number, name, or letter

that is assigned to a category or group – Examples

• Gender (e.g., male, female)

• Treatment preference (e.g., manipulation,

mobilization, massage)

Trang 20

© 2006 Evidence-based Chiropractic 20

Cấp độ số đo

2 Ordinal

– Is similar to nominal because the

measurements involve categories– However, the categories are ordered by rank– Examples

• Pain level (e.g., mild, moderate, severe)

• Military rank (e.g., lieutenant, captain, major,

colonel, general)

Trang 21

Cấp độ số đo

• Ordinal values only describe order, not quantity

– Thus, severe pain is not the same as 2 times mild pain

• The only mathematical operations allowed for nominal and ordinal data are counting of

categories

– e.g., 25 males and 30 females

Trang 22

© 2006 Evidence-based Chiropractic 22

Cấp độ số đo

3 Khoảng

– Measurements are ordered (like ordinal

data) – Have equal intervals

– Does not have a true zero

– Examples

• The Fahrenheit scale, where 0° does not

correspond to an absence of heat (no true zero)

• In contrast to Kelvin, which does have a true zero

Trang 23

Cấp độ số đo

4 Ratio

– Measurements have equal intervals

– There is a true zero

– Ratio is the most advanced level of

measurement, which can handle most types

of mathematical operations

Trang 24

© 2006 Evidence-based Chiropractic 24

Levels of measurement (cont.)

• Ratio examples

– Range of motion

• No movement corresponds to zero degrees

• The interval between 10 and 20 degrees is the same as between 40 and 50 degrees

– Lifting capacity

• A person who is unable to lift scores zero

• A person who lifts 30 kg can lift twice as much as one who lifts 15 kg

Trang 25

Levels of measurement (cont.)

• NOIR is a mnemonic to help remember the

names and order of the levels of measurement

– Nominal

Ordinal

Interval

Ratio

Trang 26

© 2006 Evidence-based Chiropractic 26

Cấp độ số đo

Measurement scale Permissible mathematic operations central tendency Best measure of

Ordinal Greater or less than operations Median Interval Addition and subtraction Symmetrical – MeanSkewed – MedianRatio multiplication and division Addition, subtraction, Symmetrical – MeanSkewed – Median

Trang 27

Hình dạng bộ dữ liệu

• Histograms of frequency distributions have

shape

• Distributions are often symmetrical with most

scores falling in the middle and fewer toward the extremes

• Most biological data are symmetrically

distributed and form a normal curve (

bell-shaped curve)

Trang 28

© 2006 Evidence-based Chiropractic 28

Trang 29

Phân phối bình thường

• The area under a normal curve has a normal

distribution ( Gaussian distribution)

• Properties of a normal distribution

– It is symmetric about its mean

– The highest point is at its mean

Trang 30

© 2006 Evidence-based Chiropractic 30

The normal distribution (cont.)

Mean

A normal distribution is symmetric about its mean

As one moves away from

the mean in either direction

the height of the curve

decreases, approaching,

but never reaching zero

As one moves away from

the mean in either direction

the height of the curve

decreases, approaching,

but never reaching zero

The highest point of the overlying

normal curve is at the mean

The highest point of the overlying

normal curve is at the mean

Trang 31

The normal distribution (cont.)

Mean = Median = Mode

Trang 32

© 2006 Evidence-based Chiropractic 32

Phân phối lệch (Skewed

– A small number of extreme values are located

in the limits of the opposite end

Trang 33

Phân phối lệch

• Skew is always toward the direction of the longer tail

– Positive if skewed to the right

– Negative if to the left

The mean is shifted the most

Trang 34

© 2006 Evidence-based Chiropractic 34

Phân phối lệch Skewed

– It will be the central point of any distribution

– 50% of the values are above and 50% below the median

Trang 35

Những tính chất đường cong

bình thường

• About 68.3% of the area under a normal curve is within one standard deviation (SD) of the mean

• About 95.5% is within two SDs

• About 99.7% is within three SDs

Trang 36

© 2006 Evidence-based Chiropractic 36

More properties

of normal curves (cont.)

Trang 37

Độ lệch chuẩn (SD)

• SD is a measure of the variability of a set of data

• The mean represents the average of a group of scores, with some of the scores being above the mean and some below

– This range of scores is referred to as

variability or spread

• Variance (S2) is another measure of spread

Trang 38

© 2006 Evidence-based Chiropractic 38

Trang 39

SD (cont.)

Ages are spread out along an X axis

Ages are spread

out along an X axis

The amount ages are

spread out is known as

dispersion or spread

The amount ages are

spread out is known as

dispersion or spread

Trang 40

© 2006 Evidence-based Chiropractic 40

Distances ages deviate above

and below the mean

Adding deviations always equals zero

Adding deviations always equals zero

Etc.

Trang 41

• However, the total always equals zero

– Values must first be squared, which cancels the negative signs

Trang 42

© 2006 Evidence-based Chiropractic 42

S2 is not in the same units (age), but SD is

Trang 43

Wide spread results in higher SDs

narrow spread in lower SDs

Trang 44

© 2006 Evidence-based Chiropractic 44

Spread is important when comparing 2 or more group means

It is more difficult to

see a clear distinction

between groups

in the upper example

because the spread is

wider, even though the

means are the same

Trang 45

• The number of SDs that a specific score is

above or below the mean in a distribution

• Raw scores can be converted to z-scores by

subtracting the mean from the raw score then dividing the difference by the SD

Trang 46

© 2006 Evidence-based Chiropractic 46

z-scores (cont.)

• Standardization

– The process of converting raw to z-scores – The resulting distribution of z-scores will always have a mean of zero, a SD of one, and

an area under the curve equal to one

• The proportion of scores that are higher or lower than a specific z-score can be determined by

referring to a z-table

Trang 47

z-scores (cont.)

Refer to a z-table

to find proportion under the curve

Refer to a z-table

to find proportion under the curve

Trang 48

© 2006 Evidence-based Chiropractic 48

z-scores (cont.)

Partial z-table (to z = 1.5) showing proportions of the

area under a normal curve for different values of z.

0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359

0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753

0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.61410.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.65170.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.68790.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.72240.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.75490.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.78520.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133

0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389

1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621

1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830

1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.90151.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.91771.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319

Ngày đăng: 20/04/2022, 14:15

HÌNH ẢNH LIÊN QUAN

Bảng phân phối tần số - ninh.giang.teaching.des.post.can.2013
Bảng ph ân phối tần số (Trang 11)
Hình dạng bộ dữ liệu - ninh.giang.teaching.des.post.can.2013
Hình d ạng bộ dữ liệu (Trang 27)
Hình dạng bộ dữ liệu - ninh.giang.teaching.des.post.can.2013
Hình d ạng bộ dữ liệu (Trang 28)
w