Research Methodology_Chapter 7.Pdf

Chapter 7 Analyzing quantitative data Chapter 7 Quantitative Research Methods • Raw quantitative data, that haven’t been processed or analyzed, convey very little meaning to most people • For these da[.]

Trang 3

7.1 Preparing, checking and inputting data

Types of data:

• Quantitative data can be divided in two groups: categorical data and numerical data

Categorical data are those whose values cannot be measured numerically but can be

classified into sets/categories according to the characteristics that describe or identify the variable, or they could be placed in rank order

• There are two types of data:

• Descriptive/nominal data – these data can simply count the number of occurrences

in each category of a variable When a variable is divided into two categories

(female/male for example) than the data are known as dichotomous data

• Ranked/ordinal data – these are data that are a more precise form than categorical

Trang 4

• Alternatively, numerical data are those whose values are numerically measured or counted as quantities (Berman 2008)

• Numerical data are therefore more precise than categorical ones because one can assign each data value a position on a numerical scale.

• Numerical data can be subdivided in two ways: based on interval and ratio data: or based on continuous or discrete data

• Interval data can state the difference (interval) between any two data values of a certain variable, whereas ratio data can calculate the relative difference (ratio) between any two data values of a certain variable.

• Continuous data are those whose values can take any value (given that you measure them accurately) while discrete data can be measured precisely (often whole numbers/integers).

Trang 5

• After determining the types of data that are to be collected the researcher can start to enter the data into data computer data processing software (RSS/EXCELL)

• To do this the data need to be coded using numerical codes This enables the researcher

to enter the data quickly with fewer errors

• When this is done the data should be checked for errors

Trang 6

7.2 Exploring and presenting the data

• Turkey’s (1977) exploratory data analysis (EDA) is a useful approach to start the analysis

of quantitative data This approach focuses on the use of diagrams to explore and

understand the data Sometimes it might be possible that this approach enables you to look at other relationships in data, which your research was not designed to test

• When looking at the collected data it is best to explore specific values, highest and

lowest values, trends over time, proportions and distributions

• Once these have been explored one can start to compare them and look for (causal)

relationships between variables)

Trang 7

Exploring

variables

Shapes of diagrams

Comparing variables

Trang 8

• Exploring variables:

• The easiest way of summarizing the data is by using tables However, tables do not demonstrate visual significance to highest or lowest values so it may be that diagrams are a better option for summarizing the data

• Another way to present data is by using a bar chart, where the height or length of each bar represents the frequency of occurrence

• Bar charts are similar to histograms, another type of data presenting, where the area of each bar represents the frequency of occurrence and where the continuous nature of the data is emphasized by the absence of gaps between

bars.

Trang 10

Trang 11

• Comparing variables

• Contingency tables or cross tabulation are approaches one could use examine the

interdependence between variables Other approaches are:

• Multiple bar charts - to explore highest and lowest values.

• Percentage component bar chart – this is used to compare proportions between variables.

• Multiple line graph – this Is used to compare trends and conjunctions.

• Stacked bar chart – used to compare totals between variables.

• Comparative proportional pie chart – this is used to compare proportions of each category or value as well as the totals between variables.

• Scatter graphs or scatter plots – this diagram is often used to explore the possible

relationships between ranked and numerical data variables by plotting one variable against

Trang 12

7.3 Describing data with use of statistics

• Turkey’s exploratory data analysis approach is a good approach to understand the data using diagrams

• Descriptive statistics, on the other hand, enable one to describe the variables

numerically They describe a variable focus on the central tendency and the dispersion

• Central tendency is measured by general impressions of values that could be seen as

common, middling or average These measures are determined by:

• The mode – the value that is visible most often

• The median – the middle value or mid-point after the data have been ranked

• The mean – also known as the average

Trang 13

7.3 Describing data with use of statistics

• The dispersion (how data are distributed around the central tendency) could be

described by:

• Inter-quartile range – the difference within the middle 50 per cent of values

• Standard deviation – extent to which the value differs from the mean

• Range – the difference between the lowest and the highest values

• Coefficient of variation – this is to compare the relative spread of data between

distributions of different magnitudes, for example hundreds of tons with billions of tons (calculated by dividing the standard deviation by the mean and multiply the answer by 100)

Trang 14

7.4 Explore relationships, differences and trends using statistics

• In a research one often wishes to find the relationship between variables

• This is called hypothesis testing, where one is actually comparing the collected data with what he expected to happen

• There are two general groups of statistical significance tests: the non-parametric tests (used when the data are not normally distributed) and the parametric tests( these are used with numerical data)

Trang 15

Testing for normal distribution

Testing for significance

Type 1 and 2 errors

Trang 16

• Testing for normal distribution:

• A way to test for normality is to use statistics to determine whether the distribution for a variable differs significantly from a comparable normal distribution

• This could be done using statistical software that use the Kolmogorov-Smirnov test and the Shapiro-Wilk test A probability of 0.05 means that there is a 5 per cent chance that the data distribution differs from a comparable normal distribution

• Thus if the probability is lower than 0,05, the data are not normally distributed

Trang 17

• Testing for significance

• If a there is a relationship between variables than the researcher will reject the null

hypothesis and accept the alternative hypothesis

• It is difficult to obtain a significant test statistic with a small sample, by increasing the

sample size more relationships found will be significant

• This is because the sample size resembles that of the population from which it was

selected

Trang 18

• Type 1 and 2 errors

• A Type 1 error occurs when the null hypothesis has been wrongly rejected and the

alternative hypothesis should not have been accepted In other words, the researcher states that two variables are related when they are actually not Statististical significance

is the same as determining the probability of making a Type 1 error

• A Type 2 error is when a researcher does not reject the null hypothesis when he should Thus he states that two variable are not related when they actually are

Trang 19

• Type 1 and 2 errors

• When descriptive or numerical data are summarized as a

two-way contingency table it is helpful to use a chi square test

• A chi square test makes it possible to determine how likely it is

that two variables are associated

• In order to do this test two assumptions should be met:

• The categories of the contingency table are mutually

exclusive Each observation falls into one category only

• Not more than 25 per cent of the cells can have expected

values of less than 5 When the table consists of two rows

and two columns, no expected values can be less than 10.

Trang 20

• Exploring the strength of a relationship

• There are two kinds of relationships:

• Correlations: this is when a change in one variable

leads to a change in another variable, but it is not

clear which variable has caused the other to

change

• Cause-and-effect relationship: when a change in

one or more variables cause a change in another

variable

Trang 21

• Exploring the strength of a relationship

• The correlation coefficient quantifies the strength

of a linear relationship between two ranked or

numerical variables between a number of +1 and

-1

• A value of +1 means positive correlation, which

means that the two variables are exactly related

and when one increases, the other one will

increase as well

• A value of -1 demonstrates a negative correlation,

Tiêu đề	Chapter 7: Quantitative Research Methods
Tác giả	V.T.P.Mai
Trường học	Foreign Trade University
Chuyên ngành	Quantitative Research Methods
Thể loại	chapter
Năm xuất bản	2023
Thành phố	Hanoi

Định dạng
Số trang	21
Dung lượng	640,77 KB