Sample Size Calculations, cont’d Sample Size Calculations, cont’d Finally, to account for clustering Finally, to account for clustering.. Suppose I intend to survey 100 schools (50 t[r]
Trang 1The Science and Art of Implementing Quantitative Evaluation Surveys
John Hoddinott
Deputy Director Food Consumption and Nutrition Division
Trang 2 Conducting a quantitative evaluation survey is both a
science and an art.
The science pertains to the construction of a sample that
is representative of the population of interest
The art pertains to the implementation of a survey
instrument (such as a questionnaire) that generates the information you need for your evaluation
Trang 3The Science
Trang 4The science …
Deciding on your unit of observation
Describing the universe: The sample frame
Drawing the sample
Choosing the sample size
Trang 5The Unit of Observation
The unit of observation is simply the unit that is of
interest given the objectives of your study:
• Students
• Young women
• Workers
Trang 6The Sampling Frame
The frame for a sample is a list of the units in the population (or universe) from which the units that will be enumerated in the sample area are selected It may be an actual list, a set of index cards, a map, or data stored in a computer The frame is a set of physical materials (census statistics, maps, lists, directories, records) that
(census statistics, maps, lists, directories, records) that
enables us to take hold of the universe piece by piece
(Casley and Lury, 1987, p 52)
Trang 7The Sampling Frame, cont’d
It is important that you examine carefully any sampling
frame that is made available to you for:
Trang 8Drawing the sample
Probabilistic sampling methods use some mechanism
involving chance to determine which observations appear
in the sample These mechanisms include:
• Systematic sampling
• Systematic random sampling
• Stratified random sampling
• Cluster based sampling
Trang 9 Systematic methods involve the selection of every nth
observation
For example, suppose we want a sample of 250 observations
from our population of 1000 students
We could take the first student on our list, the fifth, the ninth
and so on This method is relatively straightforward
The drawback is that the ordering of firms from 1 to 1000
must be random If there is some subtle, difficult-to-observe ordering of the sample (for example, older children tend to
be counted as even numbers) the observations drawn will not be a random sampling of the population.
Trang 10Sampling, cont’d
Simple random sampling is a better alternative.
The simplest way to do this is to use a statistics package
like STATA
Suppose we have a listing of 1000 students and we want
to randomly select 50 of them for interview We use the command:
• sample 50, count
Trang 11Sampling, cont’d
There is a potential weakness with this approach Suppose we
are drawing a sample of 100 students from a population of 1000
We know that 30% of these have completed grades 5-8 so our sample should contain 30 such students However, this is only true on average! Though the likelihood is high that our sample will contain 30 such students, it is also possible that it contains
20, 25 or 40.
The solution to this problem is random stratified sampling The
first step is to divide the population into groups or strata Here, the division would be between the 300 students in grades 5-8 and 700 other students Using the random number method,
select 10% of students in each category, so the resultant sample contains 30 students in grades 5-8 and 70 others The
proportions in the sample are identical to those in the
underlying population.
Trang 12Sampling, cont’d
The final approach is to use cluster based sampling.
Here, you select a unit of observation (eg a school) and
sample within it.
Cluster based sampling is especially appropriate in the
context of randomized designs where randomization
occurs at the cluster level.
Trang 13Sample Size Calculations
Recall some basic statistical concepts:
• Significance level The probability of rejecting a null
hypothesis that is true – also called Type I errors This is often expressed as a percentage so that a test of
significance level, α, is referred to as a 100α% level test
• Power The probability, 1- β of correctly rejecting a null
hypothesis that is false
• For a given sample size, there is a trade off increasing power
and reducing Type I errors
• We can only increase power and reduce Type I errors
simultaneously by increasing sample size
Trang 14Sample Size Calculations, cont’d
In addition, in order to calculate sample sizes, we need to
know:
• The size of the impact that we would like to detect
• Estimates of standard deviations
• (If using a cluster design), estimates of intra-cluster
correlation, also called the design effect
Using Stata, we can automate these calculations
Trang 15Sample Size Calculations, cont’d
(If using a clustered design), first calculate the
intra-cluster correlation coefficient:
Suppose we have the variable score and our clusters are
defined by the variable, school_id
The command:
• loneway score school_id
Gives the intra-cluster correlation
Trang 16Sample Size Calculations, cont’d
Suppose I want to detect a 20% increase as a result of my
intervention (from 100 for the control group to 120 for the treatment group)
I want to have statistical power of 0.80.
Standard deviation is 60.
I run the Stata command:
• sampsi 100 120, p(0.80) sd1(60) sd2(60)
Trang 17Sample Size Calculations, cont’d
Test Ho: m1 = m2, where m1 is the mean in population 1
and m2 is the mean in population 2
Trang 18Sample Size Calculations, cont’d
Finally, to account for clustering
Suppose I intend to survey 100 schools (50 treatments
and 50 controls) and the intra-cluster correlation is 0.35
I run
• sampclus, obsclus (25) rho (0.30)
Which will give me sample sizes adjusted for clustering
Trang 19Sample Size Calculations, cont’d
Sample Size Adjusted for Cluster Design
n1 (uncorrected) = 142
n2 (uncorrected) = 142
Intraclass correlation = 3
Average obs per cluster = 25
Minimum number of clusters = 94
Estimated sample size per group:
n1 (corrected) = 1165
n2 (corrected) = 1165
Trang 20Sample Size Calculations: Concluded
The final point to note here is that having insufficient
sample sizes dooms the evaluation survey even before you leave the office
You need to make these calculations on the basis of
conservative assumptions regarding:
• Effect size
• Intra-cluster correlations
Trang 21The art
The “Hard Arts”
• Questionnaire design and length; pilot testing
– Post-research obligations
Trang 22The Hard Arts
Trang 23The Hard Arts
Trang 24The Hard Arts
Trang 25Questionnaire Design
Getting this right is critical to the success of your project.
The best way to design the questionnaire is to ‘work
backwards’ That is, start by thinking about what your report will look like:
• What are the outcomes that you want to measure?
• (In the case of non-randomized designs), what variables
determine participation What covariates would you put in the probit you estimate when doing matching?
Trang 26Questionnaire Design, cont’d
• Do treatment observations actually receive the treatment?
Do they receive only partial treatment? Are there problems with quality? What constraints/problems did they face in accessing intervention
– Helps explain why you might not find significant impact– Allows you to set up a “treatment on the treated” model as an
alternative to your “intent to treat”
– Operational details are of considerable interest to program
managers
• What are the characteristics of your sample? Are there
particular sub-groups that you want to identify?
Trang 27Questionnaire Design, cont’d
Practical Considerations:
• Develop a logical sequence of questions – think of this as a
conversation rather than an interview
• Start with easy/gently questions of the “tell me about
yourself” type
• Consider which questions should be pre-coded and which
should be open-ended For example, a question on marital status could be
– Precoded: 1 if single; 2 if married; 3 if widowed; 4 if
Trang 28Questionnaire Design, cont’d
days? How many days per week did you work in the last seven days? How much are you paid for this wage work, allowing respondent to answer in terms of hourly, daily or weekly wage
Trang 29Questionnaire Design: Pilot Testing
Before starting your survey, you need to make sure that
your questionnaire works – this is called pilot testing.
You should try the questionnaire on 10-20 respondents,
who represent a variety of respondent ‘types’
Pilot testing should reveal the following:
a) Are definitions used in the questionnaire appropriate This applies both to definitions of units of observation (does the definition of a household correspond with the definition used by the people being studied) and to particular
questions (eg "holdings"; "assets"; "income") b) Do respondents understand the questions c) Do they know the answers
Trang 30Pilot Testing, cont’d
d) Are questions being asked that cause respondents unease
or do they refuse to answer
e) Are there problems associated with translating particular
concepts f) Is the layout and sequencing of questions sensible g) Can greater use be made of pre-coding
h) Should there be more space for open-ended questions
Trang 31Pilot Testing, cont’d
After pilot testing, review results with enumerators
Depending on results, re-do with a smaller number of
respondents (5-10)
Trang 32Questionnaire design: A final word
When doing an evaluation survey with a longitudinal
design, it is extremely important that at the time of the baseline, you obtain information on how to contact
respondents in the future:
• GPS is ideal
• Cellphone numbers
• Information on people who could aid in a follow up contact
Not only is this information helpful, it can also be used as
additional regressors in attrition probits
Trang 33Using PDAs or handheld computers in surveys
There is increasing interest in using Personal Digital
Assistants (PDAs) such as PalmPilots or handheld
computers for data collection.
Using these requires:
• Purchasing the hardware ($200-$500 each)
• Purchasing the software, such as Pendragon Forms or PC
Pocket Creations
• Some one to write the data entry program using this
software
Trang 34Handhelds: Advantages
Speed: Data are available for use immediately
Filters and skips can ensure that unnecessary questions
are not asked
Eliminates many costs associated with data entry
Trang 35Handhelds: Non-Issues
Its harder to train enumerators
• Not in our experience.
Battery life is a problem
• Buy extra batteries
Transferring data to computers is hard
• It is surprisingly easy
Trang 36Handhelds: Disadvantages
Need access to reliable source of electricity
Sometimes hard to see in bright daylight
In our experience, data entry errors are more frequent
when using PDAs compared to paper questionnaires and back office data entry, although these may converge with further experience
In our experience, surveys take slightly longer when using
handhelds for numeric information and much longer for
Trang 37Handhelds: Disadvantages
If a response does not seem quite right, it is relatively
straightforward to take the questionnaire back into the field Note too that checking one response often leads to
field Note too that checking one response often leads to
revisions to other responses.
If a PDA is lost, then all data are lost In some cases (such
as Pocket PCs, this is also true if the battery dies)
Safety/security of enumerators – carrying an expensive
electronic device can make them a target for criminals
Trang 38Data Entry, Cleaning, and Management
Researchers typically spend a lot of time thinking about
critical issues at the outset of their project (sample size calculations, questionnaires) and at the end of the project (data analysis, report writing).
They typically spend less time worrying about the
intermediate phase: data entry, cleaning and
management.
This is a mistake Many evaluation studies fall apart, or
fall badly behind schedule, because insufficient attention
is paid to data management
Trang 39Data Management, cont’d
Avoiding these problems requires:
• Paying attention to the design and implementation of the
data entry software early
• Developing protocols for data management
Trang 40Data Management, cont’d
Software Good choices include:
• CS PRO (www.cspro.org)
• Microsoft Access (often bundled with Microsoft Office)
• SPSS/Data Entry Module (but this can be expensive)
Start work on data entry programs as soon as
questionnaire design (or designs of certain modules) is finalized
They should be finished before data collection begins
Trang 41Data Management, cont’d
Example of data management protocol:
1 Enumerator checks questionnaire for completeness before giving it to supervisor
2 Supervisor does quick check to make sure form is complete and that critical topics are correct She passes form to:
3 Data checkers/verifiers who go through form in detail
Forms are sent back to supervisors/enumerators for correction Otherwise, they are given to data entry team.
4 Data are entered:
– Question as to whether to use single or double data entry– Compromise; enter 10-20% of data twice to do quality
check on entry
5 Violations of range/value are sent back to field for checking
6 Additional checks are made when variable aggregates are constructed
Trang 42Data Management, cont’d
Finally, two major considerations:
questionnaires – these cannot be lost!
Ensure data are backed up regularly.
• Minimum once a week
• This means that you have more than one copy held in
different locations
Trang 43 Budgeting for quantitative surveys has three
components:
• What do you need?
• How many _ do you need?
• How much do these cost? This is country specific
Trang 44Budgets: What do you need?
Trang 45Budget: How many
“How many” depends on:
• The size of the sample you are collecting
• Whether it is concentrated in a few places or widely
disbursed
• How quickly you want to complete the survey
Example.
• My survey is working in 40 spatially disbursed locations.
• I am interviewing 30 households in each location
• Each interview lasts approximately two hours and so I
assume that an interviewer can complete two interviews per day
• I assume that enumerators work five days, then have one
day off
Trang 46Budget: How many
If I hire three enumerators, they will complete the survey
in one locality in five days
• Each day, 6 interviews are completed (3 enumerators x 2 interviews
per day)
• In five days, 30 interviews are completed (6 per day x 5 days)
• One day to move to new site, one day off
• Repeat three times
This implies that one team of three enumerators will cover
four localities in one month
So, ten teams of enumerators (30 enumerators in total)
will complete survey over a four week period
Suppose you allocate one supervisor to two teams.
Trang 47The Soft Arts
Trang 48 Selecting enumerators
Selecting Supervisors
Interacting with respondents
• Preparing the ground
• Interview setting
• Interacting with respondents, including:
– Informed consent– Payments
– Post-research obligations
Trang 49 The job description for the ideal enumerator would
include:
• Communications skills
• Good knowledge of English (or French, or Portuguese or
Spanish) as well as the local language(s)
• A perceptive intelligence,
• inexhaustible patience
• unfailing dependability
• Wonderful people skills
• Willingness to work long hours
• An ability to get along with all elements of the local
population
Amazingly, such people do exist