Section 1: Test Selection and Administration Section II: Test Interpretation Normal PopulationMean RangeStandard Deviation Changing a Raw Score to a Standardised ScorePercentile Ranks St
Trang 1MARIAN POWER
A QUICK GUIDE TO
HUMAN RESOURCE TESTING
Marian Power qualified as a psychologist in 1973, and has been registered since 1987 She has
worked as an organisational psychologist, equal opportunity manager and human resources
manager in federal, state and local government Her roles encompassed recruitment and
selection, conflict resolution and management of grievances Marian is currently employed as
consultant psychologist with the Australian Council for Educational Research providing advice
to psychologists and human resource professionals regarding the selection of the most
appropriate assessment instruments for their particular purpose She also provides
accreditation training in the use of restricted tests.
Marian is an active member for the Australian Psychological Society, the College of
Organisational Psychologists, the Australian Association of Psychological Type, the Australian
Human Resources Institute and the Australian Association of Careers Counsellors.
A QUICK GUIDE TO HUMAN RESOURCE TESTING
Can you afford to make the wrong recruitment selection decision?
The cost of a wrong selection decision can be up to one-and-a half times the salary of the job,
let alone the time taken in the re-hiring process.
Studies have shown that appropriate assessment tools enhance the chances of making a
good selection and recruitment decision Testing is also important to the human resources
(HR) practitioner in a variety of other contexts, including team building, change management
and ongoing organisational needs.
A Quick Guide to Human Resource Testing is designed as an introduction, a refresher and a
quick reference guide for HR practitioners who use or plan to use assessment instruments in
any context It includes explanations, tips, case studies and suggestions to help you get the
most out of your HR testing.
9
ISBN 0-86431-458-2
780864 314581
Trang 3MARIAN POWER
A QUICK GUIDE TO
HUMAN RESOURCE TESTING
ACER Press
Trang 4First published 2004
by ACER Press
Australian Council for Educational Research Ltd
19 Prospect Hill Road, Camberwell, Victoria 3124
Copyright © 2004 Marian Power
All rights reserved Except under the conditions
described in the Copyright Act 1968 of
Australia and subsequent amendments, no part
of this publication may be reproduced,
stored in a retrieval system or transmitted in any
form or by any means, electronic, mechanical,
photocopying, recording or otherwise, without
the written permission of the publishers
Edited by Ronél Redman
Cover and text design by Mason Design
Printed by bpa DIGITAL
National Library of Australia Cataloguing-in-Publication data:
Power, Marian
A quick guide to human resource testing
ISBN 0 86431 458 2
1 Employee selection - Australia - Handbooks, manuals, etc
2 Employment tests - Handbooks, manuals, etc 3
Employees - Recruiting - Australia - Handbooks, manuals,etc 4 Employee selection - Law and legislation -
Australia I Title
658.3110994Visit our website: www.acerpress.com.au
In writing this guide, I would like to acknowledge the support I received from my husband, Adrian; andDominic, Stephen and Caithlin who freed computer time for me and provided encouragement Ralph Saubern, Test Publisher at ACER Press, offered frequent support and constructive advice Mycolleagues in Organisational Psychology and Human Resource Management have shared their experienceover twenty-five years, which has been an invaluable addition to my formal learning Thank you all
Trang 5Section 1: Test Selection and Administration
Section II: Test Interpretation
Normal PopulationMean
RangeStandard Deviation
Changing a Raw Score to a Standardised ScorePercentile Ranks
Stens and StaninesT-scores
Statistics 3: Reliability, Validity and
ReliabilityValidityMargins of Error
Section III: Reporting and Feedback
Report TypesUsing Reports for FeedbackFeedback to CandidatesFeedback to Managers
Section IV: Ethical and Legal Issues
Equal Employment Opportunity (EEO) Legislation
Discrimination in TestingDirect and Indirect Discrimination
Ability, Aptitude and SkillsPersonality
Interest Inventories
Trang 7An ounce of prevention is worth a ton of cure.
Those who are put in charge of recruiting and selecting new staff face a decision-making process that needs
to be responsible in its implementation and produce a positive outcome The cost of a wrong selectiondecision has been estimated to be anywhere between one and a half times and five times the salary of thejob in question Think of the advertising costs, time spent reading through and short-listing applications,interviewing, testing, sourcing referees and notifying unsuccessful candidates, for a start Then thesuccessful applicant commences but is not working at full capacity for a number of months, with existingstaff taking time out to help train the new recruit After all that effort, what if the person selected turns out
to be unsuitable for the position? The recruitment process starts all over again
Therefore, getting the right candidate in the first place is well worth the effort!
It is important to investigate the most efficient and effective ways of conducting a selection process tomaximise the chances of a positive outcome Testing is an important part of this process Studies haveshown that the chances of making a good decision in recruiting or selecting staff are enhanced whenstructured interviews are combined with objective comments from referees as well as appropriately chosenassessment tools
Testing is also important in a variety of other contexts, including team building, change managementand other ongoing organisational needs Good practice and appropriate use of tests are as vital to theseareas as they are in the selection and recruitment of staff
This book is designed as an introduction, a refresher and a quick reference guide for human resourcepractitioners who use, or plan to use, assessment instruments in any context I hope the explanations, tips,case studies and suggestions help you form a solid base for sound testing practice, and encourage you toread further about how testing can help you provide the best possible human resource services
In the Appendices there is a section on frequently asked questions, a glossary of terms that will clarifyany technical jargon, suggestions for which test to use in particular selection contexts, and an annotated list
of tests together with icons that indicate appropriate usage areas
Marian Power
1
Trang 8In human resource management, many decisions are made based on information that is gathered andpresented – for example, information about strategy planning, leave arrangements, observance ofoccupational health and safety regulations When dealing with important decision making regardingpeople and the workplace, information should be gathered from all possible reliable sources
HR practitioners use testing to collect reliable, objective information in order to optimise the making in a range of situations Table 1 details some of these situations, which are further discussed below
decision-tests
When to use
2
Table 1: Using Testing in Human Resource Management
Recruitment/Selection Organisational Development Career Planning
Screening
Recruitment
Order of merit for ongoing selection
Staff developmentOrganisation developmentPromotion
Team buildingChange management
Career choiceCareer changeRedundancy supportSuccession planning
Recruitment/Selection
When a large number of applications is received for
an advertised vacancy, it is usual to eliminate the
first round on the basis of the applicant’s letter and
résumé; that is, those candidates who fail to
demonstrate that they adequately meet the
selection criteria When there is still an abundance
of possible contenders, a screening test may be
used Typically, such a test is of reasonably short
duration, may be administered to a large group
and assesses a common skill or ability required for
the position The examiner, after consulting the
associated test manual, may decide to keep onlythose candidates who score above a predeterminedcut-off score in the recruitment pool Only thesecandidates proceed to the next stage
The next stage involves interviewing and morespecific testing – often two or more additional tests
If only a small number of applicants applied for theposition, they should proceed directly to this stage.Depending on individual preferences of the HRpractitioner and the selection committee, testingcan occur before the interviews, with only the best
Trang 9performing candidates then being invited to an
interview Alternatively, all candidates can be
interviewed and only the best performers tested
The tests used in this phase are chosen because
they assess abilities, skills and attitudes that are
clearly related to the selection criteria for the job in
question Verbal reasoning, numerical reasoning,
work style preference, manual dexterity, spatial
reasoning, and personality measures are examples
of tests that are commonly used (See page 4 for
more detail.)
From these test results, organisations that have
almost continuous recruitment needs can use
candidates’ performance to compile an order of
merit for further reference This allows for
candidates who achieve scores above a certain
predetermined level to be invited to participate
further in the selection process as their position on
this ‘ladder’ is reached
Organisational Development
When discrepancies are noted between the skills
or attitudes of current employees and the
requirements of the organisation, a range of testing
protocols is available to assist the HR practitioner
implement change These are largely chosen from
the same pool of assessments available for initial
recruitment purposes, and should relate to specific
needs as they arise For example, if an employer is
concerned that staff are experiencing difficulty with
new financial reporting requirements, numerical
reasoning tests may be given to identify those
candidates who would benefit from training to
address any gaps in their skills This process needs
careful handling to ensure that employees see it
as constructive
In addition to assessing individual performance,
personality or work style assessments are often
used with great success to identify deficiencies
in team performance, instigate more constructivework style interactions and develop appropriateteam-building activities
The change management process is often moresuccessfully implemented when the results ofassessments can be used to help address staffmembers’ individual needs, communication stylesand attitudes to change
Career Planning
There is a range of assessment tools available thatwill assist in identifying employees’ career interestsand help redirect others who are facingredundancy Many of these tools are the same asthose used in initial career guidance for schoolleavers – interest inventories, measures of values,card sorts, etc – whereas others specificallyconsider any blockages that someone may beexperiencing in making a career change Outcomesenable the career counsellor to work moreeffectively with the client in formulatingconstructive future plans
Personality inventories are also helpful in thisscenario – they identify those aspects of personalstyle with which an individual is comfortable aswell as those that may cause them distress
Exploring career paths that accommodate thesepreferences is a positive outcome of the process
In the area of succession planning, tests thatassess abilities and skills required in jobs at a higherlevel in the organisation are popular in assistingmanagers plan for advancement of their staff
Sensitivity is required in the management of thisprocess so that employees see it as constructive
Trang 10Ability and Aptitude Tests
Ability tests involve questions that require complex
sets of mental processes and are designed to test a
candidate’s natural ability in a particular area Often
ability tests explore relationships between two or
more words, numbers or pictures and ask the
candidate to extend a pattern or make an assertion
based on an understanding of the relationship
Aptitude tests are similar to ability tests, but are
designed to give an indication of a candidate’s likely
successful future performance on the attribute that
is being assessed
In selecting personnel, ability and aptitude testsare widely used as good general indicators of
someone’s potential to perform the duties of the job
to a satisfactory standard, and to demonstrate an
ability to apply knowledge gained in new situations
Non-verbal tests are also often used when selecting
staff for positions that demand skills not directly
related to formal education outcomes
Example 1 on page 5 is an example from anumerical ability test
Ability and Aptitude Test Types
The main kinds of ability and aptitude tests are
listed in Table 2 on page 5, along with the common
selection criteria relevant to each test type
Achievement Tests
Achievement tests are designed to measure what theindividual has learned in the past Many educationaltests are designed as achievement tests
Employers can also use achievement tests forpromotional activities within their organisation.For example, insurance assessors may have todemonstrate that they have learned risk categoriesand appropriate application of policy levels beforebeing considered eligible for their next promotion.Example 2 on page 5 is from an educationalachievement test
Personality Assessments
Personality assessments are designed to provideinformation about the way a person typicallybehaves in certain situations, their preferences andpersonal styles, and how they see themselves andothers, but care must be taken in their use It would
be difficult, for example, to argue that only onepersonality type may successfully fulfil therequirements of a particular job There may be jobswhere particular personality profiles are more orless desirable An applicant for the police forcewhose personality profile indicated an aggressivecomponent could be considered highly unsuitable;positions involving ‘cold calling’ in sales oftenattract extroverted personalities
tests
Types of
Tests can be classified in a number of ways One option would be to classify them according to what theyare assessing For example, tests may be assessing optimum performance (as in ability or aptitude tests) orpractical knowledge (as in achievement tests) Alternatively, they may be assessing emotional responses togain a picture of typical response patterns or to identify a person’s preferences, likes and dislikes
The following are some of the major categories of HR tests
Trang 11Table 2: Ability and Aptitude Test Types and Selection Criteria
Ability/Aptitude Type Example Selection Criteria
ACER Applied Reading TestACER Select – verbalMOST Verbal checking
APTS – abstractSPM
Attention to detail with numerical elements
Understanding verbal contentGeneral ability (verbal intelligence)Attention to detail with written materials
General ability (non-verbal intelligence)General problem-solving skillsConceptual and planning abilitiesUseful for measuring general ability where language may be a barrier
Ability to understand and work with visual representations of the real world, e.g maps, designs, plans
Understanding basic laws of physics and mechanics and their application to the real world
5
Personality assessments can also help to decide the
most appropriate management style for a candidate
and the way they are most likely to contribute to an
existing team
Because of the sensitivity involved in interpreting
personality assessment results and in providing
professional feedback, these assessments are usually
available only to psychologists or people who have
successfully completed prescribed accreditation
training Accreditation training is usually available to
HR practitioners and other professionals on an
instrument-by-instrument basis If a psychologist is
not available to assist, a structured interview and
referee comments may provide helpful sources of
information on a candidate’s personal style
Example 3 is from a personality inventory
Interest Inventories
Candidates are sometimes asked to complete
vocational interest inventories to assist in placing
them in the most appropriate job These inventories
are most effective when the person answers honestly
to give the most accurate picture of themselves
These assessments are very helpful in career
management programs, working with people facing
redundancy or those voluntarily changing career
direction Example 4 is from a career interest
inventory
4 3
2 1
I’d rather go to a party than read a book.
Strongly Agree 1 2 3 4 5 Strongly Disagree
Find the two missing numbers in the following sequence.
1 3 ■■ 7 ■■ 11
Trang 12Test questions can be categorised by the type of response required from the candidate and by the type ofcontent presented in the test Both types are discussed below.
6
questions
Types of test
Response Type
HR questions most commonly require either
multiple-choice responses, range-type responses
(for example, ‘Strongly agree’ to ‘Strongly disagree’)
or open-ended responses
• For multiple-choice questions, candidates are
usually required to select the best answer fromtwo or more possible answers provided Seeexamples 1, 2, 3, 6 and 7 on page 7
• Range-type responses (sometimes called ‘Likert
Scales’) are often used to indicate preference orstrength of feelings See example 8 on page 7
• Open-ended responses require the candidate to
write an answer in a blank area This can be asingle number or word, or a longer writtenresponse See examples 4 and 5 on page 7
Not all test questions are actual questions For
example, some ‘questions’ are simply a statement
to which the candidate is asked to respond; for
example, they may be asked to respond to the
statement ‘I prefer dogs to cats’ For this reason,
test questions are often referred to as items rather
than questions
Content Type
Most HR tests present test questions either inwritten format or in pictorial form Written testquestions can either use words (verbal) or numbers(numerical); pictorial test questions can use avariety of pictures, diagrams, mazes, maps andvisual puzzles Examples of the different options fortest content are listed below
Numerical (or Quantitative)
Numerical test questions require the use ofnumbers and numerical symbols and concepts.They can be used in a variety of test types, includingtests of ability, achievement and aptitude Somenumerical items require calculations, some requirepattern recognition, while others require thecandidate to check for errors
Examples 4 and 5 on page 7 are numerical items
Verbal (or Linguistic)
Verbal test questions are based on words andtextual information They can be used in a variety oftest types, including tests of ability, achievement,aptitude and personality Some verbal items requireword knowledge or logical reasoning, while othersrequire reading comprehension ability Others aresimply written statements to which the candidate isasked to respond by indicating their level ofagreement or disagreement
Examples 3, 6 and 8 on page 7 are verbal items
Trang 13Non-verbal or abstract test questions usually
contain a series of shapes as the stem or base of the
question The candidate needs to select another
shape or pattern from a selection of possible
answers in order to continue or complete the series
Example 2 opposite is a non-verbal item
Spatial-visual and Mechanical
Spatial-visual and mechanical test questions
require a candidate to look at a visual
representation of a physical object, such as a piece
of equipment, a shape or a geographical map, and
follow some instructions that involve manipulating
the object through space Some examples are given
below:
• A candidate is shown a complex shape and asked
what it would look like if it were rotated 180
degrees and flipped over
• A candidate is shown a piece of simple
machinery and asked what direction one part
would move if another, connected, part were
moved down
• A candidate is given a map with a bird’s eye view
of a landscape and asked questions about what
the landscape would look like from the ground
looking north
Example 1 opposite is a spatial-visual item
Example 7 is a mechanical item
Trang 14Tests vary in the way they are presented, although the basic elements of questions, answers, scoring andinterpretation/reporting are always present.
The test manual is the primary source of information about the test and often contains administrationguides, score keys, and tables for interpreting data
formats
Test
8
Questions
Questions are usually presented in a test, item or
question booklet Sometimes these test booklets
also include space for candidates to record their
answers; sometimes the answers are recorded on a
separate answer sheet When answers are recorded
separately, it means the test booklets can be reused
by the next candidate The administrator should
ensure that such reusable test booklets have not
been marked or damaged in any way and that all
are collected at the end of each test session
Often questions are presented in a
multiple-choice format In this format there is usually a stem
or base question and a number of alternative
answers The candidate is instructed to choose the
best answer from the alternatives offered
Sometimes the candidate needs to select two
answers, which together form the correct response
The answers that are incorrect are known as
‘distractors’ Sometimes distractors provide a correct
answer in one sense, but not the best answer.
Answers
There are many different kinds of answer sheets.Sometimes the questions and answers are recorded
in the same booklet; sometimes the answers have to
be recorded on a separate answer sheet
If candidates are asked to record their answers
on a separate sheet, they need to locate the correctplace to record a response (usually by matching thequestion number in the test booklet with a number
on the answer sheet) It will also be necessary forthe candidate to record their own details on theanswer sheet
Often answer sheets are in the form of OMRs
(optical mark recognition) – that is, sheets that aredesigned so that they can be read by a computerscanner This is particularly useful for large-scalescreening programs OMR sheets can also be scored
by hand if computer scoring is not required OMRanswer sheets need to be marked carefully bycandidates so that the computer scanner can readthe responses correctly Detailed instructions onhow to mark the answer sheets are always included.Another type of answer sheet is the carbonisedsheet Once the candidate has finished recordingtheir answers, the examiner removes the top layerand the carbon copy of the answers is designed insuch a way as to facilitate simple and immediatehand scoring
Trang 15Score Keys
A score key provides the examiner with information
needed to score a candidate’s responses
Sometimes the examiner needs to count the
number of correct and incorrect answers to obtain
a raw score, in which case a list of correct answers
will be provided Sometimes each answer is given a
different value and the values need to be added to
obtain the raw score
Score keys come in a variety of formats:
• in the test manual
• as a separate card or clear plastic overlay
• on the carbonised section of the answer sheet
More frequently nowadays, score keys are part of a
software system into which the examiner transfers
the candidate’s responses These systems
automatically score the responses and provide a
report on the candidate’s results
Interpretation/Reporting
Once the test has been scored, the raw score needs
to be transformed into a standardised score (see
page 18) This is done using tables provided in the
manual, or by the computerised scoring systems
In the case of ability tests, the standardised score
is usually all that is reported on the candidate For
personality and work style tests, more complex
reports are often prepared by the examiner or
generated by computer software These reports will
provide a variety of information and comments to
aid the examiner’s interpretation of the candidate’s
personal style Figure 1 opposite is an example of a
computer-generated report
Computerised Testing
Increasingly, tests are available for delivery via theInternet or otherwise on-screen on a computer Thecandidate sits at the computer and accesses the testmaterial by using a unique password that has beenallotted to that candidate by the potential employer
at an earlier stage of the selection process
The software system for computerised testingusually includes scoring, interpretation andreporting
The Manual
The Test Manual (also called a User’s Guide)contains information on the development of thetest, its purpose, the target audience, preciseadministration instructions, conversion of rawscores to standard scores, and sometimes casestudies and other information to assistinterpretation
The group(s) of candidates used to obtain astandard score comparison is also described so thatthe examiner may select the most appropriatecomparison sector See page 18 for moreinformation about standardised scores
Personality Interpretive Report Jon Sample (continued…) 29 February, 2004 Anxiety
According to his responses, Jon Sample is no more or less anxious than most people He has a tendency to trust people and therefore may not be as vigilant as others in examining people’s motives.
Factor Sten 1 2 3 4 5 6 7 8 9 10
Anxiety – general
Stable Trusting Assured Tense
Figure 1: Example of a Computer-generated Report
Trang 16Selection Criteria
It is critical that any test can be clearly demonstrated
to relate to one or more of the selection criteria
This will ensure that the HR practitioner receives
more relevant information from the test session
and that the candidate appreciates the relevance of
that test session to the final hiring decision As well,
there are important ethical and equal employment
opportunity (EEO) considerations, which require
the tests to relate directly to the selection criteria of
a job (See page 29 for a list of example selection
criteria and appropriate tests.)
Usually, tests will be chosen from one or more ofthe broad groupings below in order to accommodate
the needs of the particular selection exercise
Verbal vs Numerical
If a job requires good verbal and written
communication but no real involvement with
numerical work, then tests dealing with reading
comprehension or verbal reasoning may be chosen
On the other hand, many jobs involve regular work
with numerical calculations but little or no verbal
or written communication In this case it would be
appropriate to use a numerical assessment and not
a verbal one
Technical a
In technical fields, skills such as spatial-visual or
mechanical reasoning are often relevant There are
assessment tools that address these areas
Decision Making and Problem Solving a
Many jobs require different degrees of making and problem-solving skills A range ofinstruments is designed specifically to addressthese areas As well, tests of abstract or non-verbalreasoning are considered excellent measures ofproblem-solving and conceptual-thinking abilities
decision-Personality and Interest
Personality attributes are another consideration inmaking a good selection decision For example,does the candidate need to be able to workeffectively in a team environment? Are stronginterpersonal skills critical to successfulperformance in the job? Is a person who is verychange-oriented required? These and many othercharacteristics may be assessed using a number oftools including personality assessments andinterest inventories
The table in Appendix III lists some commonselection criteria and examples of the types of teststhat might be used
There are two main job-related issues that you need to consider in choosing the right test or tests for your
selection exercise Firstly, you should consider the selection criteria for the job and secondly, you should consider the general level of the job.
the job
Choosing the most appropriate test–
10
Trang 17Job Level
Some basic and entry-level jobs require no formal
education or training, although they may require
on-the-job training Factory assembly line or
shop-floor roles are in this category Tests at a lower level
that assess general reading ability, numerical
checking, and speed and accuracy are most
relevant here Scenario 1 below illustrates a typical
example
Another class of job involves making decisions
based on the understanding of written and/or
verbal communication For example, a customer
service officer is required to listen to concerns and
make decisions for future action based on his/her
interpretation of that information A test that
assesses verbal reasoning at a medium difficulty
level will provide useful data in this selection
decision Similarly, a team leader in this
environment is required to be comfortable
Scenario 2 illustrates this
Management roles (or graduate recruitment forpositions that progress through to managementlevel) typically demand that the successfuloccupant demonstrates sound conceptual andplanning skills Tests of abstract reasoning areexcellent tools for assessing candidates’ abilities inthis area When combined with high-level tests ofverbal and numerical reasoning, they provide astrong base for collecting relevant data on thecapability and capacity of job applicants Scenario 3
is a typical case
A warehouse needs to recruit staff
to work in despatch of orders An
ability to read and understand
fairly routine messages and
accuracy in marking orders against
picking tickets are required.
Candidates are unlikely to have sat
for any tests since leaving school,
and less than 1 hour is available
for this part of the selection
process.
Test selection
An applied reading test is
recommended for the verbal
comprehension component of the
job This test is used in technical
trade environments where there
is a need to read and understand
a limited range of materials,
such as union and OH&S notices
and company requirements A
numerical checking test, used to
determine speed and accuracy
when reading numbers, is
recommended for the other major
requirement of the job.
These tests will take around
45 minutes in total to administer.
A team leader in a customer service environment is being recruited.
Sound communication skills, together with the ability to monitor sales figures and report concerns to management, are required A desire
to help customers is essential.
Test selection
A verbal reasoning test at a medium difficulty level, such as APTS, addresses the first criterion.
A numerical awareness test, which assesses the ability to do
calculations and detect discrepancies, will measure that component of the job A profile, such as the Work Aspect Preference Scale, provides information that would assist with the assessment of personal qualities.
Testing time will be around 1 hour.
A travel agency is recruiting a manager for a busy suburban location Sound skills in written and verbal communication, together with the ability to manage budgets and develop marketing strategies, are required.
Test selectionVerbal and numerical reasoning tests appropriate for a junior middle-management level are recommended for the first two criteria listed An abstract reasoning test will be a suitable means of assessing a candidate’s conceptual and planning abilities required for developing new strategies.
Tests will take less than
2 hours to complete.
11
Scenario 1 Scenario 2 Scenario 3
Trang 18Tests vary considerably in their administration
time The test publisher or distributor typically
provides brief descriptions of their assessment
tools in their catalogues, which include information
on the purpose, cost and time of administering
each test Obviously, if you have just one hour
available for testing, you cannot choose a test that
takes 50 minutes but addresses only one of the
three criteria you wish to assess The numerical test
described in Scenario 2 on page 11 takes only
8 minutes to complete, whereas those for the
management role in Scenario 3 take 30 minutes
so if the presentation of reports is not a major issue, choosing a test with hand reporting may save money
There are many practical matters that influence the selection of appropriate test instruments The mainissues include:
• available time
• the budget for purchasing test instruments
• hand scoring vs computer scoring
• qualifications required for purchasing and using test instruments
the practicalities
Choosing the most appropriate test–
12
Trang 19Hand Scoring vs.
Computer Scoring
Some tests have separate answer sheets that can be
scanned and scored directly by computer (as well as
by hand) These answer sheets are called OMRs (see
also page 8) If the testing program involves a large
number of candidates – for example, a recruitment
screening exercise – it may be appropriate to have
the answer sheets computer scored This is a service
offered by the test publisher or distributor Other
tests that do not have OMR answer sheets can also
be scored by the test publisher or distributor on a
fee-for-service basis
If the testing program involves a small number
of candidates, hand scoring is usually faster and
In most cases, qualified psychologists may use anypublished test instrument, although a few productsrequire specialised training even for psychologists
13
Table 3: Test Levels and Administrators
Test Level Typical Tests Test Administrators
High
Medium
Low
• Personality instruments
• Individual psychological tests
• Individual intelligence tests
→ The test administration process:
structure & rationale
→ Test administration practice
→ Scoring tests
→ The tasks of the test administrator
→ Basic ethics in testing
Outcomes for participants will be:
→ A thorough grounding in the principles of test administration and scoring
→ The opportunity to learn and practise the administration and scoring of ability and personality tests
→ The skills to assist qualified test users in administering and scoring tests, so as to free their time for interpretation and decision making
Trang 20Let us consider the optimal environment
for conducting a testing session The room should
be large enough to accommodate comfortably
the anticipated number of candidates Give
consideration to the ventilation, lighting and
expected external noise levels A candidate will not
give their best performance if the room is too hot or
cold, too crowded or noisy, or if it is too hard to hear
your instructions or see you or your assistant
If you expect more than fifteen candidates, you
need an assistant to help with distributing and
collecting materials Add another assistant for each
fifteen to twenty candidates beyond that number
The logistics of managing an exam session
professionally are well and truly challenged when
you are trying to deal with an inappropriately
supervised group
Getting Started
A general introductory chat is an important firststep It serves the dual purpose of providing usefulgeneral information as well as giving nervouscandidates the chance to settle for a few momentsbefore commencing the actual test(s) This does notneed to be a lengthy exercise A few facts are useful– for example, checking that the selection test is infact the one the candidates are expecting to do.There have been occasions when perplexedcandidates have attempted a test for a position theylater found to be not of their choosing – theirsession was in another room, on another floor or atanother time of the day! Applicants are oftenunderstandably nervous, so clarifying information
to prevent such scenarios is important – and itmeans you have not all wasted valuable time andenergy
It is also useful to remind candidates where thetest session fits in the selection exercise Is this aninitial screening? Who will have access to theresults? Where will the results be stored? When andwhere may candidates request feedback on theirresults? Candidates are well aware of privacylegislation and their rights under this law
Remember to let them know how long the testsession should take, whether there will be anybreaks, where toilets are located, etc
To gain the most useful information from the test session while being completely fair to all candidates, youneed to follow strictly the guidelines set down for the administration of the tests you are using The TestManual or User’s Guide, which is an essential companion to any psychometric test, will contain a sectiondetailing instructions for the proper administration of the test you are using
The following are important aspects to consider in setting up a test session
best practice
Test administration–
14
Trang 21It is wise to check if everyone can see and hear
you Does anyone need reading glasses or need to
move to the front of the room because of a hearing
impairment? Is anyone feeling unwell? Again, the
aim is to tap into their best performance, so it may
occasionally be necessary to reschedule a
candidate
Mobile phones are a distraction Remind all
candidates that all phones need to be switched off
as a courtesy to everyone
Using Aids
Many tests do not permit the use of calculators or
such aids during the test session Again, candidates
should be reminded of this
The Manual
Ensure that you deliver the formal instructions
exactly as they are printed.
When giving any test from the ability/aptitude
range, it is essential that the administration
instructions in the manual be followed scrupulously
A general description will be provided, followed by
a script for introducing the test, giving practice
examples and working through those answers,
starting candidates on the test itself, and finishing
strictly after the allotted time period
While these rules may sound pedantic, the
whole purpose of a standardised test session is that
candidates’ performance may reasonably be
compared across different test venues, different
administrators and different times If, for example,
one supervisor is casual about the time frame,
applicants in that session may have an extra one or
two minutes on a 10-minute test Arguably, this
would give them an unfair advantage over
candidates who are tested according to the
instructions
There are, of course, some ability tests that are
untimed, and the timing rule is irrelevant in those
situations However, the general administration
instructions and practice items must still be
followed exactly
Personality tests usually have no time limit
Administrators are advised to suggest thatcandidates mark the first option that comes tomind and not to spend too long on any one item
For practical reasons, it is sensible to scheduleuntimed tests at the end of an assessment session
so that people may leave as soon as they havecompleted all components of the testing session
Collecting Materials
It is essential that all materials are collected andaccounted for before candidates leave the testroom Copyright legislation prohibits the copying ofmaterials and, to maintain confidence that testintegrity is being preserved, materials need to becounted in and counted out
15
Trang 22Normal Population
The natural world includes many examples of the
so-called ‘normal population’, characteristics of
which are often described by the bell-shaped curve
commonly called the normal curve (see Figure 2
opposite) In a normal population, most of the
people are closer to the average measurement of a
given attribute than to the extremes For example,
there are many more people who are about average
height than there are extremely tall or extremely
short people This means that if you measure some
characteristic that is normally distributed, most of
the population will ‘bunch up’ around the middle
This produces the distinctive bell-shape of the curve
If you went into a busy suburban street andmeasured the height of the first 100 adult males
who passed by (these comprise your ‘sample’
population) and plotted the frequency that each
height occurred, the resulting graph would look
something like that in Figure 2 There is a bunching
up of heights between 165cm and 185cm; far fewer
lie towards each extreme – not many men in this
sample were below 160cm or above 190cm in
height This is really just commonsense: we all
know that while everyone is a different height, most
adult people are really about the same height, give
or take 20cm Very few people are extremely tall or
extremely short
The same is true of intelligence and otherhuman characteristics that are normally distributed.Once we understand this concept of the normalpopulation, we can begin to describe itscharacteristics Three ways that we can do this aredetailing where the centre of the population is (that
is, the mean), what the expected range of results is,and how quickly (in terms of the unit ofmeasurement, such as centimetres or IQ points) wedeviate from the centre to the extremes
Mean
The most commonly used description of the centre
of a population is the mean, or average score This
is known as a measure of central tendency
To calculate the mean, all scores gained by allcandidates on a test are summed The total isdivided by the number of candidates The resultingfigure is the mean score
This figure serves as a benchmark against whichother scores may be measured; for example, ‘hisscore is well above the mean’, ‘she obtained thesame score as the mean’, etc This is the mostfrequently reported central tendency score cited inmanuals for selection tests
To interpret test results properly, an understanding of some of the basic terms used in statistics is required
It is important to be aware of their relevance to help you make sense of test results and decide whether thedata reported is sufficiently valid and reliable for your HR testing exercise
In this section we will cover four basic statistical concepts:
the basics
Statistics 1
16
Trang 23It is also helpful to know how far candidates’ scores
are dispersed from the centre Probably the most
commonly used term is ‘range’, which is the
distance between the lowest score obtained by a
candidate on a test to the highest score gained For
example, on a test of numerical reasoning, the
mean score may be 68, but the range of scores may
be from 12 to 99 You will notice that the mean is not
the ‘middle’ of the range but the average of all
scores obtained If the mean is much higher than
the middle of the range, it means that more people
are obtaining high scores on the test than people
scoring very low
Standard Deviation
Once we know the mean and the range, it is useful
to understand how quickly the population moves
away from the mean towards the extremes of the
range (or the ‘spread’ of the scores) One way of
measuring the ‘spread’ of the population is the
‘standard deviation’
The standard deviation of a normal population
is derived from a mathematical equation, so that
68.2% of the population falls within one standard
deviation higher or lower than the mean, 95.4% of
the population falls within two standard deviations
from the mean, and virtually all the population
(99.6%) falls within three standard deviations of the
mean
To go back to our height example, if the mean
height was 175cm and the standard deviation was
calculated to be 10cm, then 68% of the population
would fall between the heights of 165cm and
185cm, and 99.8% of the population between the
height of 145cm and 205cm (that is, three standard
deviations either side of the mean) If this sample of
the population was considered to be representative
of the whole male population, it would mean that
only 0.2% (that is, two people in a thousand) would
be taller than 205cm or shorter than 145cm
If the standard deviation is very small, it means
that scores a small distance from the mean could be
considered extreme or unusual scores, while if the
standard deviation is very large, a score would have
to fall very far from the mean to be extreme Forexample, if in our height example, the standarddeviation was only 1cm, then a person 3cm tallerthan the mean (that is, three standard deviationsfrom the mean) would be in the tallest 0.1% of thepopulation If the standard deviation was 20cm, aperson would have to be 60cm taller than the mean
to be in the top 0.1% of the population
We all have a good understanding of height andhow tall or short someone has to be before theywould be considered taller or shorter than usual
As each test has its own way of measuring theunderlying attribute or quality that is beingassessed, we need statistics to fully understand themeaning of any test score compared with the wholepopulation Understanding the centre, the rangeand the spread of scores is an important first step
17
Figure 2: Normal Curve
mean height
2.1% 13.6% 34.1% 34.1% 13.6% 2.1%
standard deviations –3 σ –2 σ –1 σ +1 σ +2 σ +3 σ
height (cm) 145 155 165 175 185 195 205
Trang 24Changing a Raw Score to a
Standardised Score
As well as mean, range and standard deviation (see
pages 16 and 17), test manuals usually contain
‘norm tables’, which report standardised scores
such as percentile ranks, stanines, sten scores and
t-scores Standardised scores are often also referred
to as norm scores, norm-referenced scores and
derived scores
To obtain one of these standardised scores, thenumber of correct responses scored by a candidate
(the raw score) is looked up in a table in order to
locate the corresponding standardised score This
standardised score allows us to compare each
candidate’s score with the sample population This
helps us to understand whether 11 is a high score,
an average score or a low score
Percentile Ranks
Percentile ranks are one of the most commonly
used standardised scores The following are
examples of how this ranking works
• Someone with a raw score that converts to a
percentile rank of 50% has scored the mean oraverage score They are right in the centre of the
comparison population; 50% of the samplepopulation has a higher score than they obtainedand 50% has lower
• Someone who has a percentile rank of 80% hasscored higher than 80% of the sample population
• Someone who has a percentile rank of 15% hasscored higher than only 15% of the samplepopulation
Percentile ranks are very useful for rankingcandidates in order of merit (especially with ability,aptitude or achievement tests), and for simpleexplanations of where a candidate’s score lies inrelation to the rest of the sample population
Stens and Stanines
Sten (‘Standard Ten’) scores and stanine (‘StandardNine’) scores are other ways of comparing acandidate’s performance with the wholepopulation In both sten and stanine scores thereare numbered categories that cover the wholepopulation – stens have ten categories: 1 to 10;stanines have nine categories: 1 to 9 A score of 9 or
10 indicates a very high level relative to thereference group, while a score of 1 indicates a verylow relative level
What does it mean if a person obtains a score of 11 correct responses on a test? If there are 11 items, that’spretty good! If there are 100 items, perhaps it is not so great
In order to understand the meaning of a score of 11, the developers of test instruments provide userswith a standard against which they can measure the merits of a candidate’s performance This puts them in
a much better position to make a fair comparison The standard is based on a set benchmark that wasdetermined by assessing a comparison group, or sample, that is selected to represent the population This
sample is often called the norm group.
test scores
Statistics 2
18
Trang 25Raw Score Percentile Stanine T-Score Raw Score
Figure 3: Sample Norm Table
Sten and stanine scores are useful when wanting
to avoid over-emphasising small, unimportant
differences between candidates
T-scores
T-scores are another common type of standardised
score A t-score of 50 is the mean score, or 50th
percentile With t-scores, each 10 points indicate a
standard deviation Therefore, a t-score of 60 is one
standard deviation above the mean, while a t-score
of 40 is one standard deviation below the mean In
the same way, a t-score of 70 is two standard
deviations above the mean
T-scores are a popular method of reporting
candidates’ scores on psychological test
instruments
The differences between these (and other)
standardised scores are interesting but quite
technical The important thing to remember in the
use of HR tests is that standardised scores, not raw
scores, should always be used in reporting, so that
the candidates’ scores can be properly compared tothe reference population
Remember – what does a score of 11 correctanswers really mean?
Figure 3 shows an example of a norm table with
raw scores and three different standardised scores
The HR practitioner can use whichever of thesestandard scores best suits the situation Often normtables only provide one kind of standard score
Figure 4 is a single graph showing therelationship between the normal population,mean, standard deviation and some commonlyused standardised scores
50 55
20 25 30 35 40 45 60 65 70 75 80 0.1 1 2 7 16 31 50 69 84 93 98 99 99.9 t-score