Phase 2: Development of the nOTSS System The goal of Phase 2 was to develop a system that could be used by surgeons to rate other surgeons’ behaviours in vivo in the operating theatre ra
Trang 1process produced a list of 150 unsorted non-technical skills such as ‘coordinates the team’, and ‘confirms understanding with assistant’ as raw input data for system development in phase 2
Phase 2: Development of the nOTSS System
The goal of Phase 2 was to develop a system that could be used by surgeons to rate
other surgeons’ behaviours in vivo in the operating theatre rather than to develop
a comprehensive taxonomy or research instrument The tri-level hierarchical format used for behavioural marker systems in anaesthesia (Fletcher et al 2004) and European civil aviation (Flin et al 2003) was adopted This format structures skills into category and element levels with observable behaviours (markers) indicative of good and poor performance for each element The prototype system was developed in three stages to (i) refine the skill set that emerged from phase 1, (ii) sort those skills into a skills taxonomy, and (iii) identify observable behaviours that were indicative of each skill in the taxonomy
The aim of Stage 1 was to refine the skills that emerged from Phase 1 and remove duplication without diluting the conceptual breadth of the skills that emerged from the task analysis This process was to form the basis of the system
To achieve this, the multidisciplinary research group reduced and refined the list
of 150 skills extracted from the transcripts, considering the results of the literature review, survey, and observations in theatre The skills taxonomy was developed according to design criteria derived from the JARTEL (Joint Aviation Requirements: Translation and Elaboration of Legislation) project (Flin et al 2003), an expert panel on behavioural markers (Klampfer et al 2001) and from Cognitive Task Analysis (Seamster et al 1997).1 The reduced skills list was then thematically organized and broad categories emerged broken down into component elements
In Stage 2, an iterative process was used with four independent panels of consultant surgeons from four hospitals, who modified the structure into a skills taxonomy The panels checked the wording and labelling of elements, and ensured that the framework was relevant to the surgical domain This formed the basis for the behavioural marker system
In stage three, observable behaviours (markers) indicative of good and poor performance were developed for each element by 16 consultant surgeons The surgeons were asked to think of behaviours that could be either directly observed
or inferred through communication Two subsequent multidisciplinary review meetings refined this set of illustrative behaviours, all phrased as active verbs This ensured that the system had cognitive and interpersonal functionality, was grounded in surgery, and complied with the guidelines on system design (Gordon 1993) and criteria for development of behavioural markers mentioned earlier
1 See Table 1 in Yule et al (2006b) for the full set of design criteria.
Trang 2The NOTSS Rating Scale
The aim of the system is to allow surgeons to rate skills they observe After considering the possible rating scale formats, a four-point scale was chosen, as follows: 4 good, 3 acceptable, 2 marginal, 1 poor, and N/A not applicable The
‘not applicable rating’ applies when the behaviour was not required in a given clinical scenario If the skill should have been observed but was not, then a rating
of 1 (poor) should be given Behaviours which potentially endanger patient safety should also be given this rating
Phase 3: System evaluation
The aim of Phase 3 was to evaluate the NOTSS v1.1 system, specifically to assess its psychometric properties of (i) sensitivity, (ii) inter-rater reliability, and (iii) internal structure and consistency To exert some control over the evaluation and stimuli used, we used a pseudo-experimental design which involved 44 consultant surgeons rating standardized video clips of surgeons’ intraoperative behaviour
To achieve this we filmed eleven video scenarios illustrating a range of surgeons’ non-technical skills in general and orthopaedic surgery The scenarios were filmed using a patient simulator in operating rooms with practising surgeons, anaesthetists and nurses acting the main roles The scenarios were designed by surgeons, anaesthetists and psychologists who were experienced in non-technical skills training From these, three scenarios were selected for training and six for the evaluation, the longest of which ran for 5 minutes and 40 seconds The participating surgeons attended a half-day training session on how to use the NOTSS system, with some guidance on behaviour rating (Baker et al 2001) They were instructed
to watch each scenario and to rate the observed skills of the consultant surgeon using the NOTSS rating form Participants were informed of the simulated nature
of the scenarios
Table 2.1 shows the criteria for each of the evaluation metrics used in this study and the corresponding results For more details on the evaluation see Yule et al (2008a) and Yule et al (2009) This table shows that the system was moderately sensitive, but operated best when observers had to make a decision regarding whether the behaviour was acceptable or not Within-group agreement was acceptable for the interpersonal skill categories but below acceptable criteria for cognitive skills Internal reliability was high with an overall mean difference
of 0.25 scale points between categories and elements
There were also differences in the way the scenarios were rated, two scenarios yielded either floor or ceiling ratings as the behaviours were explicitly good or poor, and other scenarios displayed more ambiguous behaviours and were rated
in the mid-range of the scale Orthopaedic surgeons were found to agree on rated behaviours significantly more than general surgeons (Yule et al 2008a)
Trang 3On the basis of the evaluation a number of changes were made to the taxonomy, the most important being the removal of ‘Task Management’ This was done because conceptually, many of the task management behaviours were actually more reflective of situation awareness; some reliability tests did not reach an acceptable threshold for the category and practically, removing a category and elements from the taxonomy reduced the cognitive load for raters who have a finite capacity for
Type of
evaluation Why it is important how calculated and criteria Result of test
Sensitivity This is a measure
of how accurate the group of raters are
in absolute ratings
of behaviour compared with reference ratings
Mean number
of scale point difference between raters and reference, represented as a decimal, usually
<1
Mean sensitivity across all categories was 67
Within-group
agreement (rwg) 18 This is a measure
of statistical agreement between
a number of raters
In this study, it represents the degree to which the groups of participants agree
on the absolute ratings they give
to behaviours in the scenarios that reflect the NOTSS categories and elements
Scores lie between
0 (no agreement) and 1 (perfect agreement); scores above 7 are deemed acceptable
rwg was calculated for the NOTSS categories and elements ratings for each of the
6 experimental groups
rwg exceeded the criteria of >.7 for two categories: Leadership and Communication
& Teamwork
rwg for Decision-making and Task Management approached the criterion but
the value of rwg
for Situation Awareness was 51
Internal reliability There should be
a high degree
of consistency between the category rating and the ratings for the two or three underpinning elements due to conceptual overlap
The mean absolute difference between raters’ element ratings and their rating for the corresponding category Lower scores (tending
to zero) indicate closer agreement
Mean difference for all categories was < 0.25 of
a scale point between elements and category on
a 4-point scale Consistency between category and element deemed very high for all categories
Table 2.1 Summary of nOTSS v1.1 evaluation results (see Yule et al
2008a for detailed results)
Trang 4holding a number of categories and elements in working memory while engaged
in a real-time observation and rating task (Yule et al 2008a) This produced the NOTSS taxonomy version 1.2 (see Figure 2.2)
The NOTSS v1.2 Handbook
A user handbook (Flin et al 2006b) was then written which contained background information on the development of NOTSS, advice for using system in clinical practice, definitions and behavioural examples of the NOTSS categories and elements, a set of rating forms for users, indicative good and poor behaviours for each element, and advice on how to use the rating scale Practical tips to aid surgeons embed non-technical skills observations into clinical practice were included, as was advice for surgical trainers planning to use NOTSS with higher surgical trainees
Phase 4: System usability
A follow-up study was conducted to evaluate system usability with 22 surgical trainers and their trainees from three Scottish hospitals The trainers were asked to use the NOTSS rating form and supporting handbook to rate and provide feedback
to trainees as soon as possible after each of ten cases where the trainee had contributed significantly to the operation Inguinal hernia repair and laparoscopic cholecystectomy were typical operations observed during this trial but it was recommended that specific use of NOTSS be determined by the educational needs of the trainees For example, with junior trainees, the focus of training is
on developing basic surgical expertise, so it was advised that the NOTSS system
Figure 2.2 nOTSS skills taxonomy v1.2
Understanding information Projecting and anticipating future state
Selecting and communicating option Implementing and reviewing decisions Communication and Teamwork Exchanging information
Establishing a shared understanding Coordinating team
Supporting others Coping with pressure
Trang 5be used for general discussion of non-technical skills and their importance to clinical practice For more senior trainees such as specialist registrars (SpRs), it was suggested that the NOTSS system be used to rate skills and provide feedback during increasingly challenging cases
Most of the consultant surgeons had been trained to use the system in the three-hour group session for the system evaluation study reported previously Those who did not participate in this session were given the same training course in a one-to-one setting Trainees attended an information session about non-technical skills and the usability trial at their hospital During this session, it was explained that the NOTSS system has been designed to aid the development of professional skills and that we were evaluating the system rather than assessing their skills during the study An online post-trial questionnaire was used to establish if using NOTSS was of any value as an adjunct to the currently available surgical education and assessment methods An initial invitation to complete it was followed up with a reminder after one month and a further reminder a month later Self-report measures were selected as the most appropriate method of gathering data on user experiences although are not without limitations, as such data are by their nature subjective, and susceptible to memory decay and social desirability bias
In total, eleven consultant surgeons completed the usability trial Data on trainee surgeons were not tracked (to ensure that they were confident that the purpose of the study was solely to assess the usability of the tool, rather than their own competence) but analysis of completed feedback forms indicate that at least
12 trainees took part The NOTSS system was used to observe and debrief on non-technical skills during a total of 43 cases (mean 4 per consultant, range 1–8 cases) In all cases, the trainee was lead surgeon In some cases the consultant was an unscrubbed observer and on other occasions was scrubbed and assisting
as well as observing The majority of trainers (90 per cent) thought that they had received enough training to use the system and preferred to conduct the debrief immediately after the operation (81 per cent) in the operating theatre suite The median length of debrief session was 3–5 minutes See Figure 2.3 for an example
of a NOTSS rating card completed after a laparoscopic cholecystectomy which mainly focused on the trainee’s ability to gather information about the patient, communicate decisions to the team and work with the assistant and consultant surgeon in a coordinated manner
All trainers used ‘communication & teamwork’, 90 per cent used ‘situation awareness’, 72 per cent used decision-making, and just over half (54 per cent) used the leadership category Some categories were not used by some trainers due
to the level of the trainee and the complexity of the procedure being completed The majority of surgical trainers thought that the NOTSS system was useful for debriefing trainees and a valuable adjunct to currently available assessment tools The trainers were all in agreement that NOTSS provided a common language to discuss non-technical skills and was useful to support reflective practice, but there were mixed opinions regarding the ease of rating non-technical skills Although
45 per cent of trainers agreed that cognitive and interpersonal skills were easy
Trang 6Figure 2.3 Completed nOTSS rating form
Trang 7to rate, 27 per cent found interpersonal skills difficult to rate compared with only 9 per cent who felt cognitive skills were difficult to rate (Yule et al 2008b) The remaining trainers were ambivalent regarding ease of rating Time can be a precious commodity in the operating theatre but only 9 per cent of trainers thought using NOTSS to debrief added too much time to their operating list and 73 per cent thought that routine use of NOTSS would enhance safety in the operating theatre All trainers thought that NOTSS has a place in surgical education and assessment Comments from trainers indicated that positive aspects of the system for surgical education were the transparent structure; common language; ability to objectively assess skills; framework for providing feedback; ease of use in real-life situations, and that using the system made time to discuss aspects of surgical performance that are ‘usually ignored’ Although some trainers reported no difficulties rating behaviours using NOTSS, four main problems were articulated These related to understanding some descriptors in the NOTSS handbook; selecting an appropriate trainee and case; observing and rating behaviours while also scrubbed, and an over-reliance on communication to infer cognitive skills
Discussion
The aims of the NOTSS project was to develop and evaluate a behavioural marker system for surgeons’ non-technical skills using human factors methods and basing the system development and associated rating scale on a skills taxonomy These aims were met and the prototype NOTSS system is being used by practising surgeons and research groups in Australasia, Japan, Europe, and North America Further development of the tool is required and there remain some unanswered questions such as the amount of training required for a practising surgeon to be able to use the tool reliably, and whether observations and ratings have to be made
by surgeons (as opposed to anaesthetists, nurses or even psychologists) to be valid and meaningful A research group at Sheffield (see Chapter 4 of this volume) are attempting to answer some of these questions Other research teams have developed tools to observe and rate the behaviours of surgical teams (Undre et al
2007 – Imperial College) or have adapted the NOTECHS tool from civil aviation (Flin et al 2003) for use with surgeons in operating theatre (Sevdalis et al 2008 – Imperial College, Mishra et al 2008 – University of Oxford) These lines of research differ in concept and approach but nonetheless enrich our understanding
of non-technical skills in surgery
The focus of surgical training still heavily favours technical skill acquisition, yet surgeons increasingly operate in teams with whom they may be unfamiliar, especially in an emergency setting The adoption of specific training in non-technical areas of expertise is still done on an ad hoc basis although the Royal Colleges of Surgery in Great Britain and Ireland all provide training in this emerging area to some extent These courses have so far been taken by enthusiastic surgeons, both consultant and trainee but are not compulsory aspects of surgical
Trang 8training The Royal College of Surgeons of Ireland however, provides funding for all trainee surgeons to attend a human factors training course As part of the NOTSS evaluation, it emerged that training in using the system was not sufficient for many users as they did not have background knowledge in psychology and human factors Therefore, we developed and ran training courses for surgeons, introducing human factors and the basics of workplace assessment of behaviour This developed into a two-day course, specifically on the NOTSS system in
2006, run with the Royal College of Surgeons of Edinburgh This course was then further developed to include wider surgical safety issues to become the SOS (Safer Operative Surgery) courses which were run in 2007 These courses were designed for higher trainee and consultant surgeons only and were based on task analysis of surgeons’ non-technical skills, the NOTSS behaviour rating system, and underlying psychology (Flin et al 2007) In 2008/09 the Royal College of Surgeons
of Edinburgh is developing these courses for a multidisciplinary audience
The Future of non-Technical Skills in Surgical education
Although not formally achieved yet, the future of surgical training will need to encompass more than just clinical and technical skills (Davidson 2002) If the aviation model was to be adopted in surgery then experienced consultant surgeons would be taken off clinical work for a period to concentrate on assessing other consultants’ non-technical skills Assessments would be done using a framework such as NOTSS to rate observable skills in a simulated environment and during real cases in the operating theatre (similar to LOSA checks in aviation, see Chapter
25 in this volume by Musson) The assessors would be trained, calibrated, and
their competence to rate others assured at an acceptable a priori level Crucially,
the assessments would be ‘high stakes’ and surgeons would have to pass the assessment by displaying appropriate behaviours in order to continue operative surgery Surgeons who did not pass would be able to attend a remedial training course for those skills requiring attention This would require courses to be developed (e.g., Flin et al 2007), and the surgeon to then be assessed at a future point before being allowed back into clinical practice This process would apply to consultant surgeons although senior trainee surgeons would be assessed and given feedback on their non-technical skills as part of their ongoing training and may have to pass a non-technical skills assessment as part of the selection process into consultant grades Research teams may be involved in the training and assurance
of assessors, instructors and practising surgeons, and would be interested in the development of measures of behaviour and performance
This model may not be appropriate for surgery and competence assessment
at this time, but in the near future, recertification will be introduced as a part of revalidation, which will require global assessment of professional performance including the skills referred to above Moreover there are some promising advancements: research teams are developing, validating and collecting data
Trang 9with observational tools, appraisals are commonplace, and the introduction of Procedure-Based Assessment (PBAs) has demonstrated that there is more to surgery than technical skills, and that workplace assessment is the method by which consultant surgeons of the future will be assessed Perhaps as important
is that in some hospitals non-technical language is becoming common parlance both intra-operatively and in the coffee room However, the surgeons who use behaviour rating scales and discuss non-technical skills with their trainees are still
in the minority In order for widespread change in practice, a trigger is required, such as official endorsement by the Postgraduate Medical Education and Training Board (PMETB) or the Intercollegiate Surgical Curriculum Programme (ISCP),
or inclusion in the processes of revalidation of doctors which is currently being discussed
The Future of nOTSS Research: integrating Systemic issues in the
Operating Theatre
NOTSS has been widely cited in the clinical literature, adopted by professional bodies for training, and the system is being used by other research groups around the world However, a reliance solely on individual skills or even those of the surgical team will not achieve the levels of safety required by patients Feedback from users of the NOTSS system indicated that aspects of surgery such as scheduling, anaesthetic care, competence and experience of other staff, availability
of equipment in theatre, new technology and training also have an impact on surgical performance and surgical outcomes Attention to these components from systems-based thinking have been found to be particularly useful in understanding and improving the safety and reliability of complex systems in other high consequence industries such as power generation and aviation (Perrow 1999) There is emerging research on the impact of distractions (Sevdalis et al 2007) and latent failures (Catchpole et al 2007) on patient safety in the operating theatre, and tools for understanding the systemic causes of adverse events in the operating theatre (Taylor-Adams and Vincent 2004) but we do not yet have a complete understanding of the systems aspects that affect patient safety
The Accreditation Council for Graduate Medical Education in the USA explicitly demands that resident trainee surgeons obtain specific knowledge, skills and attributes to demonstrate ‘systems-based practice’ (ACGME, 2007) Professional skills training needs to incorporate content on systems thinking in order to meet the demands of modern surgery, and this content should be based
on research evidence In addition to the dangers that systems pose for safety, there are also strengths embedded in surgical systems that make surgeons and surgical teams resilient in the face of dynamic, error-producing conditions A new project, funded by the Royal College of Surgeons of Edinburgh is attempting to make these aspects of the surgical system explicit and measurable With this research strategy, in time we will understand more about individual skills, the role of the
Trang 10team and how they interact with the system to protect or harm patients, and have evidence-based tools and training to support the surgeons of the future
References
ACGME (2007) Common Program Requirements: General Competencies
Accreditation Counsel for Graduate Medical Education Available from:
<www.acgme.org/outcome/comp/GeneralCompetenciesStandards21307.pdf> [accessed October 2008]
Baldwin, P.J., Paisley, A.M and Paterson-Brown, S (1999) Consultant surgeons’
opinions of the skills required of basic surgical trainees British Journal of
Surgery 86, 1078–82.
Baker, D., Mulqueen, C and Dismukes, R (2001) Training raters to assess resource
management skills In E Salas, C Bowers and E Edens (eds), Improving
Teamwork in Organizations New Jersey: LEA, 131–45.
Catchpole, K.R., Giddings, A.E.B., Wilkinson, M., Hirst, G., Dale, T and de Leval,
M (2007) Improving patient safety by identifying latent failures in successful
operations Surgery 142, 102–10.
Christian, C., Gustafson, M., Roth, E., Sheridan T., Gandhi, T., Dwyer, K., Zinner,
M and Dierks, M (2006) A prospective study of patient safety in the operating
room Surgery 139, 159–73.
Crandall, B., Klein, G and Hoffman, R (2006) Working Minds: A Practitioner’s
Guide to Cognitive Task Analysis Boston: MIT Press.
Davidson, P (2002) The surgeon of the future and implications for training ANZ
Journal of Surgery 72, 822–8.
Edmondson, A.C (2003) Speaking up in the operating room: How team leaders
promote learning in interdisciplinary action teams Journal of Management
Studies 40(6), 1419–52.
Flanagan, J (1954) The critical incident technique Psychological Bulletin 51,
327–58
Fletcher, G., Flin, R., McGeorge, P., Glavin, R., Maran, N and Patey, R (2004) Rating non-technical skills: Developing a behavioural marker system for use
in anaesthesia Cognition Technology and Work 6, 165–71.
Flin, R., Goeters, K., Amalberti, R., et al (2003) The development of the
NOTECHS system for evaluating pilots’ CRM skills Human Factors and
Aerospace Safety 3, 95–117.
Flin, R., Yule, S., McKenzie, L., Paterson-Brown, S and Maran, N (2006a)
Attitudes to teamwork and safety in the operating theatre The Surgeon 4,
145–51
Flin, R., Yule, S., Paterson-Brown, S., Maran, N and Rowley, D (2006b) The
Non-Technical Skills for Surgeons (NOTSS) System Handbook (v1.2) Available at:
<www.abdn.ac.uk/iprc/notss>