1. Trang chủ
  2. » Thể loại khác

Advancing quaititative methods in second language research luke plonsky, routledge, 2015 scan

378 14 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 378
Dung lượng 8,54 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

This text will assist graduate programs in applied linguistics and second language acquisition/studies in providing “in-house” instruction on statistical techniques using sample data and

Trang 2

cur-to carry out a relatively advanced, novel, and/or underused statistical technique Using readily available statistical software packages such as SPSS, the chapters walk the reader from conceptualization through to output and interpretation of a range

of advanced statistical procedures such as bootstrapping, mixed effects modeling, cluster analysis, discriminant function analysis, and meta-analysis This practical hands-on volume equips researchers in applied linguistics and second language acquisition (SLA) with the necessary tools and knowledge to engage more fully with key issues and problems in SLA and to work toward expanding the statistical repertoire of the field

Luke Plonsky (PhD, Michigan State University) is a faculty member in the

Applied Linguistics program at Northern Arizona University His interests include SLA and research methods, and his publications in these and other areas have

appeared in Annual Review of Applied Linguistics, Applied Linguistics, Language ing, Modern Language Journal, and Studies in Second Language Acquisition, among other major journals and outlets He is also Associated Editor of Studies in Second Language Acquisition and Managing Editor of Foreign Language Annals.

Trang 3

Learn-SECOND LANGUAGE ACQUISITION RESEARCH SERIES

Susan M Gass and Alison Mackey, Series Editors

Monographs on Theoretical Issues:

Trang 4

The Longitudinal Study of Advanced L2 Capacities (2008)

The Psychology of the Language Learner—Revisited (2015)

Monographs on Research Methodology:

Trang 5

Second Language Research: Methodology and Design (2005)

Gass with Behney & Plonsky

Second Language Acquisition: An Introductory Course, Fourth Edition (2013)

Trang 7

First published 2015

by Routledge

711 Third Avenue, New York, NY 10017

and by Routledge

2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN

Routledge is an imprint of the Taylor & Francis Group, an informa business

© 2015 Taylor & Francis

The right of Luke Plonsky to be identified as the author of the editorial material, and of the authors for their individual chapters, has been asserted in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.

All rights reserved No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers.

Trademark notice: Product or corporate names may be trademarks or registered

trademarks, and are used only for identification and explanation without intent to infringe.

Library of Congress Cataloging-in-Publication Data

Plonsky, Luke.

Advancing quantitative methods in second language research / Luke Plonsky, Northern Arizona University.

pages cm — (Second Language Acquisition Research Series)

Includes bibliographical references and index.

1 Second language acquisition—Resesarch 2 Second language acquisition—Data processing 3 Language and languages—Study and teaching—Research 4 Language acquisition—Research 5 Language acquisition—Data processing 6 Quantitative research 7 Multilingual computing 8 Computational linguistics I Title

Trang 9

This page intentionally left blank

Trang 10

List of Illustrations xi Acknowledgments xvii

3 Statistical Power, p Values, Descriptive Statistics, and Effect

Sizes: A “Back-to-Basics” Approach to Advancing Quantitative

Luke Plonsky

4 A Practical Guide to Bootstrapping Descriptive Statistics,

Geoffrey T LaFlair, Jesse Egbert, and Luke Plonsky

CONTENTS

Trang 11

x Contents

Thom Hudson

Luke Plonsky and Frederick L Oswald

PART III

Eun Hee Jeon

Ian Cunnings and Ian Finlayson

9 Exploratory Factor Analysis and Principal

Shawn Loewen and Talip Gonulal

Beth Mackey and Steven J Ross

Index 347

Trang 12

3.2 Screenshot of effect size calculator for Cohen’s d 32

3.4 Linear regression dialogue box used to calculate CIs

3.7 Output for descriptive statistics produced through

3.8 Descriptive statistics and CIs for abstracts

4.4 Descriptive statistics table with bootstrapped 95% CIs

4.5 Correlations output table with bootstrapped 95% CIs

4.7 Independent-Samples Test output table with

ILLUSTRATIONS

Trang 13

xii Illustrations

4.8 Bootstrap mean differences, Q-Q plot, and jackknife-after- boot plot of the mean difference between English and

Vietnamese 664.9 Plot of the bootstrap T-statistics, their Q-Q plot, and the

4.10 One-way ANOVA output table with bootstrapped

5.1 Cleveland’s 1993 graphic display of barley harvest data from

5.2 Types of graphics used over last four regular issues of five

5.3 Bar chart showing means of listening scores for each

category of self-rated confidence ratings with

5.5 Grouped bar chart for speaking scores by gender

5.9 Box-and-whisker plots for the five proficiency levels

5.10 Student scores (means and CIs) on five tests administered

5.11 Mean scores and 95% CIs on reading, listening,

5.12 Graphic representation of score data across levels with

5.13 Scatter plot for the relationship between reading scores

5.15 Mean state scores for NAEP data in Table 5.4 ordered

5.17 Number of weekly online posts with sparklines showing

Trang 14

6.2 Example of a funnel plot without the presence

6.3 Example of a funnel plot with the presence

7.6 SPSS standard multiple regression dialogue boxes:

7.7 SPSS standard multiple regression dialogue boxes:

7.10 SPSS hierarchical regression analysis dialogue boxes:

7.11 SPSS hierarchical regression analysis dialogue boxes:

selections of PV for the second and final model

8.1 Q-Q plots for untransformed (left) and transformed

9.3 Example of KMO measure of sampling adequacy

Trang 15

10.9 Output file three-factor model with correlated error

11.1 Step 1 247

11.3 Step 3, part 1 24911.4 Step 3, part 2 24911.5 Step 4, part 1 25011.6 Step 4, part 2 251

11.12 Truncated agglomeration schedule for 947

11.13 Distance between fusion coefficients

Trang 16

11.25 Cluster membership by score level for the

11.26 Cluster membership by score level for the

13.6 Two-dimensional output for three group average values

TABLES

3.5 General benchmarks for interpreting d and r effect

5.1 Types of graphical charts and frequency of use found in

5.2 2009 average reading scale score sorted by gender, grade

5.3 2009 average NAEP reading scale scores by gender for

Trang 17

xvi Illustrations

5.4 2009 average NAEP reading scale scores by gender for

grade 12 public schools in 11 states sorted on state mean

6.1 Suggested categories for coding within meta-analyses of L2 research 110

7.6 SPSS output for variables entered/removed in hierarchical

7.8 SPSS output for ANOVA resulting from hierarchical

regression 153

11.1 Reformatted fusion coefficients for final six clusters formed 257

11.3 Means and standard deviations for the three–cluster solution 263

13.2 Box’s M output for testing homogeneity of covariance

13.4 Relationship output for individual predictor variables

13.6 Accuracy of classification output for membership

14.3 Comparison of Means software (exploratory and

Trang 18

I want to begin by expressing my sincere gratitude to the diverse set of individuals who have contributed to this volume in equally diverse ways I am very grateful, first of all, to all 18 chapter authors It is clear from their work that they are not only experts in the statistical procedures they have written about but in their abil-ity to communicate and train others on these procedures as well I also thank the authors for their perseverance and persistence in the face of my many requests

In addition to my own comments, each chapter was also reviewed by at least one reviewer from both the target audience (graduate students or junior researchers with at least one previous course in statistics) and from the modest pool of applied linguists with expertise in the focal procedure of each chapter I am very thankful for the comments and suggestions of these reviewers which led to many substan-tial improvements throughout the volume: Dan Brown, Meishan Chen, Euijung Cheong, Joseph Collentine, Jersus Colmenares, Scott Crossley, Deirdre Derrick, Jesse Egbert, Maria Nelly Gutierrez Arvizu, Eun Hee Jeon, Tingting Kang, Geof-frey LaFlair, Jenifer Larson-Hall, Jared Linck, Junkyu Lee, Qiandi Liu, Meghan Moran, John Norris, Gary Ockey, Fred Oswald, Steven Ross, Erin Schnur, and Soo Jung Youn Along these lines, my thanks go to the students in my ENG 599 and 705 courses, who read and commented on prepublication versions of many of the chapters in the book Special thanks to Deirdre Derrick for all her help on the index I also thank Shawn Loewen and Fred Oswald, both of whom have had a (statistically) significant effect on my development as quantitative researcher A big thanks go to Sue Gass and Alison Mackey, series editors, for their encouragement and support in carrying this book from an idea to its current form Last, thanks

to you, the reader, for your interest in advancing the field’s quantitative methods

In the words of Geoff Cumming, happy reading and “may all your confidence intervals be short!”

ACKNOWLEDGMENTS

Trang 19

This page intentionally left blank

Trang 20

Douglas Biber (Northern Arizona University)

James Dean Brown (University of Hawaii at Manoa) Ian Cunnings (University of Reading)

Jesse Egbert (Brigham Young University)

Ian Finlayson (University of Edinburgh)

Talip Gonulal (Michigan State University)

Thom Hudson (University of Hawaii at Manoa)

Eun Hee Jeon (University of North Carolina, Pembroke) Ute Knoch (University of Melbourne)

Geoffrey T LaFlair (Northern Arizona University) Shawn Loewen (Michigan State University)

Beth Mackey (University of Maryland)

Tim McNamara (University of Melbourne)

John M Norris (Georgetown University)

Frederick L Oswald (Rice University)

Luke Plonsky (Northern Arizona University)

Steven J Ross (University of Maryland)

Rob Schoonen (University of Amsterdam)

Shelley Staples (Purdue University)

CONTRIBUTORS

Trang 21

This page intentionally left blank

Trang 22

PART I

Introduction

Trang 23

This page intentionally left blank

Trang 24

Rationale for This Book

Several reviews of quantitative second language (L2) research have demonstrated that empirical efforts in the field rely heavily on a very narrow range of statisti-cal procedures (e.g., Gass, 2009; Plonsky, 2013) Namely, nearly all quantitative

studies employ t tests, ANOVAs, and/or correlations In many cases, these tests

are viable means to address the research questions at hand; however, problems associated with these techniques arise frequently (e.g., failing to meet statistical assumptions) More concerning, though, is the capacity of these tests to provide meaningful and informative answers to our questions about L2 learning, teach-ing, testing, use, and so forth Also concerning is that the near-default status of these statistics restricts researchers’ ability to understand relationships between constructs of interest as well as their use of analyses to examine such relationships

In other words, our research questions are being constrained by our knowledge

of statistical tools

This problem manifests itself in at least two ways First, it is not uncommon

to find researchers that convert intervally measured (independent) variables into categorical ones in order for the data to fit into an ANOVA model Doing so trades precious variance for what appears to be a more straightforward analyti-cal approach (see Plonsky, Chapter 3 in this volume, for further comments and suggestions related to this practice) Second, and perhaps more concerning, the relatively simple statistics found in most L2 research are generally unable to model the complex relationships we are interested in L2 learning and use are multivari-ate in nature (see, e.g., Brown, Chapter 2 in this volume) Many studies account for the complexity in these processes by measuring multiple variables Few, how-ever, attempt to analyze them using multivariate techniques Consequently, it is

1

INTRODUCTION

Luke Plonsky

Trang 25

4 Luke Plonsky

common to find 20 or 30 univariate tests in a single study leading to a greater chance of Type I error and, more importantly, a fractured view of the relation-ships of interest (Plonsky, 2013)

Before going on I need to clarify two points related to the intentions behind this volume First, neither I nor the authors who have contributed to this volume are advocating for blindly applied technical or statistical sophistication I agree wholeheartedly with the recommendation of the American Psychological Asso-ciation to employ statistical procedures that are “minimally sufficient” to address the research questions being posed (Wilkinson & Task Force on Statistical Infer-ence, 1999, p 598) Second, the procedures described in this book are just tools Yes, they carry great potential to help us address substantive questions that cannot otherwise be answered We have to remember, though, that our analyses must be guided by the substantive interests and relationships in question and not the other way around I mention this because of the tendency, particularly among novice researchers, to become fascinated with a particular method or statistic and to allow one’s research questions to be driven by the method

Having laid out these rationales and caveats at the heart of this volume is

an interest in informing and expanding the statistical repertoire of L2 researchers Toward this end, each chapter provides the conceptual motivation for and the practical, step-by-step guidance needed to carry out a relatively advanced, novel, and/or underused statistical technique using readily available statistical software packages (e.g., SPSS) In related disciplines such as education and psychology, these techniques are introduced in statistics texts and employed regularly Despite their potential in our field, however, they are rarely used and almost entirely absent from methodological texts written for applied linguistics

This volume picks up where introductory texts (e.g., Larson-Hall, 2015) leave off and assumes a basic understanding of research design as well as basic statistical

concepts and techniques used in L2 research (e.g., t test, ANOVA, correlation)

The book goes beyond these procedures to provide a “second course,” that is, a conceptual primer and practical tutorial on a number of analyses not currently available in other methods volumes in applied linguistics The hope is that, by doing so, researchers in the field will be better equipped to address questions cur-rently posed and to take on novel or more complex questions

The book also seeks to improve methodological training in graduate programs, the need for which has been suggested as the result of recent studies surveying both published research as well as researcher self-efficacy (e.g., Loewen et al., 2014; Plonsky, 2014) This text will assist graduate programs in applied linguistics and second language acquisition/studies in providing “in-house” instruction on statistical techniques using sample data and examples tailored to the variables, interests, measures, and designs particular to L2 research

Beyond filling gaps in the statistical knowledge of the field and in available texts and reference books, this volume also seeks to contribute to the budding method-ological and statistical reform movement taking place in applied linguistics The

Trang 26

field has seen a rapid increase in its awareness of methodological issues in the last decade Evidence of this movement, which holds that methodological rigor and transparency are critical to advancing our knowledge of L2 learning and teaching,

is found in meta-analyses (e.g., Norris & Ortega, 2000), methodological syntheses (e.g., Hashemi & Babaii, 2013; Plonsky & Gass, 2011), methodologically oriented conferences and symposia (e.g., the Language Learning Currents conference in 2013), and a number of article- and book-length treatments raising method-ological issues (e.g., Norris, Ross, & Schoonen, in press; Plonsky & Oswald, 2014; Porte, 2012) This book aims to both contribute to and benefit from the momen-tum in this area, serving as a catalyst for much additional work seeking to advance the means by which L2 research is conducted

Themes

In addition to the general aim of moving forward quantitative L2 research, three major themes present themselves across the volume The first and most prevalent theme is the role of researcher judgment in conducting each of the analyses pre-sented here Results based on statistical analyses can obscure the decisions made throughout the research process that led to those results As Huff (1954) states in

the now-classic How to Lie with Statistics, “despite its mathematical base, statistics

is as much an art as it is a science” (p 120) As noted throughout this book, sion points abound in more advanced and multivariate statistics These procedures involve multiple steps and are particularly subject to the judgment of individual researchers Consequently, researchers must develop and combine not only sub-stantive but also methodological/statistical expertise in order for the results of such analyses to maximally inform L2 theory, practice, and future research.The second theme, transparency, builds naturally on the first Appropriate deci-sion making is a necessary but insufficient requisite for the theoretical and/or practical potential of a study to be realized Choices made throughout the process must also be justified in the written report, giving proper consideration to the strengths and weaknesses resulting from each decision relative to other avail-able options Consumers of research can then more adequately and confidently interpret study results Of course, the need for transparency applies not only to methodological procedures but also to the reporting of data (see Larson-Hall & Plonsky, in press)

deci-The third major theme found throughout this volume is the interrelatedness of the procedures presented Statistical techniques are often presented and discussed

in isolation despite great conceptual and statistical commonalities ANOVA and multiple regression, for example, are usually considered—and taught—as distinct statistical techniques However, ANOVA can be considered a type of regression with a single, categorical predictor variable; see Cohen’s (1968) introduction to the general linear model (GLM) The relationship between these procedures can also be demonstrated statistically: The eta-squared effect size yielded by an

Trang 27

6 Luke Plonsky

ANOVA will be equal to the R2 from a multiple regression based on the same independent/predictor and dependent/criterion variables Both indices express the amount of variance the independent variable accounts for in the dependent variable Whenever applicable, the chapters in this volume have drawn attention

to such similarities and shared utility among procedures

Structure of the Book

This book is divided into three parts containing 14 chapters written by some of the most methodologically savvy scholars in the field Part I sets up the book and the techniques found throughout Brown’s chapter, following this brief introduc-tion, discusses the value and place of more advanced statistics, highlighting advan-tages/benefits and disadvantages of applying such techniques in L2 research The remaining two parts correspond to two complementary approaches to advancing quantitative L2 research The chapters in Part II seek to enhance and improve upon techniques currently in use A chapter I wrote begins Part III with a critique

of the status quo of null hypothesis significance testing (NHST) in L2 research The chapter then guides readers toward more appropriate and informative use

of p values, effect sizes, and descriptive statistics, particularly in the context of means-based comparisons (t tests, ANOVAs) and correlations LaFlair, Egbert, and

myself provide a step-by-by guide to an alternative approach to running these same analyses proposed to aid L2 researchers overcome some of the problems

commonly found in our data (e.g., non-normality, small Ns): bootstrapping

Hud-son then illustrates a number of key principles for visual presentations of titative data In the final chapter of Part II, Fred Oswald and I present a practical guide to conducting meta-analyses of L2 research (This chapter is an updated and expanded version of a similar one we published in 2012.)

quan-The eight chapters in the second section focus on more advanced statistical cedures that, despite their potential, are not commonly found in L2 research Each chapter begins with a conceptual overview followed by a step-by-step guide to the targeted technique These include multiple regression ( Jeon), mixed effects model-ing and longitudinal analysis (Cunnings & Finlayson), factor analysis (Loewen & Gonulal), structural equation modeling (Schoonen), cluster analysis (Staples & Biber), Rasch analysis (Knoch & McNamara), discriminant function analysis (Norris), and Bayesian data analysis (Mackey & Ross) Practice data sets have been provided on the companion website to go along with each chapter in this part of the book as well as with Chapters 3, 4, and 6 in the previous part The companion website can be found here: http://oak.ucc.nau.edu/ldp3/AQMSLR.html

pro-Software

One of the challenges in preparing and using a book like this one is ing the statistical software Such a decision involves considering accessibility, cost,

Trang 28

choos-TABLE 1.1 Software used and available for procedures in this book

Descriptives, NHST, effect

Bootstrapping (Chapter 4) SPSS, R Excel (macro)

Meta-analysis (Chapter 6) SPSS, Excel R

Multiple regression

Mixed effects, longitudinal

Factor analysis (Chapter 9) SPSS R, Excel (macro)

Structural equation modeling

(Chapter 10) LISREL, SPSS (AMOS) R, Excel (macro)

Cluster analysis (Chapter 11) SPSS R, Excel (macro)

Rasch analysis (Chapter 12) Winsteps, Facets SPSS (extension), R, Excel

(macro) Discriminant function

Bayesian analysis

(Chapter 14) Comparison of Means SPSS (AMOS), R, Excel

*I have limited additional options to SPSS, R, and Excel, the three most commonly used programs used for statistical analyses in applied linguistics according to Loewen et al (2014).

user friendliness, and consistency across chapters, among other issues more, there are numerous options available, each of which possess a unique set of strengths and weaknesses IBM’s SPSS, for example, is very user friendly but can

Further-be costly The default settings in SPSS can also lead to users not understanding the choices that the program makes for them (e.g., Mizumoto & Plonsky, in review; Plonsky & Gonulal, in press)

As shown in Table 1.1, most analyses in this book have been demonstrated using SPSS To a much lesser extent, Microsoft Excel and R (R development core team, 2014) have also been used along with, in a small number of cases, more specialized packages

Hashemi, M R., & Babaii, E (2013) Mixed methods research: Toward new research designs

in applied linguistics Modern Language Journal, 97, 828–852.

Huff, D (1954) How to lie with statistics New York: Norton & Company.

Larson-Hall, J (2015) A guide to doing statistics in second language research using SPSS and R

New York: Routledge.

Trang 29

8 Luke Plonsky

Larson-Hall, J., & Plonsky, L (in press) Reporting and interpreting quantitative research

findings: What gets reported and recommendations for the field Language Learning,

Norris, J M., & Ortega, L (2000) Effectiveness of L2 instruction: A research synthesis and

quantitative meta-analysis Language Learning, 50, 417–528.

Norris, J M., Ross, S., & Schoonen, R (Eds.) (in press) Improving and extending quantitative reasoning in second language research Malden, MA: Wiley.

Plonsky, L (2013) Study quality in SLA: An assessment of designs, analyses, and reporting

practices in quantitative L2 research Studies in Second Language Acquisition, 35, 655–687.

Plonsky, L (2014) Study quality in quantitative L2 research (1990–2010): A

methodologi-cal synthesis and methodologi-call for reform Modern Language Journal, 98, 450–470.

Plonsky, L., & Gass, S (2011) Quantitative research methods, study quality, and outcomes:

The case of interaction research Language Learning, 61, 325–366.

Plonsky, L., & Gonulal, T (2015) Methodological reviews of quantitative L2 research: A

review of reviews and a case study of exploratory factor analysis Language Learning,

R development core team (2014) R: A language and environment for statistical computing

Vienna, Austria: R Foundation for Statistical Computing.

Wilkinson, L., & Task Force on Statistical Inference (1999) Statistical methods in

psychol-ogy journals: Guidelines and explanations American Psychologist, 54, 594–604.

Trang 30

WHY BOTHER LEARNING

ADVANCED QUANTITATIVE

METHODS IN L2 RESEARCH?

James Dean Brown

Why would anyone bother to learn advanced quantitative methods in second language (L2) research? Isn’t it bad enough that language researchers often need

to learn basic statistical analyses? Well, the answer is that learning statistics is like learning anything else: About the time you think you’ve finished, you find that there is so much more to learn Like so many things, every time you come to the crest of a hill, you see the next hill So maybe instead of asking “Why bother?” you should be asking “What’s next after I learn the basic stats?” That is what this book

is about In this chapter, I will summarize some of the benefits that you can reap from taking that next step and continuing to learn more advanced techniques in quantitative analysis Naturally, such benefits must always be weighed against any disadvantages as well, so I will consider those too

What Are the Advantages of Using Advanced

Quantitative Methods?

By advantages, I mean the benefits, the plusses, and the pros of learning and using the advanced quantitative methods covered in this book and elsewhere The primary advantages are that you can learn to measure more precisely, think beyond the basic null hypothesis significance test, avoid the problem of multiple comparisons, increase the statistical power of your studies, broaden your research perspective, align your research analyses more closely to the way people think, reduce redundancy and the number of variables, expand the number and types of variables, get more flexibility in your analyses, and simultaneously address multiple levels of analysis Let’s consider each of these advantages in turn

Trang 31

10 James Dean Brown

Measuring More Precisely

One concern that all researchers should share is for the accuracy and precision of the ways they measure the variables in their studies Variables can be quantified

as nominal, ordinal, interval, or ratio scales (for a readily available, brief review of these four concepts, see Brown, 2011a) Variables that are nominal, ordinal, or ratio scales in nature can typically be observed and quantified fairly easily and reliably However, interval scales (e.g., proficiency test scores, questionnaire subsection scores) may be more problematic That is why you should take special care in developing and piloting such measures and should always report the reliability in your study of the resulting scores as well as arguments that support their valid-ity One issue that is seldom addressed is the degree to which these “interval” scales are actually true interval scales Can you say that the raw score points on a particular test actually represent equal intervals? If not, then defending the scores

as an interval scale may not be justified One solution to that problem is to use

an advanced statistical technique called Rasch analysis This form of analysis can help you analyze and improve any raw-score interval scales you use, but also as

a byproduct, you can use Rasch analyses to convert those raw scores into logit

scores which arguably form a true interval scale There are a number of other

rea-sons why you might want to use Rasch analysis to better understand your scales and how precisely you are measuring the variables in your studies (Knoch & McNamara, Chapter 12 in this volume)

Thinking Beyond the Null Hypothesis Significance Test

In this volume, in Chapter 3 Plonsky examines the proper place of null hypothesis

significance testing (NHST) and the associated p values, as well as the importance

of examining the descriptive statistics that underlie the NHST and considering the statistical power of the study as well as the estimated effect sizes As far back as the 1970s, I can remember my statistics teachers telling me that doing an analysis

of variance (ANOVA) procedure and finding a significant result is just the ning They always stressed the importance of considering the assumptions and of following up with planned or post hoc comparisons, with plots of interaction effects, and with careful attention to the descriptive and reliability statistics In the ensuing years, the importance of also considering confidence intervals (CI), power, and effect sizes (for more on these concepts see Plonsky, Chapter 3 in this volume; Brown 2007, 2008a, 2011b) has become increasingly evident All

begin-of these advanced follow-up strategies are so much more informative than the initial result that it is downright foolish to stop interpreting the results once you

have found a significant p value Similar arguments can be made for following

up on initial multiple-regression results, on complex contingency table analyses,

or on any other form of analysis you may perform The point is that you should

never stop just because you got (or didn’t get) an initial significant p value There

Trang 32

is so much more to be learned from using follow-up analyses and more still from thinking about all of your results as one comprehensive picture of what is going

on in your data

Avoiding the Problem of Multiple Comparisons

Another important benefit of using advanced statistical analyses is that they can help you avoid the problem of multiple comparisons, also known technically as avoiding Type I errors (incorrect rejection of a true null hypothesis) This is a problem that arises when a researcher uses a univariate statistical test (one that was designed to make a single hypothesis test at a specified probability level within the NHST framework) multiple times in the same study with the same data For more advanced ANOVA techniques with post hoc comparisons, or for studies with multiple dependent variables, multivariate ANOVA (or MANOVA) designs can greatly expand the possibilities for controlling or minimizing such Type I errors These strategies work because they make it possible to analyse more variables simultaneously and adjust for multiple comparisons, thereby giv-ing greater power to the study as a whole and avoiding or minimizing Type

I errors For more on this topic, see Plonsky, Chapter 3 in this volume, or Brown (1990, 2008b)

Increasing Statistical Power

Another way of looking at the issue of multiple statistical tests just described is that many of the more complex (and multivariate) statistical analyses provide strategies and tools for more powerful tests of significance when compared with

a series of univariate techniques used with the same data In the process, using these more omnibus designs, researchers are more likely to focus on CIs, effect sizes, and power instead of indulging in the mania for significance that multiple comparisons exemplifies (again see Plonsky, Chapter 3 in this volume)

In addition, as LaFlair, Egbert, and Plonsky point out in Chapter 4, the advanced

statistical technique called bootstr apping provides a nonparametric alternative to the t-test and ANOVA that can help to overcome problems of small sample sizes

and nonnormal distributions, and do so with increased statistical power Since many studies in our field have these problems with sample size and normality, bootstrapping is an advanced statistical technique well worth knowing about

Broadening Your Research Perspective

More advanced statistical analyses will also encourage you to shift from a myopic focus on single factors or pairs of factors to examining multiple relationships among a number of variables Thus, you will be more likely to look at the larger picture for patterns Put another way, you are more likely to consider all parts of

Trang 33

12 James Dean Brown

the picture at the same time, and might therefore see relationships between and among variables (all at once) that you might otherwise have missed or failed to understand

Indeed, you will gain an even more comprehensive view of the data and results for a particular area of research by learning about and applying an advanced

technique called meta-analysis As Plonsky and Oswald explain (Chapter 6 in this

volume), meta-analysis can be defined narrowly as “a statistical method for culating the mean and the variance of a collection of effect sizes across studies,

cal-usually correlations (r) or standardized mean differences (d )” or broadly as “not

only these narrower statistical computations, but also the conceptual integration

of the literature and the findings that gives the meta-analysis its substantive ing” (p 106) Truly, this advanced form of analysis will give you the much broader perspective of comparing the results from a number of (sometimes contradictory) studies in the same area of research

mean-Aligning Your Research Analyses More Closely

to the Way People Think

Because of their broadened focus, many advanced analyses more closely match the ways that you actually think (or perhaps should think) about your data More specifically, language learning is complex and complicated to think about, and some of the advanced statistics can account for such complexity by allowing the study of multiple variables simultaneously, which of course provides a richer and more realistic way of looking at data than is provided by examining one single variable at a time or even pairs of variables

In addition, Hudson (Chapter 5 in this volume) explains the importance of visually representing the data and results and doing so effectively Two of the follow-up strategies mentioned earlier (plotting the interaction effects and CIs) are often effectively illustrated or explained in graphical representations (as line graphs and box-and-whisker plots, respectively) Indeed, thinking beyond the ini-tial NHST and using more advanced statistical analyses will naturally tend to lead you to use tables and figures to visualize many relationships simultaneously For example, a table of univariate follow-up tests adjusted for multiple comparisons puts all of the results in one place and forces you and your reader to consider them

as a package; a factor analysis table shows the relationships among dozens of ables in one comprehensive way; a Rasch analysis figure can show the relation-ships between individual examinees’ performances and the item difficulty at the same time and on the same scale; and a structural equation model figure shows the relationships among all the variables in a study in an elegant comprehensive picture Such visual representations will not only help you interpret the complex-ity and richness of your data and results, but will also help your readers understand your results as a comprehensive set

Trang 34

vari-Reducing Redundancy and the Number of Variables

Few researchers think about it, but advanced statistical analyses can also help you by reducing the confusion of data that you may face Since these advanced analyses often require careful screening of the data, redundant variables (e.g., two variables correlating at say 90, which means they are probably largely represent-ing the same construct) are likely to be noticed and one of them eliminated (to

avoid what is called multicollinearity) In fact, one very useful function of factor

analysis (see Loewen & Gonulal, Chapter 9 in this volume) in its many forms is

data reduction For example, if a factor analysis of 32 variables reveals only eight

factors, researchers might want to consider the possibility that there is erable redundancy in her data As a result, she may decide to select only those eight variables with the highest loadings on the eight factors, or may decide to collapse (by averaging them) all of the variables loading together on each factor

consid-to create eight new variables, or may decide consid-to use the eight sets of facconsid-tor scores produced by the factor analysis as variables Whatever choice is made, the study will have gone from 32 variables (with considerable dependencies, relationships, and redundancies among them) to eight variables (that are relatively orthogonal,

or independent) Such a reduction in the number of variables will very often have the beneficial effect of increasing the overall power of the study as well as the parsimony in the model being examined (see Jeon, Chapter 7 in this volume)

Expanding the Number and Types of Variables

Paradoxically, while reducing the number of variables, advanced statistical analyses can also afford you the opportunity to expand the number and types of variables

in your study in important ways For instance, research books often devote

con-siderable space to discussions of moderator variables and control variables, but simple

univariate analyses do not lend themselves to including those sorts of variables Fortunately, more complex analyses actually allow including such variables in

a variety of ways More precisely, multivariate analyses allow you to introduce

additional moderator variables to determine the links between the independent and

dependent variables or to specify the conditions under which those associations take place Similarly, various multivariate analyses can be structured to include

control variables or associations between variables, while examining still other

asso-ciations (e.g., partial and semi-partial correlations, covariate analyses, hierarchical multiple regressions) Thus, moderator and control variables not only become a reality, but can also help us to more clearly understand the core analyses in a study

Getting More Flexibility in Your Analyses

Most quantitative research courses offer an introduction to regression analysis,

which is a useful form of analysis if you want to estimate the degree of relationship

Trang 35

14 James Dean Brown

between two continuous scales (i.e., interval or ratio) or to predict one of those scales from the other However, more advanced statistical analyses offer consider-

ably more flexibility For instance, multiple regression (see Jeon, Chapter 7 in this

volume) allows you the possibility of predicting one dependent variable from

multiple continuous and/or categorical independent variables Discriminant tion analysis (see Norris, Chapter 13 in this volume) makes it possible to predict

func-a cfunc-ategoricfunc-al vfunc-arifunc-able from multiple continuous vfunc-arifunc-ables (or more func-accurfunc-ately, to determine the degree to which the continuous variables correctly classify mem-

bership in the categories) Logistic regression makes it possible to predict a

categori-cal variable such as group membership from categoricategori-cal or continuous variables,

or both Loglinear modeling can be applied to purely categorical data to test the

fit of a regression-like equation to the data For excellent coverage of all of these forms of analysis, see Tabachnick and Fidell (2013)

Other advanced statistical procedures provide the flexibility to look beyond simple relationships to patterns in relationships For example, instead of look-ing at a correlation coefficient or a matrix of simple correlation coefficients, it

is possible to examine patterns in those correlation coefficients by performing

factor analysis, which can reveal subsets of variables in a larger set of variables that

are related within subsets, yet are fairly independent between subsets The three types of factor analysis (principle components analysis, factor analysis, and con-firmatory factor analysis; see Chapter 9 in this volume for Loewen and Gonulal’s explanation of the differences) can help you understand the underlying pattern

of relationships among your variables, and thereby help you to: (a) determine which variables are redundant and therefore should be eliminated (as described earlier); (b) decide which variables or combination of variables to use in sub-sequent analyses; and (c) item-analyze, improve, and/or validate your measures

In contrast, cluster analysis is a “multivariate exploratory procedure that is used

to group cases (e.g., participants or texts) Cluster analysis is useful in studies where there is extensive variation among the individual cases within predefined categories” (Staples & Biber, Chapter 11 in this volume, p 243) Also useful is

multiway analysis, which can help you study the associations among three or

more categorical variables (see Tabachnick & Fidell, 2013 for more on multiway analysis)

Another form of analysis that provides you with considerable flexibility is

structural equation modeling (SEM), which is

a collection of analyses that can be used for many questions in L2 research SEM can deal with multiple dependent variables and multiple independent variables, and these variables can be continuous, ordinal or discrete [also

known as categorical ], and they can be indicated as observed variables (i.e.,

observed scores) or as latent variables (i.e., the underlying factor of a set of observed variables) (Mueller & Hancock, 2008; Ullman, 2006)

(Schoonen, Chapter 10 in this volume, p 214)

Trang 36

SEM combines ideas that underlie many of the other forms of analysis discussed here, but can additionally be used to model theories (a) to investigate if your data fit them, (b) to compare that fit for several data sets (e.g., for boys and girls), or (c) to examine changes in fit longitudinally.

With regard to means comparisons, mixed effects models (see Cunnings &

Finlayson, Chapter 8 in this volume), which by definition are models that include both fixed and random effects, are flexible enough to be used with data that are normally distributed or that are categorical (i.e., nonnumeric) In addition, mixed effects models are especially useful when designs are unbalanced (i.e., groups have different numbers of participants in each) or have missing data Importantly,

if you are studying learning over time, these models can accommodate repeated measures in longitudinal studies

Simultaneously Addressing Multiple Levels of Analysis

Advanced statistical analyses, especially multivariate analyses, also encourage researchers to use more than one level of analysis Indeed, these advanced analy-ses can provide multiple levels of analysis that help in examining data and the phenomena they represent in overarching ways A simple example is provided by MANOVA, which is a first stage that can justify examining multiple univariate

ANOVAs (with p values adjusted for the multiple comparisons) in a second stage

Stepwise regression or hierarchical/sequential versions of various analyses allow researchers to analyze predictor variables and combinations of variables in stages, even while factoring out another variable or combination of variables

Similarly, Bayesian data analysis as Mackey and Ross apply it to item analysis in Chapter 14 (in this volume) not only provides an alternative to NHST ANOVA approaches, but in fact,

The conceptual difference between null hypothesis testing and the

Bayes-ian alternative is that predictions about mean differences are stated a priori

in a hierarchy of differences as motivated by theory-driven claims In this

approach, the null hypothesis is typically superfluous, as the researchers aim

to confirm that the predicted order of mean differences are instantiated in the data Support for the hierarchically ordered means hypothesis is evident only if the predicted order of mean differences is observed The predicted and plausible alternative hypotheses thus must be expressed in advance of the data analysis—thus making the subsequent ANOVA confirmatory

(Mackey & Ross, Chapter 14 in this volume, p 334)Clearly, this advanced alternative form of analysis not only provides a means for examining data hierarchically and with consideration to previous findings and/

or theoretical predictions, but in fact, it also demands that the data be examined

in that way from the outset

Trang 37

16 James Dean Brown

What Are the Disadvantages of Using Advanced

Quantitative Methods?

So far, I have shown some of the numerous advantages of learning more about advanced statistical analyses But given the investment of time and energy involved, the disadvantages of using these advanced techniques should be weighed as well

I will take up those issues next By disadvantages, I mean the difficulties that are likely to be encountered in learning and using advanced quantitative methods like those covered in this book

Larger Sample Sizes

Many of the advanced statistical procedures require larger sample sizes than the more traditional and simpler univariate analyses The sample sizes often need to

be in the hundreds, if not bigger, in order to produce meaningful and table results The central problem with applying many of these advanced statistics

interpre-to small samples is that the standard errors of all the estimates will tend interpre-to be large, which may make analyzing the results meaningless

Unfortunately, getting large sample sizes is often difficult because you will need to get people to cooperate and to get approval from human subjects com-mittees Getting people to cooperate is a problem largely because people are busy and, more to the point, they do not feel that your study is as important as you feel it is Getting human subjects committees to approve your research can also

be vexingly difficult because those committees are often made up of researchers from other fields who have little sympathy for or understanding of the problems

of doing L2 research Nonetheless, for those doing advanced statistical analyses, getting an adequate sample size is crucial, so the data gathering stage in the research process is an important place to invest a good deal of your time and energy

Additional Assumptions

Another disadvantage of the more advanced statistical procedures is that they tend to require that additional assumptions be met Where a simple correlation coefficient will have three assumptions, a multiple regression analysis will have

at least five assumptions, two of which will require the data screening discussed

in the next paragraph In addition, whereas for univariate statistics a good deal

is known about the robustness of violating assumptions (e.g., it is known that ANOVA is fairly robust to violations of the assumption of equal variances if the cell sizes are fairly similar), less is known about such robustness in the more complex designs of advanced statistical procedures For a summary of assumptions underlying univariate and some multivariate statistics, see Brown (1992), or for multivariate statistics, see the early sections of each of the chapters in Tabachnick and Fidell (2013)

Trang 38

Need for Data Screening

In analyzing whether the data in a study meet the assumptions of advanced

statisti-cal procedures, data screening is often essential For example, univariate normality

(for each variable) and multivariate normality (for all variables taken together) are assumptions of a number of the more advanced forms of statistical analysis Screening the data to see if these assumptions are met means examining the data for univariate and multivariate outliers, as well as examining skew and kurtosis sta-tistics for each variable and sometimes looking at the actual histograms to ensure that they look approximately normal Not only are such procedures tedious and time consuming, but also they may require you to eliminate cases that are outliers, change some of your data points to bring them into the distribution, or mathe-matically transform the data for one variable or more Such moves are not difficult, but they are tedious In addition, they are hard to explain to the readers of a study

in applied linguistics and may seem to those readers as though you are ing your data Worse yet, moves like mathematical transformations take the analysis one step away from the original data, which may start to become uncomfortable even for you (e.g., what does the correlation mean between a normally distributed scale and one transformed with its natural log, and how do you explain that to your readers?) Nonetheless, the assumptions of advanced procedures, and the sub-sequent data screening may make such strategies absolutely necessary

manipulat-Complexity of Analyses and Interpretations

There is no question that advanced statistical techniques, especially ate ones, are more difficult to analyze and interpret First, because they involve higher-level mathematics than univariate statistics, you may find yourself learning things like matrix algebra for the first time in your life Second, because many

multivari-of the analyses involve tedious recursive procedures, it is absolutely essential to use statistical computer programs (many of which are very expensive) to analyze the data Third, the results in the computer output of advanced statistical tech-niques, especially multivariate ones, are often much more difficult to interpret than those from simpler univariate statistical analyses In short, as Tabachnick and Fidell (2013) put it: “Multivariate methods are more complex than univariate by

at least an order of magnitude” (p 1)

Are the Disadvantages Really Disadvantages?

Fortunately, I have noticed over the years that the disadvantages of learning and using advanced quantitative methods most often lead to long-term advantages

Larger Sample Sizes

For example, the need to obtain large sample sizes forces you to get responsibly large sample sizes These large sample sizes lead in the long run to more stable

Trang 39

18 James Dean Brown

results, a higher probability of finding significant results if they exist, more ful results, and ultimately to more credible results in your own mind as well as in the minds of your readers

power-Additional Assumptions

Checking the more elaborate assumptions of advanced statistical tests forces you

to slow down at the beginning of your analyses and think about the descriptive statistics, the shapes of the distributions involved, the reliability of various mea-surements, the amounts of variance involved and accounted for, the degrees of redundancy among variables, any univariate or multivariate outliers, and so forth Ultimately, all of this taken together with the results of the study can and should lead to greater understanding of your data and results

Need for Data Screening

The need for data screening similarly forces you to consider descriptive statistics, distributions, reliability, variance, redundancy, and outliers in the data, but at a time when something can be done to make the situation better by eliminating outliers or bringing them into the relevant distribution, by transforming variables that are skewed, and so forth Even if you cannot fix a problem that you have noticed in data screening, at the very least, you will have been put on notice that

a problem exists (or an assumption has been violated) such that this information can be taken into account when you interpret the results later in the study

Complexity of Analyses and Interpretations

In discussing the complexity issue, I mentioned earlier that Tabachnick and Fidell (2013) said that, “Multivariate methods are more complex than univariate by at least an order of magnitude.” But it is worth noting what they said directly after that: “However, for the most part, the greater complexity requires few conceptual leaps Familiar concepts such as sampling distributions and homogeneity of vari-ance simply become more elaborate” (p 1) Moreover, given the advantages of using advanced statistical techniques, they may well (a) force you to learn matrix algebra for the first time in your life, which will not only make it possible for you

to understand the more advanced statistics, but also make the math underlying the simpler statistics seem like child’s play; (b) motivate you to find a grant to pay for the computer software you need, or some other way to get your institution pay for it, or indeed, to finally sit down and learn R, which is free; and (c) push you to finally get the manuals and/or supplementary books you need to actu-ally understand the output and results of your more elaborate statistical analyses, and again, doing so will make the output from simpler statistical analyses seem like child’s play In short, the added complexity involved in advanced statistical

Trang 40

analyses is not all bad Indeed, it can lead you to exciting places you never thought you would go.

Conclusion

In writing this chapter, I wrestled with using the word advantages Perhaps it is

bet-ter to think about the advanced procedures described here as opening up options rather than as having advantages—but then it occurred to me that people with those options will have distinct advantages, so I stayed with the idea of advantages.That is not to say that using advanced statistics, especially multivariate analyses, for every study will be the best way to go For example, I once had a student who hated statistics so much that he set out to write a paper that used only descriptive

statistics and a single t-test, and he did it, writing an elegant, straightforward, and

interesting paper Simple as it was, he was using exactly the right tools for that research project

However, learning new, advanced statistical techniques can help you to stay interested and up-to-date in your research Having multiple options can also help you avoid getting stuck in a statistical rut For instance, I know of one researcher

in our field who clearly learned multiple regression (probably for her tion) and has used that form of analysis repeatedly and almost exclusively across

disserta-a number of studies She is cledisserta-arly stuck in disserta-a stdisserta-atisticdisserta-al rut She is holding disserta-a hdisserta-am-mer, so she uses it for everything, including screws I just wish she would extend her knowledge to include some other advanced statistical procedures, especially extensions of regression like factor analysis or SEM

ham-The bottom line here is that advanced statistics like those covered in this book can be useful and even exciting to learn, but the harsh reality is that these forms

of analysis will mean nothing without good ideas, solid research designs, reliable measurement, sound data collection, adequate data screening, careful checking of assumptions, and comprehensive interpretations that include all facets of the data, their distributions, and all of the statistics in the study

Fortunately, you have this book in your hands I say fortunately because this

col-lection of chapters is a particularly good place for L2 researchers to start expanding their knowledge of advanced statistical procedures: It covers advanced statistical

techniques; it was written by L2 researchers; it was written for L2 researchers; and

it contains examples drawn from L2 research

Good researching!

References

Brown, J D (1990) The use of multiple t tests in language research TESOL Quarterly, 24(4), 770–773.

Brown, J D (1992) Statistics as a foreign language—Part 2: More things to look for in

read-ing statistical language studies TESOL Quarterly, 26(4), 629–664.

Ngày đăng: 28/07/2020, 00:14

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w