concurrent programming on windows

Every self-respecting Windows developer should read this book." -Jonathan Skeet, Software Engineer, Clearswift "What I love about this book is that it is both comprehensive in its cove

Trang 1

Foreword by Craig Mundie, Chief Research and Strategy Officer, Microsoft

Trang 2

Praise for Concurrent Programming on Windows

"I have been fascinated with concurrency ever since I added threading support

to the Common Language Runtime a decade ago That's also where I met Joe, who is a world expert on this topic These days, concurrency is a first-order concern for practically all developers Thank goodness for Joe's book It is a tour

de force and I shall rely on it for many years to come."

-Chris Brumme, Distinguished Engineer, Microsoft

"I first met Joe when we were both working with the Microsoft CLR team At that time, we had several discussions about threading and it was apparent that

he was as passionate about this subject as I was Later, Joe transitioned to Microsoft's Parallel Computing Platform team where a lot of his good ideas about threading could come to fruition Most threading and concurrency books that I have come across contain information that is incorrect and explains how

to solve contrived problems that good architecture would never get you into in the first place Joe's book is one of the very few books that I respect on the matter, and this respect comes from knowing Joe's knowledge, experience, and his ability to explain concepts."

-Jeffrey Richter, Wintellect

"There are few areas in computing that are as important, or shrouded in mystery,

as concurrency It's not simple, and Duffy doesn't claim to make it so-but armed with the right information and excellent advice, creating correct and highly scalable systems is at least possible Every self-respecting Windows developer should read this book."

-Jonathan Skeet, Software Engineer, Clearswift

"What I love about this book is that it is both comprehensive in its coverage of concurrency on the Windows platform, as well as very practical in its presen tation of techniques immediately applicable to real-world software devel opment Joe's book is a 'must have' resource for anyone building native or managed code Windows applications that leverage concurrency!"

-Steve Teixeira, Product Unit Manager, Parallel Computing Platform, Microsoft Corporation

Trang 3

"This book is a fabulous compendium of both theoretical knowledge and practical guidance on writing effective concurrent applications Joe Duffy is not only a preeminent expert in the art of developing parallel applications for Windows, he's also a true student of the art of writing For this book, he has combined those two skill sets to create what deserves and is destined to be a

-Stephen Toub, Program Manager Lead, Parallel Computing Platform, Microsoft

II

As chip designers run out of ways to make the individual chip faster, they have moved towards adding parallel compute capacity instead Consumer PCs with multiple cores are now commonplace We are at an inflection point where improved performance will no longer come from faster chips but rather from our ability as software developers to exploit concurrency Understanding the concepts of concurrent programming and how to write concurrent code has therefore become a crucial part of writing successful software With Concurrent Programming on Windows, Joe Duffy has done a great job explaining concurrent concepts from the fundamentals through advanced techniques The detailed descriptions of algorithms and their interaction with the underlying hardware turn a complicated subject into something very approachable This book is the perfect companion to have at your side while writing concurrent software for Windows."

-Jason Zander, General Manager, Visual Studio, Microsoft

Trang 4

Concurrent Programming

on Windows

Trang 5

Microsoft .NET Development Series

John Montgomery, Series Advisor

Don Box, Series Advisor

Brad Abrams, Series Advisor

The award-winning Microsoft NET Development Series was established in 2002 to provide professional developers with the most comprehensive and practical coverage of the latest NET technologies It is supported and developed by the leaders and experts of Microsoft development technologies, including Microsoft architects, MVPs, and leading industry luminaries Books in this series provide a core resource of information and understanding every developer needs to write effective applications

Titles in the Series

Brad Abrams, NET Framework Standard Library

Annotated Reference Volume 1: Base Class Library and

Extended Numerics Library, 978-0-321-15489-7

Brad Abrams and Tamara Abrams, NET Framework

Standard Library Annotated Reference, Volume 2:

Networking Library, Reflection Library, and XML Library,

Adam Calderon, Joel Rumerman, Advanced ASP.NET

AJAX Server Controls: For NET Framework 3.5,

978-0-321-51444-8

Eric Carter and Eric Lippert, Visual Studio Tools for Office:

Using C# with Excel, Word, Outlook, and InfoPath,

978-0-321-33488-6

Eric Carter and Eric Lippert, Visual Studio Tools for

Office: Using Visual Basic 2005 with Excel, Word, Outlook,

and InfoPath, 978-0-321-41175-4

Steve Cook, Gareth Jones, Stuart Kent, Alan Cameron

Wills, Domain-Specific Development with Visual Studio

DSL Tools, 978-0-321-39820-8

Krzysztof Cwalina and Brad Abrams, Framework Design

Guidelines: Conventions, Idioms, and Patterns for Reusable

NET Libraries, Second Edition, 978-0-321-54561-9

Joe Duffy, Concurrent Programming on Windows,

978-0-321-43482-1

Sam Guckenheimer and Juan J Perez, Software

Engineering with Microsoft Visual Studio Team System,

978-0-321-27872-2

Anders Hejlsberg, Mads Torgersen, Scott Wiltamuth,

Peter Golde, T he C# Programming Language, Third Edition,

978-0-321-56299-9

Alex Homer and Dave Sussman, ASPNET 2.0 Illustrated,

978-0-321-41834-0

Joe Kaplan and Ryan Dunn, T he NET Developer's Guide to

Directory Services Programming, 978-0-321-35017-6

Mark Michaelis, Essential C# 3.0: For NET Framework 3.5,

978-0-321-53392-0

James S Miller and Susann Ragsdale,

T he Common Language Infrastructure Annotated Standard, 978-0-321-15493-4

Christian Nagel, Enterprise Services with the NET Framework: Developing Distributed Business Solutions with NET Enterprise Services, 978-0-321-24673-8 Brian Noyes, Data Binding with Windows Forms 2.0:

Programming Smart Client Data Applications with NET, 978-0-321-26892-1

Brian Noyes, Smart Client Deployment with ClickOnce: Deploying Windows Forms Applications with ClickOnce, 978-0-321-19769-6

Fritz Onion with Keith Brown, Essential ASPNET 2.0, 978-0-321-23770-5

Steve Resnick, Richard Crane, Chris Bowen, Essential Windows Communication Foundation: For NET Framework 3.5,978-0-321-44006-8

Scott Roberts and Hagen Green, Designing Forms for Microsoft Office InfoPath and Forms Services 2007, 978-0-321-41059-7

Neil Roodyn, eXtreme NET: Introducing eXtreme Programming Techniques to NET Developers, 978-0-321-30363-9

Chris Sells and Michael Weinhardt, Windows Forms 2.0 Programming, 978-0-321-26796-2

Dharma Shukla and Bob Schmidt, Essential Windows Workflow Foundation, 978-0-321-39983-0

Guy Smith-Ferrier, NET Internationalization:

T he Developer's Guide to Building Global Windows and Web Applications, 978-0-321-34138-9 Will Stott and James Newkirk, Visual Studio Team System: Better Software Development for Agile Teams, 978-0-321-41850-0

Paul Yao and David Durant, NET Compact Framework Programming with C#, 978-0-321-17403-1

Paul Yao and David Durant, NET Compact Framework Programming with Visual Basic NET, 978-0-321-17404-8

Trang 7

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals

The NET logo is either a registered trademark or trademark of Microsoft Corporation in the United States and/ or other countries and is used under license from Microsoft

The author and publisher have taken care in the preparation of this book, but make no expressed or implied warranty

of any kind and assume no responsibility for errors or omissions No liability is assumed for incidental or conse quential damages in connection with or arising out of the use of the information or programs contained herein The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales, which may include electronic versions and/or custom covers and content particular to your business, training goals, marketing focus, and branding interests For more information, please contact:

U.s Corporate and Government Sales

Visit us on the Web: informit.com/ aw

Library o/Congress Cataloging-in-Publication Data

Duffy, Joe,

1980-Concurrent programming on Windows / Joe Duffy

p cm

Includes bibliographical references and index

ISBN 978-0-321-43482-1 (pbk : alk paper) 1 Parallel programming (Computer science)

2 Electronic data processing-Distributed processing 3 Multitasking (Computer science)

4 Microsoft Windows (Computer file) I Title

Pearson Education, Inc

Rights and Contracts Department

501 Boylston Street, Suite 900

Boston, MA 02116

Fax (617) 671-3447

ISBN-13: 978-0-321-43482-1

ISBN-lO: 0-321-43482-X

Text printed in the United States on recycled paper at Edwards Brothers in Ann Arbor, Michigan

First printing, October 2008

Trang 8

For Mom & Dad

•

Trang 10

5 Windows Kernel Synchronization 183

6 Data and Control Synchronization 253

7 Thread Pools 315

8 Asynchronous Programming Models 399

9 Fibers 429

PART III Techniques 475

10 Memory Models and Lock Freedom 477

11 Concurrency Hazards 545

Trang 11

x C o n t e n t s a t a G l a n ce

12 Parallel Containers 613

13 Data and Task Parallelism 657

14 Performance and Scalability 735

PART IV Systems 783

15 Input and Output 785

16 Graphical User Interfaces 829

PART V Appendices 863

A Designing Reusable Libraries for Concurrent NET Programs 865

B Parallel Extensions to NET 887

Index 931

Trang 12

Why Not Concurrency? 1 0

Where Are We? 1 1

2 Synchronization and Time 13

Managing Program State 1 4

Identifying Shared vs Private S tate 1 5

State Machines and Time 1 9

Isolation 3 1

Immutability 34

Synchronization: Kinds and Techniques 38

Data Synchronization 40

Coordination and Control Synchronization 60

Where Are We? 73

Trang 13

xii C o n t e n t s

PART II Mechanisms 77

3 Threads 79

Threading from 1 0,001 Feet 80

What Is a Windows Thread? 81

What Is a CLR Thread? 85

Explicit Threading and Alternatives 87

The Life and Death of Threads 89

Thread Creation 89

Thread Termination 1 01

DlIMain 1 1 5

Thread Local S torage 1 1 7

Where Are We? 1 24

4 Advanced Threads 127

5

Thread State 1 27

User-Mode Thread S tacks 1 2 7

Internal Data Structures (KTHREAD, ETHREAD, TEB) 1 45 Contexts 1 51

Inside Thread Creation and Termination 1 52

Thread Creation Details 1 52

Thread Termination Details 1 53

Thread Scheduling 1 54

Thread S tates 1 55

Priorities 1 59

Quantums 1 63

Priority and Quantum Adjustmen ts 1 64

Sleeping and Yielding 1 67

S uspension 1 68

Affin ity: Preference for Running on a Particular CPU 1 70 Where Are We? 1 80

Windows Kernel Synchronization 183

The Basics: Signaling and Waiting 1 84

Why Use Kernel Objects? 1 86

Waiting in Native Code 1 89

Managed Code 204

Asynchronous Procedure Calls (APCs) 208

Trang 14

Using the Kernel Objects 2 1 1

Mutex 2 1 1

Semaphore 2 1 9

Co n t e n t s xiii

A Mutex/Semaphore Example: Blocking/Bounded Queue 224

Auto- and Man ual-Reset Events 226

Waitable Timers 234

Signaling an Object and Waiting A tomically 241

Debugging Kernel Objects 250

Where Are We? 251

6 Data and Control Synchronization 253

Mutual Exclusion 255

Win32 Critical Sections 256

CLR Locks 272

Reader I Writer Locks (RWLs) 287

Windows Vista Slim Reader/Writer Lock 289

.NET Framework Slim Reader/Writer Lock (3.5) 293

.NET Framework Legacy Reader/Writer Lock 300

Condition Variables 304

Windows Vista Condition Variables 304

.NET Framework Monitors 309

Windows Thread Pools 323

Windows Vista Thread Pool 323

Legacy Win32 Thread Pool 353

Trang 15

Where Are We? 398

8 Asynchronous Programming Models 399

Asynchronous Programming Model (APM) 400

Rendezvousing: Four Ways 403

Implementing IAsyncResult 4 1 3

Where the APM Is Used i n the NET Framework 4 1 8

ASP.NET Asynchronous Pages 420

Event-Based Asynchronous Pattern 421

The Basics 42 1

Supporting Cancellation 425

Supporting Progress Reporting and Incremental Results 425

Where the EAP Is Used in the NET Framework 426

Where Are We? 427

9 Fibers 429

An Overview of Fibers 430

Upsides and Downsides 43 1

Using Fibers 435

Creating New Fibers 435

Converting a Thread in to a Fiber 438

Determining Whether a Thread Is a Fiber 439

Switching Between Fibers 440

Deleting Fibers 441

An Example of Switching the Curren t Thread 442

Additional Fiber-Related Topics 445

Fiber Local S torage (FLS) 445

Thread Affinity 447

A Case Study: Fibers and the CLR 449

Building a User-Mode Scheduler 453

The Implemen tation 455

A Word on S tack vs S tackless Blocking 472

Where Are We? 473

Trang 16

PART III Techniques 475

10 Memory Models and Lock Freedom 477

Memory Load and Store Reordering 478

What Runs Isn 't Always What You Wrote 481

Critical Regions as Fences 484

Data Dependence and Its Impact on Reordering 485

Hardware Atomicity 486

The Atomicity of Ordinary Loads and S tores 487

Interlocked Operations 492

Memory Consistency Models 506

Hardware Memory Models 509

Memory Fences 511

.NET Memory Models 51 6

Lock Free Programming 5 1 8

Examples of Low-Lock Code 520

Lazy Initialization and Double-Checked Locking 520

A Nonblocking S tack and the ABA Problem 534

Dekker 's Algorithm Revisited 540

Where Are We? 541

11 Concurrency Hazards 545

Correctness Hazards 546

Data Races 546

Recursion and Reentrancy 555

Locks and Process Shutdown 561

Priority Inversion and S tarvation 608

Where Are We? 609

Trang 17

Where Are We? 654

13 Data and Task Parallelism 657

Where Are We? 732

14 Performance and Scalability 735

Parallel Hardware Architecture 736

SMP, CMF, and HT 736

Superscalar Execution 738

The Memory Hierarchy 739

A Brief Word on Profiling in Visual Studio 754 Speedup: Parallel vs Sequential Code 756 Deciding to "Go Parallel" 756

Trang 18

Measuring Improvements Due to Parallelism 758

A mdahl's Law 762

Critical Paths and Load Imbalance 764

Garbage Collection and Scalability 766

Win32 Asynchronous I/O 792

.NET Framework Asynchronous I/O 81 7

I / O Cancellation 822

Asynchronous I/O Cancellation for the Current Thread 823

Synchronous I/O Cancellation for Another Thread 824

Asynchronous I/O Cancellation for Any Thread 825

Where Are We? 826

16 Graphical User Interfaces 829

CUI Threading Models 830

Single Threaded Apartments (STAs) 833

Responsiveness: What Is It, Anyway? 836

.NET Asynchronous CUI Features 837

.NET GUI Frameworks 837

Synchronization Contexts 847

Asynchronous Operations 855

A Convenient Package: BackgroundWorker 856

Where Are We? 860

PART V Appendices 863

C o n t e n t s xvii

A Designing Reusable libraries for Concurrent NET Programs 865

The 20,000-Foot View 866

Trang 19

xviii Conte n t s

The Details 867

Locking Models 867 Using Locks 870 Reliability 875 Scheduling and Threads 879 Scalability and Performance 881 Blocking 884

B Parallel Extensions to NET 887

Task Parallel Library 888

Unhandled Exceptions 893 Paren ts and Children 895 Cancellation 897

Futures 898 Contin uations 900 Task Managers 902 Putting it All Together: A Helpful Parallel Class 904 Self-Replicating Tasks 909

Parallel LINQ 9 1 0

Buffering and Merging 912 Order Preservation 9 1 4 Synchronization Primitives 915

ISupportsCancelation 915 Coun tdownEven t 915 Lazylnit<T> 91 7 Man ualResetEven tSlim 91 9 SemaphoreSlim 920 Spin Lock 92 1 Spin Wait 923 Concurrent Collections 924

BlockingCollection<T> 925 Concurren tQueue<T> 928 Concurren tStack<T> 929

Index 931

Trang 20

Foreword

THE COMPUTER INDUSTRY is once again at a crossroads Hardware con

currency, in the form of new manycore processors, together with growing software complexity, will require that the technology industry fundamentally rethink both the architecture of modern computers and the resulting software development paradigms

For the past few decades, the computer has progressed comfortably along the path of exponential performance and capacity growth without any fundamental changes in the underlying computation model Hardware followed Moore's Law, clock rates increased, and software was written to exploit this relentless growth in performance, often ahead of the hardware curve That symbiotic hardware-software relationship continued unabated until very recently Moore's Law is still in effect, but gone is the unnamed law that said clock rates would continue to increase commensurately

The reasons for this change in hardware direction can be summarized

by a simple equation, formulated by David Patterson of the University of California at Berkeley:

Power Wall + Memory Wall + ILP Wall = A Brick Wall for Serial Performance

Power dissipation in the CPU increases proportionally with clock frequency, imposing a practical limit on clock rates Today, the ability to dissipate heat has reached a practical physical limit As a result, a significant

Trang 21

We have, therefore, arrived at an inflection point The software ecosystem must evolve to better support manycore systems, and this evolution will take time To benefit from rapidly improving computer performance and to retain the "write once, run faster on new hardware" paradigm, the programming community must learn to construct concurrent applications Broader adoption of concurrency will also enable Software + Services

through asynchrony and loose-coupling, client-side parallelism, and server-side cloud computing

The Windows and NET Framework platforms offer rich support for concurrency This support has evolved over more than a decade, since the introduction of multiprocessor support in Windows NT Continued improvements in thread scheduling performance, synchronization APIs, and memory hierarchy awareness-particularly those added in Windows Vista-make Windows the operating system of choice for maximizing the use of hardware concurrency This book covers all of these areas When you begin using multithreading throughout an application, the importance of clean architecture and design is critical to reducing software complexity and improving maintainability This places an emphasis on understanding not only the platform's capabilities but also emerging best practices Joe does a great job interspersing best practice alongside mechanism throughout this book

Manycore provides improved performance for the kinds of applications

we already create But it also offers an opportunity to think completely differently about what computers should be able to do for people The

Trang 22

Foreword xxi

continued increase in compute power will qualitatively change the applications that we can create in ways that make them a lot more interesting and helpful to people, and able to do new things that have never been possible in the past Through this evolution, software will enable more personalized and humanistic ways for us to interact with computers So enjoy this book It offers a lot of great information that will guide you as you take your first steps toward writing concurrent, many core aware software on the Windows platform

Craig Mundie

Chief Research and Strategy Officer

Microsoft Corporation

June 2008

Trang 24

Preface

I BEGAN WRITING this book toward the end of 2005 At the time, dual-core

processors were becoming standard on the mainstream PCs that ordinary (nonprogrammer) consumers were buying, and a small number of people

in industry had begun to make noise about the impending concurrency problem (Herb Sutter's, The Free Lunch is Over, paper immediately comes

to mind.) The problem people were worried about, of course, was that the software of the past was not written in a way that would allow it to naturally exploit that additional compute power Contrast that with the neverending increase in clock speeds No more free lunch, indeed

It seemed to me that concurrency was going to eventually be an important part of every software developer's job and that a book such as this one would be important and useful Just two years later, the impact is beginning

to ripple up from the OS, through the libraries, and on up to applications themselves

This was about the same time I had just wrapped up proto typing a small side project I had been working on for six months, called Parallel Language Integrated Query (PLINQ) The PLINQ project was a conduit for me to explore the intricacies of concurrency, multicore, and specifically how parallelism might be used in real-world, everyday programs I used it as a tool

to figure out where the platform was lacking This was in addition to spending my day job at Microsoft focused on software transactional memory (STM), a technology that in the intervening two years has become somewhat of an industry buzzword Needless to say, I had become pretty

xxiii

Trang 25

I set out to write a book that I'd have found fascinating and a useful way

to shortcut all of the random bits of information I had to learn throughout Although it took me a surprisingly long two-and-a-half years to finish this book, the state of the art has evolved slowly, and the state of good books

on the topic hasn't changed much either The result of my efforts, I hope, is

a new book that is down to earth and useful, but still full of very deep technical information It is for any Windows or NET developer who believes that concurrency is going to be a fundamental requirement of all software somewhere down the road, as all industry trends seem to imply

I look forward to kicking back and enjoying this book And I sincerely hope you do too

Book Structure

I've structured the book into four major sections The first, Concepts, introduces concurrency at a high level without going too deep into any one topic The next section, Mechanisms, focuses squarely on the fundamental platform features, inner workings, and API details After that, the Techniques

Trang 26

P reface xxv

section describes common patterns, best practices, algorithms, and data structures that emerge while writing concurrent software The fourth section, Systems, covers many of the system-wide architectural and process concerns that frequently arise There is a progression here Concepts is first because it develops a basic understanding of concurrency in general Understanding the content in Techniques would be difficult without a solid understanding of the Mechanisms, and similarly, building real Systems would be impossible without understanding the rest There are also two appendices

of this writing are the Windows Vista and Server 2008 SDKs

• Microsoft .NET Framework SDK This includes the Microsoft C#

and Visual Basic compilers, and relevant framework libraries The

latest version as of this writing is the NET Framework 3.5 SDK

Both can be found on MSDN: http: / /msdn.microsoft.com

In addition, it's highly recommended that you consider using Visual Studio This is not required-and in fact, much of the code in this book was written in emacs-but provides for a more seamless development and debugging experience Visual Studio 2008 Express Edition can be downloaded for free, although it lacks many useful capabilities such as performance profiling

Finally, the debugging tools for Windows package, which includes the popular WINDBG debugging utility-can also come in handy, particularly if you don't have Visual Studio It is freely downloadable from http: / /www.microsoft.com Similarly, the Sysinternals utilities available from http: / / technet.microsoft.com / sysinternals are quite useful for inspecting aspects of the Windows OS

Trang 28

Acknowledgments

MANY PEOPLE HAVE helped with the creation of this book, both directly

and indirectly

First, I have to sincerely thank Chris Brumme and Jan Gray for inspiring

me to get the concurrency bug several years ago You've both been incredi

bly supportive and have helped me at every turn in the road This has led

to not only this book but a never-ending stream of career, technical, and per

sonal growth opportunities I'm still not sure how I'll ever repay you guys Also, thanks to Herb Sutter, who was instrumental in getting this book's contract in the first place And also to Craig Mundie for writing a terrific Foreword and, of course, leading Microsoft and the industry as a whole into our manycore future

Vance Morrison deserves special thanks for not only being a great men

tor along the way, but also for being the toughest technical reviewer I've ever had His feedback pushed me really hard to keep things concise and relevant I haven't even come close to attaining his vision of what this book could have been, but I hope I'm not too far afield from it

Next, in alphabetical order, many people helped by reviewing the book, discussing ideas along the way, or answering questions about how things work (or were supposed to work): David Callahan, Neill Clift, Dave Detlefs, Yves Dolce, Patrick Dussud, Can Erten, Eric Eilebrecht, Ed Essey, Kang Su Gatlin, Goetz Graefe, Kim Greenlee, Vinod Grover, Brian Grunkemeyer, Niklas Gustafsson, Tim Harris, Anders Hejlsberg, Jim Larus, Eric Li, Weiwen Liu, Mike Magruder, Jim Miller, Igor Ostrovsky,

xxvii

Trang 29

xxviii Ac k n owled,m e n t s

Joel Pobar, Jeff Richter, Paul Ringseth, Burton Smith, Stephen Toub, Roger Wolff, and Keith Yedlin For those reviewers who were constantly promised drafts of chapters that never actually materialized on time, well, I sincerely appreciate the patience

Infinite thanks also go out to the staff from Addison-Wesley In particular, I'd like to give a big thanks to Joan Murray You've been the only constant throughout the whole project and have to be the most patient person I've ever worked with When I originally said the book would only take eight months, I wasn't lying intentionally Hey, a 22-month underestimate isn't too bad, right? Only a true software developer would say that

Trang 30

About the Author

Joe Duffy is the development lead, architect, and founder of the Parallel Extensions to the NET Framework team at Microsoft, in the Visual Studio division In addition to hacking code and managing a team of amazing developers, he defines the team's long-term vision and strategy His current interests include functional programming, first-class concurrency safety in the type system and creating programming models that will enable everyday people to exploit GPUs and SIMD style processors Joe had previous positions at Microsoft as the developer for Parallel LINQ (PLINQ) and the concurrency program manager for the Common Language Runtime (CLR) Before joining Microsoft, he had seven years of professional programming experience, including four years at EMC He was born in Massachusetts, and currently lives in Washington While not indulging in technical excursions, Joe spends his time playing guitar, studying music theory, listening

to and writing music, and feeding his wine obsession

xxix

Trang 32

PART I

Concepts

Trang 34

1

Introduction

C ONCURRENCY IS EVERYWHERE server-side programming for the web or cloud computing, building a No matter whether you're doing

responsive graphical user interface, or creating a new interactive client application that uses parallelism to attain better performance, concurrency is ever present Learning how to deal with concurrency when it surfaces and how

to exploit it to deliver more capable and scalable software is necessary for a large category of software developers and is the main focus of this book Before jumping straight into the technical details of how to use concurrency when developing software, we'll begin with a conceptual overview

of concurrency, some of the reasons it can be important to particular kinds

of software, the role it plays in software architecture, and how concurrency will fit progressively into layers of software in the future

Everything in this chapter, and indeed most of the content in this book, applies equally to programs written in native C++ as it does to programs written in the NET Framework

Why Concurrency?

There are many reasons why concurrency may be interesting to you

• You are programming in an environment where concurrency

is already pervasive This is common in real-time systems,

Trang 35

4 C h a pter 1 : I n t rod u c t i o n

OS programming, and server-side programming It is the reason, for example, that most database programmers must become deeply familiar with the notion of a transaction before they can truly be effective at their jobs

• You need to maintain a responsive user interface (UI) while

performing some compute- or I /O-intensive activity in response to some user input In such cases, running this work on the UI thread will lead to poor responsiveness and frustrated end users Instead, concurrency can be used to move work elsewhere, dramatically improving the responsiveness and user experience

• You'd like to exploit the asynchrony that already exists in the

relationship between the CPU running your program and other hardware devices (They are, after all, separately operating and independent pieces of hardware.) Windows and many device

drivers cooperate to ensure that large I / O latencies do not severely impact program performance Using these capabilities requires that you rewrite code to deal with concurrent orchestration of events

• Some problems are more naturally modeled using concurrency Games, AI, and scientific simulations often need to model interactions among many agents that operate mostly independently of one another, much like objects in the real world These interactions are inherently concurrent Stream processing of real-time data feeds, where the data is being generated in the physical world, typically requires the use of concurrency Telephony switches are inherently massively concurrent, leading to special purpose languages, such as Erlang, that deal specifically with concurrency as a first class concept

• You'd like to utilize the processing power made available by

multiprocessor architectures, such as multicore, which requires

a form of concurrency called parallelism to be used This requires individual operations to be decomposed into independent parts that can run on separate processors

In summary, many problem domains are ripe with inherent concurrency If you're building a server application, for example, many requests

Trang 36

Why C o n c u rre n cy 5 may arrive concurrently via the network and must be dealt with simultaneously If you're writing a Web request handler and need to access shared state, concurrency is suddenly thrust to the forefront

While it's true that concurrency can sometimes help express problems more naturally, this is rare in practice Human beings tend to have a difficult time reasoning about large amounts of asynchrony due to the combinatorial explosion of possible interactions Nevertheless, it is becoming increasingly more common to use concurrency in instances where it feels unnatural The reason for this is that microprocessor architecture has fundamentally changed; parallel processors are now widespread on all sorts of mainstream computers Multicore has already pervaded the PC and mobile markets, and highly parallel graphics processing units (CPUs) are everywhere and sometimes used for general purpose computing In order to fully maximize use of these newer generation processors, programs must

be written in a naturally scalable manner That means applications must contain sufficient laten t concurrency so that, as newer machines are adopted, program performance automatically improves alongside by realizing that latent concurrency as actual concurrency

In fact, although many of us program in a mostly sequential manner, our code often has a lot of inherent latent concurrency already by virtue of the way operations have been described in our language of choice Data and control dependence among loops, if-branches, and memory moves can constrain this, but, in a surprisingly large number of cases, these are artificial constraints that are placed on code out of stylistic habit common to C-style programming

This shift is a change from the past, particularly for client-side programs Parallelism is the use of concurrency to decompose an operation into finer grained constituent parts so that independent parts can run on separate processors on the target machine This idea is not new Parallelism has been used in scientific computing and supercomputing for decades as

a way to scale across tens, hundreds, and, in some cases, thousands of processors But mainstream commercial and Web software generally has been authored with sequential techniques based on the assumption that clock speed will increase 40 to 50 percent year over year, indefinitely, and that corresponding improvements in performance would follow "for free."

Trang 37

6 C h a pter 1 : I n t ro d u c t i o n

Program Architecture and Concurrency

Concurrency begins with architecture It is also possible to retrofit concurrency into an existing application, but the number of common pitfalls

is vastly decreased with careful planning The following taxonomy is a useful way to think about the structure of concurrent programs, which will help during the initial planning and architecture phases of your project:

• Agents Most programs are already coarsely decomposed into independent agents An agent in this context is a very abstract term, but the key attributes are: ( 1 ) state is mostly isolated within it from the outset, (2) its interactions with the world around it are

asynchronous, and (3) it is generally loosely coupled with respect to peer agents There are many manifestations of agents in real-world systems, ranging from individual Web requests, a Windows

Communication Foundation (WCF) service request, COM

component call, some asynchronous activity a program has

farmed off onto another thread, and so forth Moreover, some

programs have just one agent: the program's entry point

• Tasks Individual agents often need to perform a set of operations at once We'll call these tasks Although a task shares many ideas with agents-such as being asynchronous and somewhat independenttasks are unique in that they typically share state intimately Many sequential client-side programs fail to recognize tasks are first class concepts, but doing so will become increasingly important as finegrained parallelism is necessary for multicore Many server-side programs also do not have a concept of tasks, because they already use large numbers of agents in order to expose enough latent

concurrency to utilize the hardware This is OK so long as the

number of active agents exceeds the number of available processors;

as processor counts and the workloads a single agent is responsible for grow, this can become increasingly difficult to ensure

• Data Operations on data are often naturally parallel, so long as they are programmed such that the system is made available of latent concurrency This is called data parallelism Such operations might

Trang 38

Progra m Arch ite c t u re a n d Co n c u rre n cy 7

include transformations of data in one format into another, business intelligence analysis, encryption, compression, sorting, searching

data for elements with certain characteristics, summarizing data for reporting purposes, rendering images, etc The more data there is, the more compute- and time-intensive these operations are They are typically leaf level, very fine grained, and, if expressed properly,

help to ensure future scaling Many programs spend a large portion

of their execution time working with data; thus, these operations are likely to grow in size and complexity as a program's requirements and data input evolves over time

This taxonomy forms a nice hierarchy of concurrency, shown in Figure 1 1 While it's true that the clean hierarchy must be strictly broken

in some cases (e.g., a data parallel task may need to communicate with an agent), a clean separation is a worthy goal

State isolation also is crucial to think about while architecting concurrent programs For example, it is imperative to strive for designs that lead to agents having state entirely isolated from one another such that they can remain loosely coupled and to ease the synchronization burden As finer grained concurrency is used, state is often shared, but functional concepts

Send Agent A Agent B

Reply

FI G U R E 1 1 : A taxonomy of concurrent program structure

Trang 39

8 C h a pter 1 : I n t rod u c t i o n

such as immutability and purity become important: these disciplines help to eliminate concurrency bugs that can be extraordinarily difficult to track down and fix later The topics of state and synchronization are discussed

at length in Chapter 2, Synchronization and Time

What you'll find as you read the subsequent chapters in this book is that these terms and concepts are merely guidelines on how to create structured architecture in your program, rather than being concrete technologies that you will find in Windows and the NET Framework Several examples of agents were already given, and both task and data parallelism may take one

of many forms today These ideas often map to work items executed in dedicated threads or a thread pool (see Chapter 7, Thread Pools), but this varies from one program to the next

Layers of Parallelism

It is not the case that all programs can be highly parallel, nor is it the case that this should be a goal of most software developers At least over the next half decade, much of multicore's success will undoubtedly be in the realm of

embarrassingly parallel problems, where real parallel hardware is used to attain impressive speedups These are the kinds of problems where parallelism is inherent and easily exploitable, such as compute-intensive image manipulation, financial analysis, and AI algorithms Because parallelism is more natural in these domains, there is often less friction in getting code correct and performing well Race conditions and other concurrency hazards are simply easier to avoid with these kinds of programs, and, when it comes

to observing a parallel speedup, the ratio of success to failure is far higher Other compute-intensive kernels of computations will use parallelism but will require more effort For example, math libraries, sort routines, report generation, XML manipulation, and stream processing algorithms may all use parallelism to speed up result generation In addition, domain specific languages (DSLs) may arise that are inherently parallel C#s Language Integrated Query (LINQ) is one example of an embedded DSL within an otherwise imperative language, and MATLAB is yet another Both are amenable to parallel execution As libraries adopt parallelism, those programs that use them will receive some amount of scalability for

Trang 40

Parallel Applications

Domain Parallelism (Libraries, DSLs, etc.)

The resulting landscape of parallelism is visualized in Figure 1 2 If you stop to think about it, this picture is not very different from what we are accustomed to seeing for sequential software Software developers creating libraries focus on ensuring that their performance meets customer expectations, and they spend a fair bit of time on optimization and enabling future scalability Parallelism is similar; the techniques used are different, but the primary motivating factor-that of improving performance-is shared among them

Aside from embarrassingly parallel algorithms and libraries, some applications will still use concurrency specifically Many of these use cases will be in representing coarse-grained independent operations as agents In fact, many programs already are structured this way; utilizing the benefits

of multicore in these cases often requires minimal restructuring, although the scalability tends to be fixed to a small number of agents and, hence, cores Most developers of mostly sequential applications also can use

Tiêu đề	Concurrent Programming on Windows
Tác giả	Joe Duffy
Trường học	Microsoft
Chuyên ngành	Computer Science / Software Engineering
Thể loại	Development Series

Định dạng
Số trang	990
Dung lượng	9,16 MB