Springer VizSEC 2007 proceedings of the workshop on visualization for computer security jul 2008 ISBN 3540782427 pdf

2 Information Visualization Because of the vast amounts of data analysts work with, the need to recognizepatterns and anomalies, and the importance of keeping humans in the loop, infor-m

Trang 2

Mathematics and Visualization

Trang 4

This work is subject to copyright All rights are reserved, whether the whole or part of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microﬁlm or in any other way, and storage in data banks Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,

in its current version, and permission for use must always be obtained from Springer Violations are liable

to prosecution under the German Copyright Law.

The use of general descriptive names, registered names, trademarks, etc in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

Printed on acid-free paper

gjconti@rumint.org

Department of Electrical Engineering and Computer Science

Mathematics and Visualization

2008 Springer-Verlag Berlin Heidelberg

Cover design: WMX Design GmbH

Applied Vision, Inc.

Secure Decisions Division

Department of Computer Science

Trang 5

This volume is a collection of the papers presented at the 4th International shop on Computer Security – VizSec 2007 The workshop was held in conjunctionwith the IEEE Visualization 2007 Conference and the IEEE InfoVis Conference inSacramento, California on October 29, 2007

Work-This volume includes an introductory chapter and two chapters from the

work-shop’s invited speakers: The Real Work of Computer Network Defense Analysts by Anita D’Amico and Kirsten Whitley, and VisAlert: From Idea to Product by Ste-

fano Foresti and Jim Agutter All other papers were peer-reviewed by the VizSecprogram committee

v

Trang 6

• Kwan-Liu Ma, University of California at Davis

• Gregory Conti, United States Military Academy

Program Committee

• Kulsoom Abdullah, Georgia Institute of Technology

• Jim Agutter, University of Utah

• Stefan Axelsson, Blekinge Institute of Technology

• Anita D’Amico, Secure Decisions

• Glenn Fink, Paciﬁc Northwest National Laboratory

• Deborah Frincke, Paciﬁc Northwest National Laboratory

• John Gerth, Stanford University

• Patrick Hertzog, NEXThink S.A.

• Kiran Lakkaraju, University of Illinois at Urbana-Champaign

• Yarden Livnat, University of Utah

• Raffael Marty, Splunk

• Daniel Keim, University of Konstanz

• Stephen North, AT&T Research

• Penny Rheingans, UMBC

• Walt Tirenin, Air Force Research Laboratory

• Soon Tee Teoh, San Jose State University

• Kirsten Whitley, Department of Defense

vii

Trang 7

Introduction to Visualization for Computer Security . 1

J.R Goodall 1 Computer Security 1

2 Information Visualization 3

3 Visualization for Computer Network Defense 5

3.1 Data Sources for Computer Network Defense 6

3.2 VizSec to Support Computer Network Defense 6

4 Papers in This Volume 11

4.1 Users and Testing 11

4.2 Network Security 13

4.3 Communication, Characterization, and Context 14

4.4 Attack Graphs and Scans 15

5 Conclusion 15

References 16

The Real Work of Computer Network Defense Analysts . 19

A D’Amico and K Whitley 1 Introduction 19

2 Related Work 20

3 Methods 22

4 Findings 23

4.1 Data Transformation in CND Analysis 24

4.2 CND Analysis Roles 27

4.3 CND Analysis Workﬂow Across Organizations 29

5 Implications for Visualization 33

5.1 Visualization Across the CND Workﬂow 33

5.2 Visualization as Part of a CND Analysis Environment 35

References 36

ix

Trang 8

x Contents

Adapting Personas for Use in Security Visualization Design . 39

J Stoll, D McColgin, M Gregory, V Crow, and W.K Edwards 1 Introduction 39

2 Overview of the Personas Method and Related Work 40

2.1 Personas Method 41

2.2 Related Work 42

3 Case Study: First Look 43

3.1 Five Steps to Persona Implementation 43

3.2 Discussion 49

4 Application to Security Visualizations 49

5 Conclusion 51

References 51

Measuring the Complexity of Computer Security Visualization Designs . 53

X Suo, Y Zhu, and G Scott Owen 1 Introduction 53

2 Related Work 54

3 Technical Approach 55

3.1 Hierarchical Analysis of Data Visualization 57

3.2 Visual Integration 57

3.3 Separable Dimensions for Visual Units 58

3.4 Interpreting the Values of Visual Attributes 60

3.5 Efﬁciency of Visual Search 61

3.6 Case Study with RUMINT 63

4 Future Work 65

5 Conclusion 65

References 66

Integrated Environment Management for Information Operations Testbeds . 67

T.H Yu, B.W Fuller, J.H Bannick, L.M Rossey, and R.K Cunningham 1 Introduction 67

2 Related Work 68

3.1 LARIAT Overview 70

3.2 Design Goals 72

3.3 Interface and Visualization 72

4 Future Work 80

5 Conclusions 81

References 82

Trang 9

Contents xi

Visual Analysis of Network Flow Data with Timelines and Event

Plots . 85

D Phan, J Gerth, M Lee, A Paepcke, and T Winograd 1 Introduction 85

2 Network Flow Data 86

2.1 Flow Sensor 86

2.2 Database Repository 87

3 The Investigation Process 87

4 Flow Maps 88

5 Progressive Multiples of Timelines and Event Plots 89

6 A Case of Mysterious IRC Trafﬁc 90

7 Related Work 96

8 Future Work and Conclusions 98

References 98

NetBytes Viewer: An Entity-Based NetFlow Visualization Utility for Identifying Intrusive Behavior 101

T Taylor, S Brooks, and J McHugh 1 Introduction 101

2 Related Work 102

3.1 NetBytes Viewer User Interface 105

3.2 User Interaction 107

3.3 Implementation Details 110

3.4 Case Studies 110

4 Future Work 113

5 Conclusions 114

References 114

Visual Analysis of Corporate Network Intelligence: Abstracting and Reasoning on Yesterdays for Acting Today 115

D Lalanne, E Bertini, P Hertzog, and P Bados 1 Introduction 115

2 Background 117

3 On the Need to Support Visual Analysis 118

3.1 Types of Analyses 120

3.2 Analysis Tasks 120

4 User and Application Centric Views of the Corporate Network 122

4.1 The RadViz: Visually Grouping Similar Objects 122

4.2 The OriginalityView: Plotting the Uncommon 124

5 Alarm/Event Centric Views 126

6 Limitations and Challenges 128

7 Conclusion 129

References 129

Trang 10

xii Contents

Visualizing Network Security Events Using Compound Glyphs

From a Service-Oriented Perspective 131

J Pearlman and P Rheingans 1 Introduction 131

2 Related Work 133

3.1 Network Node Glyph 134

3.2 Layout 136

3.3 Comparing to a Model 137

3.4 Results 138

4 Future Work 144

5 Conclusions 145

References 145

High Level Internet Scale Trafﬁc Visualization Using Hilbert Curve Mapping 147

B Irwin and N Pilkington 1 Introduction 147

2 Related Work 148

4 Results 151

4.1 Output Analysis 153

4.2 Other Applications 154

5 Future Work 156

6 Conclusions 157

References 158

VisAlert: From Idea to Product 159

S Foresti and J Agutter 1 Introduction 159

1.1 The Project and Team 160

1.2 The VisAlert Metaphor 160

2 Related Work 161

2.1 Visualization of Network Security 161

2.2 Design 162

2.3 Inter-Disciplinary Collaboration 163

3.1 The Team Dynamics 163

3.2 The Design Process 164

3.3 Sketches 165

3.4 Reﬁned Conceptual Ideas 167

3.5 Implementation 169

4 Future Work 171

5 Conclusions 172

References 174

Trang 11

Contents xiii

Visually Understanding Jam Resistant Communication 175

D Schweitzer, L Baird, and W Bahn 1 Introduction 175

2 Related Work 176

2.1 BBC and Concurrent Codes 177

2.2 BBC Implementations 178

3.1 An Audio Solution 179

3.2 A Visual Representation 180

4 Future Work 184

5 Conclusions 185

References 186

Visualization of Host Behavior for Network Security 187

F Mansman, L Meier, and D.A Keim 1 Introduction 187

2 Related Work 189

2.1 Analysis of Application Ports 190

2.2 Graph-Based Approaches for Network Monitoring 190

2.3 Towards Visual Analytics for Network Security 191

2.4 Summary 191

3.1 Layout Details 193

3.2 Implementation 194

3.3 User Interaction 194

3.4 Abstraction and Integration of the Behavior Graph in HNMap 196

3.5 Application and Evaluation 197

4 Future Work 200

5 Conclusions 200

References 201

Putting Security in Context: Visual Correlation of Network Activity with Real-World Information 203

W.A Pike, C Scherrer, and S Zabriskie 1 Introduction 203

2 Related Work 204

2.1 The Importance of Maintaining Context 204

2.2 Visualizing Packets and Flows 205

2.3 Visualizing Correlated Activity 206

3.1 “I Just Want to Know Where to Focus My Time” 207

3.2 “We Need to Organize Our Hay into Smaller Piles” 208

3.3 Behavior Modeling 209

3.4 Building Context 213

3.5 Visualizing Behavior in Context 214

Trang 12

xiv Contents

4 Future Work 217

5 Conclusions 218

References 219

An Interactive Attack Graph Cascade and Reachability Display 221

L Williams, R Lippmann, and K Ingols 1 Introduction 221

2 Related Work 222

2.1 Limitations of Existing Approaches 222

2.2 NetSPA System 223

3.1 Design Goals 225

3.2 Initial System Design 225

3.3 Example Network Results 227

3.4 Field Trial Results 230

4 Future Work 232

5 Conclusions 234

References 235

Intelligent Classiﬁcation and Visualization of Network Scans 237

C Muelder, L Chen, R Thomason, K.-L Ma, and T Bartoletti 1 Introduction 237

2 Related Work 239

3.1 Scan Data and Representation 241

3.2 An Intelligent Method 242

3.3 Visualization Integration 246

3.4 A Case Study 249

4 Future Work 250

5 Conclusions 251

References 252

Using InetVis to Evaluate Snort and Bro Scan Detection on a Network Telescope 255

B Irwin and J.-P van Riel 1 Introduction 255

1.1 The Merits and Difﬁculties of Scan Detection 256

2 Related Work 257

2.1 Intrusion Detection and the False Positive Problem 257

2.2 Network Telescopes 257

2.3 Classiﬁcations of Network Scan Activity 258

2.4 Algorithmic Approaches to Scan Detection 258

2.5 Network Security Visualisation 259

3 InetVis Network Trafﬁc Visualisation 259

3.1 Key Features and Enhancements 260

4 Investigative Methodology 261

Trang 13

Contents xv

4.1 Network Telescope Trafﬁc Capture 262

4.2 Scan Detection Conﬁguration and Processing 262

4.3 Graphical Exploration and Investigation with InetVis 264

5 Results and Analysis 264

5.1 Address Scans and the Distribution of Unique Addresses 265

5.2 Scans Discovered and Characterised with InetVis 266

6 Future Work 270

7 Conclusion 271

References 271

Trang 14

Introduction to Visualization for Computer

Security

J.R Goodall

Abstract Networked computers are ubiquitous, and are subject to attack, misuse,

and abuse Automated systems to combat this threat are one potential solution, butmost automated systems require vigilant human oversight This automated approachundervalues the strong analytic capabilities of humans While automation affordsopportunities for increased scalability, humans provide the ability to handle excep-tions and novel patterns One method to counteracting the ever increasing cyberthreat is to provide the human security analysts with better tools to discover pat-terns, detect anomalies, identify correlations, and communicate their ﬁndings This

is what visualization for computer security (VizSec) researchers and developers aredoing VizSec is about putting robust information visualization tools into the hands

of humans to take advantage of the power of the human perceptual and cognitiveprocesses in solving computer security problems This chapter is an introduction tothe VizSec research community and the papers in this volume

1 Computer Security

In The Cuckoo’s Egg, astronomer-turned-systems administrator Cliff Stoll (Stoll,

1989) recounted his experience identifying and tracking a hacker through thenascent Internet in the mid-1980s Through perseverance, creativity (he once dan-gled his keys over the telephone modem lines to create interference to slow downand frustrate the intruder), and extensive coordination and collaboration with othersystems administrators, Stoll’s actions led to the uncovering of an international spyring that had inﬁltrated U.S military systems The intruder was initially detectedfrom a 75 cent accounting error

J.R Goodall

Secure Decisions Division of Applied Visions, Inc., 6 Bayview Ave Northport, NY 11768, USA, e-mail: johng@securedecisions.avi.com

1

Trang 15

2 J.R Goodall

In the two decades since Stoll’s investigation, computer security has become anoverriding concern of all types of organizations New systems and protocols havebeen developed and adopted to prevent and detect network intruders But even withthese advances, the central feature of Stoll’s story has not changed: humans are stillcrucial in the computer security process Administrators must be willing to patientlyobserve and collect data on potential intruders They need to think quickly and cre-atively They collaborate and coordinate their actions with colleagues Humans arestill as central to computer security today as they were 20 years ago Technologieshave evolved and many security processes have been automated, but the analyticcapabilities and creativity of humans are paramount in many security-related prac-tices, particularly in intrusion detection, the focus of this chapter Because of this,not all security work should be or can be automated Humans are – and should be –central to security practice This central feature of computer security is at the core

of visualization for computer security (VizSec)

Many things have changed since Stoll’s time In conjunction with the rapidgrowth of the Internet and increased organizational dependence on networkedinformation technology, the frequency and severity of network-based attacks hasincreased drastically (Allen et al., 1999) At the same time, there is an inverserelationship between the decreasing expertise required to execute attacks and theincreasing sophistication of those attacks; less skill is needed to do more damage(McHugh, 2001) As we have come more and more to rely on the ability to networkcomputers and access information online, attacks are becoming more pervasive,easier to carry out, and more destructive

Despite this increasing threat and concerted efforts on preventative security sures, vulnerabilities remain The reasons for these include: programming errors,design ﬂaws in foundational protocols, and the insider abuse problem of legitimateusers misusing their privileges (Lee et al., 2000) While it is theoretically possible

mea-to remove all security vulnerabilities through formal methods and better ing practices, practically it remains infeasible (Hofmeyr et al., 1998) Thus, even

engineer-as security technologies and practices improve, the threat to network infrengineer-astructuresremains

Automated systems to combat this threat are one potential solution, but mostautomated systems require vigilant human oversight This automated approachundervalues the strong analytic capabilities of humans While automation affordsopportunities for increased scalability, humans provide the ability to handle excep-tions and novel patterns A technical report on intrusion detection technologiesnoted that while security vendors attempt to fully automate intrusion diagnosis, amore realistic approach is to involve the human in the diagnostic loop; computerscan process large amounts of data, but cannot match humans’ analytic skills (Allen

et al., 1999)

Humans excel at recognizing novel patterns in complex data and computer rity support tools should integrate these intricate sense-making capabilities of thehuman analyst with the ability of technology to process vast quantities of data Inorder to effectively support human analysts and keep them in the diagnostic loop,

secu-it is necessary to fully comprehend the work secursecu-ity analysts do, how they do secu-it,

Trang 16

Introduction to Visualization for Computer Security 3and how their work processes can be improved by taking advantage of the inherentstrengths of both technology and humans.

One method to counteracting this ever increasing threat is to provide the humansecurity analysts with better tools to discover patterns, detect anomalies, identifycorrelations, and communicate their ﬁndings This is what VizSec researchers anddevelopers are doing VizSec is about putting robust information visualization toolsinto the hands of humans to take advantage of the power of the human perceptualand cognitive processes in solving computer security problems

2 Information Visualization

Because of the vast amounts of data analysts work with, the need to recognizepatterns and anomalies, and the importance of keeping humans in the loop, infor-mation visualization shows great potential for supporting computer security work.Put simply, information visualization turns data into interactive graphical displays.Information visualization takes advantage of the highest bandwidth human inputdevice, vision, and human perceptual capabilities Information visualization can

be used for exploration, discovery, decision making, and to communicate complexideas to others

Information visualization is distinct from the broader ﬁeld of data graphics mation visualization is interactive; the user will have tools to adjust the display inorder to gain a more meaningful understanding of the data being presented Unlikescientiﬁc visualization, which is concerned with representing physically based data(such as the human body, molecules, or geography), information visualization rep-resents abstract data; to do so often requires creativity on the designers’ part sincethere is no existing structure to map the data to the graphical display This isone of the inherent problems in developing an effective information visualization:mapping the data spatially in a meaningful manner At the core of information visu-alization is the goal of amplifying cognition, the intellectual processes in whichinformation is obtained, transformed, stored, retrieved, and used (Card, 2003) Infor-mation visualization is able to augment cognition by taking advantage of humanperceptual capabilities

Infor-Information visualization involves the use of computer-supported, visual resentations of abstract data to amplify cognition by taking advantage of humanperceptual capabilities (Card et al., 1999) Card, Mackinlay, and Shneiderman(1999) propose six ways that information visualization can amplify cognition: (1)increased resources, (2) reduced search, (3) enhanced recognition of patterns, (4)enabling perceptual inference, (5) using perceptual monitoring, and (6) encodinginformation in a manipulable medium Visualization increases memory and pro-cessing resources by permitting parallel processing of data and ofﬂoading workfrom the cognitive to perceptual memory Graphical information displays can often

rep-be processed in parallel, as opposed to textual displays, which are processed ally Visualization shifts the cognitive processing burden to the human perceptual

Trang 17

seri-4 J.R Goodallsystem, which can expand working memory and the storage of information Infor-mation visualization reduces the processes of searching by grouping informationtogether in a small, dense space Pattern recognition, one of the key elements inrecognizing intrusion detections, is another of the beneﬁts of visualization, whichemphasizes recognition rather than recall, another way in which working memory isexpanded Visual representations can often make an anomaly obvious to the user bytaking advantage of human perceptual inference and monitoring abilities Finally,information visualization encodes the data in a manipulable form that permits theuser to browse and explore the data.

One of the most successful examples of an information visualization technique

is the treemap The original treemap layout was designed by Ben Shneiderman toeffectively use display space when visualizing a hard drive’s ﬁles and their attributes,such as ﬁle size and type (Shneiderman, 1992) The treemap was a recursive algo-rithm that split the display space into rectangles alternating in horizontal and verticaldirections The size and the color of the leaf node rectangles can encode attributes

of the data In the original implementation visualizing a computer disk, color sented ﬁle type and size represented ﬁle size An example application of a treemap

repre-is an alternative method of viewing software source code, as shown in Fig 1 Inthis example, nodes represent source code files organized into their package hierar-chy Color is used to show the file’s last modification time, with green hues beingmore recently modified Treemap visualizations have been adapted to many differentapplications of understanding hierarchical data, such as newsgroup activity, stockmarket performance, election results, and sports statistics (For a history of treemapsand their many applications by Ben Shneiderman, see (Shneiderman, 2006))

Fig 1 A treemap visualization of the source code for the prefuse visualization toolkit showing the

hierarchy of the code as it is organized into packages, where each node represents a source code file and the size of nodes shows the file size and color the last modified date

Trang 18

Introduction to Visualization for Computer Security 5

Fig 2 The FilmFinder information visualization application combining a starﬁeld display with

dynamic queries c1994 ACM, Inc Included here by permission

FilmFinder, shown in Fig 2, is an early example of an information tion that highlights the importance of interaction (Ahlberg and Shneiderman, 1994).FilmFinder combines a starﬁeld display, a scatterplot where each data item is repre-sented by a point, with dynamic queries so that the display is continuouslyupdated

visualiza-as the user ﬁlters to reﬁne the selection This is an excellent example of the

impor-tance of interaction in information visualization The display itself is fairly simple,

time is plotted on the x axis and ratings on the y axis with color coded to genre

But the dynamic queries through sliders and other widgets prevent user errors and

instantly show the results of complex queries The system is an exemplarof the

visual information-seeking mantra: overview ﬁrst, zoom and ﬁlter, then details on

demand (Shneiderman, 1996) This approach encourages exploration and

under-standing of the data set as a whole, while providing a method for drilling down

to the actual data details Many of the VizSec systems described below follow thismethodology

3 Visualization for Computer Network Defense

There are many potential applications of information visualization to the problems

of computer security, including:

– Visualization for detecting anomalous activity

– Visualization for discovering trends and patterns

– Visualization for correlating intrusion detection events

– Visualization for computer network defense training

– Visualization for offensive information operations

– Visualization for seeing worm propagation or botnet activity

– Visualization for forensic analysis

– Visualization for understanding the makeup of malware or viruses

– Visualization for feature selection and rule generation

– Visualization for communicating the operation of security algorithms

This is a non-exhaustive list of the kinds of tasks that VizSec tools can be designed

to support Because networks and the Internet are so important to the operations oftoday’s organizations and since the network is the source of most computer basedattacks, the majority of VizSec research has targeted supporting the tasks associated

Trang 19

6 J.R Goodallwith the defense of enterprise networks from outside attack or insider abuse Thissection will focus on the data sources and results of the research into visualizationfor computer network defense (CND).

3.1 Data Sources for Computer Network Defense

The research of VizSec for CND can be organized according to the level of ing data to be visualized At the base, most raw level is a network packet trace Apacket consists of the TCP/IP header (which deﬁnes how a packet gets from point

network-A to point B) and payload data (the contents of the packet) network-At a higher level ofabstraction is a network flow Originally developed for accounting purposes, net-work flows have been increasingly used for computer security applications A flow

is an aggregated record of the communications between two distinct machines Aflow is typically defined by the source and destination Internet Protocol addresses,the source and destination ports, and the protocol Flows are much more compactthan packet traces, but sacrifice details and have no payload data At a higher level

of abstraction are automated systems that reduce network data to information such

as an intrusion detection system (IDS) An IDS examines network trafﬁc and matically generates alerts of suspicious activity All three of these levels operate onthe enterprise network level At a ﬁner level of granularity is the visualization ofdata about individual computer systems or applications, and at a higher level is thevisualization of data about the Internet

auto-The remainder of this section will describe a selection of VizSec research thattargets the enterprise network level, which is generally the focus of CND

3.2 VizSec to Support Computer Network Defense

This section presents representative visualization research projects for each of thelevels of enterprise network security The examples presented here each solve animportant problem Rumint facilitates the understanding of packet payloads; tnvallows analysts to move from a high-level overview of packet activity to raw details;NVisionIP enables analysts to use visualization to create automation rules; FlowTagassists collaboration and sharing through tagging of data; VisAlert enables the inte-gration of multiple data sources through a what, where, when paradigm; and IDSRainstorm highlights the importance of multiple, linked views at different levels ofsemantic detail

3.2.1 Packet Trace Visualizations

At the most granular level of enterprise network data are raw packet traces This kind

of data is useful for understanding the behavior of networks and as a supplementary

Trang 20

Fig 3 Rumint visualization: binary rainfall visualization where each row represents a packet and

each column in the row represents a bit in the packet (left), and byte frequency visualization where

each row represents one of 256 byte values and each column in the row represents the frequency

of that byte in the packet (right) c 2006 IEEE, Inc Included here by permission

source for analyzing security events, but is typically collected and analyzed on an adhoc basis, not systematically, since the data can become very large To help analystscope with this copious packet data, researchers are looking at ways to visualizepacket headers and payloads

One example is rumint, shown in Fig 3, which uses a novel visualization calledbinary rainfall, in which each packet is plotted one per row where each pixel rep-resents a bit in the packet (Conti et al., 2006, 2005) Multiple packets are shown

in time series order at multiple semantic levels An additional view presents a bytefrequency visualization, where each packet is plotted on a row where each pixelrepresents byte values of 0–255 Pixels for each row are drawn according to thefrequency of that byte in the packet The system is unique in that it provides agraphical plotting of packet payload data, plotted according to the bit value Rumintalso includes other views into the data, such as a parallel coordinate plot to shownetwork connections

Tnv, shown in Fig 4, is a visualization tool designed to facilitate the analysis cesses of CND by providing a visual display that can facilitate recognizing patternsand anomalies over time – thereby increasing support for learning and recognizingnormal trafﬁc behavior patterns – coupled with more focused views on packet-level detail that can be understood in the context of the surrounding network trafﬁc(Goodall et al., 2005, 2006) The display is split between three areas To the left is

pro-a npro-arrow pro-arepro-a thpro-at displpro-ays remote hosts, in the center is the pro-arepro-a thpro-at displpro-ays linksbetween hosts, and the large area to the right displays local hosts (those deﬁned asbeing local to the user), which is divided into a matrix where each row represents

a unique local host and each column represents a time interval, with each resultingcell color coded to the number of packets to and from that host within that timeperiod Bisecting the display to separately show local and remote hosts increasedthe scalability of the visual display, so that many more hosts can be displayed at

Trang 21

8 J.R Goodall

Fig 4 Tnv visualization showing 170,000 packets Remote hosts at the left and local hosts at the

right of the display, with links drawn between them; packets are drawn for local hosts over time and color is used to represent protocol and packet frequency for a time period

once by dividing the available screen real estate between local and remote hosts Inaddition to being able to display more hosts at a time, this partitioning also ﬁts wellwith analysts’ perceptions of what they deem to be important Because local hostsare of primary concern in ID analysis, the majority of the display space is devoted

to the local hosts The details of individual packets can be displayed on demand

3.2.2 Network Flow Visualizations

Network flows are aggregations of packet traces according to the hosts, ports, andprotocol involved Because it is aggregated, flows can be systematically collectedand stored, and then used in forensic analysis when an intrusion occurs or monitoredfor anomalous activity In either case, the volume of data makes textual analysis dif-ficult and a number of researchers are looking at visualization methods for analyzingflow data

NVisionIP is geared to increasing an analyst’s situational awareness by ing ﬂows at multiple levels of detail (Lakkaraju et al., 2004, 2005) At the highestlevel of aggregation, NVisionIP, shown in Fig 5 displays an entire class-B network(65,534 possible addresses) as a scatterplot of colored hosts to facilitate understand-ing the state of a network NVisionIP also provides the ability to drill down into thedata through a small-multiple view and a histogram of host details NVisionIP wasalso extended to “close the loop” by allowing users to create rules from the visualiza-tion that can then automatically alert on new data This concept will likely become

Trang 22

visualiz-Introduction to Visualization for Computer Security 9

Fig 5 NVisionIP visualization’s galaxy view, a scatterplot that puts subnets (the third octet of

the class-B network) along the x axis and hosts (the fourth octet) along the y axis to present an

overview of network flows for a class-B network Animation can be used to visualize traffic flows over time c2004 ACM, Inc Included here by permission

increasingly common in VizSec applications in the years to come Machines excel atpattern matching, humans excel at recognizing novel patterns This approach allowsfor both machines and humans to do what they do best

FlowTag, shown in Fig 6, is a system to visualize network ﬂows and to tagthe data to support analysis and collaboration (Lee and Copeland, 2006) Taggingallows analysts to label key elements during the analytic process to reduce the cog-nitive burden of analysis and maintain context Tagging can also be used for sharingand collaboration Tagging has become popular recently with social networking andsocial bookmarking sites; adapting the concept to CND should be encouraged in allVizSec applications FlowTag brings the popular concept of tagging to the problems

of analyzing and sharing network security data

3.2.3 Alert Visualizations

Intrusion detection, the process of using computer network and system data to tify potential cyber attacks, has become an increasingly essential component ofthe information security infrastructure However, due to the dynamic and complexnature of computer networks and the potential for inappropriate or self-damagingresponses to potential attacks, IDSs are only effective when complemented by ahuman analyst To help manage the analysis of IDS alerts, several researchers haveturned to information visualization

Trang 23

iden-10 J.R Goodall

Fig 6 FlowTag visualization showing ﬂow connection information on a parallel coordinate plot of

destination port on one axis and source IP address on the other organized in order of appearance; color represents the selection state c2006 ACM, Inc Included here by permission

VisAlert is a flexible visualization that correlates multiple data sources, such asIDS alerts and system logs files (Livnat et al., 2005a, b) Correlation is based onthe What, When and Where attributes of the data VisAlert, shown in Fig 7, inte-grates these into a single display depicting alerts as vectors between the perimeter,representing alert time (when) and type (what), and the interior, representing net-work topology (where), of a radial view This system represents one of the moresophisticated and novel visualizations to solve the important problem of correlatingdisparate events This is a significant example of a novel approach to support theintegration of multiple data sources within a unified display

IDS Rainstorm, shown in Fig 8, focuses on scalability, mapping IDS alerts topixels over time (Abdullah et al., 2005; Conti et al., 2006) Zooming and drillingdown to the details allow the users to understand the details of their IDS data Theoverview visualization aggregates 20 IP addresses for each row of pixels, organizedsequentially from top to bottom and the columns wrap around at the bottom ofthe display Each column represent 24 h of alerts By wrapping the columns, IDSRainstorm can represent 2.5 class B IP networks (163,830 hosts) in a single display.This type of display, similar to the software visualization tool SeeSoft (Eick et al.,1992), maximizes the available display space to provide an overview of very largedata sets The color of the pixels represent the severity of the associated alerts (thehighest severity of the group of 20 is used) A second display screen is used toshow a zoomed in view, which shows larger glyphs to represent alerts and also addssemantic details to show connections between the internal IP address space andexternal IP addresses represented in the alert Like NVisionIP, this is a noteworthyexample of synchronizing multiple views to show different levels of semantic detail

Trang 24

Fig 7 VisAlert visualization of correlated intrusion detection alerts showing alerts along outer

rings and network topology maps in the center c2005 IEEE, Inc Included here by permission

4 Papers in This Volume

The papers collected in this volume were presented at the Fourth VizSec Workshopfor Computer Security, held in conjunction with IEEE Vis and InfoVis in Sacra-mento, California in 2007 This collection presents the state of the art in VizSecresearch

4.1 Users and Testing

Anita D’Amico and Kirsten Whitley open this volume with an invited chapter

enti-tled The Real Work of Computer Network Defense Analysts: The Analysis Roles and Processes that Transform Network Data into Security Situation Awareness.

This chapter is intended to frame the central problems of CND work that rity visualization applications attempt to solve The authors report on the results oftheir cognitive task analysis of CND analysts in the U.S Department of Defense.They cover three of the ﬁndings from the task analysis: the cognitive transforma-tion process from raw data into security situation awareness, the identiﬁcation and

Trang 25

secu-12 J.R Goodall

Fig 8 IDS Rainstorm maps intrusion detection alerts to pixels in the overview visualization that

wraps columns of IP address activity over a 24 h time period c2006 IEEE, Inc Included here by permission

description of the analysis roles in CND, and CND analysts’ workﬂow across nizations The authors conclude by linking their ﬁndings to visualization design;drawing valuable implications for future VizSec researchers and developers.Jennifer Stoll, David McColgin, Michelle Gregory, Vern Crow, and W Keith

orga-Edwards apply a user-centered design method to VizSec in Adapting Personas for Use in Security Visualization Design The authors turn to human–computer interac-

tion and participatory design research to solve the problem of requirements capture

by using personas Personas are an archetype description of a system’s target usersthat provide a framework for organizing requirements Rather than approach usersfor feedback on design, designers can turn to the personas to simulate how well

a design meets user requirements This chapter demonstrates how user-centereddesign methodologies can be applied to VizSec software development

Xiaoyuan Suo, Ying Zhu, and G Scott Owen focus on evaluating VizSec

soft-ware in Measuring the Complexity of Computer Security Visualization Designs The

authors propose an alternative evaluation method to user studies: complexity ysis VizSec designers developers can use this method to evaluate a set of factors

Trang 26

anal-Introduction to Visualization for Computer Security 13that affect the ability of users to understand a visualization Complexity is measuredacross several dimensions, including visual integration, separable dimensions foreach visual unit, the complexity of interpreting the visual attributes, and the efﬁ-ciency of visual search The authors demonstrate the complexity analysis methodwith two VizSec applications, rumint and tnv, which were described in Sect 3.2.1.Tamara H Yu, Benjamin W Fuller, John H Bannick, Lee M Rossey, andRobert K Cunningham address the difﬁculty of supporting network testbed opera-

tions in Integrated Environment Management for Information Operations Testbeds.

Network testbeds are crucial in the design and testing of information operationssoftware, but as testbeds become more realistic, they also become more complex

to set up and manage The authors present a visual interface that facilitates testspeciﬁcation, testbed control, and testbed monitoring through multiple informationvisualization techniques

4.2 Network Security

Doantam Phan, John Gerth, Marcia Lee, Andreas Paepcke, and Terry Winograd

present a VizSec system called Isis in Visual Analysis of Network Flow Data with Timelines and Event Plots, which was named the workshop’s Best Paper winner.

Isis supports the analysis of network ﬂow data through two visualization ods, progressive multiples of timelines and event plots, to support the iterativeinvestigation of intrusions Isis combines visual affordances with structured querylanguage (SQL) to minimize user error and maximize ﬂexibility Isis keeps a history

meth-of a user’s investigation, easily allowing a query to be revisited and a hypothesis

to be changed A detailed case study using anonymized data of a real intrusiondemonstrates the features of Isis

Teryl Taylor, Stephen Brooks, and John McHugh present another VizSec system

for network ﬂow analysis in NetBytes Viewer: An Entity-Based NetFlow tion Utility for Identifying Intrusive Behavior NetBytes Viewer plots network ﬂow

Visualiza-data per port of an individual host machine or subnet on a network over time in 3D

The Z axis displays the ports, the X axis displays time, and the Y axis displays the

magnitude of trafﬁc (in ﬂows, packets, or bytes) seen by the host (or subnet) in anhour

Denis Lalanne, Enrico Bertini, Patrick Hertzog, and Pedro Bados describe a

visu-alization approach to support multiple user roles in Visual Analysis of Corporate Network Intelligence: Abstracting and Reasoning on Yesterdays for Acting Today.

The authors present a pyramidal vision of network intelligence to support more thanjust the daily monitoring of networks In addition to the system and security ana-lysts, the authors argue that other user proﬁles are interested in network intelligence,such as the the helpdesk, legal department, and the chief executive ofﬁcer Theypresent two methods of network analysis, taking a user/application centric view andalarm/temporal centric view

Trang 27

14 J.R GoodallJason Pearlman and Penny Rheingans take a service-oriented perspective to visu-

alizing network trafﬁc in Visualizing Network Security Events Using Compound Glyphs from a Service-Oriented Perspective The authors present a node-link visu-

alization in which each node is represented as a compound glyph that provides anindication of the node’s service usage Time slicing is also used in these glyphs toprovide an indication of time

Barry Irwin and Nicholas Pilkington attempt to map large IP spaces using Hilbert

curves in High Level Internet Scale Trafﬁc Visualization Using Hilbert Curve ping Network telescope (also called DarkNets) are large collections of IP space

Map-with no hosts; all trafﬁc collected on a network telescope is sent to a non-existanthost These dead end communications are never legitimate and provide indications

of backscatter, scanning, and worm activity The authors use Hilbert curves, a spaceﬁlling curve that preserves locality (i.e., ordered data will remain ordered along thecurve), to map the activity on large network telescopes

4.3 Communication, Characterization, and Context

Stefano Foresti and James Agutter present their experience with the design of a

VizSec system in VisAlert: From Idea to Product VisAlert, described above in

Sect 3.2.3, is a VizSec system that can correlate data from multiple sources into auniﬁed visualization In this invited chapter, the authors describe the design processfrom the conception of rough visual sketches to the implementation and deployment

of a production-ready software and the issues that the design team had to address tocarry the project from concept to product

Dino Schweitzer, Leemon Baird, and William Bahn present a visualization of

their security algorithm in Visually Understanding Jam Resistant Communication.

Their algorithm, BBC, is based on a new type of coding theory known as rent codes that is resistant to traditional jamming techniques The authors found

concur-it difﬁcult to explain the formal deﬁnconcur-ition and proofs to non-mathematicians, and

so turned to visualization as a communication device to visually demonstrate thealgorithm’s effectiveness

Florian Mansman, Lorenz Meier, and Daniel A Keim present an approach to

visualizing host behavior in Visualization of Host Behavior for Network Security.

The authors use a force-directed graph layout to look at changes in host behaviorover time to assist in the detection of uncommon behavior A node represents thestate of one host for a specic interval and its position is determined by its state at thatinterval So as hosts’ states change, their position also changes, allowing analysts toeasily see changes over time

William A Pike, Chad Scherrer, and Sean Zabriskie focus on bringing context

into visualization in Putting Security in Context: Visual Correlation of Network Activity with Real-World Information, which was named the workshop’s Best Paper

runner-up The central tenant of the paper is that CND analysts use their own standing of the world to put security events into context In order to support this

Trang 28

under-Introduction to Visualization for Computer Security 15necessary analytic step, the authors demonstrate a system, called NUANCE, thatcreates behavior models for network entities at multiple levels of abstraction andfuses these models with contextual information on current threats and exploits fromtextual data sources.

4.4 Attack Graphs and Scans

Leevar Williams, Richard Lippmann, and Kyle Ingols present an elegant solution

to visualizing attack graphs in An Interactive Attack Graph Cascade and bility Display Attack graphs present potential critical paths that could be used by

Reacha-adversaries to compromise networked hosts based on their known vulnerabilities.Attack graphs are useful for understanding the vulnerability level of a network, butare often too complex to understand The authors present a visual solution for attackgraph comprehension based on treemaps Multiple treemaps are used to cluster hostgroups in each subnet Hosts within each treemap are grouped based on reachability,attacker privilege level, and prerequisites

Chris Muelder, Lei Chen, Russell Thomason, Kwan-Liu Ma, and Tony Bartoletticombine machine learning and visualization to tackle the problem of classifying

scanning activity in Intelligent Classiﬁcation and Visualization of Network Scans.

The authors present a system that uses associative memory learning techniques tocompare network scans in order to create classiﬁcations The classiﬁcations can beused with visualization to characterize the source of scans

Barry Irwin and Jean-Pierre van Riel describe a 3D visualization for trafﬁc

analysis in Using InetVis to Evaluate Snort and Bro Scan Detection on a work Telescope Source IP address, destination IP address, and destination port

Net-are mapped to the three axes in InetVis for TCP and UDP traffic and a separateplane is shown below this cube (with no port information) for ICMP traffic InetVisalso incorporates textual filtering and querying using the powerful and flexible theBerkeley Packet Filter syntax The authors use the visualization to examine the scandetection capabilities two IDSs to identify possible flaws in those scan detectionalgorithms

5 Conclusion

VizSec is a growing community that is attempting to solve the important problems ofcomputer security through enabling humans through information visualization Thischapter has highlighted the motivation for VizSec and presented some of the tasksVizSec tools support and the data sources visualized Examples of visualizations

of packet traces, network ﬂows, and intrusion detection alerts were presented toprovide an understanding of some of the themes that VizSec research has grappledwith and solved, particularly for CND

Trang 29

Confer-Allen, J., Christie, A., Fithen, W., McHugh, J., Pickel, J., Stoner, E.: State of the tice of intrusion detection technologies Tech Rep CMU/SEI-99-TR-028, Carnegie Mellon University/Software Engineering Institute (1999)

prac-Card, S.K.: Information visualization In: Jacko, J.A., Sears, A (eds.) The Human Computer Interaction Handbook, pp 544–582 Lawrence Erlbaum Associates, Mawah, NJ (2003) Card, S.K., Mackinlay, J.D., Shneiderman, B (eds.): Information Visualization: Using Vision to Think Morgan Kaufman, San Francisco, CA (1999)

Conti, G., Grizzard, J., Ahamad, M., Owen, H.: Visual exploration of malicious network objects using semantic zoom, interactive encoding and dynamic queries In: Proceedings of the IEEE Workshop on Visualization for Computer Security (VizSEC), pp 83–90 (2005)

Conti, G., Abdullah, K., Grizzard, J., Stasko, J., Copeland, J.A., Ahamad, M., Owen, H., Lee, C.: Countering security analyst and network administrator overload through alert and packet

visualization IEEE Computer Graphics and Applications 26(2), 60–70 (2006)

Eick, S.G., Steffen, J.L., Eric, E., Sumner, J.: Seesoft-a tool for visualizing line oriented software

statistics IEEE Transactions on Software Engineering 18(11), 957–968 (1992)

Goodall, J.R., Lutters, W.G., Rheingans, P., Komlodi, A.: Preserving the big picture: Visual work trafﬁc analysis with tnv In: Proceedings of the IEEE Workshop on Visualization for Computer Security (VizSEC), pp 47–54 IEEE Press, New York (2005)

net-Goodall, J.R., Lutters, W.G., Rheingans, P., Komlodi, A.: Focusing on context in network trafﬁc

analysis IEEE Computer Graphics and Applications 26(2), 72–80 (2006)

Hofmeyr, S.A., Forrest, S., Somayaji, A.: Intrusion detection using sequences of system calls.

Journal of Computer Security 6(3), 151–180 (1998)

Lakkaraju, K., Yurcik, W., Lee, A.J.: Nvisionip: Netﬂow visualizations of system state for security situational awareness In: Proceedings of the ACM Workshop on Visualization and Data Mining for Computer Security (VizSEC/DMSEC), pp 65–72 (2004)

Lakkaraju, K., Bearavolu, R., Slagell, A., Yurcik, W.: Closing-the-loop: Discovery and search in security visualizations In: Proceedings of the IEEE Workshop on Information Assurance and Security (IAW), pp 58–63 (2005)

Lee, C.P., Copeland, J.A.: Flowtag: A collaborative attack-analysis, reporting, and sharing tool for security researchers In: Proceedings of the ACM Workshop on Visualization for Computer Security (VizSEC), pp 103–108 ACM, New York (2006)

Lee, W., Stolfo, S.J., Mok, K.W.: Adaptive intrusion detection: A data mining approach Artiﬁcial

Intelligence Review 14(6), 533–567 (2000)

Livnat, Y., Agutter, J., Moon, S., Erbacher, R.F., Foresti, S.: A visualization paradigm for work intrusion detection In: Proceedings of the IEEE Workshop on Information Assurance and Security (IAW), pp 92–99 (2005a)

net-Livnat, Y., Agutter, J., Shaun, M., Foresti, S.A.F.S.: Visual correlation for situational awareness In: Agutter, J (ed.) IEEE Symposium on Information Visualization (InfoVis), pp 95–102 (2005b)

McHugh, J.: Intrusion and intrusion detection International Journal of Information Security 1(1),

Trang 30

Introduction to Visualization for Computer Security 17 Shneiderman, B.: Treemaps for space-constrained visualization of hierarchies (2006) http://www cs.umd.edu/hcil/treemap-history/

Stoll, C.: The Cuckoo’s Egg: Tracking a Spy Through the Maze of Computer Espionage Pocket Books, New York (1989)

Trang 31

The Real Work of Computer Network

Defense Analysts

The Analysis Roles and Processes that Transform

Network Data into Security Situation Awareness

A D’Amico and K Whitley

Abstract This paper reports on investigations of how computer network defense

(CND) analysts conduct their analysis on a day-to-day basis and discusses the cations of these cognitive requirements for designing effective CND visualizations.The supporting data come from a cognitive task analysis (CTA) conducted to base-line the state of the practice in the U.S Department of Defense CND community.The CTA collected data from CND analysts about their analytic goals, workflow,tasks, types of decisions made, data sources used to make those decisions, cognitivedemands, tools used and the biggest challenges that they face The effort focused onunderstanding how CND analysts inspect raw data and build their comprehensioninto a diagnosis or decision, especially in cases requiring data fusion and correla-tion across multiple data sources This paper covers three of the findings from theCND CTA: (1) the hierarchy of data created as the analytical process transformsdata into security situation awareness; (2) the definition and description of differentCND analysis roles; and (3) the workflow that analysts and analytical organizationsengage in to produce analytic conclusions

impli-1 Introduction

As government and business operations increase their reliance on computer works and the information available on them, defending these valuable networks andinformation has become a necessary organizational function Risks have appearedfrom many sources The online world is witnessing increasingly sophisticated tech-nical and social attacks from organized criminal operations Moreover, an estimated

Trang 32

20 A D’Amico and K Whitley

120 countries are using the Internet for political, military or economic espionage(McAfee, 2007)

The broad area of cyber security encompasses policy and conﬁguration decisions,virus scanning, monitoring strategies, detection and reaction In the commercialworld, the domain of expertise for securing and defending information resources isreferred to as information security (InfoSec) U.S governmental organizations usethe synonymous terms computer network defense (CND) and defensive informationoperations (DIO)

This paper treats the topic of CND analysis from the perspective of the ple working as professional CND analysts We discuss how their user requirementsshould apply to the design of CND visualization tools To describe the nature ofCND analysis, we draw upon a cognitive task analysis (CTA) that we conducted

peo-in the 2004–2005 timeframe uspeo-ing mapeo-inly CND analysts workpeo-ing withpeo-in U.S.Department of Defense (DOD) organizations D’Amico et al (2005) provides a pre-liminary report on that work The research was designed to gain a full understanding

of the daily CND analysis process Three design considerations were: to understandboth the similarities and differences in how network data was analyzed across differ-ent organizations; to include analysts whose responsibilities ranged from defendinglocal networks to looking for attacks more broadly across a community (i.e., thenotion of enclave, regional and community monitoring); and to include perspectivesstemming from both tactical and strategic missions

The CTA research was undertaken with several goals in mind, including to serve

as foundation material for tool developers who do not have easy access to CNDanalysts and to provide requirements for the design of successful visualization forcomputer security These goals also motivate this paper This paper summarizesthree findings from the CTA: (1) the hierarchy of data created as the analyticalprocess transforms data into security situation awareness; (2) the definition anddescription of different CND analysis roles; and (3) the workflow that analystsand analytical organizations engage in to produce analytic conclusions We pinpointcognitive needs of CND analysts, rather than the software and system requirements.The analytic process is a joint (both human and machine) cognitive system, and thepipeline of CND analysis will not be automated in the near future The needs ofhuman analysts will remain a critical component of successful CND and should beconsidered when designing CND visualizations

Trang 33

The Real Work of Computer Network Defense Analysts 21

To answer these questions, CND analysts are responsible for tasks such as lecting and filtering computer network traffic, analyzing this traffic for suspicious orunexpected behavior, discovering system misuse and unauthorized system access,reporting to the appropriate parties and working to prevent future attacks CND ana-lysts consult the output of automated systems that provide them with network datathat have been automatically collected and filtered to focus the analyst’s attention ondata most likely to contain clues regarding attacks These automated systems (such

col-as ﬁrewalls, border gateways, intrusion detection systems (IDSs), anti-virus systemsand system administration tools) produce log ﬁles and metadata that the analyst caninspect to detect suspicious activities

To gauge the missions and analytic tasks across the CND community, CarnegieMellon University (Killcrece et al., 2003) conducted a study of 29 Computer Secu-rity Incident Response Teams (CSIRTs), of which 29% were military, and listed themajor activities of the teams surveyed A summary appears in Table 1 along withthe percentage of organizations reporting these activities The bold typeface high-lights those activities that were of interest to our CND CTA research Our researchfocused on understanding how CND analysts inspect raw data and build their com-prehension into a diagnosis or decision, especially in cases requiring data fusionand correlation across multiple data sources The CTA did not include the work ofvulnerability assessments, penetration testing, insider threat or malware analysis.The CMU study also categorized CND activities or functions into three groups:reactive, proactive, and security quality management Reactive activities are trig-gered by a preceding event or request such as a report of wide-spreading malicious

Table 1 The major activities performed by CND analysts and percentages of organizations

reporting these activities (Killcrece et al., 2003)

Activities of Computer Security Incident Response Teams

Perform a technology watch or monitoring service 55%

Trang 34

22 A D’Amico and K Whitleycode or an alert identified by an IDS or network logging system Looking to thepast, reactive tasks include reviewing log files, correlating alerts in search of pat-terns, forensic investigation following an attack and identification of an attacker whohas already penetrated the network Looking to the future, proactive activities areundertaken in anticipation of attacks or events that have not yet manifested Proac-tive tasks include identifying new exploits before they have been used against thedefended network, predicting future hostile actions and tuning sensors to adjust forpredicted attacks Security quality management activities are information technol-ogy (IT) services that support information security but that are not directly related

to a speciﬁc security event; these include security training, product evaluation, anddisaster recovery planning Killcrece et al reported, and our CTA results support,the fact that most CND analysis work is reactive, not proactive In the CND CTA,

we looked for examples of proactive work; however, the majority of the analyticactivity was reactive In describing the analysis roles below, we note instances inwhich proactive tasks can occur

Alberts et al (2004) extended the work of Killcrece et al in a report that cates best-practice workflows for effective incident management Their modelsrepresent what incident response should or could be and do not necessarily representthe actual experiences of most CND analysts By comparison, our CTA studied thestate of the practice, sought to understand the existing factors that impede success-ful analysis and identified opportunities to improve situation awareness Biros andEppich (2001) conducted a CTA of rapid intrusion detection analysts (which includetriage and aspects of escalation analysis, as defined below) in the U.S Air Force andidentified four requisite cognitive abilities: recognizing non-local Internet Protocol(IP) addresses, identifying source IP addresses, developing a mental model of nor-mal, and sharing knowledge We used their work as a starting point, but studied thelarger range of CND analysis beyond triage analysis and beyond the Air Force

advo-3 Methods

Generally, CTA is the study of an individual’s or team’s cognitive processes, ities and communications within a speciﬁc work context CTA uses naturalisticobservation techniques to elucidate expertise and to understand the actual effect

activ-of processes and systems (e.g., sactiv-oftware systems) built to automate or assist humandecision makers Ideally, a CTA involves both observing individuals as they go abouttheir work and asking directed questions about the way in which they approach theproblems, how they decide what steps to take, their communications with their co-workers and the difﬁculties of their work In a CTA, care is taken to distinguishbetween the inherent work of a domain and the work that may be created by thecurrent working environment and tools; in this way, the CTA can provide insightinto how the current working environment helps or hinders the ultimate goals of thework The output of a CTA is a detailed description of the tasks that an individual

or team performs, the data on which they operate, the decisions they make and the

Trang 35

The Real Work of Computer Network Defense Analysts 23processes and activities (cognitive, communicative and perhaps physical) that theyengage in to reach those decisions.

During the CTA described in this paper, 41 CND professionals working in sevendifferent organizations participated Most were currently active analysts; a few weremanagers who were not performing analysis on a day-to-day basis They varied inlevel of expertise and represented a variety of job titles and work roles, as deﬁned

by their organizations We focused on CND analysts who look at network trafﬁc andrelated data to determine whether the information assets are under attack and whothe attacker is To collect data, we used a combination of four knowledge capturetechniques: semi-structured interviews, observations, review of critical incidentsand hypothetical scenario construction In semi-structured interviews, the researcherguided discussion with an analyst by using a checklist of questions, yet also usedwide latitude to encourage the subject to describe the day-to-day work in detail.Observations involved watching analysts at work combined with asking questions

to clarify the process Review of critical incidents involved dissecting past incidentsthat challenged the analyst’s skills The technique of scenario construction involvedworking with analysts to ﬂesh out an imaginary analysis case of typical offensiveactions taken by a sophisticated attacker and defensive actions taken by the CNDanalyst Scenario construction allowed analysts to reveal the kinds of informationthey seek from available data sources, knowledge of adversary operations and tech-niques, and types of connections they make between seemingly disparate pieces ofinformation

4 Findings

While the organizations participating in the CTA differed in their stated mission(such as protecting a single network, identifying trends in computer attacks acrossthe entire DOD, or performing CND services for customers outside one’s own orga-nization), they had much in common This overlap, however, was obscured by thelack of standard terminology Whereas various members of the community used

common terms (e.g., event and alert), they often used the terms differently or

with-out a speciﬁc deﬁnition Also, at times, the analysts used different terms that gave

an initial impression that they had different missions or needs Therefore, a primarytask in the CTA data analysis became to analyze the participants’ usage of terms inthe context of the details of their work By sifting through the details, we identiﬁedsimilarities across the community For tool designers, these widespread cognitiveprocesses are valuable to understand, because they are fundamental aspects of CND.The following sections describe CND analysis from three perspectives, high-lighting aspects of human cognition The ﬁrst section considers the status of thedata as raw data are processed into analytic product The second section presents

a range of analysis roles, based on work performed, not on job title The third tion describes a synthesized workﬂow that captures the steps in the analytic process.Throughout the discussion, the focus is on the similarities across organizations, not

Trang 36

sec-24 A D’Amico and K Whitley

on exceptional cases In these ﬁndings, we use a standardized vocabulary, not in aprescriptive way, but as a way to illuminate the CND process data for the purpose

of the CTA

4.1 Data Transformation in CND Analysis

As a CND analyst works, data are filtered, sorted, retained or discarded based onthe analyst’s responsibilities and expertise Different responsibilities involve differ-ent data For example, some analysts primarily review the newest packet traffic orsensor data; others concentrate on data that have already been identified as suspi-cious, but require further analysis and correlation with additional data sources As

we discussed this process with the analysts, we ascertained that, as analysis ceeds, data are transformed as an analyst’s confidence in an analytic conclusionincreases The cognitive transformation can be seen as a hierarchy of data filterswith the levels reflecting the increasing certainty that a reportable security violationhas been identified (depicted in Fig 1) It is worthwhile to note that the volume ofdata generally decreases from level to level

Trang 37

Raw data are the most elemental data in the hierarchy At the start of the entire

CND analytic workflow, the raw data can be network packet traffic, netflow data

or host-based log data Especially because the amount of raw data is so large, lysts do not generally inspect all raw data Instead, raw data are passed through anautomated process (e.g., an IDS) that makes initial ﬁltering decisions (e.g., based onattack signatures) The automated ﬁltering results in a substantially reduced amount

ana-of data requiring human attention

Interesting activity refers to the data that has been ﬂagged by the initial automated ﬁlter and sent to a CND analyst for inspection We heard it referred to as activity, alerts, alarms, data, logs and interesting activity Some analysts objected to the term alerts because they felt strongly that activity is not an alert until a human analyst

has inspected and veriﬁed that the activity is worthy of further attention ing activity might be presented to the CND analyst in the form of packet headerdata from TCPDUMP or as an IDS alert Depending on the techniques employed

Interest-by the automated ﬁlter, the interesting activity may be largely composed of falsepositives Analysts perform triage on interesting activity, examining the alert detailsand related data, throwing out false positives and retaining the remainder for closerinspection

Suspicious activity remains after the triage process because the CND analyst

believes that the activity is anomalous for the monitored network or because itadheres to a signature or attack pattern associated with malicious intent Some CTA

participants called this type of activity an event, anomaly or suspicious activity.

Examples of suspicious activity include a series of scans from the same source

IP address; an unusual increase in trafﬁc to or from a server inside the network;virus infections on several workstations in a short time period; and misuse of themonitored network by employees downloading inappropriate content

Event refers to suspicious activity that a CND analyst has a responsibility to

report, based on the organization’s mission and policies For example, an tion might be charged to report only on certain types of intrusion attempts and not

organiza-on employee policy violatiorganiza-ons (e.g., using unauthorized peer-to-peer software); inthis case, a policy violation would not be escalated as an event

At the level of events, the volume of data has been signiﬁcantly reduced fromthat of raw data It is also the point at which CND analysts begin grouping indi-vidual activity based on common characteristics (such as source and destination

IP addresses, time, attack characteristics or attacker behavior) Along the sis workﬂow, CND analysts are also expanding their understanding of the data

analy-by searching for and adding new facts that show the extent of the security tion including the actors, machines, and information that has been compromised.The work for the CND analyst inspecting event data is to conﬁrm that a securityviolation has occurred and to provide as full an understanding as possible of theviolation

viola-Incident is the point when a CND analyst(s) has conﬁrmed the occurrence and

seriousness of one or more events and reports on the collection of relevant data Theincident level is usually a formal, documented point in the analysis process A CND

Trang 38

26 A D’Amico and K Whitleyanalyst prepares a formal report describing the incident After any required approval,the incident report is released as an official analytic product Some organizationshave more than one type of reports (e.g., a rapid-release distribution mechanism todistribute early information as quickly as possible and a formal reporting mechanismwhich is the finalized incident description) Within the DOD, official incidents areassigned to the responsible party for incident handling Incidents may be tagged with

a category type or priority ranking Currently, there is no international consensus onincident categories or how to measure incident severity

Incident reports are distributed to interested parties based on factors like categorytype and ofﬁcial reporting chain The topic of report distribution and data sharing

is closely related to the fact that CND analysis is often done collaboratively acrossorganizations Monitoring often takes place at enclave, regional and community lev-els with formal or informal collaboration and sharing across levels For example, inthe DOD, CND analysis occurs at individual military bases (i.e., enclave level), atthe military service level (i.e., regional level) and across the entire DOD (i.e., com-munity level) Data and reports ﬂow up and down this reporting chain Currently,the Joint Task Force for Global Network Operations (JTF-GNO) provides DODcommunity-wide analysis For state and local governments, US-CERT, operated bythe Department of Homeland Security (DHS), provides the community level In thecommercial world, companies have corporate monitoring (i.e., enclave or regionallevel) and may also report to a community service (e.g., a ﬁnancial institution mayparticipate in the Financial Services Information Sharing and Analysis Center (FS-ISAC)) In the case of Managed Security Service Providers (MSSPs), incidents arereported to individual customers (i.e., enclave level); an MSSP might also performtrend analysis across its entire customer base

The beneﬁt of wider analysis at the community level is indisputable Aside fromindividual enclave concerns about the sensitivity of their data, the value of groupingCND data stems from the fact that certain incidents cannot be fully understoodwithin a single enclave When protecting national interests, it is important to detectrelated activity and larger trends occurring across individual enclaves

Intrusion sets are sets of related incidents In the organizations we visited, intrusion sets and problem sets were essentially synonymous terms Intrusion sets

commonly arise at the community level when CND analysts can review incidentsfrom different reporting organizations and group these incidents based upon sharedfeatures such as source and destination IP addresses, time, attack characteristics

or attacker behavior When a CND community suspects that separately reportedincidents emanate from the same source or sponsor, the community groups the inci-dents into an intrusion set Just as incidents are almost universally a formal analyticproduct, the designation of an intrusion set is an ofﬁcial decision point for the orga-nizations in our CND CTA The community then increases attention and resources

to detecting, understanding and responding to relevant activity This process caninclude decisions about tuning data collection and IDS signatures to catch all newrelated data

Trang 39

such as rapid intrusion detection, consultation with other analysts and even training

of junior analysts In considering how to address this lack of common descriptions,

we decided to categorize analytical function based on the actual tasks performed.The result of this exercise was a set of six broad analysis roles that accounted forall of the cognitive work observed: triage analysis, escalation analysis, correlationanalysis, threat analysis, incident response and forensic analysis

These roles represent categories of analysis; the roles do not directly map to jobtitles An analyst with a single job title may perform work across more than one ofthe analysis roles The roles illuminate the amount and types of data that the analyst

is integrating and the goal of the analysis The roles also reﬂect authority boundariesimposed by law and policy (e.g., relating to privacy) Some of the roles align closelywith reactive analysis; some include aspects of proactive analysis

Triage analysis is the ﬁrst look at the raw data and interesting activity The triage

decision is a relatively fast decision about whether the data warrants further analysis.Triage encompasses weeding out false positives and escalating interesting activityfor further analysis, all within a few minutes of viewing the data Commonly, ananalyst inspects IDS alerts and the immediate associated trafﬁc/ﬂow metadata and/orpacket contents

The majority of analysts in the CND CTA performed triage analysis It is alsovery common that novice CND analysts are ﬁrst assigned the job of triage analysisand work under the guidance of more senior analysts The triage cases that novicesencounter provide on-the-job training that increases the range of security violationsthat they can easily recognize

Triage analysis is reactive in nature, since it is based on reviewing and sortingactivity that has already occurred Within the CTA, we encountered the following

relevant CND job titles: level 1 analyst, ﬁrst responder and real-time analyst For

the organizations in the CTA, analysts with these job titles spent the majority of theirtime performing triage analysis In a small organization or at a remote site within alarge organization (e.g., Air Force base), triage analysis may be performed, albeit in

a limited way, by the system administrator or network manager

Escalation analysis refers to the steps taken to investigate suspicious

activ-ity received from triage analysis Escalation analysis requires increasing situationawareness of the suspicious activity The process may take hours or even weeks fromstart to ﬁnish, during which the CND analyst marshals more data, usually from mul-tiple data sources and from inside and outside the organization, resulting in greatercomprehension of the attack methods, targets, goal and severity The CND analystmay also make an initial assessment of attacker identity and the mission impact of

an attack

Trang 40

28 A D’Amico and K Whitley

A main goal of escalation analysis is to produce incident reports Compared

to triage analysis, escalation looks at related data over longer periods of time(e.g., over the last several months of collected data) and from multiple data sources(e.g., including information from threat reports) The time needed to process thesedata queries and to interpret and assemble the results accounts for the fact that esca-lation analysis takes longer than triage analysis In triage analysis, emphasis is onspeed; correspondingly, the analysis usually involves limited queries on a singledata source In the current practice of CND analysis, the combination of triage andescalation analysis is what is often referred to as a real-time monitoring capability(although it does not actually occur in real time)

Sometimes, escalation analysis is based on tip-offs received from colleagues inother analysis groups and from cooperating organizations This situation occursparticularly for senior analysts who have good contacts throughout the CND com-munity

Escalation analysis is largely reactive Less commonly, escalation analysisinvolves proactive actions such as tuning sensors to look for predicted attacks oractivity related to a current investigation Within the CTA, we encountered the

following relevant CND job titles: level 2 analyst and lead analyst.

Correlation analysis is the search for patterns and trends in current and

histor-ical data At the community level, correlation analysis includes grouping data intointrusion sets; these investigations can take days to months When conducted at thecommunity level, correlation analysis is closely related to threat analysis

Correlation tasks include retrospectively reviewing packet data, alert data orincident reports collected over weeks or months of CND monitoring, looking forunexplained patterns Patterns may arise from different data attributes such asspeciﬁc source or destination IP addresses, ports used, hostnames, timing character-istics, attack details and attacker behavior By discovering patterns, CND analystscan uncover suspicious activity that was previously unnoticed An analyst mightnot know what patterns they are looking for in advance; instead, the analyst might

“know it when they see it.” When they encounter a pattern that they cannot explain,they form hypotheses about potential malicious intent, which they try to conﬁrm orcontradict via additional investigation

In the CND CTA, we encountered few analysts whose primary role was lation analysis Only 5% of the CTA participants were primarily responsible for

corre-community-wide correlation; another 5% were primarily responsible for the post hoc review, at the regional level, to search for anomalies or patterns not found during

triage and escalation analysis

Correlation analysis is reactive when it focuses on discovery within existing data

It has the potential to be proactive if discovered patterns are used to make dictions about next likely actions Within the CTA, we encountered the following

pre-relevant CND job titles: level 2 analyst, correlation analyst and site-speciﬁc lyst We choose the term correlation analysis, not out of a technically correct use of

ana-the concept of correlation, but raana-ther due to ana-the prevalent use of ana-the term in ana-the CNDcommunity to refer to grouping related data

Định dạng
Số trang	281
Dung lượng	7,3 MB