1. Trang chủ
  2. » Công Nghệ Thông Tin

Network Traffic Analysis Using tcpdump

76 395 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Network Traffic Analysis Using tcpdump
Tác giả Judy Novak
Trường học Johns Hopkins University Applied Physics Laboratory
Chuyên ngành Network Traffic Analysis
Thể loại lecture notes
Năm xuất bản 2000
Thành phố Baltimore
Định dạng
Số trang 76
Dung lượng 447,7 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

3 Course Objectives • Introduce the fundamentals of tcpdump • Explain how to write tcpdump filters • Examine fields in datagram for uses/misuses • Analyze traffic by placing it in catego

Trang 1

Introduction to tcpdump

All material Copyright  Novak, 2000, 2001 All rights reserved

Trang 2

Writing tcpdump Filters

Examination of Datagram Fields

Beginning Analysis

Real World Examples

Step by Step Analysis

References

Trang 3

3

Course Objectives

Introduce the fundamentals of tcpdump

Explain how to write tcpdump filters

Examine fields in datagram for uses/misuses

Analyze traffic by placing it in categories

Demonstrate “real-world” analysis using

tcpdump

Let you participate in the analysis process

The objectives of this course are to introduce you to the fundamentals and benefits of using tcpdump

as a tool to analyze your network traffic We’ll start with introducing concepts and output of tcpdump One of the most important aspects of using tcpdump is being able to write tcpdump filters

to look for specific traffic Filter writing is fairly basic unless you want to examine fields in an IP datagram that don’t fall on byte boundaries So, that is why an entire section is devoted to the art of writing filters

Before we start to use tcpdump to analyze traffic, we’ll examine many of the fields found in the IP datagram This is done to familiarize you with those fields in theory and also how they might be used in practice We’ll study how and why fields might be changed and for what purpose Next, we’ll start the basic analysis process by looking at tcpdump output and categorizing the kind of traffic that you can see

Then, we’ll take a look at some real-world examples and of how tcpdump was used on monitored networks to discover what was happening Next, the analysis process will be inspected step by step often with missteps to get you comfortable with it

As a note, all tcpdump output shown in this course is activity that actually occurred Source and destination hosts/IP’s have been altered to obfuscate the true identities

Trang 4

4

Overview

Introduction to tcpdump

Writing tcpdump filters

Examination of Datagram Fields

Beginning Analysis

Real World Examples

Step by Step Analysis

This page intentionally left blank

Trang 5

5

Introduction to tcpdump

Introduction to tcpdump

Writing tcpdump Filters

Examination of Datagram Fields

Beginning Analysis

Real World Examples

Step by Step Analysis

This page intentionally left blank

Trang 6

6

Objectives

Examine the strengths/weaknesses of tcpdump

Organize collection/analysis process of tcpdump data via

Interpretation of payload/hex output

This page intentionally left blank

Trang 7

7Introduction

This page intentionally left blank

Trang 8

Provides absolute fidelity

Universally available and used

A

One of the most important parts of an arsenal in your security infrastructure is at least one tool or software package that captures an audit trail or a historical record of the traffic that enters or leaves your network There will be times when you will be required to examine activity or connections that occurred in your network – not just traffic that caused an alarm to sound For instance, what if you suspect that your packet filtering router that acts as your perimeter defense was acting strangely after some major network changes were made You would have to examine the traffic that was allowed into your network to assist in determining the problem That is where tcpdump is invaluable

Also, many tools - even logs from firewalls will display suspicious traffic, yet only partial data is displayed What if you get a log of rejected traffic, but it doesn’t display or keep TCP flags? You’ll never know what kind of connection was attempted tcpdump allows the analyst to examine all the bits and fields that are collected If nothing is “wrong” with the connection, examination at the bit level is unnecessary Yet, if you suspect something “foul” with the traffic, you really need access to all the data down to the bit level

And tcpdump is a tool that is universally used and very portable If you become familiar with this software or its Windows counterpart, windump, it can be used on just about any platform to assist you in analysis of traffic

Trang 9

9

Weaknesses

By default, doesn’t collect all the payload

Does not scale well on large networks

tcpdump can collect a large volume of data for larger networks This can be alleviated by not collecting all the data on the network – perhaps omit web traffic (port 80) Or, another way to deal with this is more disk space and faster processors to analyze all the collected data But, at some point, the volume gets unwieldy

tcpdump blindly collects packet after packet It has no idea of state or being able to know that a given packet is anomalous because it does not follow the flow of a normal connection And while tcpdump has some primitive arithmetic operations or ways to manipulate bits, it cannot do complex operations for analyzing data

Finally, while it is an excellent way to collect data, tcpdump does not attempt to make interpretations of what it sees It does have some integrity checking operations for certain data to make sure that the data is not irregular, but the analyst has to have the training and savvy to interpret the data For the sophisticated analyst, this is a bonus because she or he can make the correct call Compare this with a tool that is prone

to false positives that gives no way of verifying the alarmed event But, for an analyst who has little training, tcpdump can be daunting since it does not interpret events

Trang 10

10

tcpdump Versions

tcpdump: Unix version; official current version 3.4

ftp://ftp.ee.lbl.gov/ tcpdump tar.Z

ftp://ftp.ee.lbl.gov/ libpcap tar.Z

windump: Windows version

tcpdump is officially supported by the Lawrence Berkeley Labs The current version is 3.4 There is

an effort to improve tcpdump and patch known problems with tcpdump and libpcap that appears to

be a collective effort of anyone interested The software for this effort can be found at

www.tcpdump.org Their current version is 3.5

For the Unix versions of tcpdump, you need to download software known as libpcap that implements

a portable framework for capturing low-level network traffic windump is a Windows variant of tcpdump It also requires an application program interface to collect the traffic known as winpcap

The unofficial version of tcpdump has some nice enhancements It decrypts more of the applications

at the application layer and has a very nice capability of converting hexadecimal payload to

character output

Trang 11

07:00:48.036776 myhost.com > ping.net: icmp: echo reply (DF)

07:02:12.622460 log.net.3155 > syslog.com.514: udp 101

07:03:01.132414 send.net.32938 > mail.com.25: S 248631:248631(0) win 8760

tcpdump running on a host

“sniffing” network packets

We see on this slide, a host running tcpdump and gathering records from the network interface

We see the records that tcpdump has collected below tcpdump has a default standard output based

on the protocol (TCP, UDP, ICMP) of the record that is displayed While each of the various protocols has a similar format to the other, they are also distinct in what is displayed

By default, tcpdump will collect and print, in a standard format, all the traffic passing on the network There are command line options for tcpdump that will alter the default behavior, either

by collecting specified records, printing in a more verbose mode, printing in hexadecimal or writing records as “raw packets” to a file instead of printing as standard output

Trang 12

12

Sample tcpdump Output

Sample UDP Record

09:39:19.470000 nmap.edu.728 > dns.net.111: udp 56

timestamp source port dest port : protocol bytes

Sample TCP Record beginning seq # data bytes

09:35:53.660000 nmap.edu.4 > dns.net.111: SF 136747297:136747297(0) win 1028

flags ending seq # 09:32:43.910000 nmap.edu.1171 > dns.net.139: S 2490962508:2490962508(0) win 512

09:32:43.910000 nmap.edu.1173 > dns.net.21: S 62697789:62697789(0) win 512

09:32:43.910000 nmap.edu.1193 > dns.net.22: S 1360146849:1360146849(0) win 512

09:32:43.920000 nmap.edu.1194 > dns.net.1114: S 372884098:372884098(0) win 512

Since we’ll review a lot of tcpdump output in this course, here’s a chance to get more comfortable with it This is sample output from what appears to be an nmap scan; a popular and informative scan

All records have a timestamp The sensor host (Redhat Linux 5.2) that captured these records has the precision to capture hundredths of seconds although tcpdump allows places for up to millionths

Different protocols will have different representations in tcpdump output One of the first challenges

is to identify the protocol (TCP, UDP, ICMP) Most will be labeled and while TCP isn’t explicitly labeled, it is the only one with flag bits, sequence and acknowledgment numbers to name a few Some protocols like DNS will be interpreted at the application layer Because of this, you may not see the normal clues that you are used to It may not be obvious if it is UDP or TCP so it is

important to look for clues as to which it is

In general, tcpdump gives details about the source/host > destination/host

Note that the bytes (0) transferred on SYN packets is normally 0 since they do not carry a payload because this is just part of establishing the three-way handshake

Trang 14

Collects tcpdump data in hourly files

Analyzes each hour’s data for anomalies

Formats anomalous data in html for browsing

Comes with scripts to assist in examining data

Shadow (Secondary Heuristics for Defensive Online Warfare) is an intrusion detection system available

to all for free It can be found at http://www.nswc.navy.mil/ISSEC/CID Shadow uses tcpdump as its underlying collection and processing tool Shadow turns tcpdump from a packet collecting tool into an intrusion detection system Shadow collects data from the network interface and stores it in hourly files

in raw tcpdump compressed format It analyzes each hour’s collected data after-the-fact and runs a series of tcpdump filters against it looking for anomalies and one-to-many source IP to destination IP traffic

Shadow will format into html all the events of interest detected by the tcpdump filters and processed by some perl programs The analyst can examine the output with a browser and further investigate activity using some additional perl scripts to look through an hour’s or day’s worth of data

Using Shadow relieves the analyst from having to worry about the collection of tcpdump data; it automates this process Further, it gives the analyst an automated way of examining activity Still, the analyst has to interpret the output As with any other intrusion detection system, it requires a savvy analyst to accurately interpret the output However, since it is predicated upon tcpdump, the analyst has the ability to examine all the collected data down to the bit level

Trang 15

Performs traffic analysis

Primary focus on datagram headers

Pull-based architecture

Analyst reviews hourly events of interest via

web browser

Requires a savvy analyst to interpret output

Freeware available from www.nswc.navy.mil

Shadow is a Unix based intrusion detection system It has a sensor and analysis component The sensor component collects network traffic and the analysis component fetches that traffic and analyzes it Both the sensor and analysis host process data in an hourly timeframe

The entire IP datagram is not captured because Shadow is mostly concerned with anomalies or events of interest found in the header portions of the datagram The headers examined are the IP, TCP, UDP and ICMP headers Much insight can be gained from examining these headers By default, some payload or data is captured in the datagram Shadow does not attempt to analyze this, but it is there in case you want to analyze it

Each hour the analysis host analyzes the previous hour’s traffic for events of interest These events

of interest are formatted in html for viewing by an analyst using a browser This is known as a based approach since the analyst is required to examine the records; the analyst is not informed or pushed alerts of anomalous events

pull-Shadow was developed by the pull-Shadow team at the Naval Surface Warfare Center It is still maintained and upgraded by this team Shadow can be downloaded at no cost from

http://www.nswc.navy.mil/ISSEC/CID Click on the link for Current Shadow Software.

Trang 16

Provides an audit trail of activity to/from network

Provides an intimate view of activity

While not the only reason to install and use Shadow, a very compelling reason is the price tag In many cases, but not this one, you get what you pay for Shadow is an excellent no-cost traffic analysis tool

Another benefit is that once you master Shadow, you can change it liberally at any time that you want For instance, if you hear of a new exploit and can fashion a signature with a tcpdump filter, you can modify Shadow instantaneously Compare this with some intrusion detection systems that do not offer the

capability to change filters or signatures You have to wait for the software company to update the filters when they get around to it and the updates may not include signatures that you would like to see

Also, since you get all the source code with Shadow, you can customize it for your whims and needs This

is highly unusual and allows you to make changes based on your proficiency of the software

Shadow uses tcpdump as its collection software By default, you will collect most activity going into and out of your network This can be very beneficial in providing an audit trail of activity in the network If you ever find yourself in the midst of some kind of incident, this may be a very valuable attribute for an intrusion detection system to have

Finally, some of the more GUI kinds of intrusion detection systems do not allow the user to examine the actual traffic at the IP datagram level Shadow, by virtue of tcpdump, will allow the user a very intimate view of the data collected You will maintain fidelity of data and you can use all fields for interpretation and analysis If the traffic you are analyzing is corrupted in some way, you want to be able to inspect the entire datagram

Trang 17

hour 02 data

DMZ

analysis host

secure copy

tcpdump filters

html output

The Shadow architecture is a two-host system Typically, the sensor resides on the DMZ, but it can

be placed anywhere on the network It collects the traffic from the network interface and stores the data in hourly files which are in raw tcpdump compressed format

Each hour, the analysis host securely copies the files from the sensor Using perl scripts it

orchestrates the process of running the previous hour’s tcpdump data through a set of tcpdump filters that looks for anomalous activity Another filter and perl script examine the data for signs of scans –one source IP attempting connections to multiple destination IP’s All of this information is then formatted into html for viewing by the analyst

Trang 18

Traffic sent to broadcast address

Traffic from reserved private networks

Fragmentation

Initial SYN connections

Particular UDP ports

Specific ICMP traffic

fragmentation

For TCP records, the initial SYN connections are examined This doesn’t necessarily mean that the connection was successful, it just indicates that the connection was attempted Also, certain ports or hosts may have to be excluded so as not to false alarm For UDP records, you have to maintain a list

of UDP destination ports that are of interest to you

Shadow looks for signs of a one-to-many relationship of source IP to multiple destination hosts –often indicative of a scan Finally, Shadow can be tuned to look at more granular activity to the core infrastructure hosts in your network

Trang 19

19Sample Shadow Output

Shadow output is sorted tcpdump output It is sorted by source IP and time to allow the analyst to group the activity by source IP The above activity indicates a probe of port 3128 (squid proxy server port) by host 1.2.3.4 A second host that is displayed because it was extracted by one of the tcpdump filters is host 2.2.2.2 which appears to be probing mydns.com for destination port 139 which is a NetBIOS port Typically DNS servers do not have the NetBIOS ports open

The final set of activity appears to be a full-blown scan from source IP 5.5.5.5 It is scanning the hosts on the 172.16.1 subnet for port 1243 which is a trojan known as SubSeven or BackDoorG Having the output displayed in html for the analyst makes it easier for the analyst to examine the hour’s traffic

Trang 20

20Examining tcpdump Output

This page intentionally left blank

Trang 21

11:55:52.069484 192.168.143.5 > 192.168.143.101: icmp: echo request

tcpdump will display any collected or processed output to standard output – typically the console or terminal It will also attempt to resolve any IP numbers to host names and will also attempt to translate port numbers to known services For instance, if a port number is 23 and it is found in the file /etc/services as being associated with telnet, tcpdump will print the service and not the port number - that is, unless the –n option has been used to disable resolution

As you can see, this does not display all the captured fields in the datagram Other fields are available for display, but different command line options have to be supplied in order to see the fields In the above record, we have an ICMP echo request captured

Trang 23

1415 1617 1819

Underlined : IP

protocol header and data ICMP Header

Suppose you want to examine all the bits that are captured when tcpdump is run There are many reasons for wanting to examine this level of detail, especially when you believe that there is some kind of deliberate crafting or alteration of the datagram

In order to dump the bits, tcpdump has an option to display the output in hexadecimal This is done

by using the –x command line option From the hexadecimal output, the bits can be determined When output is displayed in hex, you will have to have some idea of what the fields are that you are examining A most excellent resource to assist in this task is the “bible” of TCP/IP – TCP/IP Illustrated, Volume 1 by Richard Stevens Not only are the protocol headers conveniently located directly inside the cover, but this book uses tcpdump output to assist in the understanding of TCP/IP

One of the first things you will need to do upon looking at the hex output is to determine where the

IP header is and how long it is We’ll see how to do that in upcoming slides Also, you want to examine the embedded protocol and determine where that header stops and starts Finally, you may have some kind of interest in the embedded protocol payload

Trang 24

24

Default snaplen

Default number of bytes captured is 68

Why do we see only 54 bytes of data from tcpdump?

In the above slide, we see that there appears to be a 20 byte header that is underlined Each line of tcpdump output is 16 bytes We see that we have 54 bytes of data that have been captured For the above output, the actual datagram is longer than 68 bytes (we’ll see how to compute the datagram length), but we only have 54 bytes of output Any ideas why?

Trang 25

25

Answer: Frame Header

Frame header IP Header ICMP ICMP Data

Header

Ethernet =

14 bytes 20 bytes 8 bytes 26 bytes

The answer to the question of why only 54 bytes of IP datagram data appear on the previous slide even though the datagram is greater than 68 bytes has to do with the collection of the data in the frame header In this case, we are running on a host that has an Ethernet connection Ethernet has a

14 byte frame header which holds fields such as the source and destination MAC address and the kind of embedded datagram – IP, arp or rarp This is why we only see 54 bytes of IP datagram; 14 bytes are used to record the Ethernet header

Trang 26

26

Increasing the snaplen

For Ethernet, maximum frame size (frame header

As a test case, let’s say we want to capture the entire datagram for each record we read or process on

an Ethernet network In this case, we need to increase the snaplen to the maximum size of the datagram + the frame header Ethernet has a Maximum Transmission Unit (MTU) of 1500 If you add 14 bytes for the frame header, the snaplen must be 1514 bytes

Now, to check if we’ve collected the entire datagram, we run tcpdump with a snaplen of 1514 If we dump the collected record in hexadecimal, we find we’ve collected more than the 54 bytes The actual datagram length is found in bytes 2-3 (counting starts at 0 bytes) We discover a hex 54 in this field In decimal, that translates to 5*161+ 4*160= 84 And, we see that we’ve collected all 84 bytes

Trang 27

27

Converting Hex Output to a

Decimal Value

IP datagram length is 16 bits

16 bits = 4 hex characters

Start at the right-most character

Take each hex character and represent it as a power of 16

Simply continue labeling the remaining hex characters of the field you are converting as increasing powers of 16 Finally, after all of the characters are labeled as powers of 16, multiply the hex character by that value

In the above example, we are looking at the length field We have 4 hex characters because the length is a 16-bit field We really only need to label the two right-most characters because they are non-zero After we do this, we find we have a 4 in the 160 position; this is really the one’s position meaning we have 4*1 or 4 The next character of 5 is in the 161position So, we multiply 5*16 for a product of 80 Therefore, the decimal conversion is 84

Trang 28

6 Bytes 6 Bytes 2 Bytes (Calculated)

There will be times that you will be interested in examining the frame header One of the reasons for this would be to identify the source MAC address to try to determine where the packet came from - a host or perhaps a router The frame header can be displayed for Ethernet using the –e option

You see the source and destination MAC addresses followed by the type of packet that follows the frame header The types of traffic you are likely to see are IP, arp and rarp These fields are all stored in the frame header The final field is the length, in bytes, of the frame (not including the trailing 4 bytes of cyclical redundancy check – CRC) In this case, it is the length of the datagram plus 14 bytes since this is Ethernet This field is not stored in the header, it is calculated and

displayed in decimal when the –e option is selected

Trang 29

29Length Fields

There are several different length fields that are found in the IP datagram These can be somewhat confusing since they often cannot be calculated unless you understand that the value must be multiplied by some factor to determine the true length We’ll examine these fields in this section

Trang 30

flags 13-bit fragment offset8-bit time to live

(TTL)

8-bit protocol 16-bit header checksum

32-bit source IP address

32-bit destination IP address

4-bit IP header length

16-bit IP datagram total length

13-bit fragment offset length

4-bit header length – multiply by 4 to convert to bytes16-bit total IP datagram length – already expressed in bytes13-bit fragment offset – multiply by 8 to convert to bytes

If you look at the IP header above, you’ll see that there are three different fields containing lengths None of these fields except the 16-bit IP datagram total length is the actual byte length of the field Let’s examine these fields in more detail

Trang 31

The IP header length is found in the low-order nibble (4 bits) of the first byte offset into the IP header In the above slide, we see that the IP header length is 5 This is not 5 bytes as one might assume This is actually 5 words A word is defined as a 32-bit field And, considering that a byte has 8 bits, a word is 4 bytes So, you have to use a multiplication factor of 4 to figure the actual number of bytes In this case, we see that we have 20 bytes Since this field is 4 bits long, the greatest value that can be found in it is a binary 1111 or a hexadecimal 0f which is a decimal 15 This means the longest IP header can be 60 bytes.

You may be wondering why you have to go through this conversion – why didn’t they just make the field long enough to express in bytes? That would require 2 additional bits (26= 64) to represent the maximum of 60 bytes This would require every IP datagram to be 2 additional bits longer –increasing the volume of traffic So, any kind of representation that might truncate the size of the datagram improves efficiency

Trang 33

If we look at the type of IP option that is in this datagram, it is found in the 20thbyte There is a hex value of 0x44 there, which translates to a decimal 68 which indicates that we want to record timestamps This attempts to collect timestamps for all routers through which the datagram travels Each timestamp takes up 4 bytes The timestamp option itself requires 4 bytes in the IP header as overhead This means if the maximum IP header is 60 bytes and we must have 20 bytes for the standard header, we only have 40 bytes left for recording timestamps This allows 9 timestamps to

be collected which may not be enough to record all router timestamps through which the datagram travels

Trang 34

34

Fragmentation – Total Length

16:21:35.686860 ping.com> your.net: icmp: echo

If you look at the IP datagram length field, you see that we have a hex value of 0x05dc which computes to 1500 decimal As you will recall, this is the MTU for Ethernet So, it appears that this datagram went from a link layer that was larger than 1500 to an Ethernet network The 1500 represents the 1480 bytes of embedded fragment data plus the 20 byte IP header

Trang 35

flags 8-bit time to live

(TTL)

8-bit protocol 16-bit header checksum

32-bit source IP address

32-bit destination IP address

13-bit fragmentation offset

Looking at the above slide, the fragmentation offset field is found partially in the 6thand 7thbytes of the IP header It is a 13-bit field When a datagram is fragmented, this field will have to be changed

to reflect the offset that this fragment is found in the reassembled fragment data

Trang 36

Fragment offset length 213= 8192 bytes

How do you specify a fragment offset > 8192

65,536 / 8192 = 8

Need to multiply fragment offset by 8

Theoretically, it is possible to have a datagram that is 65,535 bytes since the datagram length field is

16 bits Given this, it is also theoretically possible that a fragment offset can be very close to this 65,535 limit But, the fragment offset field is only 13 bits with a possible maximum value of 8192 bytes Therefore, some multiplication factor must be applied to the offset to be able to represent all possible fragments

We see that if you divide the maximum possible IP datagram size – 216(actually 216 – 1) and the maximum fragment offset size 213(actually 213 – 1), you have 23which is 8 More simply, 8192 * 8

= 65536 This is how we arrive at the multiplicative factor of 8 for the fragment offset length

Trang 37

We find a fragment offset of 0xb9 which translates to a decimal 185 But, this must be multiplied by

8 to compute the actual offset which is 1480 This most likely indicates that this fragment traveled to

an Ethernet network with a MTU of 1500 (including a 20 byte IP header)

Trang 38

options (if any)

16-bit source port number 16-bit destination port number

32-bit sequence number

32-bit acknowledgement number reserved

(6-bits)

U A P R S F

R C S S Y I

G K H T N N

16-bit window size

16-bit checksum 16-bit urgent pointer

4-bit

header

length

4-bit TCP header length – multiply by 4 to convert to bytes

Another length field is found in the TCP header This represents the length of the TCP header itself Like the IP header, the TCP header can have options And, like the IP header, the TCP header is normally 20 bytes long This TCP header is found in the high-order nibble of the 12thbyte offset in the TCP header One final similarity between the IP and TCP header lengths is that they are both expressed as 32-bit words and must be multiplied by 4 to be converted to bytes

Ngày đăng: 22/10/2013, 16:15

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN