Tài liệu Debugging C and C++ code in a Unix environment ppt

In the section called Core dumps, we will see what a core dump is, and how it can help you in debugging your code.. • What type of bug was it see the section called Types of bugs?. So us

Trang 1

Debugging C and C ++ code in a Unix

environment

J.H.M Dassen jdassen@wi.LeidenUniv.nl

I.G Sprinkhuizen-Kuyper kuyper@wi.LeidenUniv.nl

Trang 2

Debugging C and C code in a Unix environment

by J.H.M Dassen and I.G Sprinkhuizen-Kuyper

Copyright and Permission Notice

Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved

Trang 3

Table of Contents

Abstract 5

1 Introduction 6

2 Conventions 7

3 Aspects of debugging C and C ++ code 8

Noticing and localising a bug 8

Understanding a bug 8

Repairing a bug 8

Types of bugs 9

C and C++specific problems 10

Preprocessor 10

Strong systems dependency 10

Weak type system 11

Explicit storage allocation and deallocation 11

Name space pollution 11

Incremental building/linking 12

The build process 12

Core dumps 13

Debugging techniques 13

Using the compiler’s features 13

The RTFM technique 14

printf() debugging 15

Assertions: defensive programming 17

ANWB debugging 18

Code grinding (code walk through) 18

Tools 18

The editor 18

A version management system 18

The debugger 19

Memory allocation debugging tools 21

System call tracers 21

Profilers 22

Conclusions 22

Bibliography 23

A .25

An example makefile 25

Documentation formats 27

Manual pages 27

Trang 4

Info documentation 28 HTML and PDF 28 Flat ASCII, DVI, PostScript etc .28

Trang 5

This document describes several techniques and tools for debugging code in C-like languages in a Unixenvironment

Trang 6

Chapter 1 Introduction

Debugging is the art of removing bugs from software The software may be code, documentation, or any

other intellectual product Here, we will look at the debugging of computer programs (or libraries)written in C or C++in a Unix environment Most of it is also applicable to other compiled procedural andobject oriented languages like Pascal, Modula and Objective C

We will mostly focus on techniques and tools to assist in debugging Of course, it is better to preventbugs from slipping into your code in the first place Sometimes it is difficult to distinguish between goodcoding practices and good debugging practices, because good debugging practices often involve

preparation and prevention So, we will also discuss some good coding practices that you should

consider adopting These practices will not make your programs bug-free, but they will diminish theoccurrence of certain types of bugs, while preparing you better for dealing with the remaining ones

It is our experience that many people waste large amounts of time on localising bugs that are quite easy

to fix once they are found, because they are not aware of, or do not know how to use, the tools,

techniques and practices available to them

Our goal is to help you prevent wasting your time in this fashion We hope you will invest time to studythe material covered here; we are convinced this investment will pay off

Trang 7

Chapter 2 Conventions

This paper follows some Unix conventions: commands and names of manual pages are written like this;

for manual pages like this: ls(1), where the section is indicated in parentheses Also, some of the

terminology (‘foo’, ‘bar’, ‘RTFM’) comes from Unix hackerdom; see [JARGON] if you are interested init

Trang 8

Chapter 3 Aspects of debugging C and C ++

code

Debugging C and C++code entails noticing, localising, understanding and repairing bugs

Noticing and localising a bug

You might think that noticing a bug is easy: you know what your code should do, and you notice that itdoes not do that This easiness is deceptive Noticing a bug involves testing Testing is best done in adisciplined fashion, and, wherever possible, in an automated fashion1 For certain types of programs(e.g compilers) it is relatively easy to construct tests (input + expected output/result) and to run theseautomatically — say, after each build

You should prepare tests carefully Make sure that if a test fails, you can see what goes wrong

In a Unix system, a bug often manifests itself as a program crash, leaving a core dump In the section

called Core dumps, we will see what a core dump is, and how it can help you in debugging your code.

Understanding a bug

You should make sure that you understand a bug fully before you attempt to fix it Ask yourself thefollowing questions:

• Have I really found the cause of the problem I observed, or is this a mere symptom?

• Have I made similar mistakes (especially wrong assumptions) elsewhere in the code?

• Is this cause just a programming error, or is there a more fundamental problem (e.g the algorithm isincorrect)?

Repairing a bug

Repairing a bug is more than modifying code Make sure you document your fix properly in the code,

and test it properly

After repairing a bug, ask yourself what you can learn from it:

Trang 9

Chapter 3 Aspects of debugging C and C code

• How did I notice this bug? This might help you to write a test case to detect it if it slips in again

• How did I track it down? This will give you better insight in which approach to take in case youencounter similar symptoms again

• What type of bug was it (see the section called Types of bugs)? Do I encounter this type often? If so,

what can I do to prevent it from re-occurring?

What you learn is probably valuable not only to you in developing this particular piece of code Try tocommunicate what you learned to your colleagues, for instance by writing it down in a pattern-likefashion (e.g ‘IF you find your program foos bars AND it does not foo bazs THEN try frobbing it’).Quite often, we find that one of the main reasons why tracking down a bug takes so long, is that we havemade unjustified assumptions about parts of our code2

• Build errors Some errors can result from using object files that haven’t been rebuilt after a changethat affects them Make sure you use a Makefile, and that it accurately reflects the dependencies

involved in building your project See the section called An example makefile in Appendix A for a way

to track dependencies automatically

• Basic semantic bugs, such as using uninitialised variables, dead code3and certain type problems Acompiler can often bring these to your attention, but it must be told to do so explicitly (e.g throughwarning and optimisation flags4; see the section called Using the compiler’s features).

• Semantic bugs, such as using the wrong variable or using ‘&’ ’&&’ No compiler or other tool canfind these You’ll have to do some thinking here Testing your program step by step using a debuggingtool can help you here

Trang 10

Note that there are many ways of classification, most of which are orthogonal to each other For example,hackers tend to distinguish between Bohr bugs and Heisenbugs ([JARGON]) Bohr bugs are ‘reliable’bugs: given a particular input, they will always manifest themselves Heisenbugs are bugs that aredifficult to reproduce reliably; they appear to depend on the phase of the moon (environmental factorslike time, particular memory allocation etc.) A Heisenbug is very often the result of errors in pointers:

using memory that is not allocated So use tools (Electric Fence, see the section called Memory

allocation debugging tools) to check all pointers and array boundaries (Another cause is the use of

uninitialised variables)

There are some features of the C and C++languages and the associated build process that often lead toproblems

Preprocessor

C and C++use a preprocessor to expand macro’s, declare dependencies and import declarations and to doconditional compilation In itself, this is quite reasonable You should realise however that all of these aredone on a textual level The C/C++preprocessor does not

This can make it difficult to track down missing declarations, it can lead to semantic problems because ofmacro expansion and it can cause subtle problems

If you suspect a problem due to preprocessing, check out the preprocessor’s manual (e.g [CPP]) and let

it expand your file for examination

Strong systems dependency

C was developed for use as a systems programming language C and also C++can give you access to a lot

of operating system functionality Unfortunately, there are a lot of small but significant differencesamong various Unix systems:

• Some system calls are not available on all systems

• Some system calls and library functions are defined in different header files on different systems

• There may be differing semantics for particular routines For example, on Sys V-like systems, a signalhandler reinstalled On BSD-like systems, a signal handler stays in place until explicitly removed

Trang 11

Also, the size and representation of some of C’s and C++’s basic types is dependent on the underlyingsystem As a C or C++programmer, you should be aware of what things are explicitly undefined in the C

or C++standard, and thus are implementation (system or compiler) dependent There are standard ways toovercome some of these problems, like usingsizeofinstead of the concrete size of the variable on thecurrent system

Weak type system

C and C++have a type system, but it is very weak You can do all kinds of conversions, many of whichcan be system dependent or meaningless Also, the compiler can do some implicit conversions that maycause havoc

Most errors due to the weak type system can be caught in the bud by doing static analysis early; see the

section called Using the compiler’s features.

Explicit storage allocation and deallocation

In C and C++, you have to explicitly allocate and deallocate dynamic storage throughmallocandfree

(for C) and throughnewanddelete(for C++) If memory (de)allocation is done incorrectly, it can causeproblems at run time such as memory corruption and memory leaks (the memory use of a program keeps

on increasing during execution)

Common errors are:

• Trying to use memory that has not been allocated yet

• Trying to access memory that has been deallocated already

• Deallocating memory twice

These errors are difficult to correct without using proper tools; see the section called Memory allocation

debugging tools.

Name space pollution

In C and C++programmers commonly do not to try to prevent name space pollution (name conflicts)

• Use thestatickeyword to indicate functions and variables whose scope is restricted to the currentfile

Trang 12

• Use as few global variables and functions as necessary If you have to use a large number of them,prefix their names consistently (e.g.MYPROJECT_someglobal)

Incremental building/linking

C and C++code can be built incrementally; usually make is used to specify dependencies among files for

a build If a Makefile does not specify dependencies properly, you can end up with executables linked toold versions of modules which can be buggy or incompatible with recently introduced changes in othermodules

The build process

Bugs you encounter may not be due to your C or C++code; they might be the result of how your

executable/library was built Make sure that you understand how the build process is organised

You should use a Makefile A Makefile describes how to build your project: it lists the files involved inyour project, their interdependencies and how a tool should build intermediary files and the end product.Make sure you have listed all dependencies; missing even a single dependency can lead to subtle

problems

make is a powerful tool, and it pays off to acquaint yourself with it well For instance, in general you

should not list compilation lines directly GNU make has some builtin rules (so called implicit rules) on

how, say,.ofiles are built from.cfiles To use those rules, you only specify the dependencies (e.g

foo.o: foo.c foo.h bar.h(for C orfoo.ccfor C++programs)), and no build rule The implicitrules have a number of variables that you can set (e.g.CCfor the C or C++compiler,CFLAGSfor thecompilation flags,LOADLIBESfor the libraries) Using the implicit rules makes your makefiles shorter,

easier to read and easier to modify See [MAKETUT], [MAKETUT2] and [MAKE] for details on make The GNU make documentation [MAKE] contains a list of the implicit rules it supports, and the variables

Trang 13

The linker combines a number of object files and libraries to produce an executable or library If this

executable or library needs no external libraries, it is called statically linked; otherwise it is called

A core dump is a snapshot of the execution of a program at the moment it is aborted by the operating

system (e.g for attempting to violate the memory protection) A normal core dump is not very helpful

unless you are an expert In the section called The debugger, we will see how to make core dumps more

helpful for debugging

By default, core dumps do not contain all the information you’d like them to For example, a core dumpcan tell you that you where dereferencing a pointer at memory location 0x12345 while executing theinstruction at 0x45678 You’d probably like to see a message that means more to you (‘The program wasaborted while attempting to dereference foo, which was NULL, at bar.c line 23’) This is possible, but itrequires you to include such information in advance

Also, note that a core dump is a snapshot; it does not include the history of how your program came tothe problematic state What a core dump shows you is a manifestation of a bug; the point where aprogram dumps core is not always the location of the bug itself, which may be located 100000

instructions back in time Often, you can reconstruct the history of a run from a core dump, but this is

difficult printf debugging (see the section called printf() debugging) and possibly system call tracing (see the section called System call tracers) are useful techniques to do this Using a debugger (see the section called The debugger) is advised.

Trang 14

Debugging techniques

In this section a number of debugging techniques from reading manuals to using tools are described

Using the compiler’s features

A good compiler can do a good deal of static analysis of your code: the analysis of those aspects of a

piece of code that can be studied without executing that code

Static analysis can help in detecting a number of basic semantic problems such as type mismatches anddead code

For gcc (the GNU C compiler) there are a number of options that affect what static analysis gcc does andwhat results will be shown There are two types of options:

Warning options

gcc has a great number of warning flags Most have the form-Wphrase You should pick ones

relevant to you at the start of coding and put them into your Makefile (use the implicit rules, and putthem in theCFLAGSvariable) Note that-Walldoes not switch on all warnings It enables a set of

warnings that gcc’s developers consider useful under nearly all circumstances In addition to-Wall

we recommend at least the following warnings when writing new code: -Wshadow-Wpointer-arith -Wcast-qual -Wcast-align -Wstrict-prototype 5As an example:The following code will result in a warning because the possibility exists that the function returnswithout returning a value: foo(int a) { if (a > 0) return a; }

Optimisation flags

gcc also supports a number of optimisations Some of these trigger gcc to do extensive flow analysis

of your code, resulting in for example dead code removal For normal use, we recommend-O2 Donot use higher optimisation levels unless you know what you are doing; the higher levels cancontain experimental optimisations which could generate bad code Also note that on some systems,enabling optimisation makes debugging using a debugger virtually impossible

For full documentation of these options, see the chapter ‘GNU CC Command Options’ in [GCC]

The RTFM technique

RTFM stands for Read The Fine Manual Make sure you take the time to find relevant documentation for

the task at hand, i.e the documentation of the tools (not only the compiler, but also make, the

preprocessor and the linker), libraries and algorithms you are expected to use, such as

Tiêu đề	Debugging C and C++ code in a Unix environment
Tác giả	J.H.M. Dassen, I.G. Sprinkhuizen-Kuyper
Trường học	Leiden University
Chuyên ngành	Computer Science
Thể loại	Manual
Năm xuất bản	1998-1999
Thành phố	Leiden

Định dạng
Số trang	29
Dung lượng	118,49 KB