1. Trang chủ
  2. » Công Nghệ Thông Tin

08 prentice hall the linux programmers toolbox mar (2007) ebook bbl

649 304 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 649
Dung lượng 4,29 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

exam-How This Book Is Organized Chapter 1, Downloading and Installing Open Source Tools, covers the mechanismsused to distribute open source code.. I examine archive files and package fi

Trang 2

The Linux Programmer’s

Toolbox

Trang 3

Prentice Hall Open Source Software Development Series

Arnold Robbins, Series Editor

“Real world code from real world applications”

Open Source technology has revolutionized the computing world Many large-scale projects are

in production use worldwide, such as Apache, MySQL, and Postgres, with programmers writing

applications in a variety of languages including Perl, Python, and PHP These technologies are in

use on many different systems, ranging from proprietary systems, to Linux systems, to traditional

UNIX systems, to mainframes.

The Prentice Hall Open Source Software Development Series is designed to bring you the

best of these Open Source technologies Not only will you learn how to use them for your

projects, but you will learn from them By seeing real code from real applications, you will learn

the best practices of Open Source developers the world over.

Titles currently in the series include:

Linux ® Debugging and Performance Tuning: Tips and Techniques

UNIX to Linux® Porting

Alfredo Mendoza, Chakarat Skawratananond, Artis Walker

0131871099, Paper, ©2006

Linux Programming by Example: The Fundamentals

Arnold Robbins

0131429647, Paper, ©2004

The Linux ® Kernel Primer: A Top-Down Approach for x86 and PowerPC Architectures

Claudia Salzberg, Gordon Fischer, Steven Smolski

0131181637, Paper, ©2006

Trang 4

The Linux Programmer’s

Toolbox

John Fusco

Upper Saddle River, NJ • Boston • Indianapolis • San FranciscoNew York • Toronto • Montreal • London • Munich • Paris • MadridCape Town • Sydney • Tokyo • Singapore • Mexico City

Trang 5

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as marks Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals.

trade-The author and publisher have taken care in the preparation of this book, but make no expressed or implied ranty of any kind and assume no responsibility for errors or omissions No liability is assumed for incidental or con- sequential damages in connection with or arising out of the use of the information or programs contained herein The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales, which may include electronic versions and/or custom covers and content particular to your business, training goals, marketing focus, and branding interests For more information, please contact:

war-U.S Corporate and Government Sales

Visit us on the Web: www.prenhallprofessional.com

Library of Congress Cataloging-in-Publication Data

Fusco, John.

The Linux programmer’s toolbox / John Fusco.

p cm.

Includes bibliographical references and index.

ISBN 0-13-219857-6 (pbk : alk paper)

1 Linux 2 Operating systems (Computers) I Title

QA76.76.O63F875 2007

005.4'32—dc22

2006039343 Copyright © 2007 Pearson Education, Inc.

All rights reserved Printed in the United States of America This publication is protected by copyright, and mission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval sys- tem, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise For information regarding permissions, write to:

per-Pearson Education, Inc.

Rights and Contracts Department

One Lake Street

Upper Saddle River, NJ 07458

Fax: (201) 236-3290

ISBN 0-13-219857-6

Text printed in the United States on recycled paper at Courier in Stoughton, Massachusetts.

First printing, March 2007

Trang 6

To my wife, Lisa, and my children, Andrew, Alex, and Samantha.

Trang 7

This page intentionally left blank

Trang 8

Foreword xvii

Preface xix

Acknowledgments xxiii

About the Author xxv

Chapter 1 Downloading and Installing Open Source Tools 1

1.1 Introduction 1

1.2 What Is Open Source? 2

1.3 What Does Open Source Mean to You? 2

1.3.1 Finding Tools 3

1.3.2 Distribution Formats 4

1.4 An Introduction to Archive Files 4

1.4.1 Identifying Archive Files 6

1.4.2 Querying an Archive File 7

1.4.3 Extracting Files from an Archive File 11

1.5 Know Your Package Manager 12

1.5.1 Choosing Source or Binary 14

1.5.2 Working with Packages 16

1.6 Some Words about Security and Packages 17

1.6.1 The Need for Authentication 19

1.6.2 Basic Package Authentication 19

1.6.3 Package Authentication with Digital Signatures 21

1.6.4 GPG Signatures with RPM 22

1.6.5 When You Can’t Authenticate a Package 25

vii

Contents

Trang 9

1.7 Inspecting Package Contents 27

1.7.1 How to Inspect Packages 28

1.7.2 A Closer Look at RPM Packages 30

1.7.3 A Closer Look at Debian Packages 31

1.8 Keeping Packages up to Date 33

1.8.1 Apt: Advanced Package Tool 34

1.8.2 Yum: Yellowdog Updater Modified 35

1.8.3 Synaptic: The GUI Front End for APT 36

1.8.4 up2date: The Red Hat Package Updater 37

1.9 Summary 39

1.9.1 Tools Used in This Chapter 39

1.9.2 Online References 40

Chapter 2 Building from Source 41

2.1 Introduction 41

2.2 Build Tools 41

2.2.1 Background 42

2.2.2 Understanding make 44

2.2.3 How Programs Are Linked 67

2.2.4 Understanding Libraries 69

2.3 The Build Process 74

2.3.1 The GNU Build Tools 74

2.3.2 The configure Stage 74

2.3.3 The Build Stage: make 77

2.3.4 The Install Stage: make install 78

2.4 Understanding Errors and Warnings 78

2.4.1 Common Makefile Mistakes 79

2.4.2 Errors during the configure Stage 82

2.4.3 Errors during the Build Stage 83

2.4.4 Understanding Compiler Errors 86

2.4.5 Understanding Compiler Warnings 88

2.4.6 Understanding Linker Errors 98

2.5 Summary 100

2.5.1 Tools Used in This Chapter 101

2.5.2 Online References 101

Trang 10

Chapter 3 Finding Help 103

3.1 Introduction 103

3.2 Online Help Tools 103

3.2.1 The man Page 104

3.2.2 man Organization 105

3.2.3 Searching the man Pages: apropos 107

3.2.4 Getting the Right man Page: whatis 110

3.2.5 Things to Look for in the man Page 111

3.2.6 Some Recommended man Pages 112

3.2.7 GNU info 115

3.2.8 Viewing info Pages 115

3.2.9 Searching info Pages 118

3.2.10 Recommended info Pages 119

3.2.11 Desktop Help Tools 120

3.3 Other Places to Look 120

3.3.1 /usr/share/doc 121

3.3.2 Cross Referencing and Indexing 121

3.3.3 Package Queries 122

3.4 Documentation Formats 124

3.4.1 TeX/LaTeX/DVI 124

3.4.2 Texinfo 125

3.4.3 DocBook 126

3.4.4 HTML 127

3.4.5 PostScript 129

3.4.6 Portable Document Format (PDF) 130

3.4.7 troff 131

3.5 Internet Sources of Information 131

3.5.1 www.gnu.org 131

3.5.2 SourceForge.net 132

3.5.3 The Linux Documentation Project 133

3.5.4 Usenet 134

3.5.5 Mailing Lists 134

3.5.6 Other Forums 134

3.6 Finding Information about the Linux Kernel 134

3.6.1 The Kernel Build 135

3.6.2 Kernel Modules 137

Trang 11

3.6.3 Miscellaneous Documentation 138

3.7 Summary 138

3.7.1 Tools Used in This Chapter 138

3.7.2 Online Resources 139

Chapter 4 Editing and Maintaining Source Files 141

4.1 Introduction 141

4.2 The Text Editor 142

4.2.1 The Default Editor 143

4.2.2 What to Look for in a Text Editor 144

4.2.3 The Big Two: vi and Emacs 146

4.2.4 Vim: vi Improved 146

4.2.5 Emacs 170

4.2.6 Attack of the Clones 179

4.2.7 Some GUI Text Editors at a Glance 182

4.2.8 Memory Usage 187

4.2.9 Editor Summary 188

4.3 Revision Control 189

4.3.1 Revision Control Basics 189

4.3.2 Defining Revision Control Terms 191

4.3.3 Supporting Tools 193

4.3.4 Introducing diff and patch 193

4.3.5 Reviewing and Merging Changes 197

4.4 Source Code Beautifiers and Browsers 203

4.4.1 The Indent Code Beautifier 204

4.4.2 Astyle Artistic Style 206

4.4.3 Analyzing Code with cflow 207

4.4.4 Analyzing Code with ctags 210

4.4.5 Browsing Code with cscope 211

4.4.6 Browsing and Documenting Code with Doxygen 212

4.4.7 Using the Compiler to Analyze Code 214

4.5 Summary 216

4.5.1 Tools Used in This Chapter 216

4.5.2 References 217

4.5.3 Online Resources 218

Trang 12

Chapter 5 What Every Developer Should Know about the Kernel 221

5.1 Introduction 221

5.2 User Mode versus Kernel Mode 222

5.2.1 System Calls 223

5.2.2 Moving Data between User Space and Kernel Space 226

5.3 The Process Scheduler 226

5.3.1 A Scheduling Primer 227

5.3.2 Blocking, Preemption, and Yielding 228

5.3.3 Scheduling Priority and Fairness 229

5.3.4 Priorities and Nice Value 234

5.3.5 Real-Time Priorities 235

5.3.6 Creating Real-Time Processes 238

5.3.7 Process States 239

5.3.8 How Time Is Measured 246

5.4 Understanding Devices and Device Drivers 257

5.4.1 Device Driver Types 257

5.4.2 A Word about Kernel Modules 259

5.4.3 Device Nodes 260

5.4.4 Devices and I/O 272

5.5 The I/O Scheduler 282

5.5.1 The Linus Elevator (aka noop) 282

5.5.2 Deadline I/O Scheduler 284

5.5.3 Anticipatory I/O Scheduler 284

5.5.4 Complete Fair Queuing I/O Scheduler 285

5.5.5 Selecting an I/O Scheduler 285

5.6 Memory Management in User Space 286

5.6.1 Virtual Memory Explained 286

5.6.2 Running out of Memory 303

5.7 Summary 315

5.7.1 Tools Used in This Chapter 315

5.7.2 APIs Discussed in This Chapter 316

5.7.3 Online References 316

5.7.4 References 316

Trang 13

Chapter 6 Understanding Processes 317

6.1 Introduction 317

6.2 Where Processes Come From 317

6.2.1 fork and vfork 318

6.2.2 Copy on Write 319

6.2.3 clone 320

6.3 The exec Functions 320

6.3.1 Executable Scripts 321

6.3.2 Executable Object Files 324

6.3.3 Miscellaneous Binaries 324

6.4 Process Synchronization with wait 327

6.5 The Process Footprint 329

6.5.1 File Descriptors 331

6.5.2 Stack 338

6.5.3 Resident and Locked Memory 339

6.6 Setting Process Limits 340

6.7 Processes and procfs 343

6.8 Tools for Managing Processes 346

6.8.1 Displaying Process Information with ps 346

6.8.2 Advanced Process Information Using Formats 349

6.8.3 Finding Processes by Name with ps and pgrep 352

6.8.4 Watching Process Memory Usage with pmap 353

6.8.5 Sending Signals to Processes by Name 354

6.9 Summary 355

6.9.1 System Calls and APIs Used in This Chapter 356

6.9.2 Tools Used in This Chapter 356

6.9.3 Online Resources 356

Chapter 7 Communication between Processes 357

7.1 Introduction 357

7.2 IPC Using Plain Files 358

7.2.1 File Locking 362

7.2.2 Drawbacks of Using Files for IPC 363

Trang 14

7.3 Shared Memory 363

7.3.1 Shared Memory with the POSIX API 364

7.3.2 Shared Memory with the System V API 367

7.4 Signals 370

7.4.1 Sending Signals to a Process 371

7.4.2 Handling a Signal 371

7.4.3 The Signal Mask and Signal Handling 373

7.4.4 Real-Time Signals 376

7.4.5 Advanced Signals with sigqueue and sigaction 378

7.5 Pipes 381

7.6 Sockets 382

7.6.1 Creating Sockets 383

7.6.3 Client/Server Example Using Local Sockets 387

7.6.4 Client Sever Using Network Sockets 392

7.7 Message Queues 393

7.7.1 The System V Message Queue 393

7.7.2 The POSIX Message Queue 397

7.7.3 Difference between POSIX Message Queues and System V Message Queues 402

7.8 Semaphores 402

7.8.1 Semaphores with the POSIX API 407

7.8.2 Semaphores with the System V API 410

7.9 Summary 412

7.9.1 System Calls and APIs Used in This Chapter 412

7.9.2 References 414

7.9.3 Online Resources 414

Chapter 8 Debugging IPC with Shell Commands 415

8.1 Introduction 415

8.2 Tools for Working with Open Files 415

8.2.1 lsof 416

8.2.2 fuser 417

8.2.3 ls 418

8.2.4 file 418

8.2.5 stat 419

Trang 15

8.3 Dumping Data from a File 420

8.3.1 The strings Command 422

8.3.2 The xxd Command 423

8.3.3 The hexdump Command 424

8.3.4 The od Command 425

8.4 Shell Tools for System V IPC 426

8.4.1 System V Shared Memory 426

8.4.2 System V Message Queues 429

8.4.3 System V Semaphores 430

8.5 Tools for Working with POSIX IPC 431

8.5.1 POSIX Shared Memory 431

8.5.2 POSIX Message Queues 432

8.5.3 POSIX Semaphores 433

8.6 Tools for Working with Signals 434

8.7 Tools for Working with Pipes and Sockets 437

8.7.1 Pipes and FIFOs 437

8.7.2 Sockets 438

8.8 Using Inodes to Identify Files and IPC Objects 440

8.9 Summary 442

8.9.1 Tools Used in This Chapter 442

8.9.2 Online Resources 443

Chapter 9 Performance Tuning 445

9.1 Introduction 445

9.2 System Performance 445

9.2.1 Memory Issues 446

9.2.2 CPU Utilization and Bus Contention 456

9.2.3 Devices and Interrupts 459

9.2.4 Tools for Finding System Performance Issues 467

9.3 Application Performance 475

9.3.1 The First Step with the time Command 475

9.3.2 Understanding Your Processor Architecture with x86info 476

9.3.3 Using Valgrind to Examine Instruction Efficiency 480

9.3.4 Introducing ltrace 484

9.3.5 Using strace to Monitor Program Performance 485

9.3.6 Traditional Performance Tuning Tools: gcov and gprof 487

9.3.7 Introducing OProfile 494

Trang 16

9.4 Multiprocessor Performance 501

9.4.1 Types of SMP Hardware 501

9.4.2 Programming on an SMP Machine 506

9.5 Summary 509

9.5.1 Performance Issues in This Chapter 510

9.5.2 Terms Introduced in This Chapter 510

9.5.3 Tools Used in This Chapter 510

9.5.4 Online Resources 511

9.5.5 References 511

Chapter 10 Debugging 513

10.1 Introduction 513

10.2 The Most Basic Debugging Tool: printf 514

10.2.1 Problems with Using printf 514

10.2.2 Using printf Effectively 519

10.2.3 Some Final Words on printf Debugging 528

10.3 Getting Comfortable with the GNU Debugger: gdb 529

10.3.1 Running Your Code with gdb 530

10.3.2 Stopping and Restarting Execution 531

10.3.3 Inspecting and Manipulating Data 541

10.3.4 Attaching to a Running Process with gdb 553

10.3.5 Debugging Core Files 553

10.3.6 Debugging Multithreaded Programs with gdb 557

10.3.7 Debugging Optimized Code 558

10.4 Debugging Shared Objects 561

10.4.1 When and Why to Use Shared Objects 562

10.4.2 Creating Shared Objects 563

10.4.3 Locating Shared Objects 564

10.4.4 Overriding the Default Shared Object Locations 564

10.4.5 Security Issues with Shared Objects 565

10.4.6 Tools for Working with Shared Objects 565

10.5 Looking for Memory Issues 569

10.5.1 Double Free 569

10.5.2 Memory Leaks 570

10.5.3 Buffer Overflows 570

10.5.4 glibc Tools 572

10.5.5 Using Valgrind to Debug Memory Issues 576

10.5.6 Looking for Overflows with Electric Fence 581

Trang 17

10.6 Unconventional Techniques 583

10.6.1 Creating Your Own Black Box 584

10.6.2 Getting Backtraces at Runtime 587

10.6.3 Forcing Core Dumps 589

10.6.4 Using Signals 590

10.6.5 Using procfs for Debugging 591

10.7 Summary 594

10.7.1 Tools Used in This Chapter 594

10.7.2 Online Resources 595

10.7.3 References 595

Index 597

Trang 18

Foreword

OK, so you’ve mastered the basics of Linux You can run ls, grep, find, and sort, and

as a C or C++ programmer, you know how to use the Linux system calls You knowthat there’s much more to life than “point and click” and that Linux will give it toyou You’re just not sure yet how So you ask yourself, “What’s next?”

This book gives you the answer John’s knowledge is broad, and he shows the longer-novice Linux user how to climb up the next part of the learning curve towardmastery

no-From command-line tools for debugging and performance analysis to the range

of files in /proc, John shows you how to use all of them to make your day-to-daylife with Linux easier and more productive

Besides a lot of “what” (what tools, what options, what files), there’s a lot of “why”here John shows you why things work the way they do In turn, this lets you under-stand why the “what” is effective and internalize the Zen of Linux (and Unix!).There’s a ton of great stuff in this book I hope you learn a lot I know I did, andthat’s saying something

Enjoy,

Arnold Robbins

Series Editor

Trang 19

This page intentionally left blank

Trang 20

Preface

Linux has no shortage of tools Many are inherited from Unix, with cryptic letter names that conjure up images of developers trying to preserve space on apunch card Happily, those days are long gone, but the legacy remains

two-Many of those old tools are still quite useful Most are highly specialized Eachmay do only one thing but does it very well Highly specialized tools often havemany options that can make them intimidating to use Consider the first time youused grepand learned what a regular expression was Perhaps you haven’t masteredregular expression syntax yet (don’t worry; no one else has, either) That’s notimportant, because you don’t need to be a master of regular expressions to put

grep to good use

If there’s one thing that I hope you learn from this book, it’s that there are manytools out there that you can use without having to master them You don’t need toinvest an enormous amount of time reading manuals before you can be productive

I hope you will discover new tools that you may not have been familiar with Some

of the tools this book looks at are quite old and some are new All of them are ful As you learn more about each tool, you will find more uses for it

use-I use the term tool loosely in this book To me, creating tools is as important as

using tools, so I have included various APIs that are not usually covered in muchdetail in other books In addition, this book provides some background on theinternal workings of the Linux kernel that are necessary to understand what sometools are trying to tell you I present a unique perspective on the kernel: the user’spoint of view You will find enough information to allow you to understand theground rules that the kernel sets for every process, and I promise you will not have

to read a single line of kernel source code

What you will not find in this book is reconstituted man pages or other mentation stitched into the text The GNU and Linux developers have done a great

Trang 21

docu-job of documenting their work, but that documentation can be hard to find for theinexperienced user Rather than reprint documentation that will be out of date bythe time you read this, I show you some ingenious ways to find the most up-to-datedocumentation.

GNU/Linux documentation is abundant, but it’s not always easy to read Youcan read a 10,000-word document for a tool and still not have a clue what the tooldoes or how to use it This is where I have tried to fill in the missing pieces I havetried to explain not just how to use each tool, but also why you would want to use

it Wherever possible, I have provided simple, brief examples that you can type andmodify yourself to enhance your understanding of the tools and Linux itself.What all the tools in this book have in common is that they are available at nocost Most come with standard Linux distributions, and for those that may not, Ihave included URLs so that you can download them yourself

As much as possible, I tried to keep the material interesting and fun

Who Should Read This Book

This book is written for intermediate to advanced Linux programmers who wish tobecome more productive and gain a better understanding of the Linux program-ming environment If you’re an experienced Windows programmer who feels like afish out of water in the Linux environment, then this book is for you, too

Non-programmers should also find this book useful because many of the tools andtopics I cover have applications beyond programming If you are a system adminis-trator, or just a Linux enthusiast, then there’s something for you in this book, too

The Purpose of This Book

I wrote this book as a follow-up to an article I wrote for the Linux Journal entitled

“Ten Commands Every Linux Developer Should Know.” The inspiration for thisarticle came from my own experience as a Linux programmer In my daily work Imake it a point to invest some of my time in learning something new, even if itmeans a temporary lull in progress on my project Invariably this strategy has paidoff I have always been amazed at how many times I learned about a tool or fea-ture that I concluded would not be useful, only to find a use for it shortly after-ward This has always been a powerful motivation for me to keep learning I hopethat by reading this book, you will follow my example and enhance your skills on

a regular basis

Trang 22

It’s also just plain fun to learn about this stuff If you are like me, you enjoy ing with Linux Motivating yourself to learn more has never been a problem.Because Linux is open source, you have the opportunity to understand all of itsinner workings, which is not possible with closed source environments likeWindows In this book I present several freely available resources available to helpyou learn more.

work-How to Read This Book

The chapters are presented such that each chapter can stand on its own Later ters require some background knowledge that is presented in the earlier chapters.Wherever possible, I have cross-referenced the material to help you find the neces-sary background information

chap-I believe the best way to learn is by example, so chap-I have tried to provide simple ples wherever possible I encourage the reader to try the examples and experiment

exam-How This Book Is Organized

Chapter 1, Downloading and Installing Open Source Tools, covers the mechanismsused to distribute open source code I discuss the various package formats used bydifferent distributions and the advantages and disadvantages of each I present sev-eral tools used to maintain packages and how to use them

Chapter 2, Building from Source, covers the basics of building an open sourceproject I present some of the tools used to build software and alternatives that areemerging There are several tips and tricks in this chapter that you can use to mas-ter your use of make I also show you how to configure projects that are distributedwith GNU’s autoconftools so that you can customize them to meet your needs.Finally, I cover the stages of the build that are often misunderstood by many pro-grammers I look at some of the errors and warnings you are likely to encounter andhow to interpret them

Chapter 3, Finding Help, looks at the various documentation formats tuckedaway in your Linux distribution that you may not know about I look at the toolsused to read these formats and discuss effective ways to use them

Chapter 4, Editing and Maintaining Source Files, discusses the various text tors available for programmers as well as the advantages and disadvantages of each

edi-I present a set of features that every programmer should look for in an editor andmeasure each editor against these This chapter also covers the basics of revisioncontrol, which is vital for software project management

Trang 23

Chapter 5, What Every Developer Should Know about the Kernel, looks at thekernel from a user’s perspective In this chapter you will find the necessary back-ground information required to understand the workings of a Linux system I intro-duce several tools that allow you to see how your code interacts with the kernel.Chapter 6, Understanding Processes, focuses on processes, their characteristics,and how to manage them I cover a good deal of background required to introducethe tools in this chapter and understand why they are useful In addition, this chap-ter introduces several programming APIs that you can use to create your own tools.Chapter 7, Communication Between Processes, introduces the concepts behindinter-process communication (IPC) This chapter contains mostly background infor-mation required for Chapter 8 Along with each IPC mechanism, I introduce theAPIs required to use it along with a working example.

Chapter 8, Debugging IPC with Shell Commands, presents several tools available

to debug applications that use IPC It builds on the information from Chapter 7 tohelp you interpret the output of these tools, which can be difficult to understand.Chapter 9, Performance Tuning, introduces tools to measure the performance ofyour system as well as the performance of individual applications I present severalexamples to illustrate how programming can impact performance I also discusssome of the performance issues that are unique to multi-core processors

Chapter 10, Debugging, presents several tools and techniques that you can use

to debug applications I look at some open source memory debugging tools ing Valgrind and Electric Fence I also take an in-depth look at the capabilities of

includ-gdb, and how to use it effectively

Trang 24

Acknowledgments

I would like to thank my wife, Lisa, without whom this book would not have beenpossible Too often, she had to be a single mom while I worked in seclusion Withouther support, I would never have been able to take advantage of this opportunity.Thanks also to my children—Andrew, Alex, and Samantha—who had to spend toomuch time without their dad during the course of this work

My thanks also go to Arnold Robbins, who provided wonderful advice and sight His experience and authoritative knowledge were invaluable to me during thecourse of this work Thanks for making this an enjoyable learning experience for me.Thanks also to Debra Williams Cauley for her patience and diligence putting upwith my missed deadlines and schedule slips This first-time author is grateful toyou for keeping everything on track

over-Finally, I would like to thank Mark Taub for recruiting me and giving me thiswonderful opportunity

Trang 25

This page intentionally left blank

Trang 26

About the Author

John Fusco is a software developer for GE Healthcare, based in Waukesha,

Wisconsin, specializing in Linux applications and device drivers John has worked

on Unix software for more than ten years and has been developing applications for

Linux since kernel version 2.0 John has written articles for Embedded Systems Programming and Linux Journal This is his first book.

xxv

Trang 27

This page intentionally left blank

Trang 28

1.1 Introduction

In this chapter, I discuss the different formats for distributing free software, how tomanipulate them, and where to find them I examine archive files and package files

in detail, as well as the most common tools commands used to manipulate them

It can be dangerous to accept software from strangers I cover various securityissues that you should be aware of and things you can do to protect yourself I intro-duce the concept of authentication and trust, and discuss how it applies to security.For those times when authentication is not possible, I show you how to inspectpackages and archives

Finally, I introduce some tools for managing packages on package-based butions and how to get the most out of them

distri-1

1

Downloading and Installing

Open Source Tools

Trang 29

1.2 What Is Open Source?

The term open source is a marketing term for free software, created by the Open

Source Initiative (OSI).1This organization was founded to promote the principles

of free software that had its roots in the GNU Project, founded by RichardStallman One goal of OSI is to counter some of the negative stereotypes about freesoftware and promote the free sharing of source code

At first, many businesses were afraid of using open source software No doubt themarketing departments of some large software companies had something to do with

it Conventional wisdom says, “You get what you pay for.” Some feared that thelicenses (like the GNU Public License) would act like a virus so that by creatingprojects using free software, they, too, would have to make their source code public.Fortunately, most of those fears have subsided Many large businesses are freelyusing and promoting open source code in their own projects Some have even basedentire products on open source software The genie is out of the bottle

To most people, open source software simply means a lot of high-quality softwareavailable at no cost Unfortunately, a lot of not-so-high-quality software is available

as well, but that’s part of the process Good project ideas flourish and improve,while bad ones wither and die Picking open source software is a bit like pickingfruit: It takes some experience to know when it’s ripe

A natural selection process is going on at many levels At the source code level,features and code are selected (based on patches) so that only the best code gets in

As a consumer, you select the projects to download, which drives the vitality of aproject No one wants to develop code for a project that no one is using Fewerdownloads attract fewer developers More downloads mean more developers, which

in turn means more code to choose among and, thus, better code Sometimes ing a project to try is a gamble, but the only things at stake are your time and effort.It’s inevitable that you will make some regrettable choices once in a while, but takeheart: It’s all part of the process

select-For some people, not knowing what you are getting is part of the fun It’s likeopening a birthday gift For others, it’s a nuisance and a waste of time If you’re

2 Chapter 1 • Downloading and Installing Open Source Tools

1 www.opensource.org

Trang 30

looking for the convenience of shrink-wrapped software that just installs and runs,there are open source projects for you—just not as many Fortunately, there aremany resources on the Internet to help you make good choices.

1.3.1 Finding Tools

The first place you should look before you start trolling the Internet is your bution CDs Assuming that you installed Linux from a set of CDs or a DVD, youprobably have a lot of tools that were not installed Most distributions ship withmuch more software on the CDs than is installed in a default installation Typically,you are given a choice when you install the OS as to what kind of system you want

distri-to create This results in an arbitrary set of packages being installed distri-to your system,based on someone’s idea of what a “workstation” or a “server” is

You can always add to the set of installed software manually by locating the rawpackages on the installation CDs The drawback here is that the packages usuallyare not arranged in any particular order, so you have to know what you are lookingfor Some distributions have graphical interfaces that arrange the packages into cat-egories to help you pick which software to install

If you don’t know what you are looking for, the Internet should be your next tination Several Web sites serve as clearinghouses for open source software Onesuch site is www.freshmeat.net Here, you will find software arranged by categories

des-so that it’s easy to find what you’re looking for While writing this book, for

exam-ple, I searched Freshmeat for the term word processors and found 71 projects

avail-able Imagine having to choose among 71 different word processors!

Freshmeat allows you to filter your results to help you narrow down your choices

My results included various operating systems besides Linux and projects in variousstages of development So I chose to limit my search to projects that have Linuxsupport, that are mature, and that use an OSI-approved Open Source license.(Freshmeat results include commercial software as well.) This reduced the number

to 12 projects—a much more manageable number A closer look revealed that eral of these projects were not what I was looking for, given the broad interpreta-

sev-tion of the term word processor After trying a few more filters, I was able to uncover

a few well-known, high-quality projects, such as AbiWord, and a few I never heard

of before There were some notable absences, such as OpenOffice, which I am using

to write this book It turns out that the reason I didn’t find OpenOffice was because

it was filed under “Office/Business :: Office Suites,” not “word processors.” Themoral of the story is that if you don’t find what you are looking for, keep looking

Trang 31

is usually a compressed tarfile An archive file is a collection of files packed into a

single file using an archiving tool such as the tarcommand Usually, the files arecompressed with the gzipprogram to save space; often, they are referred to as tar files or tarballs.

Tar files are the preferred format for distributing source code for projects Theyare easy to create and use, and every programmer is familiar with the tarprogram.Less often, you will find tar files that have binary executables in them This is aquick-and-dirty alternative to packaging and should be avoided unless you knowwhat you are doing In general, tar files are for people who have some knowledge ofprogramming and system administration

At some point in the process of downloading and installing open source software,

you are going to encounter an archive file of one sort or another An archive file is

any file that contains a collection of other files If you are a Windows user, you are

no doubt familiar with the predominant Windows archiver, PKZip Linux archiveutilities function similarly except that unlike PKZip, they do not include compres-sion Instead, Linux archive tools concentrate on archiving and leave the compres-sion to another tool (typically, gzip or bzip2) That’s the UNIX philosophy

4 Chapter 1 • Downloading and Installing Open Source Tools

Trang 32

Naturally, because this is Linux, you have more than one choice of archivers, but as

an open source consumer, you have to take what you’re given So even though youare most likely to encounter tar files exclusively, it’s good at least to know thatother tools are available

An archive utility has some special requirements beyond just preserving filenamesand data In addition to a file’s pathname and data, the archive has to preserve each

file’s metadata Metadata includes the file’s owner, group, and other attributes (such

as read/write/execute permissions) The archiver records all this information suchthat a file can be deleted from the file system and restored later from the archivewith no loss of information If you archive an executable file and then delete it fromyour file system, that file should still be executable when you restore it In Windows,the filename would indicate whether the file is executable via the extension (such as

.exe) Linux uses the file’s metadata to indicate whether it is executable, so this datamust be stored by the archiver to be preserved

The most common archive tools used in Linux are listed in Table 1-1 By far themost popular archive format is tar The name tar comes from a contraction of tape archive, which is a legacy from its days as a tape backup utility These days, tarismost commonly used as a general-purpose tool to archive groups of files into a sin-gle file An alternative to tarthat you may run into less frequently is cpio, whichuses a very different syntax to accomplish the same task There’s also the POSIXstandard archive utility pax, which can understand tarfiles, cpiofiles, or its ownformat I have never seen anything distributed in paxformat, but I mention it herefor completeness

One last archive utility worth mentioning is ar, which is most frequently used tocreate object code libraries used in software development, but it is also used to cre-ate package files used by the Debian distribution

TABLE 1-1 Most Common Archive Tools

tar Most popular.

cpio Used internally by the RPM format; not used extensively elsewhere.

ar Used internally by Debian packager; otherwise, used only for software

development libraries ar files have no path information.

Trang 33

You can also find utilities to handle .zipfiles created with PKZip as well as somelesser-known compressed archive utilities, such as lha Open source programs forLinux are virtually never distributed in these formats, however If you see a.zip

archive, it’s a good bet that it’s intended for a Microsoft operating system

For the most part, you need to know two things about each format: how toquery the archive for its contents and how to extract files from the archive UnlikeWindows archivers, which have all kinds of dangerous bells and whistles, Linuxarchivers focus on the basics So it’s generally safe to query and extract files from anarchive, especially if you are not the root user It’s always wise to query an archivebefore extracting files so that you don’t inadvertently overwrite files on your systemthat may have the same names

1.4.1 Identifying Archive Files

When you download an archive from the Internet, it most likely has been pressed to save bandwidth There are some file-naming conventions for compressedfiles; some of these are shown in Table 1-2

com-When in doubt, remember the file command This tool does a good job ofidentifying what you are looking at when the filename gives you no clue This isuseful when your Web browser or other tool munges the filename into somethingunrecognizable Suppose that I have a compressed tar archive named foo.x, for

6 Chapter 1 • Downloading and Installing Open Source Tools

TABLE 1-2 Archive Naming Conventions

.tar.gz tgz tar archive, compressed with gzip

.tar.bz2 tar archive, compressed with bzip2

.tar.Z taz tar archive, compressed with the UNIX compress command ar a ar archive, generally used only for software development

.cpio cpio archive, uncompressed

Trang 34

example The name tells me nothing about the contents of this file Then I try thefollowing command:

$ file foo.x

foo.x: gzip compressed data, from UNIX, max compression

Now I know that the file was compressed with gzip, but I still don’t knowwhether it’s a tarfile I can try unzipping it with gzipand try the filecommandagain Or I can just use the -zoption of the command:

$ file -z foo.x

foo.x: tar archive (gzip compressed data, from UNIX, max compression)

Now I know exactly what I’m looking at

Normally, people follow some intuitive naming conventions, and the name does a good job of identifying the archive type and what processing hasbeen done

file-1.4.2 Querying an Archive File

Archive files keep track of the files they contain with a table of contents, which(conveniently enough) is accessed with a -tflag for all the archivers I mentionedearlier Following is a sample from a tar file for the Debian croninstallation:

The example above added the –voption to include additional information

sim-ilar to a long listing from the lscommand The output includes the file permissions

in the first column, followed by the ownership in the second column The file size(in bytes) is shown next, with directories listed as having a size of 0 When inspect-ing archives, you should pay careful attention to the ownership and permissions ofeach file

The basic commands to list the contents of an archive for the various formats arelisted in Table 1-3 All three formats produce essentially the same output

Trang 35

TABLE 1-3 Archive Query Commands

tar archive compressed with gzip tar -tzvf filename

tar archive compressed with bzip2 tar -tjvf filename

and stdout as binary streams.

Reading the symbolic representation of a file’s permissions is fairly ward when you get used to it You should be familiar with the tricks that are used

straightfor-to represent additional information above and beyond the usual read/write/executepermissions

Let’s start with the permission string itself This is represented with a character string The first character indicates the type of file, whereas the remainingthree groups of three characters summarize the file owner’s permission, the groupmembers’ permissions, and everyone else’s permissions, respectively

ten-The type of file is indicated with a single character ten-The valid values for this acter and their meanings are listed in Table 1-4

char-The next nine characters can be grouped into three groups of three bits Each bitrepresents the read, write, or execute permissions of the file, respectively, repre-sented as r, w, and x A -in a bit position indicates that that permission is not set

A - in the w position, for example, indicates that the file is not writeable Someexamples are shown in Table 1-5

The last things to know about permissions are the setuid, setgid, and sticky bits.

These bits are not listed directly, because they affect the file’s behavior only whenexecuting

When the setuid bit is set, the code in the file will execute, using the file’s owner

as the effective user ID This means that the program can do anything that the file’sowner has permission to do If a file is owned by root and the setuid bit is set, thecode has permission to modify or delete any file in the system, no matter which userstarts the program Sounds dangerous, doesn’t it? Programs with the setuid bit havebeen the subject of attacks in the past

8 Chapter 1 • Downloading and Installing Open Source Tools

Trang 36

The setgid bit does the same thing, except that the code executes with the leges of the group to which the file belongs Normally, a program executes with theprivileges of the group of the user who started the program When the setgid bit isset, the program runs with privileges as though the user belonged to the same group.You can recognize a file with the setuid or setgid bit set by looking at the xbit inthe permissions string Normally, an x in this position means that the file is exe-cutable, whereas a -indicates that the file is not executable.

TABLE 1-4 File Types in an Archive Listing

- regular file Includes text files, data files, executable, etc.

c character device A special file used to communicate with a

character device driver These files traditionally

are restricted to the /dev directory; you usually don’t see them in archives.

b block device A special file used to communicate with a

block device driver These files traditionally are

restricted to the /dev directory; you usually don’t see them in archives.

l symbolic link A filename that points to another filename The

file it points to may reside on a different file system or may be nonexistent.

TABLE 1-5 Examples of File Permission Bits

Permissions

rwx File is readable, writeable, and executable.

rw- File is readable and writeable but not executable.

r-x File is readable and executable but not writeable.

x File is executable but not writeable or readable.

Trang 37

The setuid and setgid bits add two more possible values for this character A ercase sinstead of an xin the owner’s permissions means that the file is executable

low-by the owner and the setuid bit is set An uppercase Smeans that the setuid bit isset, but the owner does not have execute permission It seems odd, but it is allowedand just as dangerous The file could be owned by root, for example, but root has

no permission to execute the file Linux gives root execute permission if anyone has

execute permission So even if the execute bit for root is not set, as long as the rent user has execute permission, the code will execute with root privileges

cur-Like the setuid bit, the setgid bit is indicated by modifying the xposition in thegroup permissions A lowercase s here indicates that the file’s setgid bit is set andthat members of the group have permission to execute this file An uppercase Sindi-cates that the setgid bit is set, but members of the group do not have permission toexecute the file

You can see in the cronpackage output, shown earlier in this chapter, that the

crontabprogram is a setuid program owned by root Some more permissions andtheir meanings are shown in Table 1-6

TABLE 1-6 Some Examples of Permissions and Their Meanings

-rwxr-xr-x All users can execute this file Current user Current user

-rwsr-xr-x All users can execute this file File owner Current user

-rwxr-sr-x All users can execute this file Current user Group owner -rwsr-sr-x All users can execute this file File owner Group owner -rwsr-Sr-x All users can execute this file, File owner Group owner

including the owner, but not members of the file’s group.

Everyone except the owner can execute this file.

All members of the file’s group can execute this file except the owner; everyone else except the owner can execute this file.

10 Chapter 1 • Downloading and Installing Open Source Tools

Trang 38

The sticky bit is something of a relic The original intent of the sticky bit was

to make sure that certain executable programs would load faster by keeping thecode pages on the swap disk In Linux, the sticky bit is used only in directories,where it has a completely different meaning Normally, when you give write andexecute permission to other users in a directory that you own, those users are free

to create and delete files in that directory One privilege you may not want them

to have is the ability to delete other users’ files in that directory Normally, if auser has write permission in a directory, that user can delete any file in that direc-tory, not just the files he owns You can revoke this privilege by setting the stickybit on the directory When the directory has the sticky bit set, users can deleteonly files that belong to them As usual, the directory’s owner and root can deleteany files The /tmp directory on most systems has the sticky bit set for thispurpose

A directory with the sticky bit set is indicated with a tor a Tin the execute mission for others For example:

per rwxrwxrwt All users can read and write in this directory, and the sticky

bit is set

-rwxrwx T Only the owner and group members can read or write, and

the sticky bit is set

1.4.3 Extracting Files from an Archive File

Now that you know how to inspect an archive file’s contents, it’s time to extract thefiles to have a closer look The basic commands are listed in Table 1-7

Although it’s generally safe to extract files from an archive, you need to pay tion to the pathnames to avoid clobbering any data on your system In particular,

atten-cpiohas the ability to store absolute paths from the root directory This means that

if you try to extract a cpio archive that happens to have a bunch of files in /etc,you could clobber vital files inadvertently Consider a cpioarchive that contains acopy of /etc/hosts, among other things If you try to extract files from thisarchive, it will try to overwrite your copy of /etc/hosts You can see this by query-ing the archive

cpio -t < foo.cpio

/etc/hosts

Trang 39

The leading / is your clue that the archive wants to restore the copy of

/etc/hostsand not some other copy So if you are extracting files for inspection,you probably don’t want to overwrite your copies of the same file (yet) You willwant to make sure that you use the GNU option no-absolute-filenamessothat the hosts file will be extracted to

./etc/hosts

Fortunately, the only time you are likely to encounter a cpioarchive is as part of

an RPM package file, and RPM always uses pathnames relative to the current tory, so there is no chance of overwriting system files unless you want to

direc-Note that the version of tar found in some versions of UNIX also allows absolutepathnames The GNU version of tar found in Linux automatically strips the lead-ing /from files extracted from a tar archive So if you happen upon a tar file thatcomes from one of these other flavors of UNIX, GNU tar will watch your back.GNU tar also strips the leading /from the pathnames in archives that it creates

Package managers are sophisticated tools used to install and maintain software onyour system They help you keep track of what software is installed and where the

12 Chapter 1 • Downloading and Installing Open Source Tools

TABLE 1-7 Archive Extraction Commands

files to the current directory by default.

Trang 40

files are located A package manager can keep track of dependencies to make surethat new software you install is compatible with the software you have alreadyinstalled If you wanted to install a KDE package on a GNOME machine, forexample, the package manager would protest, indicating that you don’t have therequired runtime libraries This is preferable to installing the package only toscratch your head trying to figure out why it won’t work.

One of the most valuable features that a package manager offers is the ability touninstall This allows you to install a piece of software and try it out, and then unin-stall it if you don’t like it After you uninstall the package, your system is back tothe same configuration it had before you installed the package Uninstalling a pack-age is one way to upgrade it You remove the old version and install the new one

Most package managers have a special upgrade command so that this can be done

in a single step

The package manager creates a centralized database to keep track of installedapplications This database is also a valuable source of information on the state ofyour system You can list the applications currently installed on your computer, forexample, or you can verify that a particular application has not been tampered withsince installation Sometimes, just browsing the database can be an educationalexperience, as you discover software you didn’t know you had

Two of the most common package formats are RPM (RPM Package Manager2)and the Debian Package format Some additional examples are listed in Table 1-8

As you might guess, RPM is used on Red Hat and Fedora distributions, but also onSuse and others Likewise, the Debian format is used on the Debian distributionand also on several popular distributions (Knoppix, Ubuntu, and others) Otherpackage managers include pkgtool, which is used by the Slackware distribution,and portage, which is used by the Gentoo distribution

The decision about which package manager to use is not yours to make (unlessyou want to create your own distribution) Each Linux distribution chooses a sin-gle tool to manage the installed software It makes no sense to have two packagemanagers in your system If you don’t like the package manager your distributionuses, you would be well advised to choose a different distribution rather than try toconvert to a different package manager

2 Formerly the Red Hat Package Manager.

Ngày đăng: 19/06/2018, 14:37

w