digital watermarking and steganography

xxi CHAPTER 1 Introduction 1 1.1 Information Hiding, Steganography, and Watermarking.. Chapter 1 provides a history of watermarking, as well as a discussion of the characteristics that t

Trang 2

and Steganography

Trang 3

Ingemar J Cox, Matthew L Miller, Jeffrey A Bloom, Jessica Fridrich, and Ton Kalker

Keeping Found Things Found: The Study and Practice of Personal Information Management

William P Jones

Web Dragons: Inside the Myths of Search Engine Technology

Ian H Witten, Marco Gori, and Teresa Numerico

Introduction to Data Compression, Third Edition

Khalid Sayood

Understanding Digital Libraries, Second Edition

Michael Lesk

Bioinformatics: Managing Scientific Data

Zo´ e Lacroix and Terence Critchlow

How to Build a Digital Library

Ian H Witten and David Bainbridge

Readings in Multimedia Computing and Networking

Kevin Jeffay and Hong Jiang Zhang

Multimedia Servers: Applications, Environments, and Design

Dinkar Sitaram and Asit Dan

Visual Information Retrieval

Alberto del Bimbo

Managing Gigabytes: Compressing and Indexing Documents and Images, Second Edition

Ian H Witten, Alistair Moffat, and Timothy C Bell

Digital Compression for Multimedia: Principles & Standards

Jerry D Gibson, Toby Berger, Tom Lookabaugh, Rich Baker, and David Lindbergh

Readings in Information Retrieval

Karen Sparck Jones, and Peter Willett

For further information on these books and for a list of forthcoming titles,

please visit our web site at http://www.mkp.com.

The Morgan Kaufmann Series in Computer Security

Digital Watermarking and Steganography, Second Edition

Ingemar J Cox, Matthew L Miller, Jeffrey A Bloom, Jessica Fridrich, and Ton Kalker

Information Assurance: Dependability and Security in Networked Systems

Yi Qian, David Tipper, Prashant Krishnamurthy, and James Joshi

Network Recovery: Protection and Restoration of Optical, SONET-SDH, IP, and MPLS

Jean-Philippe Vasseur, Mario Pickavet, and Piet Demeester

For further information on these books and for a list of forthcoming titles,

Trang 4

and Steganography

Second Edition

Ingemar J Cox Matthew L Miller Jeffrey A Bloom Jessica Fridrich

Ton Kalker

AMSTERDAM • BOSTON • HEIDELBERG • LONDON NEW YORK • OXFORD • PARIS • SAN DIEGO SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO

Trang 5

Editorial Assistant Gregory Chalson

Cover Design Dennis Schaefer

Text Design Elsevier, Inc.

Indexer Distributech Scientific Indexing

Interior printer The Maple-Vail Book Manufacturing Group

Cover printer Phoenix Color

Morgan Kaufmann Publishers is an imprint of Elsevier.

30 Corporate Drive, Suite 400, Burlington, MA 01803, USA

This book is printed on acid-free paper.∞

Designations used by companies to distinguish their products are often claimed as trademarks or registered trademarks In all instances in which Morgan Kaufmann

Publishers is aware of a claim, the product names appear in initial capital or all capital letters Readers, however, should contact the appropriate companies for more complete information regarding trademarks and registration.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted

in any form or by any means—electronic, mechanical, photocopying, scanning, or otherwise—without prior written permission of the publisher.

Permissions may be sought directly from Elsevier’s Science & Technology Rights

Department in Oxford, UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, E-mail: permissions@elsevier.com You may also complete your request online via the Elsevier homepage (http://elsevier.com), by selecting “Support & Contact” then “Copyright and Permission” and then “Obtaining Permissions.”

Library of Congress Cataloging-in-Publication Data

Digital watermarking and steganography/Ingemar J Cox [et al.].

p cm.

Includes bibliographical references and index.

ISBN 978-0-12-372585-1 (casebound: alk paper) 1 Computer security 2 Digital watermarking 3 Data protection I Cox, I J (Ingemar J.)

QA76.9.A25C68 2008

005.8–dc22

2007040595 ISBN 978-0-12-372585-1

For information on all Morgan Kaufmann publications,

visit our Web site atwww.mkp.com or www.books.elsevier.com

Printed in the United States of America

07 08 09 10 11 5 4 3 2 1

Trang 6

Ingy Cox

Age 12May 23, 1986 to January 27, 1999

The light that burns twice as bright burns half as long—and you have burned

so very very brightly

—Eldon Tyrell to Roy Batty inBlade Runner.

Screenplay by Hampton Fancher and David Peoples

Trang 8

Preface to the First Edition xv

Preface to the Second Edition xix

Example Watermarking Systems xxi

CHAPTER 1 Introduction 1 1.1 Information Hiding, Steganography, and Watermarking 4

1.2 History of Watermarking 6

1.3 History of Steganography 9

1.4 Importance of Digital Watermarking 11

1.5 Importance of Steganography 12

CHAPTER 2 Applications and Properties 15 2.1 Applications of Watermarking 16

2.1.1 Broadcast Monitoring 16

2.1.2 Owner Identification 19

2.1.3 Proof of Ownership 21

2.1.4 Transaction Tracking 23

2.1.5 Content Authentication 25

2.1.6 Copy Control 27

2.1.7 Device Control 31

2.1.8 Legacy Enhancement 32

2.2 Applications of Steganography 34

2.2.1 Steganography for Dissidents 34

2.2.2 Steganography for Criminals 35

2.3 Properties of Watermarking Systems 36

2.3.1 Embedding Effectiveness 37

2.3.2 Fidelity 37

2.3.3 Data Payload 38

2.3.4 Blind or Informed Detection 39

2.3.5 False Positive Rate 39

2.3.6 Robustness 40

2.3.7 Security 41

2.3.8 Cipher and Watermark Keys 43

2.3.9 Modification and Multiple Watermarks 45

2.3.10 Cost 46

2.4 Evaluating Watermarking Systems 46

2.4.1 The Notion of “Best” 47

2.4.2 Benchmarking 47 2.4.3 Scope of Testing 48 vii

Trang 9

2.5 Properties of Steganographic and Steganalysis Systems 49

2.5.1 Embedding Effectiveness 49

2.5.2 Fidelity 50

2.5.3 Steganographic Capacity, Embedding Capacity, Embedding Efficiency, and Data Payload 50

2.5.4 Blind or Informed Extraction 51

2.5.5 Blind or Targeted Steganalysis 51

2.5.6 Statistical Undetectability 52

2.5.7 False Alarm Rate 53

2.5.8 Robustness 53

2.5.9 Security 54

2.5.10 Stego Key 54

2.6 Evaluating and Testing Steganographic Systems 55

2.7 Summary 56

CHAPTER 3 Models of Watermarking 61 3.1 Notation 62

3.2 Communications 63

3.2.1 Components of Communications Systems 63

3.2.2 Classes of Transmission Channels 64

3.2.3 Secure Transmission 65

3.3 Communication-Based Models of Watermarking 67

3.3.1 Basic Model 67

3.3.2 Watermarking as Communications with Side Information at the Transmitter 75

3.3.3 Watermarking as Multiplexed Communications 78

3.4 Geometric Models of Watermarking 80

3.4.1 Distributions and Regions in Media Space 81

3.4.2 Marking Spaces 87

3.5 Modeling Watermark Detection by Correlation 95

3.5.1 Linear Correlation 96

3.5.2 Normalized Correlation 97

3.5.3 Correlation Coefficient 100

3.6 Summary 102

CHAPTER 4 Basic Message Coding 105 4.1 Mapping Messages into Message Vectors 106

4.1.1 Direct Message Coding 106

4.1.2 Multisymbol Message Coding 110

4.2 Error Correction Coding 117

4.2.1 The Problem with Simple Multisymbol Messages 117

4.2.2 The Idea of Error Correction Codes 118

4.2.3 Example: Trellis Codes and Viterbi Decoding 119

Trang 10

4.3 Detecting Multisymbol Watermarks 124

4.3.1 Detection by Looking for Valid Messages 125

4.3.2 Detection by Detecting Individual Symbols 126

4.3.3 Detection by Comparing against Quantized Vectors 128

4.4 Summary 134

CHAPTER 5 Watermarking with Side Information 137 5.1 Informed Embedding 139

5.1.1 Embedding as an Optimization Problem 140

5.1.2 Optimizing with Respect to a Detection Statistic 141

5.1.3 Optimizing with Respect to an Estimate of Robustness 147

5.2 Watermarking Using Side Information 153

5.2.1 Formal Definition of the Problem 153

5.2.2 Signal and Channel Models 155

5.2.3 Optimal Watermarking for a Single Cover Work 156

5.2.4 Optimal Coding for Multiple Cover Works 157

5.2.5 A Geometrical Interpretation of White Gaussian Signals 158

5.2.6 Understanding Shannon’s Theorem 159

5.2.7 Correlated Gaussian Signals 161

5.3 Dirty-Paper Codes 164

5.3.1 Watermarking of Gaussian Signals: First Approach 164

5.3.2 Costa’s Insight: Writing on Dirty Paper 170

5.3.3 Scalar Watermarking 175

5.3.4 Lattice Codes 179

5.4 Summary 181

CHAPTER 6 Practical Dirty-Paper Codes 183 6.1 Practical Considerations for Dirty-Paper Codes 183

6.1.1 Efficient Encoding Algorithms 184

6.1.2 Efficient Decoding Algorithms 185

6.1.3 Tradeoff between Robustness and Encoding Cost 186

6.2 Broad Approaches to Dirty-Paper Code Design 188

6.2.1 Direct Binning 188

6.2.2 Quantization Index Modulation 188

6.2.3 Dither Modulation 189

6.3 Implementing DM with a Simple Lattice Code 189

6.4 Typical Tricks in Implementing Lattice Codes 194

6.4.1 Choice of Lattice 194

6.4.2 Distortion Compensation 194

6.4.3 Spreading Functions 195

6.4.4 Dither 195

Trang 11

6.5 Coding with Better Lattices 197

6.5.1 Using Nonorthogonal Lattices 197

6.5.2 Important Properties of Lattices 199

6.5.3 Constructing a Dirty-Paper Code from E8 201

6.6 Making Lattice Codes Survive Valumetric Scaling 204

6.6.1 Scale-Invariant Marking Spaces 205

6.6.2 Rational Dither Modulation 207

6.6.3 Inverting Valumetric Scaling 208

6.7 Dirty-Paper Trellis Codes 208

6.8 Summary 212

CHAPTER 7 Analyzing Errors 213 7.1 Message Errors 214

7.2 False Positive Errors 218

7.2.1 Random-Watermark False Positive 219

7.2.2 Random-Work False Positive 221

7.3 False Negative Errors 225

7.4 ROC Curves 228

7.4.1 Hypothetical ROC 228

7.4.2 Histogram of a Real System 230

7.4.3 Interpolation Along One or Both Axes 231

7.5 The Effect of Whitening on Error Rates 232

7.6 Analysis of Normalized Correlation 239

7.6.1 False Positive Analysis 240

7.6.2 False Negative Analysis 250

7.7 Summary 252

CHAPTER 8 Using Perceptual Models 255 8.1 Evaluating Perceptual Impact of Watermarks 255

8.1.1 Fidelity and Quality 256

8.1.2 Human Evaluation Measurement Techniques 257

8.1.3 Automated Evaluation 260

8.2 General Form of a Perceptual Model 263

8.2.1 Sensitivity 263

8.2.2 Masking 266

8.2.3 Pooling 267

8.3 Two Examples of Perceptual Models 269

8.3.1 Watson’s DCT-Based Visual Model 269

8.3.2 A Perceptual Model for Audio 273

8.4 Perceptually Adaptive Watermarking 277

8.4.1 Perceptual Shaping 280

8.4.2 Optimal Use of Perceptual Models 287

8.5 Summary 295

Trang 12

CHAPTER 9 Robust Watermarking 297

9.1 Approaches 298

9.1.1 Redundant Embedding 299

9.1.2 Spread Spectrum Coding 300

9.1.3 Embedding in Perceptually Significant Coefficients 301

9.1.4 Embedding in Coefficients of Known Robustness 302

9.1.5 Inverting Distortions in the Detector 303

9.1.6 Preinverting Distortions in the Embedder 304

9.2 Robustness to Valumetric Distortions 308

9.2.1 Additive Noise 308

9.2.2 Amplitude Changes 312

9.2.3 Linear Filtering 314

9.2.4 Lossy Compression 319

9.2.5 Quantization 320

9.3 Robustness to Temporal and Geometric Distortions 325

9.3.1 Temporal and Geometric Distortions 326

9.3.2 Exhaustive Search 327

9.3.3 Synchronization/Registration in Blind Detectors 328

9.3.4 Autocorrelation 329

9.3.5 Invariant Watermarks 330

9.3.6 Implicit Synchronization 331

9.4 Summary 332

CHAPTER 10 Watermark Security 335 10.1 Security Requirements 335

10.1.1 Restricting Watermark Operations 336

10.1.2 Public and Private Watermarking 338

10.1.3 Categories of Attack 340

10.1.4 Assumptions about the Adversary 345

10.2 Watermark Security and Cryptography 348

10.2.1 The Analogy between Watermarking and Cryptography 348

10.2.2 Preventing Unauthorized Detection 349

10.2.3 Preventing Unauthorized Embedding 351

10.2.4 Preventing Unauthorized Removal 355

10.3 Some Significant Known Attacks 358

10.3.1 Scrambling Attacks 359

10.3.2 Pathological Distortions 359

10.3.3 Copy Attacks 361

10.3.4 Ambiguity Attacks 362

10.3.5 Sensitivity Analysis Attacks 367

10.3.6 Gradient Descent Attacks 372

10.4 Summary 373

Trang 13

CHAPTER 11 Content Authentication 375

11.1 Exact Authentication 377

11.1.1 Fragile Watermarks 377

11.1.2 Embedded Signatures 378

11.1.3 Erasable Watermarks 379

11.2 Selective Authentication 395

11.2.1 Legitimate versus Illegitimate Distortions 395

11.2.2 Semi-Fragile Watermarks 399

11.2.3 Embedded, Semi-Fragile Signatures 404

11.2.4 Telltale Watermarks 409

11.3 Localization 410

11.3.1 Block-Wise Content Authentication 411

11.3.2 Sample-Wise Content Authentication 412

11.3.3 Security Risks with Localization 415

11.4 Restoration 419

11.4.1 Embedded Redundancy 419

11.4.2 Self-Embedding 420

11.4.3 Blind Restoration 421

11.5 Summary 422

CHAPTER 12 Steganography 425 12.1 Steganographic Communication 427

12.1.1 The Channel 428

12.1.2 The Building Blocks 429

12.2 Notation and Terminology 433

12.3 Information-Theoretic Foundations of Steganography 433

12.3.1 Cachin’s Definition of Steganographic Security 434

12.4 Practical Steganographic Methods 439

12.4.1 Statistics Preserving Steganography 439

12.4.2 Model-Based Steganography 441

12.4.3 Masking Embedding as Natural Processing 445

12.5 Minimizing the Embedding Impact 449

12.5.1 Matrix Embedding 450

12.5.2 Nonshared Selection Rule 457

12.6 Summary 467

CHAPTER 13 Steganalysis 469 13.1 Steganalysis Scenarios 469

13.1.1 Detection 470

13.1.2 Forensic Steganalysis 475

13.1.3 The Influence of the Cover Work on Steganalysis 476

13.2 Some Significant Steganalysis Algorithms 477

13.2.1 LSB Embedding and the Histogram Attack 478

Trang 14

13.2.2 Sample Pairs Analysis 480

13.2.3 Blind Steganalysis of JPEG Images Using Calibration 486

13.2.4 Blind Steganalysis in the Spatial Domain 489

13.3 Summary 494

APPENDIX A Background Concepts 497 A.1 Information Theory 497

A.1.1 Entropy 497

A.1.2 Mutual Information 498

A.1.3 Communication Rates 499

A.1.4 Channel Capacity 500

A.2 Coding Theory 503

A.2.1 Hamming Distance 503

A.2.2 Covering Radius 503

A.2.3 Linear Codes 504

A.3 Cryptography 505

A.3.1 Symmetric-Key Cryptography 505

A.3.2 Asymmetric-Key Cryptography 506

A.3.3 One-Way Hash Functions 508

A.3.4 Cryptographic Signatures 510

APPENDIX B Selected Theoretical Results 511 B.1 Information-Theoretic Analysis of Secure Watermarking (Moulin and O’Sullivan) 511

B.1.1 Watermarking as a Game 511

B.1.2 General Capacity of Watermarking 513

B.1.3 Capacity with MSE Fidelity Constraint 514

B.2 Error Probabilities Using Normalized Correlation Detectors (Miller and Bloom) 517

B.3 Effect of Quantization Noise on Watermarks (Eggers and Girod) 522 B.3.1 Background 524

B.3.2 Basic Approach 524

B.3.3 Finding the Probability Density Function 524

B.3.4 Finding the Moment-Generating Function 525

B.3.5 Determining the Expected Correlation for a Gaussian Watermark and Laplacian Content 527

APPENDIX C Notation and Common Variables 529 C.1 Variable Naming Conventions 529

C.2 Operators 530

C.3 Common Variable Names 530

C.4 Common Functions 532

Trang 16

an image, audio clip, video clip, or other work of media within that workitself Although such practices have existed for quite a long time—at least sev-eral centuries, if not millennia—the field of digital watermarking only gained

widespread popularity as a research topic in the latter half of the 1990s A fewearlier books have devoted substantial space to the subject of digital watermark-ing [171, 207, 219] However, to our knowledge, this is the first book dealingexclusively with this field

PURPOSE

Our goal with this book is to provide a framework in which to conduct researchand development of watermarking technology This book is not intended as acomprehensive survey of the field of watermarking Rather, it represents ourown point of view on the subject Although we analyze specific examples fromthe literature, we do so only to the extent that they highlight particular con-cepts being discussed (Thus, omissions from the Bibliography should not beconsidered as reflections on the quality of the omitted works.)

Most of the literature on digital watermarking deals with its application toimages, audio, and video, and these application areas have developed somewhatindependently This is in part because each medium has unique characteristics,and researchers seldom have expertise in all three We are no exception, ourown backgrounds being predominantly in images and video Nevertheless, thefundamental principles behind still image, audio, and video watermarking arethe same, so we have made an effort to keep our discussion of these principlesgeneric

The principles of watermarking we discuss are illustrated with several ple algorithms and experiments (the C source code is provided in Appendix C).All of these examples are implemented for image watermarking only Wedecided to use only image-based examples because, unlike audio or video,images can be easily presented in a book

exam-The example algorithms are very simple In general, they are not themselvesuseful for real watermarking applications Rather, each algorithm is intended toprovide a clear illustration of a specific idea, and the experiments are intended

to examine the idea’s effect on performance xv

Trang 17

The book contains a certain amount of repetition This was a consciousdecision, because we assume that many, if not most, readers will not readthe book from cover to cover Rather, we anticipate that readers will look uptopics of interest and read only individual sections or chapters Thus, if a point

is relevant in a number of places, we may briefly repeat it several times It ishoped that this will not make the book too tedious to read straight through,yet will make it more useful to those who read technical books the way we do

CONTENT AND ORGANIZATION

Chapters 1 and 2 of this book provide introductory material Chapter 1 provides

a history of watermarking, as well as a discussion of the characteristics that tinguish watermarking from the related fields of data hiding and steganography.Chapter 2 describes a wide variety of applications of digital watermarking andserves as motivation The applications highlight a variety of sometimes conflict-ing requirements for watermarking, which are discussed in more detail in thesecond half of the chapter

dis-The technical content of this book begins with Chapter 3, which presentsseveral frameworks for modeling watermarking systems Along the way, wedescribe, test, and analyze some simple image watermarking algorithms thatillustrate the concepts being discussed In Chapter 4, these algorithms areextended to carry larger data payloads by means of conventional message-coding techniques Although these techniques are commonly used in water-marking systems, some recent research suggests that substantially betterperformance can be achieved by exploiting side information in the encodingprocess This is discussed in Chapter 5

Chapter 7 analyzes message errors, false positives, and false negatives thatmay occur in watermarking systems It also introduces whitening

The next three chapters explore a number of general problems related tofidelity, robustness, and security that arise in designing watermarking systems,and present techniques that can be used to overcome them Chapter 8 examinesthe problems of modeling human perception, and of using those models inwatermarking systems Although simple perceptual models for audio and stillimages are described, perceptual modeling is not the focus of this chapter.Rather, we focus on how any perceptual model can be used to improve thefidelity of the watermarked content

Chapter 9 covers techniques for making watermarks survive several types ofcommon degradations, such as filtering, geometric or temporal transformations,and lossy compression

Trang 18

Chapter 10 describes a framework for analyzing security issues inwatermarking systems It then presents a few types of malicious attacks towhich watermarks might be subjected, along with possible countermeasures.Finally, Chapter 11 covers techniques for using watermarks to verify theintegrity of the content in which they are embedded This includes the area

of fragile watermarks, which disappear or become invalid if the watermarkedWork is degraded in any way

ACKNOWLEDGMENTS

First, we must thank several people who have directly helped us in makingthis book Thanks to Karyn Johnson, Jennifer Mann, and Marnie Boyd of Mor-gan Kaufmann for their enthusiasm and help with this book As reviewers,Ton Kalker, Rade Petrovic, Steve Decker, Adnan Alattar, Aaron Birenboim, andGary Hartwick provided valuable feedback Harold Stone and Steve Weinstein

of NEC also gave us many hours of valuable discussion And much of our ing about authentication (Chapter 11) was shaped by a conversation with Dr.Richard Green of the Metropolitan Police Service, Scotland Yard We also thank

think-M Gwenael Doerr for his review

Special thanks, too, to Valerie Tucci, our librarian at NEC, who was able in obtaining many, sometimes obscure, publications And Karen Hahn forsecretarial support Finally, thanks to Dave Waltz, Mitsuhito Sakaguchi, and NECResearch Institute for providing the resources needed to write this book Itcould not have been written otherwise

invalu-We are also grateful to many researchers and engineers who have helpeddevelop our understanding of this field over the last several years Our work

on watermarking began in 1995 thanks to a talk Larry O’Gorman presented atNECI Joe Kilian, Tom Leighton, and Talal Shamoon were early collaborators.Joe has continued to provide valuable insights and support Warren Smith hastaught us much about high-dimensional geometry Jont Allen, Jim Flanagan, andJim Johnston helped us understand auditory perceptual modeling Thanks also

to those at NEC Central Research Labs who worked with us on several marking projects: Ryoma Oami, Takahiro Kimoto, Atsushi Murashima, and NaokiShibata

water-Each summer we had the good fortune to have excellent summer studentswho helped solve some difficult problems Thanks to Andy McKellips and Min

Wu of Princeton University and Ching-Yung Lin of Columbia University Wealso had the good fortune to collaborate with professors Mike Orchard and StuSchwartz of Princeton University

Trang 19

We probably learned more about watermarking during our involvment inthe request for proposals for watermarking technologies for DVD disks than atany other time We are therefore grateful to our competitors for pushing us toour limits, especially Jean-Paul Linnartz, Ton Kalker (again), and Maurice Maes ofPhilips; Jeffrey Rhoads of Digimarc; John Ryan and Patrice Capitant of Macrovi-sion; and Akio Koide, N Morimoto, Shu Shimizu, Kohichi Kamijoh, and TadashiMizutani of IBM (with whom we later collaborated) We are also grateful tothe engineers of NEC’s PC&C division who worked on hardware implementa-tions for this competition, especially Kazuyoshi Tanaka, Junya Watanabe, YutakaWakasu, and Shigeyuki Kurahashi.

Much of our work was conducted while we were employed at Signafy, and

we are grateful to several Signafy personnel who helped with the technicalchallenges: Peter Blicher, Yui Man Lui, Doug Rayner, Jan Edler, and Alan Stein(whose real-time video library is amazing)

We wish also to thank the many others who have helped us out in avariety of ways A special thanks to Phil Feig—our favorite patent attorney—for filing many of our patent applications with the minimum of overhead.Thanks to Takao Nishitani for supporting our cooperation with NEC’s Cen-tral Research Labs Thanks to Kasinath Anupindi, Kelly Feng, and SanjayPalnitkar for system administration support Thanks to Jim Philbin, DougBercow, Marc Triaureau, Gail Berreitter, and John Anello for making Sig-nafy a fun and functioning place to work Thanks to Alan Bell for mak-ing CPTWG possible Thanks to Mitsuhito Sakaguchi (again), who first sug-gested that we become involved in the CPTWG meetings Thanks to ShichiroTsuruta for managing PC&C’s effort during the CPTWG competition, and

H Morito of NEC’s semiconductor division Thanks to Dan Sullivan for thepart he played in our collaboration with IBM Thanks to the DHSG cochairswho organized the competition: Bob Finger, Jerry Pierce, and Paul Wehren-berg Thanks also to the many people at the Hollywood studios who provided

us with the content owners’ perspective: Chris Cookson and Paul Klamer ofWarner Brothers, Bob Lambert of Disney, Paul Heimbach and Gary Hartwick

of Viacom, Jane Sunderland and David Grant of Fox, David Stebbings of theRIAA, and Paul Egge of the MPAA Thanks to Christine Podilchuk for her sup-port It was much appreciated Thanks to Bill Connolly for interesting dis-cussions Thanks to John Kulp, Rafael Alonso, the Sarnoff Corporation, andJohn Manville of Lehman Brothers for their support And thanks to VinceGentile, Tom Belton, Susan Kleiner, Ginger Mosier, Tom Nagle, and CynthiaThorpe

Finally, we thank our families for their patience and support during thisproject: Susan and Zoe Cox, Geidre Miller, and Pamela Bloom

Trang 20

During this period there has been significant progress in digital ing; and the field of steganography has witnessed increasing interest since theterrorist events of September 11, 2001.

watermark-Digital watermarking and steganography are closely related In the first tion of Digital Watermarking we made a decision to distinguish between

edi-watermarking and steganography and to focus exclusively on the former Forthis second edition we decided to broaden the coverage to include steganog-raphy and to therefore change the title of the book toDigital Watermarking and Steganography.

Despite the new title, this isnot a new book, but a revision of the original.

We hope this is clear from the backcover material and apologize in advance toany reader who thought otherwise

CONTENT AND ORGANIZATION

The organization of this book closely follows that of the original The treatment

of watermarking and steganography is, for the most part, kept separate The sons for this are twofold First, we anticipate that readers might prefer not to readthe book from cover to cover, but rather read specific chapters of interest Andsecond, an integrated revision would require considerably more work

rea-Chapters 1 and 2 include new material related to steganography and, wherenecessary, updated material related to watermarking In particular, Chapter 2 high-lights the similarities and differences between watermarking and steganography.Chapters 3, 4, 7, 8, 9, and 10 remain untouched, except that bibliographiccitations have been updated

Chapter 5 of the first edition has now been expanded to two chapters,reflecting the research interest in modeling watermarking as communicationswith side information Chapter 5 provides a more detailed theoretical discus-sion of the topic, especially with regard to dirty-paper coding Chapter 6 thenprovides a description of a variety of common dirty-paper coding techniquesfor digital watermarking

Section 11.1.3 in Chapter 11 has been revised to include material on avariety of erasable watermarking methods

Finally, two new chapters, Chapters 12 and 13, have been added Thesechapters discuss steganography and steganalysis, respectively xix

Trang 21

The authors would like to thank the following people: Alan Bell of WarnerBrothers for discussions on HD-DVD digital rights management technology,John Choi for discussions relating to watermarking of MP3 files in Korea, DavidSoukal for creating graphics for the Stego chapter

And of course we would like to thank our families and friends for theirsupport in the endeavor: Rimante Okkels; Zoe, Geoff, and Astrid Cox; PamBloom and her watermarking team of Joshua, Madison, Emily Giedre, Fia, andAda; Monika, Nicole, and Kathy Fridrich; Miroslav Goljan; Robin Redding; andall the animals

Finally, to Matt, your coauthors send their strongest wishes—get well soon!

Trang 22

trate and test some of the main points Discussions of test results provideadditional insights and lead to subsequent sections.

Each investigation begins with a preamble If a new watermarking system isbeing used, a description of the system is provided Experimental proceduresand results are then described

The watermark embedders and watermark detectors that make up these tems are given names and are referred to many times throughout the book Thenaming convention we use is as follows: All embedder and detector names arewritten in sans serif font to help set them apart from the other text Embeddernames all start withE_and are followed by a word or acronym describing one

sys-of the main techniques illustrated by an algorithm Similarly, detector namesbegin with D_ followed by a word or acronym For example, the embed-der in the first system is named E_BLIND (it is an implementation of blindembedding), and the detector is namedD_LC (it is an implementation of linearcorrelation detection)

Each system used in an investigation consists of an embedder and a detector

In many cases, one or the other of these is shared with several other systems.For example, in Chapter 3, the D_LC detector is paired with the E_BLINDembedder inSystem 1 and with the E_FIXED_LC embedder in System 2 Insubsequent chapters, this same detector appears again in a number of othersystems Each individual embedder and detector is described in detail in thefirst system in which it is used

In the following, we list each of the 19 systems described in the text, alongwith the number of the page on which its description begins, as well as a briefreview of the points it is meant to illustrate and how it works The source codefor these systems is provided in Appendix C

System 1: E_BLIND/D_LC 70

Blind Embedding and Linear Correlation Detection: The blind embedder

E_BLIND simply adds a pattern to an image A reference pattern is scaled by

a strength parameter,␣, prior to being added to the image Its sign is dictated

by the message being encoded

TheD_LC linear correlation detector calculates the correlation between thereceived image and the reference pattern If the magnitude of the correlation ishigher than a threshold, the watermark is declared to be present The message

is encoded in the sign of the correlation xxi

Trang 23

System 2: E_FIXED_LC/D_LC 77

Fixed Linear Correlation Embedder and Linear Correlation Detection: This

system uses the same D_LC linear correlation detector as System 1, butintroduces a new embedding algorithm that implements a type of informedembedding Interpreting the cover Work as channel noise that is known, the

E_FIXED_LC embedder adjusts the strength of the watermark to compensatefor this noise, to ensure that the watermarked Work has a specified linear cor-relation with the reference pattern

System 3: E_BLK_BLIND/D_BLK_CC 89

Block-Based, Blind Embedding, and Correlation Coefficient Detection: This

system illustrates the division of watermarking into media space and ing space by use of an extraction function It also introduces the use of the

mark-correlation coefficient as a detection measure

The E_BLK_BLIND embedder performs three basic steps First, a dimensional vector,vo, is extracted from the unwatermarked image by averaging

64-8× 8 blocks Second, a reference mark,wr, is scaled and either added to or tracted fromvo This yields a marked vector,vw Finally, the difference between

sub-vo andvwis added to each block in the image, thus ensuring that the extractionprocess (block averaging), when applied to the resulting image, will yieldvw.TheD_BLK_CC detector extracts a vector from an image by averaging 8 × 8pixel blocks It then compares the resulting 64-dimensional vector,v, against areference mark using the correlation coefficient

System 4: E_SIMPLE_8/D_SIMPLE_8 116

8-Bit Blind Embedder, 8-Bit Detector: TheE_SIMPLE_8 embedder is a version

of theE_BLIND embedder modified to embed 8-bit messages It first constructs

a message pattern by adding or subtracting each of eight reference patterns.Each reference pattern denotes 1 bit, and the sign of the bit determines whether

it is added or subtracted It then multiplies the message pattern by a scalingfactor and adds it to the image

TheD_SIMPLE_BITS detector correlates the received image against each ofthe eight reference patterns and uses the sign of each correlation to determinethe most likely value for the corresponding bit This yields the decoded mes-sage The detector does not distinguish between marked and unwatermarkedimages

System 5: E_TRELLIS_8/D_TRELLIS_8 123

Trellis-Coding Embedder, Viterbi Detector: This system embeds 8-bit

mes-sages using trellis-coded modulation In theE_TRELLIS_8 embedder, the 8-bit

Trang 24

message is redundantly encoded as a sequence of symbols drawn from analphabet of 16 symbols A message pattern is then constructed by addingtogether reference patterns representing the symbols in the sequence Thepattern is then embedded with blind embedding.

TheD_TRELLIS_8 detector uses a Viterbi decoder to determine the mostlikely 8-bit message It does not distinguish between watermarked and unwa-termarked images

TheD_BLK_8 detector averages 8 × 8 blocks and uses a Viterbi decoder toidentify the most likely 8-bit message It then reencodes that 8-bit message tofind the most likely message mark, and tests for that message mark using thecorrelation coefficient

System 7: E_BLK_FIXED_CC/D_BLK_CC 144

Block-Based Watermarks with Fixed Normalized Correlation Embedding:

This is a first attempt at informed embedding for normalized correlation tion Like theE_FIXED_LC embedder, the E_BLK_FIXED_CC embedder aims

detec-to ensure a specified detection value However, experiments with this systemshow that its robustness is not as high as might be hoped

TheE_BLK_FIXED_CC embedder is based on the E_BLK_BLIND der, performing the same basic three steps of extracting a vector from theunwatermarked image, modifying that vector to embed the mark, and thenmodifying the image so that it will yield the new extracted vector However,rather than modify the extracted vector by blindly adding or subtracting a refer-ence mark, theE_BLK_FIXED_CC embedder finds the closest point in 64 spacethat will yield a specified correlation coefficient with the reference mark The

embed-D_BLK_CC detector used here is the same as in the E_BLK_BLIND/D_BLK_CCsystem

System 8: E_BLK_FIXED_R/D_BLK_CC 149

Block-Based Watermarks with Fixed Robustness Embedding: This system fixes

the difficulty with the E_BLK_FIXED_CC/D_BLK_CC system by trying toobtain a fixed estimate of robustness, rather than a fixed detection value

Trang 25

After extracting a vector from the unwatermarked image, theE_BLK_FIXED_Rembedder finds the closest point in 64 space that is likely to lie within thedetection region even after a specified amount of noise has been added The

D_BLK_CC detector used here is the same as in the E_BLK_BLIND/D_BLK_CCsystem

System 9: E_LATTICE/D_LATTICE 191

Lattice-Coded Watermarks: This illustrates a method of watermarking with

dirty-paper codes that can yield much higher data payloads than are practicalwith the E_DIRTY_PAPER/D_DIRTY_PAPER system Here, the set of codevectors is not random Rather, each code vector is a point on a lattice Eachmessage is represented by all points on a sublattice

The embedder takes a 345-bit message and applies an error correction code

to obtain a sequence of 1,380 bits It then identifies the sublattice that sponds to this sequence of bits and quantizes the cover image to find the closestpoint in that sublattice Finally, it modifies the image to obtain a watermarkedimage close to this lattice point

corre-The detector quantizes its input image to obtain the closest point on theentire lattice It then identifies the sublattice that contains this point, whichcorresponds to a sequence of 1,380 bits Finally, it decodes this bit sequence

to obtain a 345-bit message It makes no attempt to determine whether or not

a watermark is present, but simply returns a random message when presentedwith an unwatermarked image

System 10: E_E8LATTICE/D_E8LATTICE 202

E8 Lattice-Coded Watermarks: This System illustrates the benefits of using an

E8 lattice over an orthogonal lattice, used in System 9 Experimental resultscompare the performance of System 10 and System 9 and demonstrate that the

E8 lattice has superior performance

System 11: E_BLIND/D_WHITE 234

Blind Embedding and Whitened Linear Correlation Detection: This system

explores the effects of applying a whitening filter in linear correlation detection

It uses theE_BLIND embedding algorithm introduced in System 1

The D_WHITE detector applies a whitening filter to the image and thewatermark reference pattern before computing the linear correlation betweenthem The whitening filter is an 11× 11 kernel derived from a simple model ofthe distribution of unwatermarked images as an elliptical Gaussian

Trang 26

System 12: E_BLK_BLIND/D_WHITE_BLK_CC 247

Block-Based Blind Embedding and Whitened Correlation Coefficient Detection:

This system explores the effects of whitening on correlation coefficient detection

It uses theE_BLK_BLIND embedding algorithm introduced in System 3

TheD_WHITE_BLK_CC detector first extracts a 64 vector from the image

by averaging 8× 8 blocks It then filters the result with the same whiteningfilter used inD_WHITE This is roughly equivalent to filtering the image beforeextracting the vector Finally, it computes the correlation coefficient betweenthe filtered, extracted vector and a filtered version of a reference mark

System 13: E_PERC_GSCALE 277

Perceptually Limited Embedding and Linear Correlation Detection: This

sys-tem begins an exploration of the use of perceptual models in watermarkembedding It uses theD_LC detector introduced in System 1

TheE_PERC_GSCALE embedder is similar to the E_BLIND embedder inthat, ultimately, it scales the reference mark and adds it to the image However,

inE_PERC_GSCALE the scaling is automatically chosen to obtain a specifiedperceptual distance, as measured by Watson’s perceptual model

System 14: E_PERC_SHAPE 284

Perceptually Shaped Embedding and Linear Correlation Detection: This

sys-tem is similar to Syssys-tem 11, but before computing the scaling factor for theentire reference pattern the E_PERC_SHAPE embedder first perceptually

shapes the pattern.

The perceptual shaping is performed in three steps First, the embedder verts the reference pattern into the block DCT domain (the domain in whichWatson’s model is defined) Next, it scales each term of the transformed ref-erence pattern by a correspondingslack value obtained by applying Watson’s

con-model to the cover image This amplifies the pattern in areas where the imagecan easily hide noise, and attenuates in areas where noise would be visible.Finally, the resultant shaped pattern is converted back into the spatial domain.The shaped pattern is then scaled and added to the image in the same manner

as inE_PERC_GSCALE

System 15: E_PERC_OPT 290

Optimally Scaled Embedding and Linear Correlation Detection: This system

is essentially the same as System 12 The only difference is that perceptual ing is performed using an “optimal” algorithm, instead of simply scaling eachterm of the reference pattern’s block DCT This shaping is optimal in the sense

Trang 27

shap-that the resulting pattern yields the highest possible correlation with the ence pattern for a given perceptual distance (as measured by Watson’s model).

refer-System 16: E_MOD/D_LC 381

Watermark Embedding Using Modulo Addition: This is a simple example

of a system that produces erasable watermarks It uses the D_LC detectorintroduced in System 1

TheE_MOD embedder is essentially the same as the E_BLIND embedder, inthat it scales a reference pattern and adds it to the image The difference is thattheE_MOD embedder uses modulo 256 addition This means that rather thanbeing clipped to a range of 0 to 255, the pixel values wrap around Therefore,for example, 253 + 4 becomes 1 Because of this wraparound, it is possible forsomeone who knows the watermark pattern and embedding strength to per-fectly invert the embedding process, erasing the watermark and obtaining abit-for-bit copy of the original

System 17: E_DCTQ/D_DCTQ 400

Semi-fragile Watermarking: This system illustrates a carefully targeted

semi-fragile watermark intended for authenticating images The watermarks aredesigned to be robust against JPEG compression down to a specified qualityfactor, but fragile against most other processes (including more severe JPEGcompression)

TheE_DCTQ embedder first converts the image into the block DCT domainused by JPEG It then quantizes several high-frequency coefficients in each block

to either an even or odd multiple of a quantization step size Each quantizedcoefficient encodes either a 0, if it is quantized to an even multiple, or a 1, ifquantized to an odd multiple The pattern of 1s and 0s embedded depends on

a key that is shared with the detector The quantization step sizes are chosenaccording to the expected effect of JPEG compression at the worst quality factorthe watermark should survive

TheD_DCTQ detector converts the image into the block DCT domain andidentifies the closest quantization multiples for each of the high-frequency coef-ficients used during embedding From these, it obtains a pattern of bits, which

it compares against the pattern embedded If enough bits match, the detectordeclares that the watermark is present

TheD_DCTQ detector can be modified to yield localized information aboutwhere an image has been corrupted This is done by checking the number

of correct bits in each block independently Any block with enough correctlyembedded bits is deemed authentic

Trang 28

System 18: E_SFSIG/D_SFSIG 406

Semi-fragile Signature: This extends theE_DCTQ/D_DCTQ system to provide

detection of distortions that only effect the low-frequency terms of the block

DCT Here, the embedded bit pattern is a semi-fragile signature derived from

the low-frequency terms of the block DCT

TheE_SFSIG embedder computes a bit pattern by comparing the

magni-tudes of corresponding low-frequency coefficients in randomly selected pairs

of blocks Because quantization usually does not affect the relative magnitudes

of different values, most bits of this signature should be unaffected by JPEG

(which quantizes images in the block DCT domain) The signature is

embed-ded in the high-frequency coefficients of the blocks using the same method

used inE_DCTQ

TheD_SFSIG detector computes a signature in the same way as E_SFSIG

and compares it against the watermark found in the high-frequency coefficients

If enough bits match, the watermark is deemed present

System 19: E_PXL/D_PXL 412

Pixel-by-Pixel Localized Authentication: This system illustrates a method of

authenticating images with pixel-by-pixel localization That is, the detector

determines whether each individual pixel is authentic

TheE_PXL embedder embeds a predefined binary pattern, usually a tiled

logo that can be easily recognized by human observers Each bit is embedded in

one pixel according to a secret mapping of pixel values into bit values (known

to both embedder and detector) The pixel is moved to the closest value that

maps to the desired bit value Error diffusion is used to minimize the perceptual

impact

The D_PXL detector simply maps each pixel value to a bit value

accord-ing to the secret mappaccord-ing Regions of the image modified since the watermark

was embedded result in essentially random bit patterns, whereas unmodified

regions result in the embedded pattern By examining the detected bit pattern,

it is easy to see where the image has been modified

System 20: SE_LTSOLVER 463

Linear System Solver for Matrices Satisfying Robust Soliton Distribution: This

system describes a method for solving a system of linear equations, Ax = y,

when the Hamming weights of the matrix A columns follow a robust soliton

distribution It is intended to be used as part of a practical implementation of

wet paper codes with non-shared selection rules

TheSE_LTSOLVER accepts on its input the linear system matrix, A, and

the right hand side, y, and outputs the solution to the system if it exists,

Trang 29

or a message that the solution cannot be found The solution proceeds byrepeatedly swapping the rows and columns of the matrix until an upper diago-nal matrix is obtained (if the system has a solution) The solution is then found

by backsubstitution as in classical Gaussian elimination and re-permuting thesolution vector

System 21: SD_SPA 484

Detector of LSB Embedding: This is a steganalysis system that detects images

with messages embedded using LSB embedding It uses sample pairs analysis

to estimate the number of flipped LSBs in an image and thereby detect LSBsteganography

It works by first dividing all pixels in the image into pairs and then assignsthem to several categories The cardinalities of the categories are used to form aquadratic equation for the unknown relative number of flipped LSBs The input

is a grayscale image, the output is the estimate of the relative message length

in bits per pixel

System 22: SD_DEN_FEATURES 491

Blind Steganalysis in Spatial Domain based on de-noising and a feature vector: This system extracts 27 features from a grayscale image for the purpose

of blind steganlysis primarily in the spatial domain

The SD_DEN_FEATURES system first applies a denoising filter to theimage and then extracts the noise residual, which is subsequently transformed

to the wavelet domain Statistical moments of the coefficients from the threehighest-frequency subbands are then calculated as features for steganalysis.Classification can be performed using a variety of machine learning tools

Trang 30

The watermark on the $20 bill (Figure 1.1), just like most paper watermarkstoday, has two properties that relate to the subject of the present book First,the watermark is hidden from view during normal use, only becoming visible

as a result of a special viewing process (in this case, holding the bill up to thelight) Second, the watermark carries information about the object in which it

is hidden (in this case, the watermark indicates the authenticity of the bill)

In addition to paper, watermarking can be applied to other physical objectsand to electronic signals Fabrics, garment labels, and product packaging areexamples of physical objects that can be watermarked using special invisibledyes and inks [344, 348] Electronic representations of music, photographs, andvideo are common types of signals that can be watermarked

Consider another example that also involves imperceptible marking of paperbut is fundamentally different on a philosophical level Imagine a spy calledAlice who needs to communicate a very important finding to her superiors.Alice begins by writing a letter describing her wonderful recent family vacation.After writing the letter, Alice replaces the ink in her pen with milk and writes atop secret message between the inked lines of her letter When the milk dries,this secret message becomes imperceptible to the human eye Heating up thepaper above a candle will make the secret message visible This is an example

of steganography In contrast to watermarking, the hidden message is unrelated

to the content of the letter, which only serves as a decoy or cover to hide thevery presence of sending the secret message

1

Trang 31

We define watermarking as the practice of imperceptibly altering a Work

to embed a message about that Work.3

We define steganography as the practice of undetectably altering a Work

to embed a secret message.

Even though the objectives of watermarking and steganography are quitedifferent, both applications share certain high-level elements Both systems

1

We use the term steganology to refer to both steganography and steganalysis, just as

cryp-tology refers to both cryptography and cryptanalysis The termsteganology is not commonly

used but is more precise than using steganography However, we will often use steganography and steganology interchangeably.

2 This definition of the termWork is consistent with the language used in the United States

copyright laws [416] Other terms that have been used can be found in the disscussion of this term in the Glossary.

3 Some researchers do not consider imperceptibility a defining characteristic of digital watermarking This leads to the field of perceptible watermarking [52, 164, 286, 294, 295], which

Trang 32

Cover Work

Payload

Embedder

WatermarkedWork(Stego Work)

payload

FIGURE 1.2

A generic watermarking (steganography) system.

consist of anembedder and a detector, as illustrated in Figure 1.2 The

embed-der takes two inputs One is the payload we want to embed (e.g., either thewatermark or the secret message), and the other is the cover Work in which

we want to embed the payload The output of the embedder is typically mitted or recorded Later, that Work (or some other Work that has not beenthrough the embedder) is presented as an input to the detector Most detectorstry to determine whether a payload is present, and if so, output the messageencoded by it

trans-In the late 1990s there was an explosion of interest in digital systems forthe watermarking of various content The main focus has been on photographs,audio, and video, but other content—such as binary images [453], text [49, 50,271], line drawings [380], three-dimensional models [36, 312, 462], animationparameters [177], executable code [385], and integrated circuits [215, 249]—have also been marked The proposed applications of these methods are manyand varied, and include identification of the copyright owner, indication torecording equipment that the marked content should not be recorded, verifi-cation that content has not been modified since the mark was embedded, andthe monitoring of broadcast channels looking for marked content

Interest in steganology increased significantly after the terrorist attacks onSeptember 11, 2001, when it became clear that means for concealing the com-munication itself are likely to be used for criminal activities.4The first steganalyticmethods focused on the most common type of hiding called Least SignificantBit embedding [142, 444] in bitmap and GIF images Later, substantial effort hasbeen directed to the most common image format—JPEG—[132, 144] and audiofiles [443] Accurate methods for detecting hidden messages prompted furtherresearch in steganography for multimedia files [147, 442]

4

Interestingly,USA Today reported on this possibility several months before the September 11,

2001 attacks [1] However, there has been little evidence to substantiate these claims.

Trang 33

1.1 INFORMATION HIDING, STEGANOGRAPHY,

AND WATERMARKING

Information hiding, steganography, and watermarking are three closely relatedfields that have a great deal of overlap and share many technical approaches.However, there are fundamental philosophical differences that affect therequirements, and thus the design, of a technical solution In this section, wediscuss these differences

Information hiding (or data hiding) is a general term encompassing a wide

range of problems beyond that of embedding messages in content The term

hiding here can refer to either making the information imperceptible (as in

watermarking) or keeping the existence of the information secret Some ples of research in this field can be found in the International Workshops onInformation Hiding, which have included papers on such topics as maintaininganonymity while using a network [232] and keeping part of a database secretfrom unauthorized users [298]

exam-The inventor of the word steganography is Trithemius, the author of the

early publications on cryptography: Polygraphia and Steganographia The nical term itself is derived from the Greek words steganos, which means

tech-“covered,” and graphia, which means “writing.” Steganography is the art of

concealed communication The very existence of a message is secret Besidesinvisible ink, an oft-cited example of steganography is an ancient story fromHerodotus [192], who tells of a slave sent by his master, Histiæus, to the Ionian

city of Miletus with a secret message tattooed on his scalp After tattooing, theslave grew his hair back in order to conceal the message He then journeyed toMiletus and, upon arriving, shaved his head to reveal the message to the city’sregent, Aristagoras The message encouraged Aristagoras to start a revolt againstthe Persian king In this scenario, the message is of primary value to Histiæus

and the slave is simply the carrier of the message

We can use this example to highlight the difference between phy and watermarking Imagine that the message on the slave’s head read,

steganogra-“This slave belongs to Histiæus.” In that this message refers to the slave (cover

Work), this would meet our definition of a watermark Maybe the only reason

to conceal the message would be cosmetic However, if someone else claimedpossession of the slave, Histiæus could shave the slave’s head and prove own-

ership In this scenario, the slave (cover Work) is of primary value to Histiæus,

and the message provides useful information about the cover Work

Systems for inserting messages in Works can thus be divided into marking systems, in which the message is related to the cover Work, andnonwatermarking systems, in which the message is unrelated to the coverWork They can also be independently divided into steganographic systems,

water-in which the very existence of the message is kept secret, and graphic systems, in which the existence of the message need not be secret

Trang 34

nonstegano-Table 1.1 Four categories of information hiding Each category is described with

an example in the text (CW refers tocover Work.)

This results in four categories of information-hiding systems, which are rized in Table 1.1 An example of each of the four categories helps to clarifytheir definitions

summa-1 In 1981, photographic reprints of confidential British cabinet documents

were being printed in newspapers Rumor has it that to determine the source

of the leak, Margaret Thatcher arranged to distribute uniquely identifiablecopies of documents to each of her ministers Each copy had a differentword spacing that was used to encode the identity of the recipient In thisway, the source of the leaks could be identified [20] This is an example ofcovert watermarking The watermarks encoded information related to therecipient of each copy of the documents, and were covert in that the min-isters were kept unaware of their existence so that the source of the leakcould be identified

2 The possibility of steganographically embedded data unrelated to the cover

Work (i.e., messages hidden in otherwise innocuous transmissions) hasalways been a concern to the military Simmons provides a fascinatingdescription of covert channels [376], in which he discusses the technicalissues surrounding verification of the SALT-II treaty between the UnitedStates and the Soviet Union The SALT-II treaty allowed both countries tohave many missile silos but only a limited number of missiles To verifycompliance with the treaty, each country would install sensors, provided bythe other country, in their silos Each sensor would tell the other countrywhether or not its silo was occupied, but nothing else The concern was thatthe respective countries might design the sensor to communicate additionalinformation, such as the location of its silo, hidden inside the legitimatemessage

3 An example of an overt watermark (i.e., the presence of the watermark

is known) can be seen at the web site of the Hermitage Museum in

Trang 35

St Petersburg, Russia.5The museum presents a large number of high-qualitydigital copies of its famous collection on its web site Each image has beenwatermarked to identify the Hermitage as its owner, and a message at thebottom of each web page indicates this fact, along with the warning thatthe images may not be reproduced Knowledge that an invisible watermark

is embedded in each image helps deter piracy

4 Overt, embedded communication refers to the known transmission of

aux-iliary, hidden information that is unrelated to the signal in which it isembedded It was common practice in radio in the late 1940s to embed

a time code in the broadcast at a specified frequency (800 Hz, for ple) [342] This time code was embedded at periodic intervals, say every

exam-15 minutes The code was inaudibly hidden in the broadcast, but it was not

a watermark because the message (the current time) was unrelated to thecontent of the broadcast Further, it was not an example of steganographybecause the presence of an embedded time code can only be useful if itsexistence is known

By distinguishing between embedded data that relates to the cover Work andhidden data that does not, we can anticipate the different applications andrequirements of the data-hiding method However, the actual techniques may

be very similar, or in some cases identical Thus, although this book focuses

on watermarking and steganographic techniques, most of these techniques areapplicable to other areas of information hiding

1.2 HISTORY OF WATERMARKING

Although the art of papermaking was invented in China over one thousandyears earlier, paper watermarks did not appear until about 1282, in Italy.6Themarks were made by adding thin wire patterns to the paper molds The paperwould be slightly thinner where the wire was and hence more transparent.The meaning and purpose of the earliest watermarks are uncertain Theymay have been used for practical functions such as identifying the molds onwhich sheets of papers were made, or as trademarks to identify the papermaker On the other hand, they may have represented mystical signs, or mightsimply have served as decoration

By the eighteenth century, watermarks on paper made in Europe andAmerica had become more clearly utilitarian They were used as trademarks,

5 See http://www.hermitagemuseum.org.

6 Much of our description of paper watermarking is obtained from Hunter [204].

Trang 36

to record the date the paper was manufactured, and to indicate the sizes oforiginal sheets It was also about this time that watermarks began to be used

as anticounterfeiting measures on money and other documents

The term watermark seems to have been coined near the end of the

eighteenth century and may have been derived from the German term marke [378] (though it could also be that the German word is derived from the

wasser-English [204]) The term is actually a misnomer, in that water is not especiallyimportant in the creation of the mark It was probably given because the marksresemble the effects of water on paper

About the time the termwatermark was coined, counterfeiters began

devel-oping methods of forging watermarks used to protect paper money In 1779,

Gentleman’s Magazine [285] reported that a man named John Mathison

had discovered a method of counterfeiting the water-mark of the bank paper, which was before thought the principal security against frauds This discovery he made an offer to reveal, and of teaching the world the method

of detecting the fraud, on condition of pardon, which, however, was no weight with the bank.

John Mathison was hanged

Counterfeiting prompted advances in watermarking technology WilliamCongreve, an Englishman, invented a technique for making color watermarks

by inserting dyed material into the middle of the paper during papermaking.The resulting marks must have been extremely difficult to forge, because theBank of England itself declined to use them on the grounds that they were toodifficult to make A more practical technology was invented by another English-man, William Henry Smith This replaced the fine wire patterns used to makeearlier marks with a sort of shallow relief sculpture, pressed into the papermold The resulting variation on the surface of the mold produced beautifulwatermarks with varying shades of gray This is the basic technique used todayfor the face of President Jackson on the $20 bill

Examples of our more general notion of watermarks—imperceptible sages about the objects in which they are embedded—probably date back tothe earliest civilizations David Kahn, in his classic book The Codebreakers,

mes-provides interesting historical notes [214] An especially relevant storydescribes a message hidden in the book Hypnerotomachia Poliphili, anony-

mously published in 1499 The first letters of each chapter spell out “PoliamFrater Franciscus Columna Peramavit,” assumed to mean “Father FrancescoColumna loves Polia.”7

7

This translation is not universally accepted Burke [58] notes that the two words of the title

of the book are made up by the author Burke goes on to claim that the word Poliphili,

assumed to mean “lover of Polia,” might also mean “the Antiquarian”; in which case, the secret message might be better translated as “Father Francesco Columna was deeply devoted

Trang 37

Four hundred years later, we find the first example of a technology similar

to the digital methods discussed in this book In 1954, Emil Hembrooke ofthe Muzak Corporation filed a patent for “watermarking” musical Works Anidentification code was inserted in music by intermittently applying a narrownotch filter centered at 1 kHz The absence of energy at this frequency indicatedthat the notch filter had been applied and the duration of the absence used tocode either a dot or a dash The identification signal used Morse code The

1961 U.S Patent describing this invention states [185]:

The present invention makes possible the positive identification of the gin of a musical presentation and thereby constitutes an effective means

ori-of preventing such piracy, i.e it may be likened to a watermark in paper.

This system was used by Muzak until around 1984 [432] It is interesting tospeculate that this invention was misunderstood and became the source ofpersistent rumors that Muzak was delivering subliminal advertising messages toits listeners

It is difficult to determine whendigital watermarking was first discussed.

In 1979, Szepanski [398] described a machine-detectable pattern that could beplaced on documents for anti-counterfeiting purposes Nine years later, Holt

et al [197] described a method for embedding an identification code in an

audio signal However, it was Komatsu and Tominaga [238], in 1988, whoappear to have first used the term digital watermark Still, it was proba-

bly not until the early 1990s that the termdigital watermarking really came

into vogue

About 1995, interest in digital watermarking began to mushroom Figure 1.3

is a histogram of the number of papers published on the topic [63] The firstInformation Hiding Workshop (IHW) [20], which included digital watermarking

as one of its primary topics, was held in 1996 The SPIE began devoting aconference specifically toSecurity and Watermarking of Multimedia Contents

[450, 451], beginning in 1999

In addition, about this time, several organizations began considering marking technology for inclusion in various standards The Copy ProtectionTechnical Working Group (CPTWG) [34] tested watermarking systems for pro-tection of video on DVD disks The Secure Digital Music Initiative (SDMI) [364]made watermarking a central component of their system for protecting music.Two projects sponsored by the European Union, VIVA [110] and Talisman [180],tested watermarking for broadcast monitoring The International Organizationfor Standardization (ISO) took an interest in the technology in the context ofdesigning advanced MPEG standards

water-In the late 1990s several companies were established to market ing products Technology from the Verance Corporation was adopted into thefirst phase of SDMI and was used by Internet music distributors such as LiquidAudio In the area of image watermarking, Digimarc bundled its watermarkembedders and detectors with Adobe’s Photoshop More recently, a number of

Trang 38

1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

FIGURE 1.3

Annual number of papers published on watermarking and steganography by the IEEE.

companies have used watermarking technologies for a variety of applications,which are discussed in detail in Chapter 2

1.3 HISTORY OF STEGANOGRAPHY

The first written evidence about steganography being used to send messages

is the Herodotus [192] story about slaves and their shaved heads already tioned in Section 1.1 Herodotus also documented the story of Demeratus, whoalerted Sparta about the planned invasion of Greece by the Persian Great KingXerxes Demeratus scraped the wax off the surface of a wooden writing tabletand scratched his warning into the wood The tablet was then coated with

men-a fresh lmen-ayer of wmen-ax to men-appemen-ar men-as men-a blmen-ank writing tmen-ablet thmen-at could be smen-afelycarried to Sparta without arousing suspicion

Aeneas the Tactician [399] proposed many steganographic techniques thatcould be considered “state of the art” of his time, such as hiding messages

in women’s earrings or messages carried by pigeons He also described severalmethods for hiding in text—by modifying the height of letter strokes or markingletters in a text using small holes

Trang 39

Linguistic steganography, also calledacrostic, was one of the most popular

ancient steganographic methods Secret messages were encoded as initial letters

of sentences or successive tercets in a poem One of the most famous examples

isAmorosa visione by Giovanni Boccacio [446].

A more advanced version of linguistic steganography originally conceived

in China and reinvented by Cardan (1501–1576) is the famous Cardan’s Grille.The letters of the secret message do not form a regular structure but a randompattern The message is read simply by placing a mask over the text The mask

is an early example of a secret (stego) key that had to be shared betweencommunicating parties Acrostic was also used in World War I by both theGermans and Allies

A precursor of modern steganographic methods was described by FrançoisBacon [27] Bacon used italic or normal fonts to encode binary representations

of letters in his works Five letters of the cover Work could hold five bits andthus one letter of the alphabet What made this method relatively inconspicuouswas the variability of sixteenth-century typography

A modern version of this steganographic technique was described byBrassil et al [51] They used the fact that while shifting lines of text up or

down by 1/300 of an inch is not visually perceptible, these small changes arerobust enough to survive photocopying

Another idea that played an important role in several wars in the nineteenthand twentieth centuries was originally proposed by Brewster (1857) [54]

He suggested hiding messages by shrinking them so much that they startedresembling specs of dirt The shrinking was made possible by the technol-ogy developed by French photographer Dragon during the Franco-Prussian War(1870–1871) Microscopic images could be hidden in nostrils, ears, or underfingernails [386] In World War I, Germans used such “microdots” and hidthem in corners of postcards slit open with a knife and resealed with starch.The modern twentieth-century microdots could hold up to one page of textand even contain photographs The Allies discovered the use of microdots

in 1941

A more recent and quite ingenious use of steganography helped der Jeremiah Denton convey the truth about his North Vietnamese captors.When paraded in front of the news media as part of staged propaganda, Dentonblinked his eyes in Morse code spelling out T-O-R-T-U-R-E

Comman-Similar to watermarking, the boom of steganography coincides with theappearance of the Internet The rapid spread of computer networks and shift todigitization of media created a very favorable environment for covert stegano-graphic communication Recently, steganography has been suspected as a pos-sible means of information exchange and planning of terrorist attacks It is onlynatural that such technology by its very nature could be used for planning crim-inal activities Moreover, as of writing this book in mid-2006, there are over 300steganographic products on the Internet available for download today Some ofthese tools offer strong encryption methods that encrypt the secret messages

Trang 40

to provide an additional layer of security in case the steganographic scheme isbroken An example of such a program is Steganos (http://www.steganos.com/ )

or Stealthencrypt (http://www.stealthencrypt.com/ ) For more current lists of

steganographic programs, seehttp://www.stegoarchive.com/.

Advances in steganography have spurred the complementary field ofsteganalysis that started developing more rapidly after the terrorist attacks ofSeptember 11, 2001 Steganalysis is concerned with developing methods fordetecting the presence of secret messages and eventually extracting them.Steganography is considered broken even when the merepresence of the secret

message is detected Indeed, the fact that we know that certain parties arecommunicating secretly is often a very important piece of information

1.4 IMPORTANCE OF DIGITAL WATERMARKING

The sudden increase in watermarking interest is most likely due to the increase

in concern over copyright protection of content The Internet had become userfriendly with the introduction of Marc Andreessen’s Mosaic web browser inNovember 1993 [11], and it quickly became clear that people wanted to down-load pictures, music, and videos The Internet is an excellent distribution systemfor digital media because it is inexpensive, eliminates warehousing and stock,and delivery is almost instantaneous However, content owners (especially largeHollywood studios and music labels) also see a high risk of piracy

This risk of piracy is exacerbated by the proliferation of high-capacity digitalrecording devices When the only way the average customer could record asong or a movie was on analog tape, pirated copies were usually of a lowerquality than the originals, and the quality of second-generation pirated copies(i.e., copies of a copy) was generally very poor However, with digital recordingdevices, songs and movies can be recorded with little, if any, degradation inquality Using these recording devices and using the Internet for distribution,would-be pirates can easily record and distribute copyright-protected materialwithout appropriate compensation being paid to the actual copyright owners.Thus, content owners are eagerly seeking technologies that promise to protecttheir rights

The first technology content owners turn to is cryptography Cryptography

is probably the most common method of protecting digital content It is tainly one of the best developed as a science The content is encrypted prior todelivery, and a decryption key is provided only to those who have purchasedlegitimate copies of the content The encrypted file can then be made avail-able via the Internet, but would be useless to a pirate without an appropriatekey Unfortunately, encryption cannot help the seller monitor how a legitimatecustomer handles the content after decryption A pirate can actually purchasethe product, use the decryption key to obtain an unprotected copy of the

Tiêu đề	Digital Watermarking and Steganography
Tác giả	Ingemar J. Cox, Matthew L. Miller, Jeffrey A. Bloom, Jessica Fridrich, Ton Kalker
Trường học	Virginia Polytechnic University
Chuyên ngành	Computer Security
Thể loại	book
Năm xuất bản	2008
Thành phố	Burlington

Định dạng
Số trang	587
Dung lượng	10,01 MB