In this thesis, we explore the property of pixel value distortion and present a watermarking scheme robust to this distortion.. The experiment results show that this watermarking scheme
Trang 1DIGITAL WATERMARKING ROBUST TO
PRINT-AND-SCAN PROCESS
Zhou Zhicheng
NATIONAL UNIVERSITY OF SINGAPORE
2004
Trang 2DIGITAL WATERMARKING ROBUTS TO
PRINT-AND-SCAN PROCESS
Zhou Zhicheng
(B.Eng.(Hon.) in Computer Science, USTC)
A THESIS SUBMITTED FOR THE DEGREE OF MASTER OF SCIENCE DEPARTMENT OF COMPUTER SCIENCE
SCHOOL OF COMPUTING NATIONAL UNIVERSITY OF SINGAPORE
2004
Trang 3Acknowledgement
The writing of this paper has received deep concerns and valuable help from my advisors
Dr Qibin Sun and A/P Ee-chien Chang, whose kind encouragement, generous patience, and pertinent criticisms have guided me throughout the whole research work and the formation of this thesis Their broad knowledge, deep insight, serious attitude and research enthusiasm benefited me a lot in conducting my research work and will surely guide me for the future challenge
I also want to show my sincere gratitude to my friends: Dajun He, Shuiming Ye, Zhishou Zhang, Zixiang Yang, Xinglei Zhu, and Rong Zhang, for their continuous encouragement, valuable discussion and kind advice I will always remember the time we spent together with joy and harmony like a big family
Finally, allow me to show my deep thanks to my family for their consistent encouragement and support Their deep love is the incentive force for my progress Special thanks go to my wife for sharing my joy and sorrows all the time
Trang 4Publications
Shuiming Ye, Zhicheng Zhou, Qibin Sun, and Qi Tian, A Quantization-Based Image
Authentication System, Fourth International Conference on Information,
Communications & Signal Processing, Fourth IEEE Pacific-Rim Conference On Multimedia, Singapore, December 2003
Qibin Sun, Dajun He, Zhicheng Zhou, and Shuiming Ye, Feature Selection for
Semi-fragile Signature-based Authentication Systems, International Conference on
Information Technology: Research and Education, Newark, New Jersey, August, 2003
Trang 5Table of Content
Acknowledgement i
Publications ii
Summary vi
List of Figure viii
List of Table xi
Chapter 1 Introduction 1
1.1 Digital Watermarking 3
1.1.1 Overview of Digital Watermarking 3
1.1.2 Models of Digital Watermarking 7
1.2 Print-and-Scan Process 10
1.3 Motivation 12
1.4 Organization of thesis 13
Chapter 2 Printer and Scanner 15
2.1 Working Principles of Printer 15
2.2 Working Principles of Scanner 21
2.3 Related Work on Watermarking 23
2.4 Summary 26
Chapter 3 Noise of Print-and-Scan Process 28
Trang 63.1 Introduction 28
3.2 Properties in Pixel Domain 32
3.3 Properties in DCT Domain 37
3.4 Summary 44
Chapter 4 Watermarking Algorithm 46
4.1 Original DEW 47
4.1.1 Watermark Embedding 48
4.1.2 Watermark Extracting 52
4.2 Our Watermarking Scheme 53
4.2.1 Watermark Embedding 56
4.2.2 Watermark Extracting 61
Chapter 5 Experiments and Discussions 64
5.1 Side Effect of RST Distortion 66
5.2 Optimal Parameters for Watermarking 69
5.2.1 Energy Threshold for Watermark Extracting 70
5.2.2 Quality Factor for Watermark Embedding 73
5.2.3 Energy threshold for watermark embedding 76
5.2.4 Cutting Index Cmin 77
5.2.5 Number of blocks in one GOB 80
5.3 Performance Comparison 84
5.3.1 Fidelity 85
5.3.2 Robustness 87
5.4 Summary 93
Trang 7Chapter 6 Conclusion and Future Work 94
6.1 Conclusion 94 6.2 Future Work 95
Trang 8Summary
After decades of development, the research of digital watermarking has advanced greatly
in many ways Many digital watermarking systems are robust to the ordinary image processing techniques like lossy compression, cropping, scaling, and rotation However, digital watermarking that is robust to D\A and A\D transformation, such as print-and-scan process, still lacks full investigations In general, the print-and-scan distortion includes two types of distortions: Rotation, Scaling and Transformation (RST) distortion and pixel value distortion However, many existing research only focuses on the former distortion because the pixel value distortion is difficult to be perfectly modeled
In this thesis, we explore the property of pixel value distortion and present a watermarking scheme robust to this distortion From the investigation on the noise property of the print-and-scan process, we find that the smooth blocks get less distorted during this process Watermark embedded in these blocks is probably more robust to the print-and-scan distortion However, protecting the texture blocks from distortion is also necessary because the watermark extractor within this blind watermarking scheme also uses these texture blocks For this reason, we design the noise adding operation to compensate the print-and-scan distortion to the texture blocks In addition, the watermark extractor also affects the whole system performance, so we use the multiple-vote operation
to enhance the robustness of water extractor The rules of the multiple-vote operation are specially designed based on the property of the print-and-scan distortion
The experiment results show that this watermarking scheme is robust to print-and-scan distortion for a variety of images having different texture contents
Trang 9Furthermore, the comparison between this scheme and the differential energy watermarking scheme (DEW) shows that our special designed operations are effective for improving the watermark robustness to print-and-scan distortion
Trang 10List of Figure
Figure 1.1 Classification of Digital Watermarking 6
Figure 1.2 Watermarking Model I: Watermark as Noise 7
Figure 1.3 Watermarking Model II: Watermark as Noise and Side Information 8
Figure 1.4 Watermarking Model III: Watermark as Transmitted Messages 9
Figure 2.1 Basic Components of Laser Printers 17
Figure 2.2 Dot Gain Effect of Printers 18
Figure 2.3 Gamma Correction TOP: Gamma Error, BELOW: Gamma Correction 20
Figure 2.4 Basic Components of Flatbed Scanners 22
Figure 3.1 The Print-and-scan Copy of 'Lena' with Supplemental Marks 29
Figure 3.2 'Lena' and 'Baboon' 31
Figure 3.3 Projection Map Between Original and Scanned Images 33
Figure 3.4 Distortion Map Between Original and Scanned Images 35
Figure 3.5 Projection Maps of Different Copies of Scanned 'Lena' 36
Figure 3.6 Pixel Distortions Between Original Lena and Gaussian-Noised 'Lena' 36
Figure 3.7 Zig-zag Scan Order of DCT Coefficients 38
Figure 3.8 Absolute Distortions in Low & High Frequency 40
Figure 3.9 Distortions of DC Coefficients, TOP: increased; BOTTOM: decresed 42
Figure 3.10 The Blocks Whose Energies are Decreased by Print-and-Scan 42
Figure 3.11 Magnitudes of Different Types of Distortions 44
Figure 4.1 DEW Watermark Embedding Scheme 48
Figure 4.2 Shuffled Images: 'Lena'(LEFT) & 'Baboon'(RIGHT) 48
Trang 11Figure 4.3 Creation of GOB in DEW 49
Figure 4.4 DEW Watermark Extracting Scheme 52
Figure 4.5 Modified Watermark Embedding Scheme 57
Figure 4.6 Slices of the Shuffled Images: 'Lena' (LEFT) & 'Baboon' (RIGHT) 58
Figure 4.7 PSNR of Different Print-and-Scan Images 59
Figure 4.8 Print-and-Scan Copies of 'Lena': Original(LEFT); Noise Adding(RIGHT) 60
Figure 4.9 Modified Watermark Extracting Scheme 62
Figure 5.1 Bit Error Rates for Different RST 68
Figure 5.2 Watermarked 'Lena' and 'Baboon' 71
Figure 5.3 Bit Error Rates for Different G 72
Figure 5.4 PSNR of Watermarked Images for Different Q 74
Figure 5.5 Watermarked Images for Q=45 74
Figure 5.6 Watermarked Images for Q=65 75
Figure 5.7 Bit Error Rates for Different Q 76
Figure 5.8 Bit Error Rates for Different E 77
Figure 5.9 PSNR of Watermarked Images for Different Cmin 79
Figure 5.10 Bit Error Rates for Different Cmin 80
Figure 5.11 Watermarked Images for Different n UP: M=8, BELOW: M=0.12 82
Figure 5.12 PSNR of Watermarked Images for Different n 83
Figure 5.13 Bit Error Rates for Different n 84
Figure 5.14 PSNRs for All Test Images 86
Figure 5.15 Watermarked 'Flower's of DEW (LEFT) and Our Scheme (RIGHT) 87
Figure 5.16 Robustness Comparison for DEW and our Scheme on 'Lena' 88
Trang 12Figure 5.17 Robustness Comparison for DEW and our Scheme on 'Baboon' 89
Figure 5.18 Robustness Comparison for DEW and our Scheme on 'Fruit' 90
Figure 5.19 Robustness Comparison for DEW and our Scheme on 'Flower' 91
Figure 5.20 Robustness Comparison for DEW and our Scheme on 'Girl' 92
Figure 5.21 Robustness Comparison for DEW and our Scheme on 'Women' 92
Trang 13List of Table
Table 3.1 Means of Different High Frequency Distortions 43
Table 5.1 Parameters for Experiment of RST Side Effect 67
Table 5.2 Bit Error Rate for Different RST 67
Table 5.3 Parameters for Experiment 1 70
Table 5.4 Parameters for Experiment 2 73
Table 5.5 Parameters for Experiment 3 76
Table 5.6 Parameters for Experiment 4 77
Table 5.7 Bit Error Rates for Different Cmin 79
Table 5.8 Parameters for Experiment 5 80
Table 5.9 PSNR of Images in Different n 81
Table 5.10 Parameters for experiment 6 85
Trang 14Chapter 1
Introduction
The great development of the computer network in the past decades, including LAN, WAN, Internet and wireless network, enables fast and convenient communications between people At the same time, the duplicate or edit of the multimedia objects in high quality and fidelity becomes easier as the digital multimedia technologies advance However, this ease of distributing and modifying the media objects introduces new problems to the copyright protection
Cryptography is the first technology applied to protect the transmitted digital multimedia content The basic idea behind cryptography is to encrypt the plaintext to the ciphertext, and then send it to the receiver Only the receiver having the corresponding decryption function can get the plaintext back from the ciphertext However, this framework only protects the information during transmission After the multimedia object
is decrypted, it may still be freely distributed by the receiver Therefore, this object is not under protection any more Digital watermarking is the technology for complementing the shortage of the cryptographic framework Its general idea is to embed the identification
Trang 15information into the multimedia content and associate them tightly By this means, the distributed multimedia content always carries the embedded information
Digital watermarking and cryptography have very different requirements, as they have different goals For cryptography, robustness refers to security, which evaluates how good the algorithm is for protecting the encrypted information from being known by the attackers who have no decryption keys Nevertheless, robustness in digital watermarking has different meanings: protecting the embedded watermarks from being removed when the image content is not changed In other words, the attackers to the digital watermarking have no interest in knowing the content of watermark, but want to make it fail to work properly
Research on the robustness of the watermarking system has been greatly investigated for all multimedia objects such as audio, image and video For image watermarking, many schemes guarantee that the watermarking is robust to ordinary image processing operations such as scaling, cropping, rotation, transformation or lossy compression However, little work has done on the image watermarking robustness to print-and-scan process In some cases, the print-and-scan is also thought as the acceptable image processing techniques as it does not change the image content greatly This thesis tries to understand the print-and-scan process and aims to design a watermarking scheme robust to the distortion comes from this process
This chapter begins with an overview of digital watermarking, and then introduces the mathematical models of watermarking After that, the related research on digital watermarking robust to print-and-scan process is introduced Finally the organization of the whole thesis is given
Trang 161.1 Digital Watermarking
1.1.1 Overview of Digital Watermarking
The history of watermarking can be traced back to several hundred years ago, when it was used mainly for trademark or decoration [1] At that time, the watermark was created using chemical or physical methods during the process of paper manufacturing It was not until
1954 when Emil Hembrooke invented the first digital presentation of the watermark [2];
and Komatsu and Tominaga used the term digital watermarking for the first time in
1988[1] Then in the 1990's, interest on this research topic began to boost because the easy distribution of the media objects in computer networks introduces new problems on the copyright protection After the decade of development, the application of digital watermarking is no longer restricted on the copyright protection More and more applications of the digital watermarking are proposed, such as ownership protection, data monitoring and tracking, data authentication, and copy control etc
Although digital watermarking uses the term "watermarking", it only shares the invisible property with the traditional watermarking The digital watermarking has more requirements than the traditional watermarking as it is used digitally In fact, digital watermarking uses the similar techniques as the information hiding [3], which were utilized for secret communication between armies in wars thousands of years ago Although Information hiding, steganography and digital watermarking are similar techniques in embedding information into multimedia objects, they have different goals and applications, resulting in different requirements For example, commonly robustness
is the main focus of digital watermarking, capacity is the main focus of information hiding
Trang 17and secret communication is the main focus of steganography [1, 3, 4] give the detailed explanation and comparison of these terms Note that the requirement of a watermarking system is application-oriented; and some digital watermarking applications may take other factors into consideration but not only restrict in the robustness In this thesis, we mainly focus on the robustness while keeping the appropriate watermark capacity and visual image quality
There are many perspectives to category the digital watermarking technologies Figure 1.1 gives a coarse classification Different types of media require different digital watermarking technologies For instance, the real-time videos and audios need the fast watermark extracting, while images have less requirements on the speed of watermark extraction Authors of [5] and [6] has detailed a comparison on the technologies for these media types Though watermarking technologies can be classified by the property of perception, perceptible watermarking is only used in image and video applications but never used in audio The reason comes from the different perception properties between the Human Visual System (HVS) and the Human Auditory System (HAS) HVS samples the multimedia object by a scalable means, perceives the information globally and neglects the locally visual distortion But HAS samples every local signal instantaneously in a detailed scale, thus it is very sensitive to even locally audible distortion Perceptible watermarking, or, visible watermarking, simply overlays a visual pattern on the image or video for watermark This pattern generally is a trademark or a copyright logo, which claims the copyright ownership The disadvantage of perceptible watermarking is that it distorts the original image or video content greatly For this reason, imperceptible techniques become more and more popular Imperceptible watermarking embeds the
Trang 18watermark to the locations to which HVS or HAS is not sensitive Then the modification introduced by the watermark embedding probably cannot be perceived by HVS or HAS Spatial domain and frequency domain watermarking schemes differ in their embedding domains In general cases, the frequency domain watermarking is more robust than the spatial domain watermarking Embedding a watermarking bit in frequency domain is similar to distributing the energy of this bit uniformly to different pixels within spatial domain Then the local spatial distortion only damages the watermark bit partially; and the information of this bit hidden in other locations is probably enough for correct watermark extraction We will narrow our scheme within the imperceptible image watermarking in frequency domain
For different applications, the requirements of digital watermarking are different
The most common discussed properties are fidelity, capacity, robustness, and security
Fidelity refers to the perceptual similarity between the watermarked version and the
original version of the host data; capacity refers to the amount of information that can be embedded within the host data without unacceptable distortion; robustness refers to the
correct watermark detection even after the acceptable signal processing operations are
done to the watermarked media; and security refers to the resistance to the malicious attack
methods like the unauthorized removal, embedding, and detection Although the tradeoff
between these properties varies for different applications, robustness is fairly a common
requirement for most of the watermarking applications We are mainly concerned with the
robustness, as well as the capacity and fidelity in this thesis
Trang 19Figure 1.1 Classification of Digital Watermarking
After decades of research on digital watermarking, more and more applications are
proposed and investigated In general, these applications include ownership protection,
data monitoring and tracking, data authentication, fingerprinting, copy/access control
and media indexing etc [1, 2, 7] Ownership protection, or copyright protection, aims to
prove the ownership for a multimedia object by embedding the owner's information into it How to identify the true watermark if more than two watermarks are embedded in the
media is a problem for this application The purpose of data monitoring and tracking is to
find out whether the data is transmitted or broadcasted as required For example, the total time of the advertisement broadcasted by a radio station or a television station must be
summed for billing purpose Data authentication determines whether the host media has
been tampered In authentication, the main concern is to prevent the tampered media
object with a forged watermark to pass the authentication Fingerprinting embeds different
watermarks in the copies of a media object and distributes them to different customers When any customer distributes his/her copy illegally, the owner of the multimedia object
Trang 20can find out this pirate Access control aims to restrict the user's access to the protected
data according to his or her privilege For example, the access control detector in the Video Cassette Recorder (VCR) can prohibit the VCR to record a watermarked video, when the
embedded watermark declares this video as "never-copy" Indexing embeds distinct labels
into the video contents, to facilitate the finding process of these media objects
1.1.2 Models of Digital Watermarking
In general, digital watermarking is modeled mathematically as a kind of digital communication [1, 8, 9, 10] Watermarking system itself is a communication channel and the watermark is the information that is transmitted through this channel To well understand the prototype of the watermarking systems, the discussion on their common models is necessary
Cox gives three communication models for digital watermarking in [1] Among
these models, the cover work plays different roles as pure noise, side information or transmitted message; watermark embedder and watermark detector play the same roles as
modulator and demodulator
Figure 1.2 Watermarking Model I: Watermark as Noise
Trang 21In the first model (Figure 1.2), the media object, i.e., the cover work C 0, is thought
of as the noise to the communication channel Then the watermarked media C w is the
addiction of the encrypted watermark m and the cover work C 0 The signal processing
imposed on the watermarked media C w is considered as another adding noise n to the
communication channel Watermark detector tries to eliminate the noise and get an
estimate message m n of the input message m If the cover work is known to the detector, this detector is called informed detector, otherwise it is called blind detector The watermarking system having a blind detector is called blind watermarking system
Figure 1.3 Watermarking Model II: Watermark as Noise and Side Information
In the second model (Figure 1.3), cover work C 0 is regarded as the side information known to the transmitter of a communication channel The embedding and detecting
procedures of this model are similar to the first one Similarly to the first model, the cover
work C 0 is still considered as the noise, but its noise effect can be minimized because the encoder also takes it as the side information The encoder can expect and invert its noise effect in the watermark embedder side before it is added to the channel [1, 8]
Trang 22Figure 1.4 Watermarking Model III: Watermark as Transmitted Messages
In the third model (Figure 1.4), cover work is no long considered as the noise of the channel, but another message to be transmitted along with the watermark message through the channel This watermarking model multiplex watermark and cover work into the same transmission channel, and de-multiplex them at the receiver Traditional communication model only has one de-multiplexing module, which separates the messages by different values of a specific parameter The common parameters used in the multiplexing communication are time, frequency, or code sequence Nevertheless, watermarking model uses two independent modules, i.e the human perception and the watermark detector, to de-multiplex the transmitted information
Besides these communication models, Cox suggests a geometric model for digital watermarking [1], in which the watermark and cover work are regarded as vectors in high-dimensional spaces The watermarked object is the cumulative vector of the watermark vector and the cover work vector That is, watermarking can be considered as moving some vector to a specific space In this thesis, I would like to model watermarking
as a multiplexed communication
Trang 231.2 Print-and-Scan Process
As we are concerned with the digital watermarking robust to print-and-scan process, we need to know what the print-and-scan process is, and the property of the noise it adds to the image
The whole print-and-scan process is: printing the image of digital format onto a paper, and scanning this paper to get back the digital image Intuitively, this process is a kind of D/A A/D transformation, and the noise of this process comes mainly from the transformation between digital format and analogue format Nevertheless, because many uncontrolled and unexpected distortion is introduced in the print-and-scan process, the model of print-and-scan noise is more complex than the simple D/A A/D transformation
In general, the print-and-scan process includes two kinds of distortions: geometric distortion and pixel value distortion [17]
Geometric distortion is a severe attack to the synchronization of a blind digital watermarking system Typical geometric distortion includes rotation, scale, and translation (RST) applied globally to the whole image These operations do not change the image content, but only change the image's orientation, scale or position Besides these
global geometric distortions, Khun and Petitcolas proposed a benchmark StirMark to
attack the watermark by the local geometric distortions while keep the image content with
no globally visual alternations [11] Much research work has been performed on global RST distortions, and these techniques generally fall into two categories The first category
inverts the geometric distortion and then extracts the watermark [12, 13, 14] Pereira et al
proposed to embed a template into the middle-frequency FFT spectrum of the image for reference; then inverted the geometrical transformation according to the change of this
Trang 24template and extracted the watermark [13] Kutter proposed to repeat the watermark bits to different locations in a pattern; then localized this pattern by a correlation function, identified and inverted the affine transformation, and extracted the watermark [14] The second category finds a transform space that is invariant to the RST distortion and embeds the watermark directly within this space for better synchronization [15, 16] Ruanaidh and Pun proposed to embed watermark within the Fourier-Mellin transform [15] To get this transform space for an image, Fourier transform (FFT), log-polar mapping (LPM), and Fourier transform (FFT) are taken in sequence Nevertheless, the basic idea of this technique is based on the continuous FFT, which is hard to calculate for discrete format
images Lin et al proposed another method to get the weaker but simpler invariant domain
for watermarking in [16] Similarly to Pun's scheme, this scheme also takes the FFT and LPM in the first stage, but after that a projection to the angle axis in log-polar coordinate rather than another FFT is taken In this domain, the image is invariant to translation and scaling, and cyclic-shift to rotation Although the image in this domain is not invariant to rotation operation, the watermarking detector in his scheme knows the original watermark, and an exhaustive search can find the translation angle to compensate the rotation operation
In contrast to the plenty of research on RST, there is less research investigating the properties of pixel value distortion and the watermarking robust to it Lin gave a general model for the pixel value distortion of print-and-scan process in [17] Although the detailed formulas of this model are not given, all types of noise involved in the print-and-scan process have been taken into consideration Nevertheless, this model needs the calibration for different setting of printer and scanner Thus it is not common for all the
Trang 25printers and scanners Zhu et al [18] proposed to use the pixel distortion of print process as
the signature for authentication of the printed document, because the distortions to the same pattern in different print process are unique [18] This reveals that the print-and-scan distortion is time-variant The security of their schemes is based on truth that the resolution
of the microscope used for watermark extraction can be very high, while the printer can not provide such resolution to counterfeit the unique printed pattern This asymmetry guarantees that the extraction of the print signature is easy to implement while the reproduction is almost impossible In general, the pixel value distortion of print-and-scan
is a nonlinear, content dependent and time-variant function To reduce or invert its affect like what was done to geometric distortion in detailed granularity is difficult Nevertheless,
as the human eyes can recognize an original copy and a print-and-scan copy of the same image to be similar, it is intuitive that some feature space within an image can be robust to print-and-scan In other words, in the print-and-scan context, feature space should be found in a coarse granularity
1.3 Motivation
There are many applications for watermarking robust to print-and-scan process The authentication of documents needs this technique for some cases, such as authentication of passports, cheques, E-tickets or ID-cards These types of images are simple in content, providing more space for embedding The copyright protection for documents also needs this technique For example, the photographers publishing their pictures of high quality in magazines might hope to claim their ownership Or, the publishers purchasing the whole copyright of the photos might also hope to protect their copyrights and guarantee that these
Trang 26photos are published under their permission But there are few techniques to support these applications currently
On the other hand, the research on this topic still needs further investigations Currently, the research that is related to the print-and-scan distortion mainly focuses on the rotation, scaling and transformation (RST) distortion, but not the pixel value distortion However, a watermarking system supporting the above applications can not ignore this type of distortion Thus the research on this topic is important and valuable
In this thesis, we intend to understand the properties of the pixel value distortion, and design a blind watermarking system that is robust to this print-and-scan distortion As RST distortion is not our concern, we assume it is erased by other means before watermark extracting For the design of watermarking system, our strategy is to find the stable feature space in the images for the print-and-scan process, and design a method to blindly embed the watermark in this space To focus our research on the print-and-scan distortion, we only consider the watermarking for grayscale images
1.4 Organization of thesis
The whole thesis is organized as follows Chapter 2 gives a detail introduction on the working principles of the printers and the scanners In addition, the printer model and scanner model are discussed in this chapter In chapter 3, we will discuss the property of the print-and-scan noise, based on the printer and scanner models The results of the chapter 2 and chapter 3 give the basis to design a robust watermarking system Chapter 4 discusses our watermarking system This watermarking system is a variation of the Differential Energy Watermarking (DEW) Chapter 5 presents the experiment results to
Trang 27evaluate the robustness of our watermarking system to print-and-scan distortion The results include the robustness performance of our watermarking system for different images, and the comparison on the robustness between the original DEW and our system
Trang 28Chapter 2
Printer and Scanner
To design a watermarking scheme robust to print-and-scan process, we need to know the working principles and the mathematical models of printers and scanners In this chapter,
we first introduce the working principles of the printers and scanners, then their mathematical models, and finally the previous research related to our desired watermarking scheme
Printers and scanners are the image input and output systems respectively The optical principles of these imaging systems are presented in [19] And the structural components and their functionalities are described in [20, 21] To understand how they work, we need to know their internal components and how they coordinate Based on the distortions introduced by different components within printers or scanners, we can get the distortion models
2.1 Working Principles of Printer
Printers can be categorized in several ways For example, impact printers work by hitting a head or a needle against an ink ribbon to make a mark on the paper; while non-impact
Trang 29printers exploit other techniques like the physical theories in optics and electronics to avoid these mechanical strikes[22] Dot-matrix printers, daisy-wheel printers, and line printers belong to the impact printers, which are noisier than the non-impact printers like inkjet printers and laser printers On the other hand, printers can be classified as contone printers and halftone printers [19] Contone printers can print the continuous tone images directly, but they are more expensive Halftone printers change a continuous tone image into a binary image (i.e halftone image) and then print this binary image Halftone printers are more popular because they are cheaper than contone printers while providing a similar quality Because of their popularity, we will choose the non-impact half tone printers for experiment In particular, we will use the laser halftone printers
Figure 2.1 illustrates the major components of a laser printer and its six-step print
process [20] Firstly, the photoreceptor drum is uniformly charged by the corona wires (1); then partially discharged by the laser beam (2) The laser beam is generated based on the
image or the document that is printed, and the partially discharged area on the drum
represents the printing area of the binary image After that, the toner hopper coats the drum
surface with the positively charged toners, which only adhere to the discharged area (3)
The drum rolls over a paper that is negatively charged by another set of corona wires
below it (4) As this charge is stronger than the negative charge on the drum in step (1), the toners are pulled away from the drum and adhered to the paper The paper carrying the
adhered black toners is then sent to a fuser, which melts the toners and fuses them with the
paper tightly (5) In the last step, the charge on the surface of the drum is cleared up by a discharge lamp (6) The drum is then prepared well for the next printing The optical and electronic devices, including the mirror, the laser generator and the corona wires, are the
Trang 30delicate equipments They do not generate noise during the printing process But they are very sensitive to small movements Even a very small dithering of the mechanical devices could introduce distortion to the printed images This mechanical distortion is the random noise independent to the printed images
Figure 2.1 Basic Components of Laser Printers
Besides the hardware distortion from mechanical devices, a printer also introduces
the software noise of the digital-to-analog conversion The laser printer uses digital
halftoning technique to convert continuous-tone images to the half-tone images, which are
represented by black and white dots only, before they are sent for printing The laser beam
is modulated based on this binary image, but not the original image of continuous tone The image quality of a halftone image is similar to its original image, because the Human Visual System (HVS) tends to spatially integrate the intensities of the neighboring pixels
Trang 31within an image Haltoning represents the gray intensity of a pixel within the original
image by a group of small dots The gray intensity decides how the dark dots and white
dots are distributed in this group Details about choosing the pattern of black and white
dots can be found in [23] Viewing a halftone image, the HVS then integrates all the dots
within this small group to recover the original gray level information of this pixel Thus we
perceive an average reflectance value R, which can be modeled by the Murray-Davies
R and R is the reflectance values of black dots and the white paper respectively W
The function of (2.1) is also called the tone transfer function Given the pixels having the
same gray intensity, varies with respect to other surrounded pixels of different gray
levels The halftoning noise then is dependent on the image content, which is different to
the independent noise of the mechanical devices
F
Figure 2.2 Dot Gain Effect of Printers
If the printing process is perfect, the tone transfer function R in (2.1) is a linear
function of the weighting factor F However, in practice the tone transfer function is a
nonlinear function, and the printed image is darker than expected (Figure 2.3 up) This is
caused by another distortion called dot gain, which is defined as the dot-size increase when
Trang 32the image is printed (Figure 2.2) In general, there are two types of dot gains: mechanical dot gain and optical dot gain Mechanical dot gain (Figure 2.2 middle) comes from the physical spreading of the toners as they are fused into paper Optical dot gain (Figure 2.2 right) refers to the optical growth of the dot-size, which comes from the trapping of the reflecting lights beneath the printed dot The light (e.g Light A) projected to the dot directly will be absorbed entirely and get no reflections However, the reflections of some lights projected near the dot are also possible to be absorbed For instance, light B in Figure
2.2 enters the paper, scatters laterally, and emerges under the dot Thus, the dot has a higher probability to absorb the light than one would expect from the physical dot-size It
has an optical size larger than the physical size Compensation of the dot gain is similar to
gamma correction in CRT display device, because the transfer function of dot gain
resembles the voltage-to-light curve, which is a power-law transformation, of a CRT [24] The power-law transformation has the common form [25]:
where and are positive constants For common gamma errors of CRTs,
(Figure 1.1 Plot2), which will darken the image to lose the details To invert the gamma
errors, gamma correction calculate
s=cr α=cr α⋅ α=cr, the transfer function changes to a linear one after
gamma correction Figure 1.1 illustrates effect of gamma and gamma correction employed
to the example image In such cases, the tone transfer equation taking dot gain into account
Trang 33which is called Yule-Nielsen equation Although this model seems perfect for correcting the dot gain, the calibration is needed for printers to find the different suitable parameterα The noise introduced by gamma correction is not only content-dependent, but also nonlinear
Figure 2.3 Gamma Correction TOP: Gamma Error, BELOW: Gamma Correction
In [26], a noise model is proposed, which takes all the noise of halftoning, dot gain and mechanical dithering into consideration However, this noise model is complicated and needs the calibration before it is employed to a printer The calibration requires that the original image is available This requirement, however, conflicts with our goal of designing a blind watermarking system
Trang 342.2 Working Principles of Scanner
There are also many types of available scanners, such as flying-spot scanner, drum scanner, and flatbed scanner, all of which are different from their mechanical sub-systems [27] Most desktop scanners use charge-coupled device (CCD) arrays, photo-multiplier tubes (PMT) or Contact Image Sensor (CIS) as the light sensors As CCD-based flatbed scanner
is popular, we only discuss the principles and models of this kind of printer
The basic components of the flat bed scanners are the Charge-Coupled Device (CCD), lamp, and the analog-to-digital converter (ADC) (Figure 2.4) During the scanning process, a scan head inside the scanner moves across the flatbed slowly from one side to the other side The moving scan head then samples the image to reconstruct its digital format The scan head is made up with cathode fluorescent lamp, mirrors, lens and CCD array The lamp illuminates the lights to the scanned image, and the reflection light is led to the CCD array by the mirrors and the lens CCD is a collection of tiny diodes, which are able to convert photons (light) into electrons (electrical charge) The brighter the light that hits on a single diode, the greater the electrical charge will be generated at that diode After the CCD array absorbs the lights reflected by the images, it generates the corresponding electrical voltage signals and sends them to an analog-to-digital converter (ADC) ADC then converts the analog signals into digital signals The image processor processes the received digital signals to reconstruct the image and then send it to the PC for further processing
Trang 35Figure 2.4 Basic Components of Flatbed Scanners
From the scanning process, we can find that the scanners noise comes from the
following sources: vibration or movement of mechanical sub-system, dirt or bias of the
optical sub-system, illuminant unbalance of the lamp, sensitive distortion of the CCD and
sampling distortion of ADC [19] gives a scanner sensor model for the response at a single
where f λ( )is the spectral transmittance of the color filters, d λ( ) is the sensitivity of the
detector (i.e CCD) used in the measurement, l λ s( ) is the spectral radiance of the
illuminant(i.e lamp), r λ( ) is the spectral reflectance of the area being scanned, is the
measurement noise, and
i
ε
s i
t denotes the value obtained from the ith channel This model
concerns the light and color parameters in the high degree of details In [27], the author
gives a coarser model for the CDD array output:
Trang 36( , ) ( , ) ( , ) (
g x y =⎡⎣f x y ∗h x y ⎤⎦⋅s x y, ) (2.5)
where f x y( , )is the input image, h x y( , ) is the sensor aperture response, and s x y( , ) is
the sampling function Actually, g x y( , ) is the convolution of f x y( , ) and ,
which is then multiplied by the sampling function
( , )
h x y
( , )
scanning process is similar to convolution, where the high components of the images are
erased That is, the scanner produces a blurred copy for the printed document These
models indicate that the scanner noise is content-dependent and nonlinear, which is similar
to the printer noise
2.3 Related Work on Watermarking
Due to the variations in the print-and-scan process, the research in literature only partially
solve the problems related to the watermarking that is robust to print-and-scan process As
previous research on the rotation, scaling and transformation (RST) has been introduced in
chapter 1 In this section, we will review the related work exploring the pixel value
distortion of print-and-scan process
Cox [10] proposes a watermarking scheme exploiting the idea of spread spectrum
idea used in digital communication The watermark bits are embedded to the
coefficients in the frequency domain with the strength by:
To enhance the robustness of the watermarking scheme, are selected from the n most
significant coefficients of the frequency domain The watermark then is extracted by
i
c
Trang 37by referring to the original image Thus this assumption is impractical for many applications The techniques similar to Cox's scheme are proposed in [
I(x,y) is the intensity value at the position (x, y) of the original image I, I'(x,y) is the
intensity value of the distorted image.τps( , )x y is the global point spread function, which is the convolution product of the printer spread function τp( , )x y and the scanner spread
τ is the high-pass filter, representing the high noise variance near the edges and
is a white Gaussian random noise Lin claims that
1
N τh( , )x y is not symmetric in the vertical and horizontal directions, since the scan head moved by a step motor will introduce the directional jitter noise s x y( , ) is the sampling function, K(I) is the
responsive function:
Trang 38( ) ( I) K ( )
K I = ⋅ − α I β γ + β +N I (2.10) This function models the noise of A/D, D/A and the gamma adjustments in the printer and
scanner This model is similar to the printer and scanner models mentioned in section 2.1
Lin proposes to estimate the parameters α , γ,β and in equation (2.10) by the optimal x
MSE(Mean Square Error) The watermarking scheme in this paper is designed mainly for
the robustness to RST distortion It is similar to O'Ruanaidh 's in [15], which embeds the
watermark in the Fourier-Mellin transform domain of the image It differs from
O'Ruanaidh 's scheme in using the discrete-fourier transform instead of the continuous
fourier transform when calculating the Fourier-Mellin transform However, this method
also needs the original image for reference to calibrate the estimated parameters of the
print-and scan models Lin’s experiment results mainly focus on the rotation, scaling and
cropping (RSC) distortion of print-and-scan, and lack the evaluation on the system
robustness to the pixel value distortion
k
β
In [18], Zhu et al design an authentication system by utilizing the random property
of the printing process Though this watermarking system is not designed for the
robustness to print-and-scan, it demonstrates the time-variant property of the
print-and-scan process It is claimed that the randomness in the printing process of the
same content is non-repeatable, and propose to use this unique randomness as the robust
feature to embed watermark for document authentication To embed the watermark, an
predefined print pattern is firstly printed out on the document; then a digital microscope is
used to extract the print signature from this printed pattern; the print feature is then
combined with the critical information about the document to generate the specific digital
signature (specific print and specific document); this digital signature is finally printed as
Trang 39the barcode on the document finally To authenticate the document, we need to repeat the
above feature extraction on the printed pattern to get the new extracted feature, which is then combined with critical information to generate the new digital signature This new
digital signature is then compared with the old one that is printed as the barcode to get the final authentication decision The security of their digital watermarking framework is
provided by the asymmetry of examining and reproducing the printed pattern It is easy to get the shape profiles (i.e extracted signature) of the printed pattern by the cheap microscope with high resolution But a printer that is able to reproduce the same printed
pattern needs extremely high resolution, which can not be achieved even by the best
printer at present Therefore, the leakage of security leak seems impossible in a practical sense However, this method is very sensitive as a high resolution microscope is used to
get the extracted signature Every unintentional distortion to the printed pattern on an
authenticated document would lead to authentication failure For example, if the E-ticket is
folded just across the printed pattern, then its new extracted feature is probably different from the original one and the authentication fails Furthermore, the printed pattern needs
to be protected during watermark embedding The second print that generates the bar code
of the digital signature would probably distort it This paper reveals the time-variant
property of the print-and-scan noise This noise is not integrated into the printer and scanner models introduced in the previous sections of this chapter
2.4 Summary
In this section, we introduce the working principles of the laser-printers and the flatbed CCD-based scanners and highlight their mathematical models suggested by different
Trang 40researchers We also discuss some other digital watermarking schemes related to the print-and-scan process
From the above discussion, we can find that the noise model of the print-and-scan process is complex It is a type of content-based, nonlinear and time-variant noise It is impossible to totally erase this noise if the original image is not available Even the print-and-scan model can only estimate the noise, and needs the original image for calibration before it is put into use Therefore, to design a blind watermarking system, we can not use these models directly At least, the strategy to erase the noise or to calibrate the models is unacceptable We choose another strategy by looking for the stable feature space within the images for print-and-scan noise and them embedding the watermark into this space This type of feature space should get the least distortion from the print-and-scan process, and the watermark embedded in this space can be preserved well and robust to print-and-scan distortion