15 CHAPTER 2 DIGITAL IMAGE STEGANOGRAPHY BASED ON THE GALOIS FIELD USING GRAPH THEORY AND AUTOMATA.. 19 2.3.2 Digital Image Steganography Based on The Galois Field GF pm Using Graph Theo
Trang 1MINISTRY OF EDUCATION AND TRAINING
HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
||||||||||
Nguyen Huy Truong
RESEARCH ON DEVELOPMENT OF METHODS OF GRAPH THEORY AND AUTOMATA IN STEGANOGRAPHY AND
SEARCHABLE ENCRYPTION
DOCTORAL DISSERTATION IN MATHEMATICS AND
INFORMATICS
Hanoi - 2020
Trang 2MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
||||||||||
Nguyen Huy Truong
RESEARCH ON DEVELOPMENT OF METHODS OF GRAPH THEORY AND AUTOMATA IN STEGANOGRAPHY AND SEARCHABLE
ENCRYPTION Major: Mathematics and Informatics Major code: 9460117
DOCTORAL DISSERTATION IN MATHEMATICS AND INFORMATICS
SUPERVISORS:
1 Assoc Prof Dr Sc Phan Thi Ha Duong
2 Dr Vu Thanh Nam
Hanoi - 2020
Trang 3Nguyen Huy TruongSupervisors
Trang 4I am extremely grateful to Assoc Prof Dr Sc Phan Thi Ha Duong
I want to thank Dr Vu Thanh Nam
I would also like to extend my deepest gratitude to Late Assoc Prof Dr PhanTrung Huy
I would like to thank my co-workers from School of Applied Mathematics andInformatics, Hanoi University of Science and Technology for all their help
I also wish to thank members of Seminar on Mathematical Foundations forComputer Science at Institute of Mathematics, Vietnam Academy of Science andTechnology for their valuable comments and helpful advice
I give thanks to PhD students of Late Assoc Prof Dr Phan Trung Huy for sharingand exchanging information in steganography and searchable encryption
Finally, I must also thank my family for supporting all my work
Trang 5Page
LISTOFSYMBOLS iii
LISTOFABBREVIATIONS iv
LISTOFFIGURES v
LISTOFTABLES vi
INTRODUCTION 1
CHAPTER1 PRELIMINARIES 4
1.1 Basic Structures 4
1.1.1 Strings 4
1.1.2 Graph 4
1.1.3 Deterministic Finite Automata 6
1.1.4 The Galois Field GF (pm) 7
1.2 Digital Image Steganography 8
1.3 Exact Pattern Matching 11
1.4 Longest Common Subsequence 12
1.5 Searchable Encryption 15
CHAPTER 2 DIGITAL IMAGE STEGANOGRAPHY BASED ON THE GALOIS FIELD USING GRAPH THEORY AND AUTOMATA 16
2.1 Introduction 16
2.2 The Digital Image Steganography Problem 18
2.3 A New Digital Image Steganography Approach 19
2.3.1 Mathematical Basis based on The Galois Field 19
2.3.2 Digital Image Steganography Based on The Galois Field GF (pm) Using Graph Theory and Automata 21
2.4 The Near Optimal and Optimal Data Hiding Schemes for Gray and Palette Images 29
2.5 Experimental Results 34
2.6 Conclusions 38
CHAPTER 3 AN AUTOMATA APPROACH TO EXACT PATTERN MATCHING 40
3.1 Introduction 40
3.2 The New Algorithm - The MRc Algorithm 42
3.3 Analysis of The MRc Algorithm 48
3.4 Experimental Results 51
3.5 Conclusions 56
CHAPTER 4 AUTOMATA TECHNIQUE FOR THE LONGEST COMMON SUBSEQUENCE PROBLEM 57
4.1 Introduction 57
Trang 64.2 Mathematical Basis 58
4.3 Automata Models for Solving The LCS Problem 62
4.4 Experimental Results 67
4.5 Conclusions 68
CHAPTER 5 CRYPTOGRAPHY BASED ON STEGANOGRAPHY AND AUTOMATA METHODS FOR SEARCHABLE ENCRYPTION 69
5.1 Introduction 69
5.2 A Novel Cryptosystem Based on The Data Hiding Scheme (2; 9; 8) 71
5.3 Automata Technique for Exact Pattern Matching on Encrypted Data 75
5.4 Automata Technique for Approximate Pattern Matching on Encrypted Data 77 5.5 Conclusions 79
CONCLUSION 81
LISTOFPUBLICATIONS 82
BIBLIOGRAPHY 83
ii
Trang 7LIST OF SYMBOLS
An alphabetThe set of all strings on
The empty stringjSj The number of elements of a set S
juj The length of a string u
GF (pm) The Galois eld is constructed from the polynomial ring Zp[x],(GF n(pm); +; ) where p is prime and m is a positive integer
A vector space over the eld GF (pm)LCS(p; x) A longest common subsequence of p and x
lcs(p; x) The length of a LCS(p; x)
LeftID(u) The least element the leftmost location of u
Rmp(u) The last component of LeftID(u) in p
(I; M; K; Em; Ex) A data hiding scheme
I A set of all image blocks with the same size and image format
M A nite set of secret elements
Em An embedding function embeds a secret element in an image
block
Ex An extracting function extracts an embedded secret element
from an image block
q
colour The number of di erent ways to change the colour of each
pixel in an arbitrary image block
Adjacent(cp; a) An adjacent vertex of cp
c block A string of length c
Posp(z) The last position of appearance of z in p
Mp An automaton accepting the pattern p
Con g(p) The set of all the con gurations of p
Wp(u) The weight of u in p
Wp(C) The weight of C
WCon g(p) The set of the weights of all the con gurations of p
Wpi(a) The weight of a at the location i in p
Wmp(a) The heaviest weight of a in p
W (a) The weight of a in p
Trang 8LIST OF ABBREVIATIONS
AOSO Average Optimal Shift Or
BNDM Backward Nondeterministic Dawg Matching
EBOM Extended Backward Oracle Matching
FOPA Fastest Optimal Parity Assignment
HCIH High Capacity of Information Hiding
MSDR Maximal Secret Data Ratio
PSNR Peak Signal to Noise Ratio
SAE Searchable Asymmetric Encryption
SSE Searchable Symmetric Encryption
TVSBS Thathoo Virmani Sai Balakrishnan Sekar
iv
Trang 9LIST OF FIGURES
Figure 1.1 A simple graph 5
Figure 1.2 A spanning tree of the graph given in Figure 1.1 6
Figure 1.3 The transition diagram of A in Example 1.3 7
Figure 1.4 The basic diagram of digital image steganography 9
Figure 1.5 The degree of appearance of the pattern p 12
Figure 2.1 The nine commonly used 8-bit gray cover images sized 512 512 pixels 35 Figure 2.2 The nine commonly used 8-bit palette cover images sized 512 512 pixels 36
Figure 2.3 The binary cover image sized 2592 1456 pixels 36
Figure 3.1 Sliding window mechanism 41
Figure 3.2 The basic idea of the proposed approach 45
Figure 3.3 The transition diagram of the automaton Mp, p = abcba 47
Trang 10LIST OF TABLES
Table 1.1 An adjacency list representation of the simple graph given in Figure 1.1 5
Table 1.2 The performing steps of the BF algorithm 11
Table 1.3 The dynamic programming matrix L 13
Table 2.1 Elements of the Galois eld GF (22) represented by binary strings and decimal numbers 30
Table 2.2 Operations + and on the Galois eld GF (22) 30
Table 2.3 The representation of E and the arc weights of G for the gray image 31 Table 2.4 The payload, ER and PSNR for the optimal data hiding scheme (1; 2n 1; n) for palette images with qcolour = 1 37
Table 2.5 The payload, ER and PSNR for the near optimal data hiding scheme (2; 9; 8) for gray images with qcolour = 3 37
Table 2.6 The payload, ER and PSNR for the near optimal data hiding scheme (2; 9; 8) for palette images with qcolour = 3 38
Table 2.7 The comparisons of embedding and extracting time between the chapter’s and Chang et al.’s approach for the same optimal data hiding scheme (1; N; blog2(N + 1)c), where N = 2n 1, for the binary image with qcolour = 1 Time is given in second unit 38
Table 3.1 The performing steps of the MR1 algorithm 47
Table 3.2 Experimental results on rand4 problem 52
Table 3.3 Experimental results on rand8 problem 52
Table 3.4 Experimental results on rand16 problem 53
Table 3.5 Experimental results on rand32 problem 53
Table 3.6 Experimental results on rand64 problem 54
Table 3.7 Experimental results on rand128 problem 54
Table 3.8 Experimental results on rand256 problem 55
Table 3.9 Experimental results on a genome sequence (with j j = 4) 55
Table 3.10 Experimental results on a protein sequence (with j j = 20) 56
Table 4.1 The Refp of p = bacdabcad 60
Table 4.2 The comparisons of the lcs(p; x) computation time for n = 50666 67
Table 4.3 The comparisons of the lcs(p; x) computation time for n = 102398 68
vi
Trang 11In the modern life, when the use of computer and Internet is more and more essential,digital data (information) can be copied as well as accessed illegally As a result,information security becomes increasingly important There are two popular methods toprovide security, which are cryptography and data hiding [2, 5, 6, 20, 56, 62, 81].Cryptography is used to encrypt data in order to make the data unreadable by a third party[5] Data hiding is used to embed data in digital media Based on the purpose of theapplication, data hiding is generally divided into steganography that hides the existence ofdata to protect the embedded data and watermarking that protects the copyrightownership and authentication of the digital media carrying the embedded data
Steganography can be used as an alternative way to cryptography However,steganography will become weak if attackers detect existence of hidden data Henceintegrating cryptography with steganography is as a third choice for data security [2, 5,
6, 12, 19, 61, 62, 81, 86, 93]
With the rapid development of applications based on Internet infrastructure, cloud computing becomes one of the hottest topics in the information technology area Indeed, it is a computing system based on Internet that provides on-demand services from application and system software, storage to processing data For example, when cloud users use the storage service, they can upload information to the servers and then access it on the Internet online Meanwhile, enterprises can not spend big money on maintaining and owning a system consisting of hardware and software Although cloud computing brings many bene ts for individuals and organizations, cloud security is still an open problem when cloud providers can abuse their information and cloud users lose control of it Thus, guaranteeing privacy of tenants’ information without negating the bene ts of cloud computing seems necessary [28,
38, 40, 41, 60, 95, 102] In order to protect cloud users’ privacy, sensitive data need to be encoded before outsourcing them to servers Unfortunately, encryption makes the servers perform search on ciphertext much more di cult than on plaintext To solve this problem, many searchable encryption techniques have been presented since 2000 Searchable encryption does not only store users’ encrypted data securely but also allows information search over ciphertext [26, 28, 29, 38, 40, 60, 71, 85, 102].
Searchable encryption for exact pattern matching is a new class of searchableencryption techniques The solutions for this class have been presented based onalgorithms for [26] or approaches to [41, 89] exact pattern matching
As in retrieving information from plaintexts, the development of searchableencryption with approximate string matching capability is necessary, where the searchstring can be a keyword determined, encrypted and stored in cloud servers or anarbitrary pattern [28, 40, 71]
From the above problems, together with the high e ciency of techniques using graph and automata proposed by P T Huy et al for dealing with problems of exact pattern matching (2002), longest common subsequence (2002) and steganography (2011, 2012 and 2013), as well as potential applications of graph theory and automata approaches suggested by Late Assoc Prof Phan Trung Huy in steganography and searchable encryption, and under
Trang 12the direction of supervisors, the dissertation title assigned is research on development
of methods of graph theory and automata in steganography and searchableencryption
The purpose of the dissertation is to research on the development of new andquality solutions using graph theory and automata, suggesting their applications in,and applying them to steganography and searchable encryption
Based on results published and suggestions presented by Late Assoc Prof PhanTrung Huy in steganography and searchable encryption, the dissertation will focus onfollowing four problems in these elds:
- Digital image steganography;
- Exact pattern matching;
- Longest common subsequence;
- Searchable encryption
The rst problem is stated newly in Chapter 2, the three remaining problems arerecalled and clari ed in Chapter 1 In addition, background related to these problems ispresented clearly and analysed very carefully in Chapters of the dissertation
For the rst three problems, the dissertation’s work is to nd new and e cient solutionsusing graph theory and automata Then they will be used and applied to solve the lastproblem
The dissertation has been completed with structure as follows Apart fromIntroduction at the beginning and Conclusion at the end of the dissertation, the maincontent of it is divided into ve chapters
Chapter 1 Preliminaries This chapter recalls basic knowledge indicated throughoutthe dissertation (strings, graph, deterministic nite automata, digital images, the basicmodel of digital image steganography, some parameters to determine the quality ofdigital image steganography, the exact pattern matching problem, the longest commonsubsequence problem, and searchable encryption), re-presents important conceptsand results used and researched on development in remaining chapters of thedissertation (adjacency list, breadth rst search, Galois eld, the fastest optimal parityassignment method, the module method and the concept of the maximal secret dataratio, the concept of the degree of fuzziness (appearance), the Knapsack Shakingapproach, and the de nition of a cryptosystem)
Chapter 2 Digital image steganography based on the Galois eld using graph theoryand automata Firstly, from some proposed concepts of optimal and near optimalsecret data hiding schemes, this chapter states the interest problem in digital imagesteganography Secondly, the chapter proposes a new approach based on the Galoiseld using graph theory and automata to design a general form of steganography inbinary, gray and palette images, shows su cient conditions for existence and provesexistence of some optimal and near optimal secret data hiding schemes, applies theproposed schemes to the process of hiding a nite sequence of secret data in an imageand gives security analyses Finally, the chapter presents experimental results to showthe e ciency of the proposed results
Chapter 3 An automata approach to exact pattern matching This chapter proposes
a exible approach using automata to design an e ective algorithm for exact patternmatching in practice In given cases of patterns and alphabets, the e ciency of theproposed algorithm is shown by theoretical analyses and experimental results
2
Trang 13Chapter 4 Automata technique for the longest common subsequence problem.This chapter proposes two e cient sequential and parallel algorithms for computing thelength of a longest common subsequence of two strings in practice, using automatatechnique Theoretical analysis of parallel algorithm and experimental results con rmthat the use of the automata technique in designing algorithms for solving the longestcommon subsequence problem is the best choice.
Chapter 5 Cryptography based on steganography and automata methods forsearchable encryption This chapter rst proposes a novel cryptosystem based on a datahiding scheme proposed in Chapter 2 with high security Additionally, ciphertexts do notdepend on the input image size as existing hybrid techniques of cryptography andsteganography, encoding and embedding are done at once The chapter then appliesresults using automata technique of Chapters 3 and 4 to constructing two algorithms forexact and approximate pattern matching on secret data encrypted by the proposedcryptosystem These algorithms have O(n) time complexity in the worst case, togetherwith an assumption that the approximate algorithm uses d(1 )me processors, where ; mand n are the error of the string similarity measure proposed in this chapter and lengths ofthe pattern and secret data, respectively In searchable encryption, the cryptosystem can
be used to encode and decode secret data on users side and pattern matching algorithmscan be used to perform pattern search on cloud providers side
The contents of the dissertation are written based on the paper [T1] published in
2019, the paper [T4] accepted for publication in 2020 in KSII Transactions on Internetand Information Systems (ISI), and the papers [T2, T3] published in Journal ofComputer Science and Cybernetics in 2019 The main results of the dissertation havebeen presented at:
- Seminar on Mathematical Foundations for Computer Science at Institute of Mathematics, Vietnam Academy of Science and Technology,
- The 9th Vietnam Mathematical Congress, Nha Trang, August 14-18, 2018,
- Seminar at School of Applied Mathematics and Informatics, Hanoi University of Science and Technology
Trang 14CHAPTER 1 PRELIMINARIES
This chapter will attempt to recall terminologies, concepts, algorithms and resultswhich are really needed in order to present the dissertation’s new results clearly andlogically, as well as help readers follow the content of the dissertation easily Thebackground knowledge re-presented here consists of basic structures (Section 1.1:strings (Subsection 1.1.1), graph (Subsection 1.1.2), deterministic nite automata(Subsection 1.1.3), and the Galois eld GF (pm) (Subsection 1.1.4)), digital imagesteganography (Section 1.2), exact pattern matching (Section 1.3), longest commonsubsequence (Section 1.4) and searchable encryption (Section 1.5)
x = x[1]x[2]::x[n]; x[i] 2 ; 1 i n;
where n is a positive integer
A special string is the empty string having no letters, denoted by The length of thestring x is the number of letters in it, denoted by jxj Then j j = 0
Notice that for the string x = x[1]x[2]::x[n], we can also write x = x[1::n] in short The set of all strings on the alphabet is denoted by The operator of strings is
concatenation that writes strings as a compound The concatenation of the two strings
u1 and u2 is denoted by u1u2
Let x be a string A string p is called a substring of the string x, if x = u1pu2 for somestrings u1 and u2 In case u1 = (resp u2 = ), the string p is called a pre x (resp su x) ofthe string x The pre x (resp su x) p is called proper if p 6= x Note that the pre x or the
su x can be empty
1.1.2 Graph
Besides some basic concepts in graph theory, this subsection recalls the way representing
a graph by adjacency lists and breadth rst search [82] These are used in Chapter 2.
A nite undirected graph (hereafter, called a graph for short) G = (V; E) consists of anonempty nite set of vertices V and a nite set of edges, where each edge has eitherone or two vertices associated with it A graph with weights assigned to their edges iscalled a weighted graph
4
Trang 15An edge connecting a vertex to itselfSendis calledto a loop Multiple edges are edges connecting the same vertices A graph having no loops and no multiple edges is called a simple graph.
In a simple graph, the edge associated to an unordered pair of vertices fi; jg is called the
Two vertices i and j in a graph G are called adjacent if they are vertices of an edge of
A graph without multiple edges can be described by using adjacency lists, which
specify adjacent vertices of any vertex of the graph
Example 1.1 Using adjacency lists, the simple graph given in Figure 1.1 can be
Breadth First Search:
Input: A connected simple graph G with vertices ordered as i1; i2; : : : ; in
Output: A spanning tree T
Set T to be a tree consisting only i1;
Set L to be an empty list;
Trang 16Channel
Send to
Secret Data
For each adjacent vertex j of i
If (j is not in L and T ) Cover
ImagefAdd j to the end of L;
Add j and the edge fi; jg to T ;
gReturn T ;
Figure 1.2 A spanning tree of the graph given in Figure 1.1
A graph with directed edges (or arcs) is called a directed graph Each arc isassociated with the ordered pair of vertices In a simple directed graph, the arcassociated with the ordered pair (i; j) called the arc (i; j) And the vertex i is said to beadjacent to the vertex j and the vertex j is said to be adjacent from the vertex i
1.1.3 Deterministic Finite AutomataStudy on the problem of the construction and the use of deterministic nite automata
is one of objectives of the dissertation Hence, this subsection will clarify this model ofcomputation [44, 82]
De nition 1.1 ([44]) Let be an alphabet A deterministic nite automaton (hereafter,called an automaton for short) A = ( ; Q; q0; ; F ) over consists of:
A nite set Q of elements called states, An initial state q0, one of the states in Q,
A set F of nal states The set F is a subset of Q,
A state transition function (or simply, transition function), denoted by , that takes
as arguments a state and a letter, and returns a state, so that : Q ! Q,The transition function can be extended so that it takes a state and a string, andreturns a state Formally, this extended transition function can be de ned recursively by
:Q!Qsuch that for all q 2 Q; s 2 ; a 2 ; (q; as) = ( (q; a); s) and (q; ) = q
6
Trang 17An alternative and simple way presenting an automaton is to use the notation \
transition diagram" A transition diagram of an automaton A = ( ; Q; q0; ; F ) is a
directed graph given as follows [44]
a) Each state of Q is a vertex
b) Let q0 = (q; a), where q is a state of Q and a is a letter of Then the transition
diagram has an arc (q0; q) labeled a If there are several letters that cause transitions
from q0 to q, then the arc (q0; q) is labeled by a list of these letters
c) There is an arrow into the initial state q0 This arrow does not originate at any
vertex
d) States not in F have a single circle Vertices corresponding to nal states are
marked by a double circle
Example 1.3 Consider an automaton A = ( ; Q; q0; ; F ) over = fa; bg, where
Q = fq0; q1; q2g, F = fq2g, and is given by the following table Then the transition
diagram of A is shown in Figure 1.3
Figure 1.3 The transition diagram of A in Example 1.3
([82]) A string p is said
A = ( ; Q; q0; ; F Secret Data to a nal state, it means thatSecret(qData0;p) is
) if it takes the initial state q 0
This subsection describes how to construct a nite eld with p elements, called the
GF pm p Image 1 Image algebraic structure
Galois eld ( ), where is prime andm is an integer [88] The
Send to
will be used in Chapter 2
Let p be a prime number De ne Zp[x] to be the set of all polynomials with the variable
x, whose coe cients belong to the eld Z p Addition and multiplication in Z p [x] are de ned
Secret Key Secret Key in the usual way and then reduce the coe cients modulo p at the end.
For f(x) 2 Zp[x], the degree of f(x), denoted by deg(f), is the largest exponent of x in f(x)
A polynomialSenderf(x) 2 Zp[x] is called to be irreducible if there doesReceivernotexist
CoverImage
Trang 18polynomials f1(x); f2(x) 2 Zp[x] such that
f(x) = f1(x)f2(x);
where deg(f1) > 0 and deg(f2) > 0 = m1.
Let f(x) 2 Zp[x] be an irreducible polynomial with deg(f) De ne
Zp[x]=(f(x)) to be the set of pm polynomials of degree at most m 1 in Zp[x] Additionand multiplication in Zp[x]=(f(x)) are given as in Zp[x], followed by a reduction modulof(x) Then Zp[x]=(f(x)) with these operations is a eld having pm elements, called theGalois eld GF (pm) Note that for p is prime and m 1, the Galois eld GF (pm) is unique.1.2 Digital Image Steganography
The interest problem in Chapter 2 is digital image steganography This section willrecall the concept of digital images, the basic model of digital image steganography, someparameters to determine the e ciency of digital image steganography and lastly re-presentresults researched on development and used in Chapter 2 such as the fastest optimalparity assignment (FOPA) method, the module method and the concept of the maximalsecret data ratio (MSDR) [18, 20, 21, 39, 49, 50, 51, 53, 61, 63, 65, 76, 78, 104]
A digital image is a matrix of pixels Each pixel is represented by a non negativeinteger number in the form of a string of binary bits This value indicates the colour ofthe pixel [39]
Note that based on the way representing of colours of pixels, digital images can bedivided into following di erent types [78]
1 Binary image: Each pixel is represented by one bit In this image type, the colour
of a pixel is white, \1" value, or black, \0" value
2 Gray image: Each pixel is typically represented by eight bits (called 8-bit grayimage) Then the colour of any pixel is a shade of gray, from black corresponding tocolour value \0" to white corresponding to colour value \255"
3 Red green blue image: Each pixel is usually represented by a string of 24 bits(called 24-bit RGB image), where the rst 8 bits, the next 8 bits and the last 8 bitscorresponds to shades of red, green and blue, specifying the red, green and bluecolour components of the pixel, respectively Then the colour of the pixel is acombination of these three components
4 Palette image: The colour of each pixel is not shown directly by the numberrepresenting the pixel as for RGB images Instead, this number is a colour index of thecolour of the pixel existed in the colour table (the palette), an ordered set of values(strings of 24 bits) which represent all colours as in RGB images used in the imageand contained in the le with the image The size of the palette is the same as thelength of a bit string representing a pixel and is limited by 8 bits For a string of 8 bits,call palette images 8-bit palette images
The objective of digital image steganography is to protect data by hiding the data in a digital image well enough so that unauthorized users will not even be aware of their existence [21, 18] Figure 1.4 shows the basic model of digital image steganography, where the cover image is a digital image used as a carrier to embed secret data into, the stego image is digital image obtained after embedding secret data into the cover image by the
8
Trang 19function block Embed with the secret key on the Sender side For steganography generally,
a Payload Corresponding to a certain Payload, to measure the embedding capacity of the
cover image, the embedding rate (ER) is used and de ned as follows [104]
StegoImage
CommunicationChannelSend to
StegoImage
Secret Data
Extract
Secret KeyReceiver
CoverImage
Figure 1.4 The basic diagram of digital image steganography The peak signal to noise ratio (PSNR) is used to evaluate quality of stego image Based on
the value of PSNR, we can know the degree of similarity between the cover image and stego
image If the PSNR value is high, then quality of stego image is high Conversely,
quality of stego image is low In general, for the digital image, PSNR is de ned by the
where B(i; j); G(i; j); R(i; j); B0(i; j); G0(i; j) and R0(i; j) are the colour value of the Blue,
Green and Red components of a pixel at position (i; j) in the cover and stego image,
respectively For human’s eyes, the threshold value of PSNR value is 30dB [20, 53,
65, 104], it means that the PSNR value is higher than 30dB, it is hard to distinguish
between the cover image and its stego image
Let G be a palette image and P = fc1; c2; : : : ; cn g be its palette, where ci is the
colour of a pixel of G corresponding to the colour index i Each colour c in P is
considered as a vector consisting of red, green and blue components Suppose d is a
distance function on P The FOPA method [50] tries to get functions Next, Next: P !
P , and Val, Val: P ! Z2, where two conditions are satis ed for all c 2 P as follows
2
4
Trang 201 d(c; Next(c)) = minv6=c2P d(c; v),
2 Val(c) =Val(Next(c)) + 1 on the eld Z 2
Call GP = (VP ; EP ) a weighted complete undirected graph of the palette image G,where VP = P and the weight of the edge fc; c0g is d(c; c0) The function Nearest,Nearest: P ! P , is given by Nearest(c) = c0 holding d(c; c0) = minv6=c2P d(c; v) A rhoforest F = (V; E) is a directed graph with vertices weighted by the functionVal, where V
= VP , E is a set of all arcs (v; Next(v)), the vertex v has the weightVal(v) for all v 2 V The construction of a algorithm determining F is the essence of the FOPA method.Algorithm for FOPA:
Input: A weighted complete undirected graph GP , the function Nearest
Output: A rho forest F = (V; E)
Choose a vertext c 2 P , set V = fcg, and set C = P
nfcg; SetVal(c) = 0; // Or 1 randomly While (C is not
empty) // Update F f
a) Take one element v 2 C;
b) Initialize v 0 = v, setVal(v 0 ) = 0 (or 1 randomly), by a nite loop, nd a longest sequence of k + 1 di erent elements in P consecutively, v 0 ; v 1 ; : : : ; v k , such that
Nearest(vi) = vi+1 for i = 0; k 1; vi 2 C; vk 2 C or vk 2 V , and set
Next(vi) = vi+1; i = 0; k 1;
b1) Case vk 2 C: SetVal(vi) = 1+Val(vi 1); i = 1; k and Next(vk) = vk 1;
Set V = V [ fv0; v1; : : : ; vkg and C = Cnfv0; v1; : : : ; vkg;b2) Case vk 2 F : SetVal(vi) = 1+Val(vi+1); i = k 1; : : : ; 1; 0;
Set V = V [ fv0; v1; : : : ; vk 1g and C = Cnfv0; v1; : : : ; vk 1g;g
Return F ;
End
De nition 1.3 ([51]) Let M be a module over the ring Zm, k > 0 be a natural number,and U be a subset of Mnf0g Call U a k-base of M if for any v in Mnf0g, there exist telements v1; v2; : : : ; vt 2 U; t k, together with a1; a2; : : : ; at 2 Zm such that v = v1a1 +
v2a2 + :: + vtat
Let G be a digital image, call CG the set of all colours of pixels in G Consider thecase m = 2 and G is a binary image Then CG = f0; 1g, and for n is a positive integer,the set M = Z2n = f(x1; x2; : : : ; xn)jxi 2 Z2; i = 1; ng with element addition and scalarmultiplication de ned as usual is a module over the ring Z2 [49] For k = 1, the set
U = Mnf0g is an unique 1-base of M [51] Two functions Next, Next: CG ! CG, and Val,Val: CG ! Z2, satisfying the condition Val(c) =Val(Next(c)) + 1 on the ring Z2, are dened in [49] Suppose that for N jUj, I = fI1; I2; : : : ; IN g is an arbitrary image block
of G, K = fK1; K2; : : : ; KN jKi 2 Z2; i = 1; Ng is a secret key, d is any element in M, and
h is a surjective function from I to U In the module method, d is considered as a secretdata, embedded in and extracted from the image block I with the key K by the blocksEmbed and Extract as follows [49, 51]
10
Trang 21The block Embed (embedding d in I):
De nition 1.4 ([49]) MSDRk(N) is the largest number of embedded bits of secret data
in an image block of N pixels by changing colours of at most k pixels in the imageblock, where k; N are positive integers
Given a positive integer qcolour, call qcolour the number of di erent ways to changethe colour of each pixel in an arbitrary image block of N pixels According to [49]
MSDRk(N) = blog2(1 + qcolourCN1 + qcolour2CN2 + + qcolourkCNk )c: (1.3)1.3 Exact Pattern Matching
This section will restate the exact pattern matching problem, and recall the concept
of the degree of fuzziness (appearance) used in Chapter 3 [24, 52, 68]
Let x be a string of length n Denote the substring x[i]x[i + 1]::x[j] of x by x[i::j] for all
1 i j n, the ith element of x by x[i] and i is called a position in x Let p be a substring oflength m of x, where m is a positive integer, then there exists i for 1 i n m + 1 such that
p = x[i::i + m 1] And say that i is an occurrence of p in x or p occurs in x at position i
De nition 1.5 ([68]) Let p be a pattern of length m and x be a text of length n over thealphabet Then the exact pattern matching problem is to nd all occurrences of thepattern p in x
The following example uses the Brute Force (BF) algorithm [24] to demonstrate themost original way solving this problem
Table 1.2 The performing steps of the BF algorithm
Trang 22Example 1.4 Given a pattern p = fah and a text x = dfahfkfaha Then there are twooccurrences of p in x as shown below: dfahfkfaha The BF algorithm is performed bythe following steps presented in Table 1.2, the bold letters correspond to themismatches, the underlined letters represent the matches when comparing the letters
of the pattern and the text We know that many letters scanned will be scanned again
by the BF algorithm because each time either a mismatch or a match occurs, thepattern is only moved to the right one position
Chapter 3 uses the degree of fuzziness in [52] to determine the longest pre x of thepattern in the text at any position However, this terminology can lead to severalmisunderstandings for the readers So throughout this dissertation, the degree offuzziness will be replaced with the degree of appearance The concept of the degree
of appearance is restated as follows
De nition 1.6 ([52]) Let p be a pattern and x be a text of length n over the alphabet Then for each 1 i n, a degree of appearance of p in x at position i is equal to the length
of a longest substring of x such that this substring is a pre x of p, where the right endletter of the substring is x[i]
Notice that obviously, if the degree of appearance of p in x at an arbitrary position iequals jpj, then a match for p in x occurs at position i j pj + 1 Figure 1.3 illustrates theconcept of the degree of appearance of the pattern p in x
The degree of appearance of p in x at the position being scanned is equal to 4
Figure 1.5 The degree of appearance of the pattern p
1.4 Longest Common Subsequence
This section will recall the longest common subsequence (LCS) problem, and theKnapsack Shaking approach addressing the problem studied on development inChapter 4 [24, 47, 94, 101]
De nition 1.7 ([101]) Let p be a string of length m and u be a string over the alphabet Then u is a subsequence of p if there exists a integer sequence j1; j2; : : : ; jt such that
(i) u is a common subsequence of p and x,
(ii) There does not exist any common subsequence v of p and x such that jvj > juj
12
Trang 23Denote an arbitrary longest common subsequence of p and x by LCS(p; x) Thelength of a LCS(p; x) is denoted by lcs(p; x).
By convention, if two strings p and x does not have any longest commonsubsequences, then the lcs(p; x) is considered to equal 0
Example 1.5 Let p = bgcadb and x = abhcbad Then string bcad is a LCS(p; x) andlcs(p; x) = 4
Let p and x be two strings of lengths m and n over the alphabet ; m n The longestcommon subsequence problem for two strings (LCS problem) can be stated in twofollowing forms [24, 47]
Problem 1 Find a longest common subsequence of p and x
Problem 2 Compute the length of a longest common subsequence of p and x.The simple way to solve the LCS problem is to use the algorithm introduced by Wagnerand Fischer in 1974 (called the Algorithm WF) This algorithm de nes a dynamicprogramming matrix L(m; n) recursively to nd a LCS(p; x) and compute the lcs(p; x) as
where L(i; j) is the lcs(p[1::i]; x[1::j]) for 1 i m, 1 j n
Example 1.6 Let p = bgcadb and x = abhcbad Use the Algorithm WF, the L(m; n) isobtained below Then lcs(p; x) = L(6; 7) = 4 In Table 1.3, by traceback procedure,starting from value 4 back to value 1, a LCS(p; x) found is a string bcad
Table 1.3 The dynamic programming matrix L
De nition 1.10 ([47]) Let u = p[j1]p[j2] : : : p[jt] be a subsequence of p Then an element
of the form (j1; j2; : : : ; jt) is called a location of u in p
From De nition 1.10, the subsequence u has at least a location in p If all the dierent locations of u are arranged in the dictionary order, then call the least element theleftmost location of u, denoted by LeftID(u) Denote the last component of LeftID(u) by
Rmp(u) [47]
Trang 24Example 1.7 Let p = aabcadabcd and u = abd Then u is a subsequence of p and has seven di erent locations in p, in the dictionary order they are
(1; 3; 6); (1; 3; 10); (1; 8; 10); (2; 3; 6); (2; 3; 10); (5; 8; 10); (7; 8; 10):
It follows that LeftID(u) = (1; 3; 6) and Rmp(u) = 6
De nition 1.11 ([47]) Let p be a string of length m Then a con guration C of p is de ned as follows
1 Or C is the empty set Then C is called the empty con guration of p, denoted by
C0
2 Or C = fx1; x2; : : : ; xtg is an ordered set of t subsequences of p for 1 t m such that the two following conditions are satis ed
(i) For all 1 i t, jxij = i,
(ii) For all xi; xj 2 C, if jxij > jxjj then Rmp(xi) >Rmp(xj)
Set of all the con gurations of p is denoted by Con g(p)
De nition 1.12 ([47]) Let p be a string of length m on the alphabet , C 2 Con g(p) and a
2 Then a state transition function ’ on Con g(p) such that
’ : Con g(p)! Con g(p) de ned as follows
1 ’(C; a) = C if a 2= p
2 ’(C 0 ; a) = fag if a 2 p.
3 Set C0 = ’(C; a) Suppose a 2 p and C = fx1; x2; : : : ; xtg for 1 t m Then C0 isdetermined by a loop using the loop control variable i whose value is changed from t down to 0:
a) For i = t, if the letter a appears at a location index in p such that index is greater than Rmp(xt), then xt+1 = xta;
b) Loop from i = t 1 down to 1, if the letter a appears at a location index in p such that index 2 (Rmp(xi); Rmp(xi+1)), then xi+1 = xia;
c) For i = 0, if the letter a appears at a location index in p such that index is smaller than Rmp(x1), then x1 = a;
d) C0=C
4 To accept an input string, the state transition function ’ is extended as
follows ’ : Con g(p) ! Con g(p)such that for all C 2 Con g(p); s 2 ; a 2 ; ’(C; as) = ’(’(C; a); s) and ’(C; ) = C
Example 1.8 Let p = bacdabcad and C = fc; ad; babg Then C is a con guration of pand C0 = ’(C; a) = fa; ad; ada; babag
In 2002, P T Huy et al introduced a method to solve the Problem 1 by using theautomaton given as in the following theorem In this way, they named their method theKnapsack Shaking approach [47]
Theorem 1.1 ([47]) Let p and x be two strings of lengths m and n over the alphabet ;
m n Let Ap = ( ; Q; q0; ’; F ) corresponding to p be an automaton over the alphabet ,where
The set of states Q = Con g(p),
14
Trang 25The initial state q0 = C0,
The transition function ’ is given as in De nition 1.12,
The set of nal states F = fCng, where Cn = ’(q0; x)
Suppose Cn = fx1; x2; : : : ; xtg for 1 t m Then
1 For every subsequence u of p and x, there exists xi 2 Cn; 1 i t such that the two following conditions are satis ed
(i) juj = jxij,
(ii) Rm p (x i ) Rm p (u) 2 A
LCS(p; x) equals x t
1.5 Searchable Encryption
This section clari es the term of searchable encryption (SE) and recalls the de nition of
a cryptosystem They will be studied and used in Chapter 5 [26, 40, 60, 85, 88, 102].Consider a problem to occur in cloud security as follows [60, 85, 102] Cloud tenants,for example enterprises and individuals with limited resource including software andhardware, store data with sensitive information on cloud servers Assume that theseservers cannot be fully trusted This means they may not only be curious about the users’information but also abuse the data received Then users wish to encrypt their data beforeuploading them to servers Because of limitations of cloud users’ information technologysystem, users also wish that cloud providers can help them perform information searchdirectly on ciphertexts However, encryption brings di culties for servers to do search onthe encrypted data These lead to a problem that is to nd a solution to satisfy the twowishes of cloud users when they choose cloud storage service
SE is a way to solve the above problem It is indeed a system consisting of two maincomponents, a cryptosystem is used to encode and decode on cloud users side andalgorithms for searching on encrypted data are done on cloud providers side [40, 102]
In cryptography, SE can be either searchable symmetric encryption (SSE) orsearchable asymmetric encryption (SAE) In SSE, only private key holders can createencrypted data and produce trapdoors for search In SAE, users who have the public keycan make ciphertexts but only private key holders can generate trapdoors [26, 102]
Since the dissertation proposes a new symmetric encryption system for SSE inChapter 5, the correctness of this system needs to prove In this dissertation, thecomponents and properties of a cryptosystem de ned in [88] will be considered as astandard form to verify Here recalls this de nition
De nition 1.13 ([88]) A cryptosystem is a ve-tuple (P; C; K; E; D) such that thefollowing properties are satis ed
1 P is a nite set of plaintexts,
2 C is a nite set of ciphertexts,
3 K is a nite set of secret keys,
4 For every k 2 K, there exists an encrypting function ek 2 E and a correspondingdecrypting function dk 2 D, where ek : P ! C and dk : C ! P holds dk(ek(x)) = x for each
x 2 P
Trang 26CHAPTER 2 DIGITAL IMAGE STEGANOGRAPHY BASED ON THE GALOIS FIELD USING GRAPH THEORY AND AUTOMATA
This chapter rst proposes concepts of optimal and near optimal secret data hidingschemes The chapter then proposes a new digital image steganography approachbased on the Galois eld GF (pm) using graph and automata to design the data hidingscheme of the general form (k; N; blog2 pmnc) for binary, gray and palette images withthe given assumptions, where k; m; n; N are positive integers and p is prime, shows sucient conditions for existence and proves existence of some optimal and near optimalsecret data hiding schemes These results are derived from the concept of themaximal secret data ratio of embedded bits, the module method and the FOPAmethod proposed by P T Huy et al in 2011, 2012 and 2013, recalled in Section 1.2 ofChapter 1 An application of the schemes to the process of hiding a nite sequence ofsecret data in an image is also considered Security analyses and experimental resultscon rm that the proposed approach can create steganographic schemes whichachieve high e ciency in embedding capacity, visual quality, speed as well as security,which are key properties of steganography
The results of Chapter 2 have been published in [T1]
in the image [17, 57, 62, 76, 100] The chapter’s work focuses on steganography indigital images in spatial domain
Digital image steganography studies the steganographic schemes, where each schemeconsists of an embedding function and extracting function The embedding function showshow to embed secret data in the digital image and the extraction function describes how toextract the data from the digital image carrying the embedded data [46, 87]
In digital image steganography, a few main factors must be taken in consideration when we design a new secret data hiding scheme, which are embedding capacity of the cover image, quality of stego image and security However, as well known, embedding capacity of the cover image and quality of its stego image are irreconcilable con ict A balance achieved of the two factors can be done according to di erent application requirements In addition to the three main factors, speed of the embedding and extracting functions also
16
Trang 27plays an important role in steganographic schemes It is considered as a lastconstraint to determine e ciency of schemes [46, 53, 65, 69, 87, 104].
The simplest and most popular spatial domain image steganography method is theleast signi cant bit (LSB) substitution (called LSB based method) For 24-bit RGB and 8-bitgray images, in this method the data is embedded in the cover image by changing theleast signi cant bits of the image directly, therefore it becomes vulnerable to securityattacks [18, 62, 72, 75, 76, 97, 104] EZ Stego method for palette images is similar to thecommonly used LSB based method However, this method does not guarantee quality ofstego images [36, 37, 97] To alleviate this problem, in 1999, Fridrich proposed a newmethod based on the parity bits of colour indexes of pixels in palette cover images, calledthe parity assignment (PA) method Then EZ Stego method can be considered as anexample of PA method [36, 50] In 2000, Fridrich et al improved the method byinvestigating the problem of optimal parity assignment for the palette and this version iscalled the optimal parity assignment (OPA) method [37] To easily control quality of stegoimages, Huy et al introduced another OPA method, called the FOPA method, in 2013[50] Unlike the colour and gray images, each pixel in binary images only requires one bit
to represent colour values (black and white), therefore, modifying pixels can be easilydetected So, binary image steganography is a more di cult and challenging problem Forbinary images, block based method is usually used to maintain quality of stego images Inthis method, the cover and stego images are partitioned into individual image blocks of thesame size, embedding and extracting secret data are based on the characteristic valuescalculated for the blocks WL (Wu et al., 1998), PCT (Pan et al., 2000), modi ed PCT(Tseng et al., 2001), CTL (Chang et al., 2005) schemes are all well known and blockbased for binary images [21, 18, 48, 75, 92]
Given a qcolour which is the number of di erent ways to change the colour of eachpixel in an arbitrary image block, and use the concept of the maximal secret data ratio
of embedded bits proposed by Huy et al in 2011 [49], the chapter introduces concepts
of optimal and near optimal secret data hiding schemes Actually, the optimality ofsteganographic schemes has been considered in [37, 46] However, the authors usedthe time complexity of embedding and extracting functions, or the concept of optimalparity assignment that minimizes the energy of the parity assignment for the colourpalette to determine whether a steganographic scheme is optimal
By the block based method, call a secret data hiding scheme a data hiding scheme(k; N; r), where k; N; r are positive integers, if the embedding function can embed r bits
of secret data in each image block of N pixels by changing colours of at most k pixels
in the image block The chapter’s work is concerned with the problem of designingoptimal or near optimal data hiding schemes (k; N; r) for digital images (binary, grayand palette images)
Based on the module approach and the (FOPA) method using graph theory proposed
by Huy et al in 2011 and 2013 [49, 50], the chapter proposes a new approach based onthe Galois eld using graph and automata in order to solve the problem By this approach,the chapter proposes schemes consisting of the optimal data hiding scheme (1; 2n 1; n)for binary, gray and palette images with qcolour = 1, where n is a positive integer, the nearoptimal data hiding scheme (2; 9; 8) and the optimal data hiding scheme (1; 5; 4) for grayand palette images with qcolour = 3 Security analyses show that an application of theseschemes to the process of hiding a nite sequence of secret data in an image can avoid
Trang 28detection from brute-force attacks.
The experimental results reveal that the e ciency in embedding capacity and visualquality of the near optimal data hiding scheme (2; 9; 8) for gray images with qcolour = 3 isindeed better than the e ciency of the HCIH scheme [104] The embedding and extractingtime of the proposed approach are faster than that of the Chang et al.’s approach [18] Forthe near optimal data hiding scheme (2; 9; 8) for palette images with qcolour = 3 and theoptimal data hiding scheme (1; 2n 1; n) for palette images with qcolour = 1, values of ERcan be selected suitably to achieve acceptable quality of the stego images
The rest of the chapter is organized as follows Section 2.2 gives some new conceptsand states the chapter’s digital image steganography problem Section 2.3 consists of twoSubsections 2.3.1 and 2.3.2 Subsection 2.3.1 introduces mathematical basis based onthe Galois eld GF (pm) for the digital image steganography problem, where p is prime and
m is a positive integer Subsection 2.3.2 rstly proposes a digital image steganographyapproach based on the Galois eld GF (pm) using graph and automata to design the datahiding scheme of the general form (k; N; blog2 pmnc) for the given assumptions, where k;m; n; N are positive integers and p is prime Secondly, the subsection gives su cient
subsection shows that there exists the optimal data hiding scheme (1; 2n 1; n) for binary,
gray and palette images with qcolour = 1, where n is a positive integer At the end ofSubsection 2.3.2, the way applying the data hiding scheme (k; N; blog2 pmnc) to theprocess of hiding a nite sequence of secret data of length blog2 pmnc bits in an image isconsidered Subsection 2.4 proves that there exist the near optimal data hiding scheme(2; 9; 8) and the optimal data hiding scheme (1; 5; 4) for gray and palette images with
qcolour = 3 Section 2.5 shows experimental results in order to evaluate the e ciency of theproposed data hiding schemes and approach Lastly, some conclusions are drawn fromthe proposed approach and experimental results in Section 2.6
2.2 The Digital Image Steganography Problem
This section gives some new concepts and states the chapter’s digital imagesteganography problem
De nition 2.1 A block based secure data hiding scheme in digital images (for short, called
a data hiding scheme) is a ve-tuple (I; M; K; Em; Ex), where the following conditions are satis ed
1 I is a set of all image blocks with the same size and image type,
2 M is a nite set of secret elements,
3 K is a nite set of secret keys,
4 Em is an embedding function to embed a secret element in an image block,
Trang 29De nition 2.2 A data hiding scheme (I; M; K; Em; Ex) is called a data hiding scheme(k; N; r), where k; N; r are positive integers, if each image block in I has N pixels andthe embedding function Em can embed r bits of secret data in an arbitrary image block
by changing colours of at most k pixels in the image block
De nition 2.3 For a given qcolour, a data hiding scheme (k; N; r) is called an optimaldata hiding scheme if r = MSDRk(N) and there does not exist a positive integer N0such that N0 < N, r = MSDRk(N0) Then N is denoted by Noptimum
De nition 2.4 For a given qcolour, a data hiding scheme (k; N; r) is called a nearoptimal data hiding scheme if r = MSDRk(N) and N > Noptimum
The chapter’s digital image steganography problem Design optimal or near optimaldata hiding schemes (k; N; r) for digital images (binary, gray and palette images).2.3 A New Digital Image Steganography Approach
This section introduces mathematical basis based on the Galois eld for the digital image steganography problem (Subsection 2.3.1), proposes a digital image steganography approach based on the Galois eld using graph theory and automata to design the data hiding scheme of the general form (k; N; blog2 pmn c) for the given assumptions, where k; m; n; N are positive integers and p is prime, shows su cient conditions for existence and proves existence of some optimal data hiding schemes (Subsection 2.3.2) Security analyses and an application of these data hiding schemes to the process of hiding a nite sequence of secret data in an image are considered in Subsection 2.3.2.
2.3.1 Mathematical Basis based on The Galois Field
This subsection constructs mathematical basis based on the Galois eld GF (pm) forthe digital image steganography problem, where p is prime and m is a positive integer(Propositions 2.2, 2.4 and Theorem 2.1)
Given the Galois eld GF (pm), recalled in Subsection 1.1.4 of Chapter 1, where p isprime and m is a positive integer Let GF n(pm) = f(x1; x2; : : : ; xn)jxi 2 GF (pm); i = 1;
ng, where n is a positive integer, with two operations of vector addition + and scalarmultiplication are de ned as follows
Trang 30Proof Suppose [x]\[y] 6= ?, then there exists z in [x]\[y] By De nition 2.5, z = ax = by.Since a 2 GF (pm)nf0g, x = a 1by Thus x 2 [y] and therefore [x] [y] Similarly, [y] [x]and hence [x] = [y]
Propostion 2.1 The set of all classes forms a partition of the set GF n(pm)
Proof For all x 2 GF n(pm), then x 2 [x] by De nition 2.5 Thus the union of all classes is
GF n(pm) By Lemma 2.1, any two distinct classes are disjoint The proof is complete Denote the set of all classes by [GF n(pm)] This can be represented by [GF n(pm)] =f[x]jx 2 GF n(pm)g The number of elements of a set S is denoted by jSj.
Propostion 2.2 j[GF n(pm)]nf0gj = ppm 1 .
P t
i =1 aivi, where
Trang 31Proof Evidently, jGF n(pm)nf0gj = pmn 1 GF (pm) Consider
X
B = f[ a0ivi0]ja0i 2 GF (pm)nf0g; [vi0] 2 S; i = 1; t; t kg:
i=1
To prove that S does not depend on the choice of representatives of classes, it su ces to
show that A = B By the hypothesis [v0 ] = [v i ], then v i = b i v0 Suppose [x] 2 A, then
x = ( i=1 a i v i ) = ( i=1 i 6 a i b i v 0 ) Clearly, a i b i = 0 by the de nition of the class, then
x B since b = 0, then there exists b 1, thus v = b 1v Similarly, B A.
So, A = B
De nition 2.7 Let V be a vector space over a eld K, S V Then S is called a k-Generatorsfor V , where k is a positive integer, if the two following conditions are satis ed
a) For all v; v0 2 S, there does not exist a 2 K such that v0 = av, b)
For all v 2 V nf0g, there exists t such that 1 t k and v =
v1; v2; : : : ; vt 2 S; a1; a2; : : : ; at 2 Knf0g
Lemma 2.2 Let S = fv1; v2; : : : ; vtg be a k-Generators for the vector space GF n(pm).Then S0 = f[v1]; [v2]; : : : ; [vt]g is a k-[Generators] for the set [GF n(pm)]
20
Trang 32Proof Since S is a k-Generators for GF n(pm), then for all v; v0 2 S, there does not exists
a in GF (pm) such that v0 = av By Proposition 2.1 and De nition 2.5, [vi] 6= [0vi0] and
[v i ] 6= 0, for all v i 2 S; 1 i t For all [u] 2 [GF n(pm)]nf0g, then u = P k
i=1
The proof is complete
Lemma 2.3 Let S0 = f[v1]; [v2]; : : : ; [vt]g be a k-[Generators] for the set [GF n(pm)] Then
S = fv1; v2; : : : ; vtg is a k-Generators for the vector space GF
n(pm) Proof For all v 2 GF n(pm)nf0g, then
i = 1; k0; k0 k For all [v]; [v0] 2 S0, then there does not exists a in GF (pm) such that
v0 = av by Proposition 2.1 It means that for all v; v0 2 S, there does not exists a in GF(pm) such that v0 = av The proof is complete
Theorem 2.1 There exists S to be a k-Generators for the vector space GF n(pm) withjSj = N if and only if there exists S0 to be a k-[Generators] for the set [GF n(pm)] with
jS0j = N
Proof This is deduced immediately from Lemmas 2.2 and 2.3
Propostion 2.4 Let c be the number of k-[Generators] of N elements for the set[GF n(pm)] Then the number of k-Generators of N elements for the vector space
GF n(pm) is c(pm 1)N
Proof Suppose S0 is a k-[Generators] for [GF n(pm)] with jS0j = N Since S0 does notdepend on the choice of representatives of classes by Proposition 2.3, the number ofways to change representatives of all classes in S0 is (pm 1)N By the hypothesis, thenumber of k-[Generators] of N elements for the set [GF n(pm)] is c, then the number ofk-Generators of N elements for the vector space GF n(pm) is c(pm 1)N by Lemma 2.3and Theorem 2.1
2.3.2 Digital Image Steganography Based on The Galois Field GF (pm) Using GraphTheory and Automata
This subsection rstly proposes a digital image steganography approach based on theGalois eld GF (pm) using graph and automata to design the data hiding scheme of thegeneral form (k; N; blog2 pmnc) for the given assumptions, where k; m; n; N are positiveintegers and p is prime (Theorem 2.2 and Security analysis (2.12)) Secondly, thesubsection gives su cient conditions for existence of the optimal data hiding schemes
21
Trang 33Let I be a set of all image blocks with the same size and image type and assumethat each image block in I has N pixels, where N is a positive integer For simplicity,the structure of an arbitrary image block I in I can be represented by
I = fI1; I2; :::; IN g;
where Ii is a colour value for binary and gray images or colour index in the palette forpalette images of the ith pixel in I with i = 1; N Consider C to be a set of all colourvalues or indexes of pixels of I
Let M be a nite set of secret elements and set M = GF n(pm)
Let K be a nite set of secret keys For all K 2 K, also assume that the structure ofthe key K is the same as the structure of the image block I So, we can write
K = fK1; K2; :::; KN gfor Ki 2 GF (pm) with i = 1; N
Assume that we nd a k-Generators S for GF n(pm) with jSj = N and S = fv1; v2; : : : ;
Given a ip graph G, we denote by Adjacent(cp; a) an adjacent vertex of cp
(Adjacent(cp; a) is adjacent from cp), where the weight a is assigned to the arc(cp; Adjacent(cp; a))
Assume that we build a ip graph G = (V; E)
From the way to determine the arc set E in De nition 2.8, assume that
Trang 34De nition 2.10 Let 2 = GF n(pm); N = f1; 2; : : : ; Ng; 2N GF (pm)nf0g - the set of allsubsets of the set N GF p m
2.1 For the case v = q, then v + ( q) = 0.
Since is a k-Generators forRemark>
GF n(pm); jSj = N;S = fv 1 ; v 2 ; : : : ; v N g, thus there exist k0; k0 k; v i t 2 S;
1 i t N; a t 2 0g ; t =1; k0 such that v + ( q) = at v
i t
So, 2 given in De nition 2.10 is a function P
De nition 2.11 Let I 2 I; M 2 M and K 2 K The automaton A(I; M; K) is ave-tuple ( ; Q; q0; ; T ), where
2 The set of states Q = fqi; i = 0; N + 1jq0 = Pi=1 K i v i ; q i = 1(qi 1; (i; Ii));
i = 1; N ; qN+1 = 2(qN ; M)g;
3 The initial state q 0 ;
4 The set of nal states T = fqN+1g;
5 The transition function : Q! Q, (qi 1; Ii) = qi; i = 1; N ; (qN ; M) = qN+1:
Remark 2.2 The set of states Q and the transition function given in De nition 2.11 arecompletely determined based on the functions 1; 2 and it follows that the automatonA(I; M; K) is constructed accurately in De nition 2.11
Let an image block I 2 I, a secret element M 2 M, a key K 2 K By using theautomaton A(I; M; K) and the ip graph G, two functions Em and Ex in the data hidingscheme (I; M; K; Em; Ex) are designed as follows
The function Em (embedding M in I):
Remark 2.3 Consider I0 = Em(I; M; K), by (2.5), Em only changes colours of jqj pixels
in I based on the ip graph G, then I0 2 I So, Em designed holds De nition 2.1
The function Ex (extracting M from I0):
Trang 35Propostion 2.5 For all (I; M; K) 2 I M K; Ex(Em(I; M; K); K) = M.
Proof Set M0 = Ex(I0; K) By De nitions 2.9 and 2.11, M0 = PN
i=1(V al(Ii0) + Ki)vi (2.9)
P NAfter implementing (2.3) q = i=1(V al(Ii) + Ki)vi By De nitions 2.10 and 2.11, after implementing (2.4) we consider two cases of q:
If q = ?, then (2.5) is not implemented and hence I is not changed Thus I 0 I and therefore
Theorem 2.2 Suppose that a k-Generators S for the vector space GF n(pm) is found and
a ip graph G is built Then there exists the data hiding scheme (k; N; blog2 pmnc), where N = jSj
Proof For the assumption that a k-Generators S for GF n(pm); jSj = N is found and a
ip graph G is built, we o er the way to construct the data hiding scheme (I; M; K; Em;Ex) based on the Galois eld GF (pm) by using the ip graph G and the automaton A(I;M; K) Em changes colours of at most k pixels I to embed M in I for all I 2 I; M 2 M by
De nition 2.10 and Statement (2.5)
Consider B to be the set of all secret data of length r bits, then jBj = 2r jMj = pmn by M
= GF n(pm) Suppose that we construct an injective function f; f : B ! M Then the Em isused to embed b 2 B in I as follows
I0 = Em(I; M; K);
Since f is injective by our supposition, after extracting M from I0 by Ex, the secret data
b will be determined accurately based on f
Since B and M are nite sets, thus to exist the injective function f, we let jBj jMj, itmeans 2r pmn, then r log2 pmn, choose r = blog2 pmnc So, for r = blog2 pmnc, the r
Trang 36bits of the secret data b can be embedded in I By De nition 2.2, the data hidingscheme (I; M; K; Em; Ex) is a data hiding scheme (k; N; blog2 pmnc) So, the datahiding scheme (k; N; blog2 pmnc) exists
Security analysis of the data hiding scheme proposed (k; N; blog2 pmnc): Assumethat parameters k; N, Em, Ex, the vector space GF n(pm) and the ip graph G in thedata hiding scheme (k; N; blog2 pmnc) are published The secret element M isextracted from I0 by the extracting function Ex as follows
is pmN because K 2 K Consider GF to be an arbitrary subset of 2blog2 pmnc elements ofthe set GF n(pm ), B to be the set of all secret data of length blog2p mn c bits, it means
mn
f; f : B ! GF By (2.10), to decrypt the secret element M to the secret data b, we need tolog 2
2 b c ! Then
for a brute force attack, an attacker has to try every possible combination of S, K and f
in the given data hiding scheme The number of combinations of S, K and f is
Trang 37Propostion 2.6 For n is a positive integer, there exists the optimal data hiding scheme(1; 2n 1; n) for binary, gray and palette images with qcolour = 1.
Proof For qcolour = 1, from (2.1), therefore p = 2; m = 1 If we build a ip graph G, thenthere exists the optima data hiding scheme (1; 2n 1; n) with qcolour = 1 by Theorem2.3 The Galois eld GF (pm), GF (pm) = GF (2) is the same as the eld Z2 (see [88]).Next, we show ways to build ip graphs G = (V; E) on the eld Z2 for binary, gray andpalette images as follows
For the binary image, then C = f0; 1g, cp 2 C, cp is a colour value of a pixel
V = C and for all v 2 V , the vertex v is assigned a weight by a functionVal such thatVal(v) = v;
E = f(cp; cp0)jcp; cp0 2 V; cp 6= cp0g and every arc (cp; cp0) has the same weight 1.For the gray image, then C = f0; 1; : : : ; 255g, cp 2 C, cp is a colour value of a pixel
V = C and for all v 2 V , the vertex v is assigned a weight by a functionVal such thatVal(v) = v mod 2;
E = f(255; 254); (cp; cp + 1)jcp 2 V; 1 cp 254g and every arc (cp; cp0) is assigned the same weight 1
For the palette image, then C = f0; 1; : : : ; 2t 1g, t is the number of bits to representcolour indexes, cp 2 C, cp is a colour index of a pixel The palette P = fp0; p1; : : : ; p2t
1g, pi 2 P , pi is the colour corresponding to the colour index i; i = 0; 2t 1 To unifynotations throughout this dissertation, here changes the name of the functionVal in theFOPA method, recalled in Section 1.2 of Chapter 1, to Valp and setVal(cp) = Valp(p),where the colour index cp 2 C corresponds to the colour p 2 P
Consider G to be the rho forest built by the algorithm for FOPA and assign the same weight 1 to all arcs of G However, all colours of the rho forest are replaced with their colour indexes
By De nition 2.8, it is not di cult to verify that the graphs G for binary, gray and paletteimages built as above are all ip graphs on the eld Z2 So, there exists the optimal datahiding scheme (1; 2n 1; n) for binary, gray and palette images with qcolour = 1
Notice that if we set N = 2n 1, then the data hiding scheme (1; 2n 1; n) becomes thedata hiding scheme (1; N; blog2(N + 1)c) Remember that for N is a positive integer,the data hiding scheme (1; N; blog2(N + 1)c) for binary image with qcolour = 1 is the datahiding scheme CTL [18] So, Proposition 2.6 shows that the data hiding scheme CTLreaches an optimal data hiding scheme for N = 2n 1, where n is a positive integer.
Theorem 2.4 Suppose that a 2-Generators S for the vector space GF n(pm) with
exists the optimal data hiding scheme (2; jSj; blog2 pmnc) for qcolour = pm 1.
Proof For the assumption of the theorem, by Theorem 2.2, there exists the data hidingscheme (2; jSj; blog2 pmnc) According to the proposed approach, the data hidingscheme (2; jSj; blog2 pmnc) is designed based on the assumption qcolour = pm 1 by2.1 Now, we prove it to be optimal for qcolour = pm 1
26
Trang 38Suppose the data hiding scheme (2; N; r) is optimal for qcolour = pm 1, then )
r = MSDR 2 (N) = blog 2 (1+q colour C N1 +q colour2 C N2 )c = log 2 (1+qcolour N+q
For r = blog2 pmnc and q
colour 6 m 1, from (2.19), we obtain 7
Trang 39Given an image F used as a carrier to embed a secret data sequence into, partition Finto disjoint image blocks of N pixels, F = fF1; F2; : : : ; Ft2 g Let D = D1D2 : : : Dt3 be asecret data sequence embedded in the cover image F , where Di is secret data of lengthblog2 pmnc bits, i = 1; t3 Since each blog2 pmnc bits of secret data is only embedded
in one image block of F , t3 t2
Let Jump be a bijective function used to determine the order of blocks in F in the process of hiding D in F , Jump : f1; 2; : : : ; t2g ! f1; 2; : : : ; t2g
Consider GF to be an arbitrary subset of 2blog2 pmnc elements of the set GF n(pm), B to
be the set of all secret data of length blog2 pmnc bits, it means B = f0; 1; : : : ; 2blog2 pmnc
1g in the decimal system Then there exists a bijective function f; f : B ! GF
In real applications, when apply the data hiding scheme (k; N; blog2 pmnc) based
on the proposed approach to the process of hiding D in F , use the secret key set K,
K = fK1; K2; : : : ; Kt1 g instead of one secret key The process of hiding D in F by usingthe data hiding scheme (k; N; blog2 pmnc) consists of the embedding algorithm EmDFand the extracting algorithm ExDF proposed as follows
The embedding algorithm EmDF (embedding a secret data sequence D in F ):
F 0 = F ; // F’ is called a stego image
The extracting algorithm Ex DF (extracting the secret data sequence D embedded from F 0):
t = 1;
For i = 1 to t3 Do
f
M = Ex(FJump(i); Kt); // Use the automaton A(FJump(i); M; Kt) (2.24)
Di = f 1(M); //f 1 is the inverse function of f (2.25)g
D = D1D2 : : : Dt3 ;
Propostion 2.7 For a cover image F , a secret data sequence D, a bijective function Jump,
a bijective function f, a secret key set K and the data hiding scheme (k; N; blog 2 pmnc) based
on the proposed approach given as above Suppose the stego image block F 0 is generated after D is embedded in F by the embedding algorithm Em DF Then the data sequence D0extracted from F 0 by the extracting algorithm ExDF is exactly the secret data sequence D.
Trang 40Proof By (2.21) and (2.23), EmDF in (2.22) and ExDF in (2.24) use the same secret key
Kt The bijective function Jump guarantees for all i; j 2 f1; 2; : : : ; t3g; i 6= j;Jump(i) 6= Jump(j), it means that an arbitrary image block in F is only used at mostone time in the process of hiding By Proposition 2.5, M extracted by (2.24) is thesame as M embedded by (2.22) Then the bijective function f guarantees that Diencrypted by (2.20) is the same as Di decrypted by (2.25), i 2 f1; 2; : : : ; t3g Therefore
we complete the proof
Security analysis of process of hiding D in F : Assume that parameters k; N, Em, Ex, the vector space GF n(pm) and the ip graph G in the data hiding scheme (k; N; blog 2 pmnc) are published The secret element M is extracted from F Jump0(i) by (2.24), we have
M = Ex(FJump0(i); Kt);
from De nitions 2.9 and 2.11 and by (2.9), we obtain
of choices for the k-Generators S is c(pm 1)N N! The number of choices for the key set
K, two bijective functions Jump and f are pmt1 N , t2 ! and C 2 blog2 pmnc 2blog2 p mn c
This section shows that there exist the near optimal data hiding scheme (2; 9; 8)(Theorem 2.5 and Security analyses (2.45), (2.46)) and the optimal data hidingscheme (1; 5; 4) (Corollary 2.1 and Security analyses (2.47), (2.48)) for gray andpalette images with qcolour = 3
According to the way of constructing the Galois eld GF (pm) from the polynomial ring
Zp[x], where p is prime and m is a positive integer [88], here consider the case p = m = 2and use the irreducible polynomial g(x) = x2 + x + 1 in Z2[x] to construct the Galois eld GF(22) from the polynomial ring Z2[x], we obtain the Galois eld GF (22) as follows
GF (22) = f0; 1; x; x + 1gwith two operations addition + and multiplication are de ned as in Z2[x], followed by areduction modulo g(x)
29