15 CHAPTER 2 DIGITAL IMAGE STEGANOGRAPHY BASED ON THE GALOIS FIELD USING GRAPH THEORY AND AUTOMATA.. 19 2.3.2 Digital Image Steganography Based on The Galois Field GF p m Using Graph Th
Trang 1MINISTRY OF EDUCATION AND TRAINING
HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
——————————
Nguyen Huy Truong
RESEARCH ON DEVELOPMENT OF METHODS
OF GRAPH THEORY AND AUTOMATA
IN STEGANOGRAPHY AND SEARCHABLE ENCRYPTION
DOCTORAL DISSERTATION IN MATHEMATICS AND
INFORMATICS
Trang 3MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
——————————
Nguyen Huy Truong
RESEARCH ON DEVELOPMENT OF METHODS
OF GRAPH THEORY AND AUTOMATA
IN STEGANOGRAPHY AND SEARCHABLE ENCRYPTION
Major: Mathematics and Informatics Major code: 9460117
DOCTORAL DISSERTATION IN MATHEMATICS AND INFORMATICS
SUPERVISORS:
1 Assoc Prof Dr Sc Phan Thi Ha Duong
2 Dr Vu Thanh Nam
Trang 5DECLARATION OF AUTHORSHIP
I hereby certify that I am the author of this dissertation, and that I have completed it
under the supervision of Assoc Prof Dr Sc Phan Thi Ha Duong and Dr Vu ThanhNam I also certify that the dissertation’s results have not been published by other authors
Trang 7I also wish to thank members of Seminar on Mathematical Foundations for ComputerScience at Institute of Mathematics, Vietnam Academy of Science and Technology for theirvaluable comments and helpful advice.
I give thanks to PhD students of Late Assoc Prof Dr Phan Trung Huy for sharing
and exchanging information in steganography and searchable encryption
Finally, I must also thank my family for supporting all my work
Trang 9Page
LIST OF SYMBOLS iii
LIST OF ABBREVIATIONS iv
LIST OF FIGURES v
LIST OF TABLES vi
INTRODUCTION 1
CHAPTER 1 PRELIMINARIES 4
1.1 Basic Structures 4
1.1.1 Strings 4
1.1.2 Graph 4
1.1.3 Deterministic Finite Automata 6
1.1.4 The Galois Field GF p( m) 7
1.2 Digital Image Steganography 8
1.3 Exact Pattern Matching 11
1.4 Longest Common Subsequence 12
1.5 Searchable Encryption 15
CHAPTER 2 DIGITAL IMAGE STEGANOGRAPHY BASED ON THE
GALOIS FIELD USING GRAPH THEORY AND AUTOMATA 16
2.1 Introduction 16
2.2 The Digital Image Steganography Problem 18
2.3 A New Digital Image Steganography Approach 19
2.3.1 Mathematical Basis based on The Galois Field 19
2.3.2 Digital Image Steganography Based on The Galois Field GF p( m)
Using Graph Theory and Automata 21
2.4 The Near Optimal and Optimal Data Hiding Schemes for Gray and Palette
Images 29
2.5 Experimental Results 34
2.6 Conclusions 38
CHAPTER 3 AN AUTOMATA APPROACH TO EXACT PATTERN
MATCHING 40
3.1 Introduction 40
3.2 The New Algorithm - The MRc Algorithm 42
3.3 Analysis of The MRc Algorithm 48
3.4 Experimental Results 51
3.5 Conclusions 56
Trang 10
COMMON SUBSEQUENCE PROBLEM 57
4.1 Introduction 57
Trang 11
4.2 Mathematical Basis 58
4.3 Automata Models for Solving The LCS Problem 62
4.4 Experimental Results 67
4.5 Conclusions 68
CHAPTER 5 CRYPTOGRAPHY BASED ON STEGANOGRAPHY
AND AUTOMATA METHODS FOR SEARCHABLE ENCRYPTION 69
5.1 Introduction 69
5.2 A Novel Cryptosystem Based on The Data Hiding Scheme (2 9 8) , , 71
5.3 Automata Technique for Exact Pattern Matching on Encrypted Data 75
5.4 Automata Technique for Approximate Pattern Matching on Encrypted Data 77
5.5 Conclusions 79
CONCLUSION 81
LIST OF PUBLICATIONS 82
BIBLIOGRAPHY 83
Trang 13LIST OF SYMBOLS
Σ∗ The set of all strings on Σ
| | S The number of elements of a set S
| | u The length of a string u
GF p( m) The Galois field is constructed from the polynomial ring Z p[ ],x
wherep is prime and m is a positive integer
(GF n (p m) + ), , · A vector space over the fieldGF p( m)
LeftID( ) u The least element the leftmost location of u
Rm p( )u The last component of LeftID( ) in u p
(I M K , , , Em Ex , ) A data hiding scheme
I A set of all image blocks with the same size and image format
Ex An extracting function extracts an embedded secret element
from an image block
q colour The number of different ways to change the colour of each
pixel in an arbitrary image block
c block A string of length c
Pos p( )z The last position of appearance of z in p
M p An automaton accepting the pattern p
WConfig( )p The set of the weights of all the configurations of p
W p i( )a The weight of a at the location in i p
Wm p( )a The heaviest weight of a in p
W a( ) The weight of a inp
Trang 15LIST OF ABBREVIATIONS
AOSO Average Optimal Shift Or
BFS Breadth First Search
BNDM Backward Nondeterministic Dawg Matching
EBOM Extended Backward Oracle Matching
FOPA Fastest Optimal Parity Assignment
HCIH High Capacity of Information Hiding
LCS Longest Common Subsequence
LSB Least Significant Bit
MSDR Maximal Secret Data Ratio
SAE Searchable Asymmetric Encryption
SSE Searchable Symmetric Encryption
TVSBS Thathoo Virmani Sai Balakrishnan Sekar
Trang 17LIST OF FIGURES
Figure 1.1 A simple graph 5
Figure 1.2 A spanning tree of the graph given in Figure 1.1 6
Figure 1.3 The transition diagram of A in Example 1.3 7
Figure 1.4 The basic diagram of digital image steganography 9
Figure 1.5 The degree of appearance of the pattern p 12
Figure 2.1 The nine commonly used 8-bit gray cover images sized 512×512 pixels 35 Figure 2.2 The nine commonly used 8-bit palette cover images sized 512 ×512
pixels 36
Figure 2.3 The binary cover image sized 2592×1456 pixels 36
Figure 3.1 Sliding window mechanism 41
Figure 3.2 The basic idea of the proposed approach 45
Figure 3.3 The transition diagram of the automaton M p , p = abcba 47
Trang 19LIST OF TABLES
Table 1.1 An adjacency list representation of the simple graph given in Figure 1.1 5
Table 1.2 The performing steps of the BF algorithm 11
Table 1.3 The dynamic programming matrix L 13
Table 2.1 Elements of the Galois field GF (22) represented by binary strings and
decimal numbers 30
Table 2.2 Operations + and on the Galois field · GF (22) 30
Table 2.3 The representation of E and the arc weights of G for the gray image 31 Table 2.4 The payload, ER and PSNR for the optimal data hiding scheme (1 2, n − 1 ) , n for palette images with q colour = 1 37
Table 2.5 The payload, ER and PSNR for the near optimal data hiding scheme
(2 9 8), , for gray images with q colour = 3 37
Table 2.6 The payload, ER and PSNR for the near optimal data hiding scheme
(2 9 8), , for palette images with q colour = 3 38
Table 2.7 The comparisons of embedding and extracting time between the chapter’s and Chang et al.’s approach for the same optimal data hiding scheme (1 , N, blog2(N + 1) )c , where N = 2 n − 1, for the binary image with q colour = 1 Time is given in second unit 38
Table 3.1 The performing steps of the MR1 algorithm 47
Table 3.2 Experimental results on rand4 problem 52
Table 3.3 Experimental results on rand8 problem 52
Table 3.4 Experimental results on rand16 problem 53
Table 3.5 Experimental results on rand32 problem 53
Table 3.6 Experimental results on rand64 problem 54
Table 3.7 Experimental results on rand128 problem 54
Table 3.8 Experimental results on rand256 problem 55
Table 3.9 Experimental results on a genome sequence (with | |Σ = 4 ) 55
Table 3.10 Experimental results on a protein sequence (with | |Σ = 20) 56
Table 4.1 The Ref p of p = bacdabcad 60
Table 4.2 The comparisons of the lcs(p, x) computation time for n = 50666 67
Table 4.3 The comparisons of the lcs(p, x) computation time for n = 102398 68
Trang 21In the modern life, when the use of computer and Internet is more and more essential,
digital data (information) can be copied as well as accessed illegally As a result,information security becomes increasingly important There are two popular methods toprovide security, which are cryptography and data hiding [2, 5, 6, 20, 56, 62, 81]
copyright ownership and authentication of the digital media carrying the embedded data.Steganography can be used as an alternative way to cryptography However,
steganography will become weak if attackers detect existence of hidden data Hence
integrating cryptography with steganography is as a third choice for data security
storage service, they can upload information to the servers and then access it on the Internetonline Meanwhile, enterprises can not spend big money on maintaining and owning asystem consisting of hardware and software Although cloud computing brings many
benefits for individuals and organizations, cloud security is still an open problem when cloudproviders can abuse their information and cloud users lose control of it Thus, guaranteeing
Searchable encryption for exact pattern matching is a new class of searchable encryptiontechniques The solutions for this class have been presented based on algorithms for [26]
be a keyword determined, encrypted and stored in cloud servers or an arbitrary pattern[28, 40, 71]
From the above problems, together with the high efficiency of techniques using graph and
automata proposed by P T Huy et al for dealing with problems of exact pattern matching(2002), longest common subsequence (2002) and steganography (2011, 2012 and 2013), aswell as potential applications of graph theory and automata approaches suggested by Late
Trang 22Assoc Prof Phan Trung Huy in steganography and searchable encryption, and under
Trang 23the direction of supervisors, the dissertation title assigned is research on development
of methods of graph theory and automata in steganography and searchableencryption
The purpose of the dissertation is to research on the development of new and qualitysolutions using graph theory and automata, suggesting their applications in, and applying
them to steganography and searchable encryption
Based on results published and suggestions presented by Late Assoc Prof Phan Trung
Huy in steganography and searchable encryption, the dissertation will focus on following
four problems in these fields:
- Digital image steganography;
- Exact pattern matching;
- Longest common subsequence;
clearly and analysed very carefully in Chapters of the dissertation
For the first three problems, the dissertation’s work is to find new and efficient solutions
using graph theory and automata Then they will be used and applied to solve the lastproblem
The dissertation has been completed with structure as follows Apart from
Introduction at the beginning and Conclusion at the end of the dissertation, the main
content of it is divided into five chapters
Chapter 1 Preliminaries This chapter recalls basic knowledge indicatedthroughout the dissertation (strings, graph, deterministic finite automata, digital images,the basic model of digital image steganography, some parameters to determine the
approach, and the definition of a cryptosystem)
Chapter 2 Digital image steganography based on the Galois field using
graph theory and automata Firstly, from some proposed concepts of optimal andnear optimal secret data hiding schemes, this chapter states the interest problem in digital
image steganography Secondly, the chapter proposes a new approach based on the Galoisfield using graph theory and automata to design a general form of steganography in binary,gray and palette images, shows sufficient conditions for existence and proves existence ofsome optimal and near optimal secret data hiding schemes, applies the proposed schemes
Trang 24the proposed algorithm is shown by theoretical analyses and experimental results.
Trang 25Chapter 4 Automata technique for the longest common subsequenceproblem This chapter proposes two efficient sequential and parallel algorithms for
computing the length of a longest common subsequence of two strings in practice, using
automata technique Theoretical analysis of parallel algorithm and experimental results
confirm that the use of the automata technique in designing algorithms for solving the
longest common subsequence problem is the best choice
Chapter 5 Cryptography based on steganography and automata methods
for searchable encryption This chapter first proposes a novel cryptosystem based on
The contents of the dissertation are written based on the paper [T1] published in 2019,the paper [T4] accepted for publication in 2020 in KSII Transactions on Internet and
Information Systems (ISI), and the papers [T2, T3] published in Journal of ComputerScience and Cybernetics in 2019 The main results of the dissertation have been presentedat:
- Seminar on Mathematical Foundations for Computer Science at Institute of
Mathematics, Vietnam Academy of Science and Technology,
- The 9th Vietnam Mathematical Congress, Nha Trang, August 14-18, 2018,
- Seminar at School of Applied Mathematics and Informatics, Hanoi University ofScience and Technology
Trang 27CHAPTER 1 PRELIMINARIES
This chapter will attempt to recall terminologies, concepts, algorithms and results which
are really needed in order to present the dissertation’s new results clearly and logically,
as well as help readers follow the content of the dissertation easily The background
knowledge re-presented here consists of basic structures (Section 1.1: strings (Subsection1.1.1), graph (Subsection 1.1.2), deterministic finite automata (Subsection 1.1.3), and theGalois field GF p( m) (Subsection 1.1.4)), digital image steganography (Section 1.2), exactpattern matching (Section 1.3), longest common subsequence (Section 1.4) and searchableencryption (Section 1.5)
Let x be a string A string p is called a substring of the string , ifx x= u1pu2 for some
strings u1 and u2 In case u1 = (resp. u2 = ), the string is called a prefix (resp suffix) p
of the string The prefix (resp suffix)x p is called proper if p 6= Note that the prefixx
or the suffix can be empty
nonempty finite set of vertices V and a finite set of edges, where each edge has either one
or two vertices associated with it A graph with weights assigned to their edges is called a
Trang 28
or two vertices associated with it A graph with weights assigned to their edges is called aweighted graph
Trang 29
An edge connecting a vertex to itself is called a loop Multiple edges are edges connecting
the same vertices A graph having no loops and no multiple edges is called a simple graph
adjacent vertices of any vertex of the graph
Example 1.1 Using adjacency lists, the simple graph given in Figure 1.1 can be
Figure 1.1. A simple graph
Table 1.1. An adjacency list representation of the simple graph given in Figure 1.1
Vertex Adjacent vertices
Trang 30
Remove the first vertex i from ;L
Trang 31j and the vertex j is said to be adjacent from the vertex i
1.1.3 Deterministic Finite Automata
Trang 32
such that for all q ∈ Q, s ∈Σ∗ , a ∈ Σ ( , δ q, as) = ( (δ δ q, a , s ) ) and (δ q, ) = q
Trang 33a) Each state of Q is a vertex.
b) Let q 0 = (δ q, a ), where q is a state of Q and a is a letter of Σ Then the transition
diagram has an arc (q 0 , q ) labeled If there are several letters that cause transitions from a
q 0to , then the arc ( q q 0 , q) is labeled by a list of these letters
will be used in Chapter 2
Let be a prime number Define p Z p[ ] to be the set of all polynomials with the variablex
x, whose coefficients belong to the field Z p Addition and multiplication in Z p[ ] are definedx
Trang 34
x in ( ) A polynomialf x f x ( ) ∈ Z p[ ] is called to be irreducible if there does not existx
Trang 35polynomials f1( )x , f2( ) x ∈ Z p[ ] such thatx
f x( ) = f1( )x f2( )x ,
where deg(f1) > 0 and deg(f2) >0
Let f x( ) ∈ Z p[ ] be an irreducible polynomial withx deg( ) = f m ≥ 1 Define
Z p[ ] ( ( )) to be the set ofx / f x p m polynomials of degree at most m −1 in Z p[ ] Additionx
and multiplication in Z p[ ] ( ( )) are given as inx / f x Z p[ ], followed by a reduction modulox
f x( ) Then Z p[ ] ( ( )) with these operations is a field havingx / f x p m elements, called theGalois field GF p( m) Note that for p is prime and m ≥ 1, the Galois field GF p( m) isunique
1.2 Digital Image Steganography
assignment (FOPA) method, the module method and the concept of the maximal secretdata ratio (MSDR) [18, 20, 21, 39, 49, 50, 51, 53, 61, 63, 65, 76, 78, 104]
A digital image is a matrix of pixels Each pixel is represented by a non negative integer
Trang 36the cover image is a digital image used as a carrier to embed secret data into, the stego
image is digital image obtained after embedding secret data into the cover image by the
Trang 37function blockEmbedwith the secret key on the Sender side For steganography generally,
the secret data needs to be extracted fully by the block Extract with the secret key onthe Receiver side [20, 61, 63, 76]
The total number of the secret data sequence bits embedded in the cover image is called
Stego Image
Trang 391 (d c, Next c( )) = minv c P 6= ∈ d c, v( ),
2 Val( ) =c Val Next( ( )) + 1 on the fieldc Z2
Call G P = (V P , E P) a weighted complete undirected graph of the palette image G, where
V P = P and the weight of the edge {c, c 0 } is ( d c, c 0 ) The function Nearest Nearest, : P → P,
a algorithm determining F is the essence of the FOPA method
Algorithm for FOPA:
a) Take one elementv ∈ C;
b) Initialize v0= , setv Val(v0) = 0 (or 1 randomly), by a finite loop, find a longest
sequence of k+ 1 different elements in P consecutively, v0, v1, , v k, such that
Nearest(v i ) = v i+1 for = 0 i , k − , v 1 i ∈ C, v k ∈ C or v k ∈ V , and set
U = M \{ }0 is an unique 1-base of M [51] Two functions Next Next C, : G → C G, and
Val Val, : C G → Z2, satisfying the condition Val( ) =c Val Next( ( )) + 1 on the ringc Z2, are
Trang 40
data, embedded in and extracted from the image block I with the key K by the blocks
Embed and Extractas follows [49, 51]