Truong 2019, “A New Digital Image Steganography ApproachBased on The Galois Field GF pm Using Graph and Automata”, KSIITransactions on Internet and Information Systems, 139, pp.. Based o
Trang 1MINISTRY OF EDUCATION AND TRAINING
HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
——————————
Nguyen Huy Truong
RESEARCH ON DEVELOPMENT OF METHODS
OF GRAPH THEORY AND AUTOMATA
IN STEGANOGRAPHY AND SEARCHABLE ENCRYPTION
Major: Mathematics and InformaticsMajor code: 9460117
ABSTRACT OF DOCTORAL DISSERTATION IN MATHEMATICS
AND INFORMATICS
Hanoi - 2020
Trang 2The dissertation is completed at:
Hanoi University of Science and Technology
The dissertation will be defended before approval committee at
Hanoi University of Science and Technology
Time , date month year
The dissertation can be found at:
1 Ta Quang Buu Library, Hanoi University of Science and Technology
2 Vietnam National Library
Trang 3LIST OF PUBLICATIONS[T1] N H Truong (2019), “A New Digital Image Steganography ApproachBased on The Galois Field GF (pm) Using Graph and Automata”, KSIITransactions on Internet and Information Systems, 13(9), pp 4788-4813.(ISI)
[T2] N H Truong (2019), “A New Approach to Exact Pattern Matching”,Journal of Computer Science and Cybernetics, 35(3), pp 197-216.[T3] N H Truong (2019), “Automata Technique for The LCS Problem”,Journal of Computer Science and Cybernetics, 35(1), pp 21-37
[T4] N H Truong (2019), “A Novel Cryptosystem Based on Steganographyand Automata Technique for Searchable Encryption”, KSII Transactions
on Internet and Information Systems (revised) (ISI)
Trang 4INTRODUCTIONWhen the use of computer and Internet is more and more essential,digital data (information) can be copied as well as accessed illegally As
a result, information security becomes increasingly important Thereare two popular methods to provide security, which are cryptographyand data hiding Cryptography is used to encrypt data in order to makethe data unreadable by a third party Data hiding is used to embed data
in digital media Based on the purpose of the application, data hiding
is generally divided into steganography that hides the existence of data
to protect the embedded data and watermarking that protects thecopyright ownership and authentication of the digital media carryingthe embedded data
Steganography can be used as an alternative way to cryptography.However, steganography will become weak if attackers detect existence
of hidden data Hence integrating cryptography with steganography is
as a third choice for data security
With the rapid development of applications based on Internetinfrastructure, cloud computing becomes one of the hottest topics inthe information technology area Indeed, it is a computing systembased on Internet that provides on-demand services from applicationand system software, storage to processing data For example, whencloud users use the storage service, they can upload information tothe servers and then access it on the Internet online Meanwhile,enterprises can not spend big money on maintaining and owning asystem consisting of hardware and software Although cloudcomputing brings many benefits for individuals and organizations,cloud security is still an open problem when cloud providers canabuse their information and cloud users lose control of it Thus,guaranteeing privacy of tenants’ information without negating thebenefits of cloud computing seems necessary In order to protectcloud users’ privacy, sensitive data need to be encoded beforeoutsourcing them to servers Unfortunately, encryption makes theservers perform search on ciphertext much more difficult than onplaintext To solve this problem, many searchable encryption
Trang 5techniques have been presented since 2000 Searchable encryptiondoes not only store users’ encrypted data securely but also allowsinformation search over ciphertext.
Searchable encryption for exact pattern matching is a new class
of searchable encryption techniques The solutions for this class havebeen presented based on algorithms for or approaches to exact patternmatching
As in retrieving information from plaintexts, the development ofsearchable encryption with approximate string matching capability isnecessary, where the search string can be a keyword determined,encrypted and stored in cloud servers or an arbitrary pattern
From the above problems, together with methods using graphtheory and automata proposed by P T Huy et al of solving problems
of exact pattern matching (2002), longest common subsequence(2002) and steganography (2011, 2012 and 2013), and their potentialapplications in steganography and searchable encryption, as well asunder the direction of supervisors, the dissertation title assigned isresearch on development of methods of graph theory andautomata in steganography and searchable encryption
The purpose of the dissertation is to research on the development
of new and quality solutions using graph theory and automata,suggesting their applications in, and applying them to steganographyand searchable encryption
Based on the results and suggestions introduced by P T Huy et al.,the dissertation will focus on following four problems in steganographyand searchable encryption:
- Digital image steganography;
- Exact pattern matching;
- Longest common subsequence;
- Searchable encryption
For the first three problems, the dissertation’s work is to find newand efficient solutions using graph theory and automata Then theywill be used and applied to solve the last problem
The dissertation has been completed with structure as follows.Apart from Introduction at the beginning and Conclusion at theend of the dissertation, the main content of it is divided into fivechapters
Trang 6Chapter 5 Cryptography based on steganography and automatamethods for searchable encryption.
The contents of the dissertation are written based on the paper[T1] published in, the revised manuscript [T4] submitted to KSIITransactions on Internet and Information Systems (ISI), and thepapers [T2, T3] published in Journal of Computer Science andCybernetics in 2019 The main results of the dissertation have beenpresented at: Seminar on Mathematical Foundations for ComputerScience at Institute of Mathematics, Vietnam Academy of Science andTechnology; The 9th Vietnam Mathematical Congress, Nha Trang,August 14-18, 2018; Seminar at School of Applied Mathematics andInformatics, Hanoi University of Science and Technology
CHAPTER 1PRELIMINARIES1.1 Basic Structures
1.1.3 Deterministic Finite Automata
Study on the problem of the construction and the use ofdeterministic finite automata is one of objectives of the dissertation.Hence, this subsection will clarify this model of computation
Trang 71.1.4 The Galois Field GF (pm)
This subsection re-presents how to construct a finite field with pm
elements, called the Galois field GF (pm), where p is prime and m ≥ 1
is an integer The algebraic structure will be used in Chapter 2
1.2 Digital Image Steganography
The interest problem in Chapter 2 is digital image steganography
This section will recall the concept of digital images, the basic model
of digital image steganography, some parameters to determine the
efficiency of digital image steganography and lastly re-present results
researched on development and used in Chapter 2 such as the fastest
optimal parity assignment (FOPA) method, the module method and
the concept of the maximal secret data ratio (MSDR)
The basic model of digital image steganography is shown in Figure
Stego Image
Cover Image
Figure 1.4 The basic diagram of digital image steganography
Definition 1.4 (P T Huy et al., 2011) MSDRk(N ) is the largest
number of embedded bits of secret data in an image block of N pixels
by changing colours of at most k pixels in the image block, where k, N
are positive integers
Given a positive integer qcolour, call qcolour the number of different
ways to change the colour of each pixel in an arbitrary image block of
N pixels Then
MSDRk(N ) = blog2(1+qcolourCN1 +qcolour2 CN2 +· · ·+qkcolourCNk)c (1.3)
Trang 81.3 Exact Pattern Matching
This section will restate the exact pattern matching problem, andrecall the concept of the degree of fuzziness (appearance) used inChapter 3
Definition 1.5 Let p be a pattern of length m and x be a text of length
n over the alphabet Σ Then the exact pattern matching problem is tofind all occurrences of the pattern p in x
Definition 1.6 (P T Huy et al., 2002) Let p be a pattern and x be
a text of length n over the alphabet Σ Then for each 1 ≤ i ≤ n, adegree of appearance of p in x at position i is equal to the length of alongest substring of x such that this substring is a prefix of p, wherethe right end letter of the substring is x[i]
1.4 Longest Common Subsequence
This section will recall the longest common subsequence (LCS)problem, and the Knapsack Shaking approach addressing the problemstudied on development in Chapter 4
Denote an arbitrary longest common subsequence of p and x byLCS(p, x) The length of a LCS(p, x) is denoted by lcs(p, x)
Let p and x be two strings of lengths m and n over the alphabet
Σ, m ≤ n The longest common subsequence problem for two strings(LCS problem) can be stated in two following forms
In cryptography, SE can be either searchable symmetricencryption (SSE) or searchable asymmetric encryption (SAE) InSSE, only private key holders can create encrypted data and produce
Trang 9trapdoors for search In SAE, users who have the public key canmake ciphertexts but only private key holders can generate trapdoors.
CHAPTER 2DIGITAL IMAGE STEGANOGRAPHY BASED ON THEGALOIS FIELD USING GRAPH THEORY AND
AUTOMATAThis chapter first proposes concepts of optimal and near optimalsecret data hiding schemes The chapter then proposes a new digitalimage steganography approach based on the Galois field GF (pm)using graph and automata to design the data hiding scheme of thegeneral form (k, N, blog2pmnc) for binary, gray and palette imageswith the given assumptions, where k, m, n, N are positive integers and
p is prime, shows sufficient conditions for existence and provesexistence of some optimal and near optimal secret data hidingschemes These results are derived from the concept of the maximalsecret data ratio of embedded bits, the module method and theFOPA method proposed by P T Huy et al in 2011, 2012 and 2013,recalled in Section 1.2 of Chapter 1 An application of the schemes tothe process of hiding a finite sequence of secret data in an image isalso considered Security analyses and experimental results confirmthat the proposed approach can create steganographic schemes whichachieve high efficiency in embedding capacity, visual quality, speed aswell as security, which are key properties of steganography
The results of Chapter 2 have been published in [T1]
2.1 Introduction
2.2 The Digital Image Steganography Problem
Definition 2.1 A block based secure data hiding scheme in digitalimages (for short, called a data hiding scheme) is a five tuple(I, M, K, Em, Ex), where the following conditions are satisfied
1 I is a set of all image blocks with the same size and image type,
2 M is a finite set of secret elements,
3 K is a finite set of secret keys,
4 Em is an embedding function to embed a secret element in animage block, Em : I × M × K → I,
Trang 105 Ex is an extracting function to extract an embedded secretelement from an image block, Ex : I × K → M,
6 Ex(Em(I, M, K), K) = M, ∀(I, M, K) ∈ I × M × K
Definition 2.2 A data hiding scheme (I, M, K, Em, Ex) is called adata hiding scheme (k, N, r), where k, N, r are positive integers, if eachimage block in I has N pixels and the embedding function Em canembed r bits of secret data in an arbitrary image block by changingcolours of at most k pixels in the image block
Definition 2.3 For a given qcolour, a data hiding scheme (k, N, r)
is called an optimal data hiding scheme if r = MSDRk(N ) and @N0,
N0< N , r = MSDRk(N0) Then N is denoted by Noptimum
Definition 2.4 For a given qcolour, a data hiding scheme (k, N, r)
is called a near optimal data hiding scheme if r = MSDRk(N ) and
N > Noptimum
The chapter’s digital image steganography problem Designoptimal or near optimal data hiding schemes (k, N, r) for digital images(binary, gray and palette images)
2.3 A New Digital Image Steganography Approach
2.3.1 Mathematical Basis based on The Galois Field
Let GFn(pm) = {(x1, x2, , xn)|xi ∈ GF (pm), ∀i = 1, n}, where
n is a positive integer, with two operations of vector addition + andscalar multiplication · are defined as follows
x + y = (x1+ y1, x2+ y2, , xn+ yn),
ax = (ax1, ax2, , axn), a ∈ GF (pm),where x, y ∈ GFn(pm) and x = (x1, x2, , xn), y = (y1, y2, , yn)
We remember that (GFn(pm), +, ·) is a vector space over the field
GF (pm)
Definition 2.5 The class of an element x ∈ GFn(pm), denoted by[x], is given by
[x] = {ax|a ∈ GF (pm)\{0}}
Trang 11Given a class [x], x is referred to as the representative of [x] Forsimplicity, denote the class [0] by 0.
Denote the set of all classes by [GFn(pm)] This can be represented
by [GFn(pm)] = {[x]|x ∈ GFn(pm)} The number of elements of a set
Definition 2.7 Let V be a vector space over a field K, S ⊂ V Then
S is called a k-Generators for V , where k is a positive integer, if thetwo following conditions are satisfied
a) ∀v, v0∈ S, @a ∈ K, v0 = av,
b) ∀v ∈ V \{0}, ∃t, t ≤ k, v1, v2, , vt ∈ S,
a1, a2, , at∈ K\{0}, v =Pt
i=1aivi.Theorem 2.1 There exists S to be a k-Generators for the vectorspace GFn(pm) with |S| = N if and only if there exists S0 to be ak-[Generators] for the set [GFn(pm)] with |S0| = N
Propostion 2.4 Let c be the number of k-[Generators] of N elementsfor the set [GFn(pm)] Then the number of k-Generators of N elementsfor the vector space GFn(pm) is c(pm− 1)N
2.3.2 Digital Image Steganography Based on The Galois Field
GF (pm) Using Graph Theory and Automata
Let I be a set of all image blocks with the same size and imagetype and assume that each image block in I has N pixels, where N is
a positive integer For simplicity, the structure of an arbitrary imageblock I in I can be represented by I = {I1, I2, , IN}, where Ii is acolour value for binary and gray images or colour index in the palettefor palette images of the ith pixel in I, ∀i = 1, N Consider C to be aset of all colour values or indexes of pixels of I
Let M be a finite set of secret elements and set M = GFn(pm).Let K be a finite set of secret keys For all K ∈ K, also assume thatthe structure of the key K is the same as the structure of the image
Trang 12block I So, we can write K = {K1, K2, , KN} for Ki ∈ GF (pm),
∀i = 1, N
Assume that find a k-Generators S for GFn(pm) with |S| = N and
S = {v1, v2, , vN}
Definition 2.8 A weighted directed graph G = (V, E) is called a flip
graph over the Galois field GF (pm) (for short, called a flip graph) if
the two following conditions are satisfied
1 V = C and for all v ∈ V , the vertex v is assigned a weight by a
functionVal such thatVal(v) ∈ GF (pm),
2 For ∀cp ∈ V, ∀a ∈ GF (pm)\{0}, ∃!(cp, cp0) ∈ E and the arc
(cp, cp 0) is assigned the weight a such that Val (cp 0) = Val (cp) + a (on
GF (pm))
Given a flip graph G, we denote by Adjacent(cp, a) an adjacent
vertex of cp, where the weight a is assigned to the arc
(cp, Adjacent(cp, a))
Assume that build a flip graph G = (V, E)
From the way to determine the arc set E in Definition 2.8, assume
that
|C| ≥ pm and qcolour= pm− 1 (2.1)Let an image block I ∈ I, a secret element M ∈ M, a key K ∈ K
By using the automaton A(I, M, K) and the flip graph G, two functions
Em and Ex in the data hiding scheme (I, M, K, Em, Ex) are designed
Trang 13Theorem 2.2 Suppose that find a k-Generators S for the vector space
GFn(pm) and build a flip graph G Then there exists the data hidingscheme (k, N, blog2pmnc), where N = |S|
Security analysis of the data hiding scheme proposed(k, N, blog2pmnc): Assume that publish parameters k, N , Em, Ex, thevector space GFn(pm) and the flip graph G in the data hiding scheme(k, N, blog2pmnc)
c(pm− 1)NN !pmNCp2mnblog2 pmnc2blog2 p mn c! (2.12)Theorem 2.3 Suppose that build a flip graph G Then there exists theoptimal data hiding scheme (1,ppmnm −1−1, blog2pmnc) for qcolour= pm− 1.Propostion 2.6 For n is a positive integer, there exists the optimaldata hiding scheme (1, 2n− 1, n) for binary, gray and palette imageswith qcolour= 1
Notice that if we set N = 2n − 1, then the data hiding scheme(1, 2n− 1, n) becomes the data hiding scheme (1, N, blog2(N + 1)c).Remember that for N is a positive integer, the data hiding scheme(1, N, blog2(N +1)c) for binary image with qcolour = 1 is the data hidingscheme CTL (Chang et al., 2005) So, Proposition 2.6 shows that thedata hiding scheme CTL reaches an optimal data hiding scheme for
N = 2n− 1, where n is a positive integer
Theorem 2.4 Suppose that find a 2-Generators S for the vectorspace GFn(pm) with |S| =
a flip graph G Then there exists the optimal data hiding scheme(2, |S|, blog2pmnc) for qcolour= pm− 1
2.4 The Near Optimal and Optimal Data Hiding Schemes forGray and Palette Images
Here consider the case k = p = m = 2 and n = 4, the data hidingscheme (2, N, 8) exists if the hypothesis of Theorem 2.2 is satisfied, itmeans that find a 2-Generators S for the vector space GF4(22), |S| = Nand build a flip graph G over the Galois field GF (22)