18 2.2.2 Tigital Image 8ieganography Hased on The Galois Field OF 2.4 The Near Optimal and Opti Data Hiding 8 Schemes for Greys and Palette 3.2 The New Algorithm - The MR Algorithm 42
Trang 1MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
Nguyen Huy Truong
RESEARCH ON DEVELOPMENT OF METHODS
OF GRAPH THEORY AND AUTOMATA
IN STEGANOGRAPHY AND SEARCHABLE ENCRYPTION
DOCTORAL DISSERTATION IN MATHEMATICS AND
INFORMATICS
Trang 2
MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
Nguyen Huy Truong
RESEARCH ON DEVELOPMENT OF METHODS
OF GRAPH THEORY AND AUTOMATA
IN STEGANOGRAPHY AND SEARCHABLE ENCRYPTION
Major: Mathematics and Informatics
Trang 3DECLARATION OF AUTHORSHIP
I hereby certify that | am the author of this dissertation, and that | have completed it under the supervision of Assoc Prof Dr Se Phan Thi Ta Duong and Dr Vn Thanh Nam [alka certify that the dissertation’s results have not heen published by other anthars
Trang 4ACKNOWLEDGMENTS
Tam extremely grateful to Assoc Prof Dr Se Phan Thi Ha, Duong
T want vo thank De Vu Thanh Naw
T would also like bo extend my decpost gratitude tu Late Assoc Prof Dr Phau Trang Huy
I would Hike to thank my co-workers from School of Applied Mathematics and Informatics, Hanoi University of Science and Technology for all their help
Toko wish to thank members of Sominar on Mathcmatical Foundations for Computer Science at Institute of Mathematics, Vietnam Academy of Science and Technology for their valuable comments and helpful advice
I give thanks to PLD siudeuls of Late Assoc Prof Dr Plan Trung Huy lor sharing and exchanging information in steganography aud ycarebuble eneryplion,
Vinally, 1 must also thank my family for supporting all my work
Trang 5CHAPTER 2 DIGITAL IMAG GAN OGR: APHY BASED ON THE
GALOIS FIELD USING GRATH THEORY AND AUTOMATA 16
43.1 Mathematical Liasis based on ‘The Galois it 18
2.2.2 Tigital Image 8ieganography Hased on The Galois Field OF
2.4 The Near Optimal and Opti Data Hiding 8 Schemes for Greys and Palette
3.2 The New Algorithm - The MR Algorithm 42
4.1 Intreduction
Trang 642 Mathematioul Busts ee
4.3 Automota Modcls for Solving The Los Pri oblem te
44 Experimental Results
45 Conclusions
AND AUTOMATA METHODS FOR SEARCHABLE ENCRYPTION
5.2 A Novel Cryptosystem Thased on ‘The Data Tiding Scheme (2,9, 8)
5.8 Automate Techuique for Exact Pattern Matching on Euerypted Data
$4 Aulomala Technique for Approximate Pattern Matching on Enerypted Data 5.5 Conclusions
CONCLUSION
Trang 7‘The empty set
‘The empty string The mimber of elements of a set S The length of w string, u
The Gulois field is constructed Irom the polynomial ring, Zp[z] where p is prime and 7 is a positive integer
A vector space over the field GF{p")
A longest common subsequence of p and z The length of a LOSp, 2)
‘The least element the leftmost location of u The last, component of efi?) in p
A data hiding scheme
A yet of all image blocks with the sume size und image format
A finite set of secret elements
A finite set of secret keys
An embedding fimetion embeds a secret element in an image block
Au cxtracting [auction extracts an cubedded s from an image block
The number of different ways to change the colour of each
‘Lhe last position of appearance of 2 in p
An antomaton arcepting the pattern p
‘The set, of ail the canfiguratians of p The weight vf w in p
Trang 8Exteuded Buckward Oracle Matching
Izmbedding Habe
Tranek Jennings Smyth Fastest Optimal Parity Assignment Forward SBNDM
Hashing Tligh Capacity af Information Tiding Loug BNDM
Longest Counnon Bubsequenee Least Significant Bit
Maximal Secret: Data Ratio Mean Square Error
Noudeterministic Polynormial
Optimal Purity Assignment,
Parity Assignment Pan Chen Tseng Peak Signal ta Noise Ratio Red Green Bluc
Shift Add Searchable Asymmetric Encryption Simplified BNDM
Searchable Buery pion Searchable Symmetric Encryption Thathao Virmani Sai Balakrishnan Sekar Wagner Fischer
Wu Lee
Trang 10LIST OF TABLES
Tuble 1.1 An adjecuncy list representation of the sinple graph given in Pigure 1.1
TTable 1.2 ‘he performing steps of the BE algorithm
Table The dymamic programming muatria I
Table 3.1 Elements of the Calvis jicld GF(22) represented by binury strings and
decimal numbers
Table The representation of Fl and the arn ineights nf G for the gray image
Table The payload, ER and PSNR Jor the optimal dula hiding schene
(1,8 — 1,n) for palette images with geoiour — 1
‘Table 2.5 ‘he payload, LR and PSNR for the near optimal data hiding scheme
(2,9,8) for gray images with Gestour — 3 - Table 2.6, The pagloud, ER und PSNR for the neur optimal data hiding scheme
(2,9,8) for palette images with qcotour =3 -.- 2-22 eee Table 2.7 ‘The comparisons of embedding and extracting time hetwcen the
chapter's and Chang et al.'s approach for the same optimal data hiding scheme (1,0 [lozo(N + LJ), where N — 2" — 1, for the binary nage with qootouwr = 1- ‘Lime is given in second unit 2
Table The performing steps of the MN, algorithm :
Table 3.2 Baporimentul results on rundd provlem ee
Table 3.3 tzperimental results on rand& problem
Table 3.4 Ezpertmental results on randif problem
Table 8.6 Eiperbnentil results on rund64 problem © ee
Table 3.7 Experimental results on rand128 problem oo
Table 3.8 tzperimental results on rand256 prottem
'Table 3.9 Lzperimental results on a genome sequence (with [23] — 4
Table 3.10 Experimental results on a protein sequence (with || — 20)
‘Vable 4.1 The ef, of p— bacdabead
Table 4.2 The comparisons of the les(p, x) computation time for n— 50666
Table 4.3 The eompurisons of the lea(p,xr) conpulation lime form — 109396
Trang 11INTRODUCTION
In the modern life, when the use of computer and Internet is more and more essential, digital dava (information) can be copied as well as accessed illegally As a result,
information securily becomes increasingly important There are two popular methods to
provide security, which are cryptography and data hiding 2, 5, 6, 20, 56, 62, 81] Cryptography is used to encrypt data in order ta make the data unreadable hy a third
party [5] Data hiding is used to embed data in digital media Based on the purpose of
the applicution, dala hiding is generally divided into sloganography that Indes the existence of duba lo protect the embedded data und watermurking tat proteely the copyright ownership and authentication of the digital media carrying the embedded data Steganography can be used as an alternative way to cryptography Ilowever, steganography will hecame weak if attackers detect existence of hidden data Hence, integrating cryptography with steganography is as a third choice for data sccurity
2, 5, 6, 12, 19, 61, 62, 81, 86, 93]
With the rapid development of applications based on Internet infrastructure, cloud
coulputing becomes ove vf the lotlest topics in the information technology area Indeed, it
is a computing system based on Luternet Lat provides on-demand services from application,
and system software, storage to processing data For example, when cloud users use the storage service, they can upload information to the servers and then access it on the Internet online Meanwhile, enterprises can not apend big maney on maintaining and owning a syslom consisting of hardware and sofware Although cloud computing brings many benelits for individuuly and organizations cloud sccurity ix sUill an oper problem wher doud providers can abuse their information and cloud users lose control of it hus, guaranteeing privacy of tenanta’ information without negating the henefits of clond computing seeme necessary [28, 38, 40 41, 60, 94, 102] Tn order to protect
duta ued to be caeoded belure outsourcing them to servers Unfortunately, eaerypuion makes the servers perform scarch on ciphertext much more difficult than on plaintext To solve this problem, many searchable encryption techniques have been presented since 2000
Searchable encryption does not only store users’ encrypted data securely but also allows
information search over cipherlext 26, 28, 29 38, 40, 60, 71, 85, 102]
Searchable encryption for exact pattern matching is a new class of searchable encryption techniques ‘I'he solutions for this class have been presented based on algorithms for [26]
or approaches to 41, 89] exact pattern matching
cloud users’ privacy, sensitive
As iu relricving iulormation frou pluimoxts, tke development of sourchable exery prion wilh upproximale string uratching, capability is necessary, where the seurcr string, con
be a keyword determined, encrypted and stored in cloud servers or an arbitrary pattern
28, 40, 71|
From the shove problema, together with the high efficiency of techniqnes nsing graph and
automala proposed by P T Huy etal, for dealing with problems of exuet pattern matching (2002), longest common subsequence (2002) and steganography (2011, 2012 and 2013), os
well as potential appheations of graph theory and automata approaches suggested by Late
Assoc Prof Phan Trung Huy in steganography and searchable encryption, and under
Trang 12the direction of supervisors, the disscrvution lille ussigned is research on development
of methods of graph theory and automata in steganography and searchable encryption
The purpose of the dissertation is to research on the development of new and quality solutions using graph theory and automate, suggesting their applications in, and applying
them to steganography and searchable encryption
Based on results published and suggestions presented by Late Assoc Prof Phan ‘Lrung Tiny in steganography and searchable encryption, the dissertation will foons on following four problems in these fields
~ Digjtal image skegauography;
Exact pattern matching:
- Longest common subsequence:
~ Searchable encryption
The first problem iy stated newly in Chapter 2, the Usrve remaining problems are recalled and clarified in Chapter 1 In addition, background related to these problems is presented clearly and analysed very carefully in Chapters of the dissertation
For the first three problems, the dissertation’s work is to find new and efficient solutions
using graph theory and automata, Then ey will be used and applied wo sulve Une last
problem
‘he dissertation has been completed with structure as follows Apert from
Tntreduction at the beginning and Conelnsion ar the end of the dissertation, the main content of it iy divided inte five chapters
Chapter 1 Preliminaries Thí: chapter recalls basic knowledge indicated throughout the dissertation (strings, graph, deterministic finite automata, digital images, the basic model of digital image steganography, some parameters to determine the quality of digital image steganography, the exact pattern matching problem, the longest couumon subsequence probiem, aud searchable cucryplion), re-prewets nporkanL concepts and results used and rescarched on development in remaining chapters of the dissertation {adjacency list, breadth first search, Galois field, the fastest optimal parity
assignment method, the module method and the concept of the maximal secret data
ratio, the concept of the degree of fuzziness (appearance), the Knapsack Shaking
approach, and the definition of a cryptosystem)
Chapter 2 Digital image steganography based on the Galois field using graph theory and automata Firstly, from some proposed concepts of aptimal and
uear oplimal secret data hiding schemes, this chapler states the interest problem in digital
image slgauography Secondly, tle chapter proposes a new approach bused on the Galuis field using graph theory and automata to design a general form of steganography in binary,
gray and palette images, shaws sufficient conditions for existence and proves existence of
some optimal and near optimal secret data hiding schemes, applies the proposed schemes
to the process of hiding a finlle sequence of sveret data in an imuge and gives sccurily analyses Finally, the chapter presents experimental results to show the efficiency of the
proposed results
Chapter 3 An automata approach to exact pattern mat
ing This chapter
proposes a fexible approach using automata to desigu au ellective ulyoritiun for exact
pattern matching in practiec In given easca of patterns and alphabets, the efficiency of
the proposed algorithm is shown by theoretical analyses and experimental results.
Trang 13Chapter 4 Automata technique for the longest common subsequence problem This chapter proposcs two cflicient scquentiol and parallel algorithms for computing the length of a longest common subsequence of two strings in practice, using automata technique Theoretical analysis of parallel algorithm and experimental results confirm that the use of the automala technique in desiguiug algorithuns for solving the longest common subsequence problem is the best choice
Chapter 5 Cryptography based on steganography and automata methods for searchahle encryption This chapter first proposes a navel cryptasystem based on
a data hiding scheme proposed in Chapter 2 with high security, Additionally, ciphertexte
do nov depend on vbe inpul imnige size us existing hybrid Leulumiques of cryptography aud steganography encocing and embedding are done at onee ‘he chapter then applies results using automata technique of Chapters 3 and 4 to constructing two algorithms for exact
and approximate pattern matching on secret data encrypted by the proposed cryptosystem
These ulgorithis have Of) line complexity in he worst case, logether with au aysumptioa that the approximate algorithm uses [(1_ cm] processors, where c,m and n arc the crror
of the string similarity measure proposed in this chapter and lengths of the pattern and secret data, respectively Tn searchable encryption, the oryptosystem can be used to encode
ide aud pattern matching algoritlius can be used to perform pattern search on cloud providers side
‘Lhe contents of the dissertation ave written based on the paper |'I'l] published in 2019, the paper [T4] sccepted for publication in 2020 in KSIT Transactions on Tnternet and Information Systems (IST), and the papers [T2, T3| published in Journl of Computer Seiene aud Cybeenclies in 2019 The nai results of ue dissertation have buou presented
at:
- Seminar on Mathematical Fonndatians for Compnter Science at Institnte of Mathematics, Vietnam Academy of Science and Technology,
- The 9! Vietnun Mathowaticul Congress, Nis Trung, August 14-18, 2018,
- Seminar at School of Applicd Mathematics and Informatics, Hanoi University of Science and ‘Technology
and decode secret data on users
Trang 14CHAPTER 1 PRELIMINARIES
‘Lhis chapter will attempt to recall terminologies, concepts, algorithms and results which are really needed in order tc present the dissertation’s new results clearly and logically,
as well as help readers follow the content of the dissertation easily The backgrmmd Inawledge re-presented here consists of hasic structures (Section 1.1: strings (Snsection 14.4) graph (Subsection 1.1.2), deterministic finite automata (Subscetiou 1.1.3), und the Galois field Gi'(p™) (Subsection 1.1.4}), digital image steganography (Section 1.2), exact pattern matching (Section 1.3), longest common subsequence (Section 1.4) and searchable
where n is a positive integer
A special string is the empty string having na letters, denoted by ¢ The length of the sling © is the number of letters in i, dewoved by [z| Then |e = 0
Notice that for the string x = 2[1]2[2] 2[m], we can also write x = 2[1 n] in short
‘The set of all strings on the alphabet 3: is denoted by 3" ‘Lhe operator of strings is concatenation that writes strings as a compound The concatenation of the two strings ui and ug is denoted by wiug
Let z be a string A string p is called a substring of the string £, if x = wypuy for some strings u, and uz In case w1 = € (resp up = €) the string p is called a prefix (resp suflix)
of the string x The prefix (resp suffix) p is called proper if p 4 x Note that the prefix
or che sullix ean be empty
1.1.2 Graph
Besides some basic concepts in graph theory, this subsection recalls the way representing
a graph by adjavency lists and breadihh first search [82) These are used in Chapter 2
A finite undirected graph (hereafter, called a graph for short) G = (V,E) consists of 9 nonempty finite set of vertices V and a iinite set of edges, where each edge has either one
or two vertices associated with it A graph with weights assigned to their edges is called a
Trang 15An edge counceting # vertex Lo iluelf iy culled u loop, Multiple edyes ae edges eouneeving the same vertices, A graph having no loops and no multiple cdgcs is called a simple graph
dn a simple graph, the edge associated to an unordered pair of vertices {i,j} is called the
edge {22}
Two vervices ý and 7 in a graph @ are called adjaoenb if they are vertives of an edye of
G
A graph without multiple edges can be described by using adjaceney lists, which specify
adjacent vertices of any vertex of the graph
Example 1.1 Using adjavcucy lists, the simple graph given in Figure 1 can be represented as in ‘lable 1.1
Figure 1.1 A simple graph Table 1.1 An adjacency list representation of the simple graph given in Figure 1.1
Breadth First Search:
Inpul: A counceted simple geaph G with vertices ordered as #1 j2,
avn:
Gutput: A spanning tree T
Set T to be a tree consisting only i1;
Set 7 to be an empty list:
Trang 16For cach adjacent vertex j of ¢
If (j is not in L and T)
C Add j to the end of Fi;
Add j aud the edge [i,f} to T;
Figure 1.2 A spanning tree of the graph given in Higure 1.1
A graph with directed edges (or arcs) is called a cirectecl graph Hach arc is associated with the ordered pair of vertices In a simple directed graph, the arc associated with the ordered pair (1,7) called the are (i,j) And the vertex ¿ is said to be adjacent to the vertex
j and the vertex j is said to be adjacent from the vertex i
1.1.3 Deterministic Finite Automata
Study ou the problem of the construction aud the use of deterministie finite automata
is one of objectives of the dissertation Hence, this subsection will clarify this model of computation [44, 82]
Tefinition 1.1 ([44]) Let ©: be an alphahet A deterministic finite automaton (hereafter,
called an automaton for shoet) A — (5,0, 40.6 F} over 3 consists of:
© A finite sub @ of elements called states,
* An initial state gy, one of the states in Q,
* A set; F of final states The set F is a anhset af Q,
# A scare transition function (or simply, transition fimetion), denoted hy ổ, thar takes
ag arguments a state and o letter, ud returns a state, so that 6: @ x E> Q,
» The transition function 5 can be extended so that it takcs a state and a string, and tetums a, state l’ormally, this extended transition function é can be defined recursively by
Trang 17Analteruative and simple way preseuting an wulomaton is to use the uolation “trausition diagram” A tronsition diagram of an automaton A = (5, @, 90,4, F) is a directed graph given as follows +44]
a) Each state of Q is a vertex
b) Let ¢! — S(q.a), where g is a stale of Q and u is a letter of E Then the transivion diagram has an are (q',9) labeled a If there are several letters that cause transitions from
¢ to q, then the arc (q’,q) is labeled by a list of these letters
¢} There is an arrow into the initial state qu This arrow does not origmate at any vertex
d) States not in F have a single circle, Verliees corresponding \e final states are marked
by a double circle
Example 1.3 Consider an automaton A = (.Q,q0.,P) over = {a,b}, where Q= (mái), E = {a2}, and 6 is given by the following table Then the tunsivion diagram of A is shown in Figure 1.3
Figure 1.8 The iransilion diegram of A in Brampte 1.3
Tefinition 1.2 ([B2]) A strmg p is said to be aecepted by the antomaton
A= (¥,Q, 40,4, F) if it takes the initial state qo to a final sta
astale in F
1.1.4 The Galois Field GF(g")
This subsection descrihes how to construct, a finite field with p™ elements, called the Galois ficld GF(p"), weve p is prime and we 2 Lis an inboger 88] The algcbruie structure will be used in Chapter 2
Let p be a prime number Define %p[z| to be the set of all polynomials with the variable
zr, whose caefficients belong to the field Z, Addition and multiplication in Z,lat_ are defined
in the usnal way and then reduce the coefficients modulo p at the end
For fiz) ¢ Zp denoted by dey(f), is the largest expoucul of xin f(z) A polynomial f(x) € Zp 2] is called to be irreducible if there does not exist
Trang 18polynomials fa(z), fale) ¢ Zplz] such that
4
— file) faa),
where deg(fi) > 0 and deg( fo} > 0
Let f(z) ¢ Zple] be an irreducible polynomial with degif) = m = 1 Define Zlzl/(ƒ(2)] to be the set of p™ polynomials of degree at most m—1 in Zp[z] Addition
and muttiplication in Zya]/(f(a)) are given as in Z)[x], followed by a reduction modula f(x) Then Zp z]/(f(e)) with these operations is a field having p™ elements, called the
Galois ñcld CFip”) Nole that for p is prime aud m = 1, the Calob ficld CF(w") is unique
1.2 Digital Image Steganography
The interest problem in Chapter 2 is digital image steganography This section will recall the concept of digital images, the basic model of digital image steganography, some parameters to determine the efficiency of digital image steganography and lastly re-present vemults researched on development and used in Chapter 2 snch as the fastest optimal parity assignment (FOPA) method, the module method und Ube concept of Uhe maxiaral secret
data ratic (MSDR) [18, 20, 21, 39, 49, 50, 51, 53, 61, 63, 65, 76, 78, 104]
A digital image is a matrix of pixels Lach pixel is represented by a non negative integer
number in the form of a string of binary bits This value indicates the colour of the pixel
39)
Note that based on the way representing of colours of pixels, digital images can be
divided into following clifferent types [78]
1 Binary image: Fach pixel is represented by one bit Tn this image type, the colour of
a pixcl is white, “1” value, or black, “0” value
2 Gray imuge: Buch pixel is ly pieally represented by cight bits (culled 8-bit gray imuge)
‘Vhen the colour of any pixel is a shade of gray, from black corresponding to colour value
“0? to white corresponding to colour valne “253
3 Red green blue image: Each pixel ia usnally represented by a string of 24 hits (called 24-bit RGB image), where the first 8 bits, the ucxt 8 bits and the last 8 bits correspouds
to shades of red, green and blue, spccifying the red, green and blue colour components
of 24 bi
iu the file with the invage The size of the palette iy Uhe same as the Ioupilt of # bit string represcuting a pixel aud is limited by § bits For u string of 8 bits, cull palclic images 8-bit palette images
The chjective of digital image steganography is ta protect data by hiding the data in
which represent, all colours as in RGB images vised in the image and contained
a digital image well enongh so that unauthorized users will not even be aware of their
existence [21, 18] Figure L4 shows Ube busie modal of digital image stegauceruphy, where the cover image is a digital image used as a carticr to embed secret data into, the stego image is digital image obtained after embedding secret data into the cover image by the
8
Trang 19function block Embed with the secret key on the Sender side For steganography generally, the s be extracted fully by the block Extract with the secret key on the Receiver side [30, 61, 63, 76)
The total number of the secret data sequence bits embedded in the cover image is called
a Payload, Corresponding to a certain Payload, to measure the embedding capacity of the cover image, the embedding rate (ER) is used and defined as follows [104]
eret data needs
Embed Py Stego ——N|Mo
Send to
Figure 1.4 The basic diagram of digital image steganography
The peak signal to noise ratio (PSNR) is used to evaluate quality of stego image Based
on the value of PSNR, we can know the degree of similarity between the cover image and
stego image If the PSNR value is high, then quality of stego image is high Conversely, quality of stego image is low In general, for the digital image, PSNR is defined by the following formula [20, 53]
where B7), Gít 7), Rứ 3) !( 3), G1 2) and TF(, 7) are the colour value of the Blue,
Green and Red components of a pixel at position (i.j) in the cover and stego image, respectively For human’s eyes, the threshold value of PSNR value is 30dB [20, 53, 65, 104],
it means that the PSNR value is higher than 30dB, it is hard to distinguish between the
cover image and its stego image,
Let G be a palette image and P = {e\.e3 cn} be its palette, where cj is the colour
of a pixel of G corresponding to the colour index i Each colour ¢ in P is considered as a
vector consisting of red, green and blue components Suppose d is a distance function on P
The FOPA method [50] tries to get functions Next, Nert: P+ P, and Val, Val: P + Zo,
ied for all ¢ € P as follows
Trang 20
1 dle, Nexi{e)} = min gre pale, v),
2 Val{e) = Val Next{c}) | 1 on the ficld Zo,
Cali Gp — (Vp, Lip) a weighted complete undirected graph of the palette image Cf, where
Vp — P and the weight of the edge {¢,’} is d(c,¢} The funetion Nearest, Nearest: P + P,
is given by Newresi(e] — c! holding d{c,¢) — mingre Pde, 0) A rho forest F — (V,E) is
a directed graph with vertices weighted by the function Val, where V = ¥p, E is a set of all ares (v, Neat(v)), the vertex v has the weight Val(v) for ally € V ‘The construction of
a algorithm determining 7 is the essence of the FOPA method
Algorithm for FOPA:
Inpul: A weighted complic undirected graph Cp, Ube Lauetion Nearest
=(V,F)
Choose a vertext oC P, set ¥ = {ce}, and sct C = PA fe};
Set Vai(c) — 0; // Or 1 randomly
While (C is not empty) // Update #
b) Initialize op — », set Valen} — 0 (or L raudoaily), by a finite loop, ñnd a longest
sequence af k-+1 different elements in P consecntivel 10,1\, ,ty, such that,
Nearest(u;) — vig for i— TR Loy © Cy vp ¢ 6 or ty ¢ Vi and set
Nectu;) — v4) 1.1 - GT:
bi) Case vy ¢ Cs Sct Val(u;) = 11 Vallez4),i = 1,k and Nevi(vz) = 1% 1;
Set V— VU {ag in re} and Ó — Nho, tì, 1k]
b2) Case vy € £: Set Val(u;) = 14 Eallu 1), = &— 1, ,1,Uy
Set Vs Vu toni (} and Ở = ỞN{H0,ĐỊ v0k L}: }
U — M\{U} is an unique Lbase of / [Si] ‘I'wo functions ert, Next Cg 3 Co, and Val, Val Cg + Zo, satisfying the condition Val(c) — Vall Neci(c}) +1 on the ring Ze, are defined in [49] Suppose Wat for V2 U1 — (it Iv} is an arbitrary image block
of G, K ={K1, Ka, , Kyl Bi € Za,
his a surjective function from J to U In the module method, d is considered as a, secret dara, embedded in and extracted from the image block T with the key K by the blocks Embed and Extract as follows “19, 51]
Trang 21The block Eaubed (cuibedding din I):
Step 1) Compute m = 3° ALA) Valfe) 1K,
Step 2) Case d —m: Keep / intact;
Case d £m: Find v & U such that d-+(—m) — v Based on + and h, determine
an element Z; of I Then change J; wo Ff — Newt(I:);
Retum J’;
"The block Extract {extracting d from I’): d= 30", AU) Vell!) + Ki)
Definition 1.4 ([49]) MSP#, (1) is the largest number of embedded bits of secret data
in an image block of WV pixels by changing colours of at most k pixels in the image black, where k, N are positive integers
Given a positive integer qectour, call qeofour the number of different ways to change the colour of each pixel in an arbitrary image block of N pixels According to [49]
MSDRAN} = logy 1 — &aimrCẰ + đãng CÑ + + Peotone Ch) 33
1.3 Exact Pattern Matching
This section will restate the exact pattern matching problem, aud recall the concept of
the degree of fuzincss (appewrance} usu in Chapter 3 (24, 52, 68]
Let x be a string of length n Denote the substring x[i afi + 1] 2[j] of x by zfé.3]
for all 1 <i <j <n, the i element of x by 2[i] and i is called a position in a Let
be a substring of length m of x, where m is a positive integer, shen there exists ¡ for
L<é<n mi 1 such that poafid im 1] And yay that i iy an oveurrenee of pin 2
or p oceurs in a ul position ý
Definition 1.5 (68|) Let p be a pattern of length m and a be a text of length n over
the alphabet Ð Then the cxact pattcra matching problem is to find all occurrences of the pattern pin x
‘Lhe following example uses the Brute Force (BF) algorithm [24] to demonstrate the most original way solving this problem
Table 1.2 Yhe performing steps of the BI algorithm
Trang 22
Example 1.4, Given » patian p = fab aud a text « = dlulfkfuhe Then there are two ocewrences of p in x as shown below: dfahfkfaha The BF algorithm is performed by the following steps presented in ‘Lable 1.2, the bold letters correspond to the mismatches, the underlined letters represent the matches when comparing the letters of the pattern and the lext We kuow thal many letlers scauned will be scanued again Ly the BF algoritiun because each time either a mismatch or a match occurs, the pattern is only moved to the right one position
Chuptcr 3 uss the degree of Luzainess in [52] lo determine the longest prefix of the partern in the text at any position However, this terminology can lead to several misunderstandings for the readers, $0 throughout this dissertation, the degree of
fuzziness will be replaced with the degree of appearance The concept of the degree of
appearance is restated as follows
Detinition 1.6 ([5%]} Let p be a pattern and x be a text of length n over the alphabet
5 Then far each I < i <n, a degree of appearance of p in x at position 7 is equal to the
length of a longest substring of @ such that this substring is a prefix of p, where the right
end Ictter of the substring is z[ï]
Notice thar obvionsly, if the degree of appearance of p in m ar an arbitrary position cauals p , then a tnatch for pin x occurs ab position ¢ |p| | 1 Plgure 1.3 illustrates the couvept of the degrce of uppeacunce of ve patrern p inc
The depres of appearance of p in x at the position being scanned is equal to 4
ed
(a prefix of p)
Figure 1.5 ‘the degree of appearance of the pattern p
1.4 Longest Common Subsequence
This scction will recall the longest common subscqucnee (LOS} problem, and the Knapeack Shaking approach addressing the problem studied on development in Chapter 4
2, 47, 94, 101]
Definition 1.7 (101]} Let p be a string of length m and u be a string over the alphabet
3 'Phen u is a subsequence of p if there exists a integer sequence ji, j2, , je such that
LE je jn < jp mand u— pliilplial-plidl
(i) a1 is a common subsequence of p and sr,
Gi) There dora not exist any common suhsecence » of p and x such that, vl > ||
12
Trang 23Denote an arbitrary longest common subsequence of p and x by LCS(p 2) The length
of a LCS(p,t) is denoted by les(p,z)
By convention, if two strings p and x does not have any longest common subsequences,
then the Ics(p,x) is considered to equal 0
Example 1.5 Let p = bgcadb and x = abhebad Then string bead is a LOS(p,c) and les(p, +) = 4
Let p and xr be two strings of lengths m and n over the alphabet Đ,rm < n The longest
common subsequence problem for two strings (LCS problem) can be stated in two following
forms (24, 47]
Problem 1 Find a longest common subsequence of p and z
Problem 2 Compute the length of a longest common subsequence of p and a
The simple way to solve the LCS problem is to use the algorithm introduced by Wagner and Fischer in 1974 (called the Algorithm WF) This algorithm defines a dynamic
programming matrix L(m,n) recursively to find a LCS(p,r) and compute the les(p, x) as
follows [94]
Li j) = 9 Li 1, 9-1) +1 pli) = [i],
max{L(i,j —1),L(i—1,j)} otherwise,
where L(i, 4) is the les(pl1 i), 2[1.j]) for 1 < ¡ < m, 1< j <n
Example 1.6 Let p = bụcadb and x = abhebad Use the Algorithm WE, the L(m,n)
is obtained below Then les(p,2) = L(6,7) = 4 In Table 1.3, by traceback procedure, starting from value 4 back to value 1, a LCS(p,x) found is a string bead
Table 1.3 The dynamic programming matrix L
of the form (j1, j2, ,Jt) is called a location of w in p
From Definition 1.10, the subsequence w has at least a location in p If all the different locations of u are arranged in the dictionary order, then call the least clement the leftmost location of u, denoted by LeftfD(u) Denote the last component of LeftfD(u) by Rmp(u) tứ].
Trang 24Example 1.7, Let » = aabeadabed and w = abd Thon « is a subsequence of p and hus seven differcnt locations in p, in the dictionary order they arc
(1, 3,6), (1 3, 10}, (1,8, 10), (2,8, 6), (2,3, 10), {5,8 10), (7,8, 10)
It follows that Left{D(u) — (1, 3,6) and Rrap(u} — 6
Definition 1.11 ((47]) Let p be a string of length m, Then a configuration C of p is defined ax follows
1 Or C is the empty set ‘Then ( is called the empty configuration of p, denoted by
Cn
2 Or C= {ay,29, , 04} is an ordered set of t subsequences of p for 1 << m such that the wo [ollowing eoudilions ace yatistied
(i) Bor all 1 <i <¢, Jae =
Vor all xj,24 © Cif [ag
> [z¡| then fang (x9) > Ring (ej)
Set of all the configurations of p is denoled by Conlis(g)
Delnition 1.12 ([47]j Let p be a string of length ye on the alphabet E.G ¢ Conlig(p) and œ C5 Then a slate transition function g@ on Conlg(p} x 5 such thái
p: Contig(p) x i + Contig{p) detined as follows
1 p(C,a) — Œifa ép,
2 pÌCn, g) — {a} in €p
3 Sot Œ' = @(Œ, ø) Suppose a C p and C = {a1,22 ,24} for 1 <t <m Then C is
determined by a loop using the loop control variable i whose value is changed from t down
to 0:
a) For i — 2, if the letter a appears ar a location mdex in p such that index ia greater
than Rriy(w_), then apa — (0:
b) Loop from ¢ = t — 1 down to 1, if the letter a appears at a location index in p such
that index € (Amp(x.), Rmp(wi/1)), then zi )1 = nie:
©) For # = 0, if the letter a appears at a location index in p such that indea is smaller
than Rrop(21), then «1 — a;
d) Œ=Œ
4 ‘Lo accept an input string the state transition function ¢ is extended as follows
ý: Config(n) x 5* 4 Config(p) such that for all C’ € Config(p),s € U",2 € U.¢(C, as) = glolC,a),s) and (Ce) = C Example 1.8 Let p — bucdabeud aud O — {e,ud,bab} Then C is a configuration of p and 6" = 9(C,a) = {a, ad, ada, baba}
In 2002, P T Huy et al introduced a uethod Wo solve the Problen L by using the automaton given as in the following theorem In this way, they named their method the Knapsack Shaking approach [47]
Theorem 1.1 ([47]} Let p and x he two strings of lengths m and n over the alphabet
Lym <n Let Ay = (2,G, 90, ¢, £') corresponding to p be an automaton over the alphabet
i, where
@ The set of states Q = Config(p),
Trang 25© The initial slate qy = Co,
« The transition function ¢ is given as in Definition 1.12,
* ‘Ihe set of final states i — {Cy}, where Cy, — plga, 2)
Suppose Clo, — {21,22, , ve} for ct <m Then
J For every subsequence u of p andx, there exists oi € On, 1 <1 <b such that the tuo following conditions are satisfied
) |u| = |>i,
ñỤ Tm,(m) << Ram,(t
2 ALES, 2} equals xe
1.5 Searchable Encryption
This section elarifies the term of searchable encryption (SE) and recalls the definition
of weryplosysium They will be studied und used in Chapter 5 [26, 40, 60, 85, $8, 102] Cousider » problem to ovcur in cloud security as lollows (60, 85, 102] Cloud tenants, for example enterprises and individuals with limited resource ineluding software and hardware, store data with sensitive information on cloud servers Assume that these servers cannot,
he fully trusted ‘This means they may not only be curious ahout the users’ information but also abuse the data reccived Then users wish to cnerypt their data before uploading them to servers Because of limitations of cloud users’ information technology system, users also wish that cloud praviders can help them perform information search directly
on ciphertexts However, encryption brings difficulties for ververs to do search on (he
encrypted data These lead to a problem that is bo Gnd a solution to satisfy the two wishes
of cloud users when they choose cloud storage service,
SI is a way to solve the above problem Tt is indead a system consisting af twa tnain components, a oryptosystem is used to anende and decode on clond users side and
algorithins for searching on cuerypted date are done on doud providers vide [40,1
In cryptography, SE ean be civher searchable symmetric encryption (SSE) o
asyrametric encryption (SAE) In SSE, only private key holclers can create encrypted data and produes trapdoors for search, Tn SAF, users who have the public key can make ciphertexts but only privare key holders can generate trapdoors [26, 102
Since the dissertation proposes a new yyuunctrie cneryplivn system for SSE in Chapter
5, the correctness of this system needs ta prove In this dissertation, the components and properties of a eryptosystem defined in [88] will be considered as a standard form te verify
Here recalis this definition
‘archable
Definition 1.13 ([88]} A cryptosystem is a five-tuple (P,C.K,£,D} such that the fallawing properties are satisfied
1 P is a finite set of plaintexts,
2 C iva finite act of viphortexts,
3 K iv a fiuite sot of sourct keys,
4 For every & & K, there exists an encrypting function eg & & and a corresponding decrypting finetion dy ¢ D, where ey: PC and dys CP holds dyleg(n)) = a for each w € P
Trang 26CHAPTER 2 DIGITAL IMAGE STEGANOGRAPHY BASED ON THE GALOIS FIELD USING GRAPH THEORY
AND AUTOMATA
This chapter first: proposes concepts af optimal and near optimal secret, data hiding schemes The chapter then proposes a new digital image steganography approach based
on the Galois field Gf'(p™) using graph and automata to design the data hiding scheme
of the general form (k, N, [logy p™"|) for binary, gray and palette images with the given assumptions, where kerjn,N ae positive integers and p is prime, shows sullicient
voudilions for cxislenee und proves caistenee of some optima and near uptimal secret
data hiding schemes ‘'hese results are derived from the concept of the maximal secret
data ratio of embedded bits, the module merhad and the FOPA method proposed hy
P.T Huy et al in 2011, 2012 and 2012, recalled in Section 1.2 af Chapter 1 An application of the schemes to the process of hiding w finite sequence of seuret date in an image is also considered Security analyses and experimental resuits coniirm that the proposed approach can create steganographic schemes which achieve high efficiency in embedding capacity, vienai quality, speed as well as security, which are key properties of steganography
The results of Chapter 2 have been published in [T1]
2.1 Introduction
In steganography, depend on the type of digital media there ure many types of steganography such as image, audio and video stcganography [4, 5, 20, 61, 62 73, 76, 96] Ilowever, image steganography is used the most popularly because digital images are
often transmitted on Internet and they have high degree of redundancy Furthermore, the technique of image sleganography is mainly image steganography in spatial domain,
steganography is achieved by changing colows of some pixels directly in the image
17, 57, 62, 76, 100] ‘Whe chapter’s work focuses on steganography in digital images in
spatial domain
Digital image slegauegraphy eludies the slegunographic schemes, where cach sehen consists of an embedding function ond extracting [anclion The embedding function shows how to embed! secret data in the digital image and the extraction function describes how
to extract the data from the digital image carrying the embedded data [46, 87]
In digital image steganography, a few main factors must be taken in consideration when
we design a uew seorek dat hiding scheme, which ure embedding eapucity of the cover image, quality of stego image and sceurity However, a3 well known, embedding capacity
of the cover image and quality of its stego image are irreconcilable conflict A balance
achieved of the two factors can be done according to different application requirements In
addition to the three main factors, speed of the embedding and extracting [unctions also
16
Trang 27pluys an imporluut role in steganographic selomes, Il is voupidcred as a last coustralut to determine ficiency of schemes [46, 53, 6ð, 69, 87, 104]
‘Vhe simplest and most popular spatial domain image steganography method is the least significant bit (LSB) substitution (called LSB based method) For 24-bit RGD and &-bit gray inages, iu this method the data is embedded in the cover image by elauging vhe least significant bits of the image directly, therefore it becomes vulnerable to security attacks
18, 62, 72, 75, 76, 97, 104] Ki Stego method for palette images is similar to the commonly nied TSB based method Towever, this method does not guarantee quality of stego images
30, 37, 97] To allevinte this problem, in 1999, Feidrich proposed u ucw metliod bused
on the purity bits of colour indexes of pixels in pulelte cover immpes, culled the parity assignment (PA) method Then EZ Stego method can be considered as an example of
PA method [36, 5U] In 2000, Lridrich et al improved the method by investigating the problem of optimal parity assignment for the palette and this version is called the optimal purily avsigument (OPA) method [87] To eusily coutrol quality of stego images, Huy et
al introduced another OPA method, called the FOPA method, in 2013 [50] Unlike the colour and gray images, each pixel in binary images only requires one bit to represent colour values (black and white}, therefore, modifying pixels can he easily detected So, binary
image steganography is a more dilficuly and challenging problem, For binary images,
block based method is usually used to maintain quality of stego images In this method,
the cover and stego images are partitioned into individual image blocks of the same size,
embedding and extracting secret data are based on the characterisric valnes calculated for
the blocks WL (Wu et al., 1998), PCT (Pus et al., 2000), modified POT (Tseng et al, 2001), CTL (Chang ct al 2005) seteimes are all well known and block based for binary images [21, 18, 48, 75, 92]
Given a deoiour Which is the number of different ways to change the colanr of each pixel in
an arbitrary image block, and use the concept of the maximal secret data, ratin of emhedded
dils peoposed by Huy eb ul in 2011 [49], the chapter introduces concepts of optimal and near optimal scerct data hiding schemes Actually, the optimality of steganographic schemes has been considered in [37, 46] Llowever, the authors used the time complexity
of embedding and extracting fimetions, or the concept of optimal parity assignment thar
ininimizes the energy of the parity assignment for the coluur paletle to determine whether
a steganographic scheme is optimal
By the block based method, call a secret data hiding scheme a data hiding scheme
ik, N,r), where k, Nr are positive integers, if the embedding function can embed 7 hits
of secret data in each image block of V pixels by changiny colours of al most & pixels in the image block The chapter's work is concerned with Une problen of designing optimal
or near optimal data hiding schemes (é, Vr) for digital images (binary, gray and palette
images)
Based an the module approach and the (FOPA) method nsing graph theory proposed
by Huy et al in 2011 und 2013 “49, 50], the chapter proposes a new approueh bused on the Galois ficld using graph and automata in order to solye the problem By this approach, the chapter proposes schemes consisting of the optimal data hiding scheme (1,2"— 1,n) far binary, gray and palette images with dectour — |, Where nis a positive integer, the near
optimal data hidiug scheme (2,9,8) aud the optimal dala iiding scheme (1,5,4) for ray and palette images with yooiour = 3 Security analyses show that an application of these schemes to the process of hiding a finite sequence of secret data in an image can avoid
17
Trang 28detection from brute-force wt tacks
The cxperimentel results reveal that the cfiicicncy in embedding capacity and visual
quality of the near optimal data hiding scheme (2, 9,8) for gray images with qcotour — 3 is indeed better than the efficiency of the IICTII scheme [104] The embedding and extracting
time of the proposed approach are faster (han Unat of Lhe Chang et al.’s approach [18] For the near optimal date, hiding scheme (2,9.8) for palette images with qeolour = 3 and the
optimal data hiding scheme (1, 2" — 1,7) for palette images with geoiur = 1, values of BR can be selected suitably to achieve acceptable qnaliry of the stego images
The rest of the chuptor is organized us fellows, Section 2.2 gives some new eoucepls and slates Ube chopler’s digital image steganography problem Section 2.3 cousialy of two
Subsections 2.3.1 and 2.3.2 Subsection 2.3.1 introduces mathematica! basis basecl ou the
Galois field Gi'(p™) for the digital image steganography problem, where p is prime and
m is a positive integer Subsection 2.3.2 firstly proposes a digital image steganography
approsch buved on the Culoly Geld GF") using graph und automutu to design the data hiding scheme of the general form (4, N, [logo p"|) for the given assumptions, where
k,m,n, are positive integers and p is prime Secondly, the subsection gives sufficient
conditions for existence af the optimal data hiding schemes (1, 22>}, [logy p™]) and
we ‘| [loge p"" |) with gooiour = Pp — 1 'Thinhly, the
subsection shows that there exists the optimal data hiding scheme (1,2”— 1,n) for binary, gray and paletre images with qooau — 1, where n is a positive integer At the end of Subsection 2.3.2, the way applying the data hiding scheme (b, N, [logy p™[} to the process
of hiding # finive sequence of sere! data of length [loge p"] bits in au image is considered Subscction 2.4 proves thet there cxist the near optimal data hiding scheme :2,9,8] and the optimal data hiding scheme (1,5,4} for gray and palette images with qoofour — 3 Section 2.5 shows experimental results in order to evaluate the efficiency of the proposed dat hiding schemes and approach, Lastly, sume conclusions are drawn from the proposed approach and experimental results in Section 2.6
2.2 The Digital Image Steganography Problem
This soction gives some now coneupls and slates tle chapter's digital imege
steganography problem
Definition 2.1 A block hased secure data hiding scheme in digital images (for short, called adata hiding scheme) ia a five-tuple (Z,.M 1, Em, Fx), where the following conditions are sutisfied,
1 Z is a set of all image blocks with the same size and image type,
2 Mis a finite cet of secret elements,
3 Kis a finite set of secret keys,
4, 2m is on embedding function to embed a scerct clement in an image block, Em: Ix MxK>T,
5 Bais an extracting function to extract an embedded secret element from an image black, fn: Ex KM,
6 Bul Bal7, M.K),K) — M for all (7, MK} CD x Mx XK.
Trang 29Definition 2.2 A duta hiding scheme (Z.M,X, Erm, Bx) is culled a data hiding seherne (k, N,v}, where k, N,r are positive integers, if each image block in Z has N pixcls and the embedding function £m can embed r bits of secret data in an arbitrary image block by
changing colours of at most & pixels in the image block
Definition 2.3 Vor a given qetour, a date hiding scheme (/,.V,r) is called an optimal dara hiding scheme if r — MSDR,(N) and there does not, exist a positive integer N’ such that N’ < N,v = MSDRi(N") Then N is denoted by Noptimum-
2.3 A New Digital Image Steganography Approach
‘This section introduces mathematical basis based cu the Galois field for the digital image steganography problem (Subsection 2.3.1), proposes a digital image steganography approach hased on the Galais field using graph theory and automata to design the data hiding scheme of the generul form (&, NV, [loge p""'|) for the given wssumplions, where k,m,n, N are positive intogers and p is prùne, shows sufficiont conditions for existence and proves existence of some optimal data hiding schemes (Subsection 2.3.2} Security analyses and an application af these data hiding schemes to the process of hiding a fini
> sequence
of secret dava in an image are considered in Subsection 2
2.3.1 Mathematical Basis based on The Galoia Field
This subsection conatrners mathematical hasis hased on the Galois field GF{p™) for
the digital image sleganography problem, where p is pric aud ye is w posilive inveger
{Propositions 2.2, 2.4 end Thoorcm 2.1)
Given the Galois field Gi"(p""), recalled in Subsection 1.1.4 of Chapter 1, where p
is prime and m is a positive integer Let GF*(p™) — {(m,z2, ,xa)|z¡ € GF(p”),
i- Gu}, where 7 is a posilive integer, wilh lwo operations of vecbor addition + and scalar
multiplication - are defined as follows
ety — (a+ yet ya Pantin)
a is referred to as the representative of [a] For simplicity, denote the
Lemma 2.1 For all 2,y ¢ GF"(p™), [2] 0 [y] = @ or
19
Trang 30
Proof Suppoxe [z] ily] # Ø, uhen there exists z in “z]r*[g] By Definition 2.5, z = aw = by Since a C CF(p")\{0),2 = 'by, Thus x ¢ [g] and therefore [2] C [y] Similarly, 99] C [zl
Propustion 2.1, The sel of all classes Jorns 6 partilion dƒ thé sei GE"(0P),
Proof For all x € G4"(p™), then x € [2] by Definition 2.5 ‘I'hus the union of all classes is
GP(pr) Tiy Lemma 2.1, any two distinct classes are digjomt The proof is camplete L_
Denote the seb of all classes bự [Œ#@)] ‘Lhis can be represented by GFP) — {lala € GF™{(p™)} The mmber of elements of a set $ is denated by [S|
we Elehu st wile € GF®0Ph NẮDR, then y — ax, y! — be for abe GFip™)\{0} Since
13 e xO then ä 2 6, Clearly, |ŒF(P®)N(BH — ph — 1 (see B8)} Bince g 7 0, then [x]| =p" — 1 By Proposition 2.1, |[G"{p")]\{0}] = Fe E
Definition 2.6 Suppose 6 c [GE*(p")|\{0} Then $ ig called a k-/Generatnra} far the set GF*(p!")], where ke is a positive integer, if for all [u] € [GF ))\10}, lo] £ {—¡ 2im|l aịc GP 501, [n]C 8.2=T 8< K]
Propostion 2.3 If § is @ k [Generators] for the set [GL"(p")
integer, then S does not depend on the choice of representatines oj
To prove that 5 docs not depend on the choice of representatives of classes, it suấfiocs to
show that 4 — 8 Ly the hypothesis [vf] — [ey], then 2; — 6jv{ Suppose [x] < A, then
oT and — a(t, 188 bivt) Clearly, aj; 4 0 by the definition of the class, then
Conversely, since b; 7 0, then there exists , |, vhus uf — b, ej Similarly, BC AL
Delinition 3.7 Let V be a vector space over a field K, 9 © V ‘Then # ís called a k-Generators for V, where k is a positive integer, if the two following conditions are satisfied a) For all ev! C-S, there dues nol exist «CK such that of = we,
b) For ull uc V0}, there exists b such thal 1 < uo < Band v = Yh ais, where
Tamma 3.2 fet §— {o1.0a m} fe a keGenemators for the nectar apsss GP"{pP"), Then 5 = (le, le] le] és & be{Cenerators] for the set [GE"")
30
Trang 31Proof Since is u k-Cenerators for GP*(y™), then lor all vv" © 8, unary does not exists
a in GF(p")} such that of = au By Proposition 2.1 and Definition 2.5, fu] # [ol] ond 3í] £ 0, for all tý C61 < ý < £ For all uc (GF*ip\(0), then u = P* cai,
BS hyn, © 5,07 & GF AO}, 3 — TF Thun fu] — [S3 aay] and hence
#
lyfe 1Ệ) agus ia, € GF(P™ {0}, 19.) € Si = TF
id
Lemma 2.3 Ect S’ = {[v1], [val, , [a]| be ø ke/Oenerudorel for thì sốt G9], Then
9— {m,ts, ,U/} is a &- Generators for the ueclor space ŒITHp”®)
Proof Fur all v € GF™(p™)\{0}, then
BR 0,5] c VỀ 2 nen lai © GH Ge" ){0} byl © 84 = TA <b}
TR kt < k For all |, fe'] © 8’, then there does not exists a in GF(p™) such thar
v' = av by Proposition 2.1 It means thet for all eu’ © S, there dues not exists @ in
Theorem 2.1 There exists S to be a k-Generaiors for the vectar space GF"(p™) with
— ÁN if and only if there exists 5! to be ä e{Generators) for the set |GF™w™)] with
Proof This is deduced immediately from Temmas 2.2 and 2.3 L
Propustion 2.4 Le! c be the munber of be{Geroraturs! of N clewents for the
CP), Then the wumber of k-Cencrutors of N elements for the vector space GE"p™) is c(p™ — N
1
Trang 32a finite veyuence of seorct duta of length [foxy pj bits in an imge (Proposition 2.7 and Security analysis (2.27))
Let Z be a set nf all image blocks with the same size and image type and assume that each image block in Z has N pixels, where N is a positive integer For simplicity, the slructure of un arbitrary image block J in Z cun be represented by
T— {h, 2, Tw}
where fj is a colour value for binary and gray images ar colour index in the palette for palette images of the i pixel in J with ¢ = TV Consider C to be a set of all colour values or indexes of pixcls of Z
Let M be a finite set of secret elements and set Mf — GF?(p")
Let K be a finite set of secret keys Vor all K€ K, also assume thar the structure of the key K is the same as the structure of Lhe image block I So, we ean write
Definition 2.8 A weighted directed graph G = (V F) is called a flip graph over the Galois
field GF(p) (for short, called a flip graph) if the two following conditions are satisfied
1 ¥ =C and for all v C ¥, the vertex v is assigned a weight by a function Val such
that Valu} c GFip™),
2 Por all cy € Via & GF(p™)\{0}, there exists a unique are {ey, cy’) in E auch that
this are is uysigned the weigh « aud Vel(ep) — Val(ep) tu (on ŒP(0))
Given a flip graph G, we denote by Adjaceni{cp,a) an adjacent vertex of cp
{Adjacent(cp,a) is adjacent from cy), where the weight « is assigned to the arc
(ep, Adjarert{ep,a)}
Assume that we build n flip graph @ — {¥, )
From the way lo determine the ure yeb 2 in Definition 2.8, assume that
Definition 2.9 Let £1 = (1,2, ,.N} x Gg c GE"), (cpl ©
function such that 8): CF"(p") x E1 > GF"tp™) dotined by Thou 6 iso
ð:(g, (6 cp)) = g— Val(cp}oi (on CFM ay")
tạ 8
Trang 33Detinition 2.10 Let Uz = G"{p™) , ,2WXGFWr"ÌNỮ _ the set of all
subsets of the set MO x GF(p™)\{0} Then & is a finetion snch that
Ttemark 2.1, Vor the case 1 4 9, then x —(—-q) £ 0 Since S is a & Generators for
GF"), S| — MS — fopug yen], thus Uhere caist RR Sku, CS,
L Sie < Nyue ¢ OF(W)\(0}, 0 — 1, such that» ( 9} — OK, at, (on GP),
So, 62 given in Definition 2.10 is a function
Definition 2.41 Leb 7 ¢ Z.M oc Mound K CK The automaton AU, MK) iso five-tuple (21, Q, qu, 8,7’), where
1 ‘The alphabet & — C UY,
2 The set of states Q — {ai — TFT — DM, wimg — Sila 10,4),
i—TN, an_1 — ð (0N, M)},
3 The initial state 0,
4 The set of tinal states T— {gn—1},
5 The transition function 6: Q@ x7 — Q, (qi i} — ai — 1, 8 lan, M) — gn 1
Remark 2.2 ‘Ihe set of states Q and the transition function d given in Letinition 2.11 are complerely determined based on the fnnations 61, 4y and it follows that the artomaton AUM, K) is coustructed uecurately in Definition 2.1L
Let an image block F< T a secret element M € M,a key K€ K Ry using the anromaton A(T, M, A) and the flip graph G two fanetions Bm and Erin the data hiding
scheme (Z,.M4,K, Bra, Ex) are desigued as follows
‘Lhe function bm (embedding M in J):
Remark 2.4 Consider — Em(I, M,F), by (2.5), im only changes colours of q| pixels
in F based om the flip graph G, then 7’ € 7 So, Fim designed holds Definition 2.1
The function Fa; (extracting AF from 7’):
Trang 34Propustion 2.5 For all (2, M, K) cI «Mx K, Bal BmlI, MK), K) = Af
Pronf Set Mf? = Ex, K) By Definitions 2.9 and 2.11, M! = SON (Wal( 1) + Kajni (2.9)
Alter implementing (2.3) ¢ — OX (Val) | Ke By Definitions 2.10 and 2.11, after implementing (2.4) we consider two cases of g
If q = @, then (2.5) is net implemented and hence is not changed I'hus J/ =/ and therefore M! = M
Otherwise Af 1 q) = SE, aru;,.g is computed by (23) 1 < 4 < N= TR,
KS ku, © Svar C OF (0), tien
J is changed in positions #;,t — I? by (2.5), i, is changed to I), Val(Z].}— Val(i,} — et
by the flip graph G
‘Theorem 2.2 Suppose that a k Generators § for the vector space GF*(p™) is found and
a flip graph G is built Then there exists the data hiding scheme (k, N, [loggp™*"), where X-l§|
Pranf For the assumption that a k- Generators S for GP" (p?}, [S| — N is foimd and a flip
graph G is built, we offer the way vo construct the data hiding scheme (Z,.M4,K, Bm, Ba based on the Galois field G#'(p™) by using the flip graph G and the automaton A(/, M, A’)
£m changes colours of at most & pixels J ta embed M in / for all 4 € ZA © M by
T3eñnition 2.10 and Statement (2.5)
Consider 2 to be the set of all secret data of length r bits, then |B = 2" |A4| = p™ by
M =GF"p""}, Suppose thur we construct an injective function f, ff Bo š A4, Then the
Em is used to embed 6 € B in J as follows
f= BmUt, MK),
Since f is injective by our supposition, after extrecting M from J! by Ez, the secret data
6 will be determined accurately based on f
Since B and M are finite sets, thus to exist the injective fumetion f, we let |< (MI, it
means 2” < p™, then r < Inga p™, choose r = [loggp™"] So, for r = logy p™" |, the r
34
Trang 35ils of the yeervt dala & cau be cbedded in 7 By Definition 2.2, the date hiding seherie M.X, Em, Bz) is a data hiding scheme (k, Ä, |loga p"”|) So, the data hiding scheme
Security aualysiy of the datu hiding scheme proposed (4, V, [logy p™?]): Assurne Unt parameters k,.N, Lm, Hx, the vector space GA{p") and the flip graph @ in the data hiding seheme (&,.N, |logy »"""|) are published ‘he secret element Ad is extracted from J!
by the extracting function Fz aa follows
LÊ from Definitions 2.9 und 2.11 and by (2.9):
N
By (2.11), to cxtraeb accuratciy ă, me need to know the &-Generators § for CF"{p") and
the key 1€ Sinee the number of the k-Generators found is c(p” — 1)" by Proposition 2.4,
then the number of choices for the k-Generators $ is ep — 1)* Nt (note that the order
of elements in $ also affects the formula (2.11)) The siumber of choices for the key K
is p™” because A ¢ K Consider GF to be an arbitrary subset of 2l!"#2F7"! elements of
the set G#"(p™), B to be the set of all secret data of length |loga p"| bits, it means
B— {0.1, ,gl!e8-P"" — 1} in the decimal avstem Then there exists a bijective funcsion
S.A :B > GF, By (2.10), bo decrypt the seeret clement M Lo thị data b, we used to
know f The number of choices for the bijective fmnction ƒ is C 2lwa?”"lI, Then
for a brute force attack, an attacker has to try every possible combination of $, K and f
in the given data hiding scheme The number of combinations of 5, K and f is
Theorem 2.8 Supyose thal a flip grople G is fill Then there exists the optimal data
hiding scheme (1, ? tạm, ma [logs p™" °) for geotour = p™ — 1
Proof Sut S' — [GF"y)\{0}, then 8 is a fener for [GP*®(02)] by Definivion
26 Consider [ul ¢ 8’, then 5’\{[u]} is not a J {Generators} for [G#"{p™)] because
2] ¢ {[avJia € GF"), vf © Sf fv }} hy Proposition 2.1) Therefore
S! isa unique í-jGeneretarsj for (ŒP"(p"°)]N{0}, (2.13)
and |S"] = mm by Proposition 2.2 By Theorem 2.1, there exists 1-Genemiors S$ for
GF"{p'"), $| = [S| = BSL By (2.13) and Theorem 2.1, there docs not cxist another 1-Generalara S for GF pP), $0] < S|, then
Sis a I-Generatory for GF" p"") wilh the suzallest nanber of elements (2.14)
— and for k—1,.N — |S] — 2251, we obtain
Trang 36Propustion 2.6 For n is a positive inteyer, there exists the optimal data hiding scheme 4.2" 1,n) for binary, gray and palette images with geotour = 1
Proof Por qotour — 1, from (2.1), therefore p — 2,m — 1 If we build a flip graph G, then there exists the optima data hiding scheme (1,2? — 1,1) with doniwe — L by Theorem 2.3 The Galois field GF(p"}), CF(p") = GF (2) is the same as the ficld Zz (sce [88]} Next,
we show ways to build flip graphs G = (V, #) on the field Zz for binary, gray and palette images as follows
Tor the binary image, then @ — {0,1}, op € C,, cp is a colour valne af a pixel
eV — 6 and for all v c ¥, the vertex v is assigned a weight by a function Fal such
that Vau(u) =u;
0 £ — {lop cw lop, ep © Vien 2 or} and every arc (¢p,cy)} has the same weight 1
Tor the gray image, then C= {0,1, , 258}, ey € C ¢, i8 a colour value of a pixel
© V = C and for all v © V, the vertex v ig ussigned a weight by a function Val such
that Val(v} = v mod 2:
2 L — {(235, 254), (ep, ep + lep © V,1 < cp < 254} and every are (ep, cy) is assigned the same weight 1
For the palette image, Uen Ở — {0,1 8? — 1}, ¢ is the umber of bits to represent
colour indexes, cy ¢ C, ¢p is o colour index ofa ‘oie The palette P = {po, wih
pi © £, pj is the colour corresponding to the colour index i,i = 0,2!—1 ‘Io unity norations thronghom this dissercation, here changes the name of the function Val in the TOPA method, recalled in Section 1.2 of Chapter 1, to Vài, And setVal(rp) — Vedp(p), where the colour index 6, ¢ C corresponds lo the culour p ¢ P
By Definition 2.8, it is not difficult to verify that the graphs G' for binary, gray and palctte images built as above are all Bip graphs on the field Zp So, there exists the optimal data hiding scheme {1,2 — L,n) for binary, gray and palette images with qeojonr — 1 E Notice that if we set V7
the data hiding scheme (1,.V [loga(N 1 1)|) Remember Unt for NV is a positive integer, the data hiding scheme (1,.V, [logy{.V + 1).) for binary image with geofour = 1 is the data hiding scheme C''L [18] So, Propcsition 2.6 shows that the data hiding scheme C'l'L reaches an optimal data hiding scheme for N — 2" —1, where n is a positive integer
2 —1, then the data hiding scheme (1,2? — 1,n)} becomes
(2, {S , [logy p’*" |} is designed bascd on the assumption qealour =
prove it to be optimal for geotour — B" — 1
According to Lhe proposed approach, the data hiding vehezne
1 by 2.1, Now, we
Trang 37Suppose the dutu hiding seleme (2,N,7) is optiol fer geotoar =
By the assumption of the theorem, |S[ — then
ena Mom ey yo tom 2
”
Trang 38Given wi image F used ay a carrier to einbed a sceret data sequence into, partition F into disjoint image blocks of N pixels, F = (Fi, Fa, Fig} Let D = DịD¿ Dị, bố
a secret data sequence embedded in the cover image J", where D; is secret data of length [Logs p” | bits, i — 1, ts Since each [logy p"] bits of secret data is only embedded in one image block of F la < to
Let Jamp be a bijective function used to determine the order of blocks in F in the process of hiding 1) in #, Jump : {1,2, fo} > 41,2, ,t2}
Consider GF to be an arbitrary subset of 2l!.2"" elements of the set GF"(p), 7Ï ta
‘be the set of all secret data of length [logy bits, it means B — {0,1, ,2U087"" 1}
in he decimal system Then there exisls a bijective funetion f.f:B > OF
In real applications, when apply the data hiding scheme (f, NV’, |loggp""{) based on the proposed approach to the process of hiding L in 2", use the secret key set K,
K— {R!,Ñ2, ,Rt} insread of one secret key The proeess of hiđmg 72 in Ƒ by using
the datu hiding scheme (k,¥, ogg |) consists of the embedding algorithin Bmp and the extracting algorithm Expr proposcd as follows
M — BaF jumptihs / Use Une automaton ACP punp(y M, K?) (2.24)
‡
D_~ mạ Dụ;:
Propostion 2.7, For a cover image I’, a secret data sequence D, « bijective function Jump,
a hajertine function f, a secret key set K and the data hiding echeme (k, N, [logy p'™”_} based
on the proposed approach yiven as above Suppose the stego image block F* iy generated after D is embedded in #° by the embedding algorithm Limp ‘Then the data sequence L! extracted from E by the extracting algorithm Kxpp is exactly the secret data sequence LD
Trang 39
Proof By (2.21) aud (2.23), Bruny in (2.22) aud Bepy in (2.24) use Une seme secret key K' The bijective function Jump guorantecs for all 4 C [1,2 th8 4 i Jump(i) £ Jump(j), it means that an arbitrary image block in J" is only used at most one time in the process of hiding By Proposition 2.5, M extracted by (2.24) is the same
as M embedded by (2.22) Then the bijective function f guaraulees thal Dj eucrypted
by (2.20) is the same as Dj decrypted by (2.25), i € {1,2, ,#4} Therefore we complete
From (2.26) and (2.25), to extract accurately Dj,i — 1,ts, we need to know the
&- Generators $ for GF*(p™), the key set K and two bijective functions Jump, f Since
by Proposition 24, then Une number
‘Lhe number of choices for the key set
KC, two bijective functions Jump and ƒ are p™N, ty! and CHT ""™ gllog2"" It (see the
security analysis of the data hiding scheme (kN, [loga7™?|) as above), respectively
the number of the Generators found is e(y — 1
of choices for the & Generators $ is e(p™ — 1) NI
Then for a brute force attack, au attacker has to bey every possible combination of $, ,
Jump snd f in the given process of hiding ‘Ihe number of combinations of S, K, Jump
and f is
2.4 The Near Optimal and Optimal Data Hiding Schemes for
Gray and Palette Images
This section showy Unt there cxist the ueur oplimal data hiding seherne (3,9,8) (Cheorem 2.5 and Security analyses (2.45), (2.46)) and the optimal data hiding scheme (1.5.4) (Corollary 2.1 and Security analyses (247), (2.48)) for gray and palette images
With dente = 3
According lo the way of constructing the Gulois lid GF(y™) fro tke polynouttal ring
2Zolic], where p is prime and ít is a posilive integer [88], licre consider the ease p = m = 2
and use the irreducible polynomial g(«} — 224.241 in Z[z] to construct the lalois field
GF (2) from the polynomial ring Za[z], we obtain the Galois field GF(2+) as follows
GF(2?) — {0,1,2,r+1}
with two operations addition + and multiplication ‹ are defined as in Z2 x], followed by a
reduction modulo g(2)
29
Trang 40Notice that the polynomiul g(x) is irreducible in Zoe] Indeed, if g(r) has fuetors being different from the constant, then the factors of g(r) arc only polynomials of degree 1 and hence g(«) has roots in Z, this can not happen because g(0)) — g(1) — 1
To save memory space, this section writes all polynomials of GF(2*) by sequences of
their coefficients and then denole the sequence of any polyuomial’s cvellicieuls by a binary string and o decimal number as in Table 2.1
From 'Lable 2.1, to be convenient for programming, hereafter, Gl#(2”) can be considered
in decimal system by G72?) — (0.1,2, 3} Then two operations in (2Ƒ'{2?) are presented
Based ou the binary represontaldon o[ ŒP(3') as in Table 2.1, consider the case vy — 4,
then every vector in the vector space Œ#“(2') over the field (7#'(2') can be writben as a
string of length 8 bits, ix means that in the decimai sysbem Œ#?{2”) can be presenbed
by GFA(24) = {0,1, 255} Thus two operations of vector addition + and scalar saultiplicuion - on CF#(22) are completely determined based ou the operations on Une Calois Geld GP(2?) in Table 2.2
Table 2.2 Qperntions + and» on the Galois field GF(2?)
By Proposition 2.2, the number of subsets of V’ elements of the set [G'F*(22)|\{0} is CZs
Then to find « 2-/Generutors] $! for the set [CF42?)] [S'] = N, we aved vo try CH subsets
30