Time for recovering the pass-words of such documents mainly depends on two factors: the size of the password search space and the computing power of the underline system.. The combinatio
Trang 1Password recovery for encrypted ZIP archives using GPUs
Pham Hong Phong
Hanoi University of Technology
phongph@it-hut.edu.vn
Phan Duc Dung
Hanoi University of Technology
ducdung872001@gmail.com Duong Nhat Tan
Hanoi University of
Technology
dn.tan7388@gmail.com
Nguyen Huu Duc
Hanoi University of Technology
ducnh-fit@mail.hut.edu.vn
Nguyen Thanh Thuy
Hanoi University of Technology
thuynt@it-hut.edu.vn
ABSTRACT
Protecting data by passwords in documents such as DOC,
PDF or RAR, ZIP archives has been demonstrated to be
weak under dictionary attacks Time for recovering the
pass-words of such documents mainly depends on two factors:
the size of the password search space and the computing
power of the underline system In this paper, we present an
approach using modern multi-core graphic processing units
(GPUs) as computing devices for finding lost passwords of
ZIP archives The combination of GPU’s extremely high
computing power and the state-of-the-art password
struc-ture analysis methods would bring us a feasible solution for
recovering ZIP file password We first apply password
gener-ation rules[9] in generating a reasonable password space, and
then use GPUs for exhaustively verifying every password in
the space The experimental results have shown that the
password verification speed increases about from 48 to 170
times (depends on the number of GPUs) compared to
se-quential execution on the Intel Core 2 Quad Q8400 2.66
Ghz These results have demonstrated the potential
appli-cability of GPUs in this cryptanalysis field
Categories and Subject Descriptors
E.3 [Data]: Data encryption—code breaking
General Terms
Security, Performance
Keywords
GPU, ZIP, password recovery
1 INTRODUCTION
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
SoICT’10, August 27-28, 2010, Hanoi, Vietnam.
Copyright 2010 ACM 978-1-4503-0105-3/10/08 $10.00.
Figure 1: ZIP archive encryption and decryption processes
While cryptographic researches intend to protect against information leakage in data storage and data communica-tion, cryptanalysis, in contrast, tries to discover informa-tion protected by encrypinforma-tion These two research branches are considered as two sides of the same information secu-rity issue The development of one branch is promoting the development of the other In the history, the advance of a branch against the other sometimes brings great benefits for life, even decides the fate of a nation For example, the suc-cess in the Zimmermann Telegram cryptanalysis in the First World War made the United States plunge into the war, or the success in the Germanic cryptanalysis contributed to shorten the Second World War a few months[5]
From the same point of view, in this paper, we aim to re-search a cryptanalysis method for ZIP archives protected by passwords Originally, compression methods such as PKZip, Deflate, LZMA are used to reduce data size, making data storage and data communication effective Because the in-formation security often comes with data storage and infor-mation exchange techniques, popular compression tools such
as WinZip, implementing efficient compression algorithms, often integrate with data encryption functions which typ-ically use common symmetric encryption systems such as DES or AES
For the convenience, an encryption key is generated from the sender’s password by a hash function This key is used
to encrypt the document Then the password is transferred
to the recipient via a secure channel and used to generate the same key for decoding the document The process of encoding and decoding a protected ZIP file is shown in Fig-ure 1
In this paper, we try to recover the content of an encrypted
Trang 2ZIP file without knowing its protection password In fact, for
weak encryption systems such as RC4 or DES, cryptanalysis
can be conducted by an exhaustive attack on the whole key
space in an acceptable time The results of [4], [6] have
proved this method
For strong encryption systems such as AES (which is
com-monly used in new versions of WinZip – 9.0 or higher),
such an exhaustive attack is almost impossible AES[1],
Ad-vanced Encryption Standard, is a block cipher AES works
with 128-bit data blocks (4x4 bytes) with the key length
is either 128, 192 or 256 bits AES can easily be
imple-mented at high speed by software or hardware and does not
require much more memory AES is a strong encryption
sys-tem, AES-128/AES-256 with the key size of 128-bit/256-bit,
there are up to 2128
/2256
cases in the key space for testing
to find out the original password Courtois and Pieprzyk’s
XSL algorithm [2], reduces the key space from 2128
keys to
2100
keys But even so, trying all the possibilities still is
not acceptable for common computing systems Our
ap-proach in this paper does not directly attack on AES key
space Instead, we found that the encryption key of a
pro-tected ZIP file is generated from a user password by a hash
function which is published in the ZIP file specification[8]
Although password spaces are also large but attacking on
a password space is much more feasible than attacking on
AES key space, since dictionary attack methods can be
ap-plied The major obstacle of this cryptanalysis method is
that the computational complexity of the hash function is
quite high This is even more difficult, because password
salting techniques implemented in new versions of ZIP tools
prevent us from using pre-computed attack methods
To overcome these difficulties, we first employ the
re-cent password structure analysis method of Weir, Aggarwal,
Medeiros, and Glodek[9] to reduce the size of the password
search space Then we use the extremely high computing
power of modern muti-core GPUs for implementing the
com-plex hash function to concurrently verify passwords from the
password search space and to generate AES encryption keys
for all possibly-correct passwords Finally, we apply
plain-text recognition techniques to find one correct answer from
the set of possibly-correct passwords
Our experimental program, which was written in CUDA
running on a PC with an Intel Quad Core 2.66Ghz and
two NVIDIA GeForce GTX 295 graphics cards installed,
achieves the password checking speed about 5,011 passwords
per second, 170 times faster than the CPU-based sequential
program on same system With a good password structure
and a large dictionary, this result shows that the proposed
algorithm would allow us to recover ZIP file password in a
reasonable time compared to the CPU-based version
In the rest of this paper, we briefly introduce GPU and
CUDA technologies for general-purpose applications, describe
details of the proposed algorithm and show the
experimen-tation results
2 CUDA AND GPGPU
In recent years, computing power of graphics processors(GPU)
has increased significantly compared to CPU Until June
2008, NVIDIA’s GPU GT200 generation has reached the
threshold of 933GFLOPS, more than 10 times over
dual-core processor the Intel Xeon 3.2 GHz at the same time
Figure 2 shows a massive increase in computing power of
the nVidia graphics processors compared to Intel
proces-Figure 2: NVIDIA GPU-Intel CPU performance comparison
sors This superiority in performance does not imply the superiority in technology GPU and CPU are developed in two different directions: while CPU technology speeds up
a single task, GPU technology tries to increase the number
of tasks that can be performed in parallel Thus, while the number of cores in common CPUs has not reached 8 cores yet, the number of cores in single GPU has reached 240 and also promises to continue to increase to 500 cores in 2010
As a penalty for the computing power, GPUs lose the flexi-bility of processing cores Currently, all processing cores on one single GPU can only execute a single piece of code at
a time, so GPU is only suitable for data parallel problems,
in which the same program code will be executed in parallel for several different data sets Fortunately, most problems that require large computing power can be converted to a type of data parallelism
Beside the effort of improving GPU computing power, GPU manufacturers are also interested in providing better application development environments for common devel-opers to easily program on GPUs NVIDIA CUDA[7] is a good example of such effort With CUDA, programmers can exploit GPU computing power for not only graphics process-ing applications but also general-purpose applications This technology is one of important factors for the opening of the recent GPGPU(General-Purpose computation on Graphics Processing Units) era The followings are some key features
of the programming language supported by CUDA (called CUDA language):
• CUDA language is an extension of C language, so fa-miliar to most developers
• CUDA code is divided into two parts: one executed
on CPU and the other executed on GPU The part
executed on GPU, also known as parallel kernel, when
called, can be executed in parallel on thousands of ex-ecution threads Each thread has a unique identifier used to determine its task
• CUDA allows programmers to define an arbitrary num-ber of parallel threads, but to avoid the dependence
on hardware , threads are devided into blocks with the number not exceeding 768(GT200 generation) This allows a programmer to design his parallel program ef-fectively without caring about the hardware capability
• Memory is hierarchically organized for effective usage – Main memory: the memory area for CPU code
Trang 3Only this code can access and modify information
here
– Global memory: the memory area that all GPU
threads can access to it Programmers can move
data from main memory to global memory by
us-ing functions from a CUDA basic library This
memory is often used to store inputs and outputs
for parallel threads on GPUs
– Shared memory: the memory area that only threads
in one block can access This memory is
inte-grated on-chip, therefore, the speed of accessing
data on it is much higher than on global memory
This memory is often used to store temporary
shared data among threads in a block to speed
up the process of memory usage
– Local memory: the memory area allocated to
lo-cal variables of each thread and one GPU thread
can not access to those from others
With the ability to perform data parallelism on such a
lot of threads, GPU is an appropriate choice to the problem
of ZIP file cryptanalysis, where each thread can take one
password from the password search space to check The
next section of the paper explains details of our GPU-based
password recovery algorithm using CUDA for protected ZIP
files
3 RECOVERING THE ZIP FILE PASSWORD
ON GRAPHICS PROCESSORS
As introduced in Section 1, our approach in ZIP file
crypt-analysis is to attack on the password space, instead of
di-rectly attacking on the AES key space The whole password
recovery algorithm is devided into three main steps:
1 Apply the password structure analysis algorithm from[9]
to find an appropriate password structure, and then
use the password structure for reducing the size of
password search space
2 Exploit the computing power of GPUs for accelerating
the preliminary password checking process The result
of this process is a set of possibly-correct passwords
(called candidate passwords) whose sizes are much more
smaller than the size of password search space
3 Apply a plaintext recognition technique for verifying
each candidate password to find out one correct
pass-word of the ZIP file
In this paper, we concentrate on step one and step two of
the algorithm
3.1 Strategy
According to the specification of the ZIP file[8], the
pro-cess of checking one password consists of the following steps:
1 Generate an AES key from a given password by
using the hash function described in the specification
-PBKDF2(pw, salt, dkLen)where pw is the given
pass-word to check, salt is a random value stored in the
compressed file, and dkLen is the AES key size The
function PBKDF2 has a large computational complexity
In fact, this function performs the HMAC-SHA1
algo-rithm for 1,000 times, thus preventing attacks from
common computing systems The random value salt
is used to prevent from pre-computed attacks
2 Decrypt and decompress the encrypted ZIP file using the obtained AES key in the previous step This algo-rithm also generates a checking value which will be compared to a MAC (message authentication code) value stored in the archive to decide whether the pass-word is correct or not
With such the specification, for each of passwords if we completely implement these steps, the time needed to check the entire password space is very large because decrypting and decompressing the entire encrypted ZIP file is extremely expensive Instead, in step two we can apply decrypting and decompressing techniques for a part of the ZIP file, and then use a plaintext recognition technique to quickly check the validation of the generated AES key
The time of password recovery for a protected ZIP file tightly depends on the size of the given password space
An exhaustive approach is to enumerate all possibly-correct passwords whose lengths are shorter than a specific num-ber For example the set of passwords composed of the set
of characters {a-z,A-Z,0-9} with the maximum length of 6 will have totally 57,731,386,987 passwords Due to the high computational complexity of PKBDF2 function, checking such password search space would require a huge computational resource Instead of the exhaustive approach, we employ a new result of password structure analysis from [9] to reduce the size of the password search space This research uses the statistic results in psychology about ability to maintain password memory of users to construct a password structure which presents a much smaller password space containing the ones with highest occurrence probabilities
Another strategy we have considered in this paper is to take advantage of friendly usage features of compression tools For example, WinZip allows to detect incorrect pass-words rapidly by storing a two-byte password verification value (PVV) in the header of the ZIP file This value will be compared to a part of the output of the function PBKDF2 for quickly rejecting most incorrect passwords If a password
is accepted, it is not yet guaranteed that the password is correct However, the number of passwords which can be accepted is significantly smaller than the size of initial
pass-word space We call them candidate passpass-words Since the
execution of the hash function PBKDF2 takes the main work-load of the checking process, our strategy is to implement this hash function on GPUs to effectively check passwords
in the given password space in parallel
In the next two sub-sections, we are going to explain in de-tails the algorithm of password structure analysis to reduce password search space, and the implementation of PBKDF2
on GPUs
3.2 Reducing password search space using the password structure analysis technique
There are two factors in evaluating the quality of a pass-word space: the number of passpass-words and the probability of the success in finding the correct password Let us analyse two approaches in forming a password space:
1 Full space This space contains all passwords It al-ways meets the second condition since the correct pass-word can always be found by a exhaustive search algo-rithm However, the size of the full space is normally
Trang 4too large, the implementation of the exhaustive search
algorithm becomes impractical
2 Partial space Instead of exhaustively searching in the
full space, we can choose a subset of the space for the
searching algorithm Since the occurrence
probabil-ity of the correct password in the space is an
impor-tant factor for the success of the password searching
process, we should consider a good subset so that the
occurrence probability of the correct password in the
subset is as high as possible
In the second approach, external knowledge about
pass-word structure is normally used to determine the subset
Using a smaller set of characters for the derivation of the
correct password, limiting the length of the correct password
are two naive techniques for choosing a partial space This
is, however, not so practical since the size of the resulting
space is normally too large to contain the long correct
pass-word Instead, we can construct the password search space
in the order of occurrence probabilities of passwords The
order of generating passwords is very important It helps
to find the correct password quickly and then can
immedi-ately end the searching process without having to check the
rest of the password space There are many criteria that
change the order of checking passwords For example, the
correct password would contain no more than 10% of
upper-case characters, or 20% special characters and digit
charac-ters in the total number of characcharac-ters We call all of such
criteria password generation rules Clearly, passwords which
are generated by rules are more “quality” than the ones
gen-erated by the naive techniques In this paper, we refer to
the approach using password generation rules, based on the
descending occurrence probabilities of password structures
This technique was originally proposed by Weir, Aggarwal,
Medeiros, and Glodek in their work [9]
The naive techniques consider the occurrence
probabil-ities of user passwords to be similar In fact, according
to statistics of occurrence probabilities of actual passwords,
this is not so practical For example, the password
“word12” has a higher occurrence probability than the
pass-word “P@$$W0rd!12”
Assume that a password is a combination of alphabet,
numeric and special characters We denote alphabet
char-acters as L, numeral charchar-acters as D, and special charchar-acters
as S Then the password “$password12” can be structurally
denoted as SLD This structure is called the simple
struc-ture If we add the information about the number of
char-acters to a simple structure, we will obtain a base structure,
e.g S1L8D2 One important type of the base structures is
the pre-terminal structure, which can be generated from the
base structure by filling in specific values for the D and S
parts of the base structure For example, one instance of the
pre-terminals of S1L8D2 is $L812 We calculate the
prob-ability of a pre-terminal as the product of the probprob-ability
of the base structure, the occurrence probabilities of
spe-cial characters and the occurrence probabilities of numeric
characters, which can be pre-computed by using a meaning
dictionary The algorithm published in [9] can be briefly
described as followings:
• Given a set of password generation rules in form of
a context-free grammar G=(V,Z,F,P) where V, Z are
finite sets of variables and terminals, S is the start
Table 1: An example of a password grammar
Production Probability
S → D1L5S2D1 0.75
S → S2L4D1S1 0.25
variable, P is a finite set of productions of the form
α → β where α is a single variable and β is a sequence
of variables and terminals Table 1 gives an example
of a grammar together with probabilities of rules
• Pre-terminal structures are generated in order of de-creasing probability by the following tree-buiding steps – Put S as the root of the tree
– Children of the root are pre-terminals with high-est probabilities, derived from the base structures that are immediately obtained from S Note that
a pre-terminal can be generated from a base struc-ture by substituting all occurrences of S and D with the corresponding special charaters and num-bers as shown in the grammar
– The tree advances to each leaf by substituting a higher probability special character or number in
a pre-terminal with a lower probability one – Figure 3 shows the corresponding generated tree
to get pre-terminals in order of decreasing prob-ability
• A password can be generated from a pre-terminal struc-ture by substituting L meta character in the strucstruc-ture with a meaning word from the given dictionary The set of passwords generated by this approach is reason-ably small in comparison to those generated by naive tech-niques It will be considered as the input password space for the password verification process on GPUs described in the next sub-section
3.3 Verifying candidate passwords on GPUs
Assuming that the input password space includes n pass-words In theory, n passwords can be checked - to confirm whether each password is a candidate password or not - at the same time, by calling p (p = n) corresponding GPU threads However, this number of threads p is limited by hardware resources, usually p is much smaller than n There-fore, to check n passwords, we need to sequentially call (n/p) times, each time a batch of p passwords is feeded to check p threads in parallel, thus, the password search space should
be divided into the corresponding batches Figure 4 de-scribes such inspection of passwords
The algorithm code is divided into two parts: the sequen-tial execution on CPU and parallel execution on GPUs The first part generates pre-terminal structures as shown in the
Trang 5Figure 3: The corresponding generated tree
Figure 4: Checking passwords in parallel
previous section This part is executed sequentially on CPU
The computation cost mainly depends on the second part
For each of obtained pre-terminal structures, we transplant
it with words from the given meaning dictionary to generate
a set of passwords, then use the function PBKDF2 to check
whether each password of the set is a candidate password or
not
Because the number of words in the dictionary is very
large, the number of passwords generated from a single
pre-terminal is large as well Thus, we can take advantage of
GPU computing power for this checking task
We denote the set of words in the meaning dictionary as
W , among them, the set of k-length words is denoted as Wk
with k >= 1 |Wk| represents the number of words in Wk
Pseudocode of the algorithm as followings:
for each pre-terminal S {
k = llength(S);
m = |Wk|;
l = ceil(m / p); /* l - a number of batches */
for j = 0 to l - 1 {
for id = 0 to p - 1 in parallel {
base = j * p;
guess_password = transplant(S, Wk(base + id);
TestPVV = PBKDF2(guess_password, salt, dkLen);
if (TestPVV == PVV)
markCandidatePassword(guess_password);
}
}
}
In the above code, the function llength(S) returns the
length of the consecutive letter area denoted by the meta symbol L in the pre-terminal structure (for the convenience
of the presentation, we assume that there is only one meta symbol L in the pre-terminal structure) This meta symbol will be substituted with a k-length word in the dictionary
to form a test password Since each GPU can only perform maximum p threads at the same time, the set of words Wkis devided into l batches, each of them is proceeded in parallel Thus, for the batch j, words from j ∗ p to (j + 1) ∗ p − 1 in
Wk will be merged together with the structure S to form p passwords to test by PBKDF2 function, and the candidate passwords are finally marked by markCandidateP assword
4 EXPERIMENTAL RESULTS
The algorithms described in the previous two sections have been implemented and tested on a system consisting of
• CPU: Intel Core 2 Quad Q8400 2.66 Ghz
• RAM: 8GB
• GPU: two dual graphic cards NVIDIA GeForce GTX
295 (total of 4 GPUs)
• OS: CentOS 5.3 Inputs of the program include two dictionaries:
• A specialized dictionary which is a set of actual pass-words This dictionary serves the calculation of oc-currence probabilities of the password structures, spe-cial characters, and numeric characters However, for the reason of information security, achieving real pass-words is not so easy In our experiment, we have cre-ated some forms of base structures to generate pass-words and then manually set the probabilities of them,
as well as of digit and special characters
• The dictionary dic-0294 contains 869,229 words for the substitution of meta symbol L in the result pre-terminal structures This dictionary is considered as the current largest one
Trang 6Table 2: Performance comparison of generating candidate passwords using the exhaustive search algorithm
The limited length of passwords The number of passwords CPU 1GPU 2GPUs 4GPUs
Table 3: Comparison of the password checking
speeds on different environments
CPU-based algorithm 2.9h
Single GPU-based algorithm 4m
2GPU-based algorithm 2m16s
4GPU-based algorithm 1m15s
In[3] we applied the exhaustive approach to demonstrate
superior computing power of GPGPU technology in the
prob-lem of recovering ZIP file password This experiment uses
the set of uppper, lower and numeric characters S =
{a-z,A-Z,0-9} and the optimal number of threads can run in
parallel p = 32,768 The result showed that the speed of
generating AES keys on GPU increases from 48 to 170 times
compared to that on the sequential CPU-based program
(ap-proximately 48 times on single GPU, and 170 times on four
GPUs) Table 2 demonstrates this result
The disadvantage of this approach is that the password
search space may increase exponentially when the maximum
length of passwords increases Therefore, with the password
of ZIP file such as “6class$$4”, it is almost impossible to
recover password if using this approach
Our experiment in this paper generates the password space
based on the occurrence probabilities of pre-terminal
struc-tures We can overcome this problem while still taking
ad-vantage of great computing power of GPUs The experiment
assumes that the correct password of the given ZIP file can
be derived from one of pre-terminal structures generated by
the rules in Table 1 The correct password of the given ZIP
file is ”6class$$4” which is generated from the pre-terminal
structure ”6L5$$4”, the fifth one on the priority tree
Ta-ble 3 depicts the performance comparison chart of different
implementations of the checking algorithms
5 CONCLUSIONS
We have introduced the two main steps in our approach for
the problem of recovering protected ZIP file password:
re-ducing the password search space using the password
struc-ture analysis technique, and verifying candidate passwords
by using the high computing performance of GPUs The
ex-perimental results shows that the speed of recovering
pass-word by this approach gives a significantly greater
perfor-mance than that by the CPU-based algorithm
With a very large initial password space, the number of
candidate passwords is also not small (it can be reduced
65,536 times if using the two-byte value PVV) Thus, our
next step for completely solving the problem of recovering
password for protected ZIP file would be the implementation
of decryption, decompression and plaintext recognition
al-gorithms on GPUs In addition, we will also consider imple-menting the proposed algorithms on GPU cluster to exploit the power of such computing system
Finally, the solution proposed in this paper can be cus-tomized to apply in cryptanalysis problems on other kinds
of protected files such as DOC, PDF
6 REFERENCES
[1] F I P S P 197 Advanced encryption standard (aes), 2001
[2] N T Courtois and J Pieprzyk Cryptanalysis of block ciphers with overdefined systems of equations, 2002 Preprint is available at
http://eprint.iacr.org/2002/044/
[3] P Dung, D Tan, P Phong, N Duc, and N Thuy Applying cuda computing technology in the problem of
recovering zip file password In FAIR09: Proceedings of the 4th National Symposium of Fundamental and Applied Information Technology Research, 2009
[4] E F Foundation Cracking DES: Secrets of Encryption Research, Wiretap Politics and Chip Design O’Reilly & Associates, Inc, 1998
[5] D Kahn The Codebreakers - The Story of Secret Writing 1967
[6] A Klein Attacks on the rc4 stream cipher Des Codes Cryptography, 48(3):269–286, 2008
[7] NVIDIA
http://www.nvidia.com/object/cuda home new.html [8] PKWARE Zip file format specification, 2007
[9] M Weir, S Aggarwal, B d Medeiros, and B Glodek Password cracking using probabilistic context-free
grammars In SP09: Proceedings of the 2009 30th IEEE Symposium on Security and Privacy, pages 391–405, Washington, DC, USA, 2009 IEEE Computer Society