Master of science Mathematics: A thesis submitted to the graduate faculty in partial fulfillment of the requirements for the degree

These formal definitions often seem far stronger than necessary for practical security — many cryptosystems used in practice do not satisfy the formal definitions. This includes typical ElGamal implementations. The attacks discussed in this paper suggest that it is worth striving to meet formal definitions of security in actual implementations.

OVERVIEW

Introduction

Public key cryptosystems, known for their mathematical elegance, often lack security in simple implementations Research has shown that attacks on ElGamal and RSA are particularly effective when the encrypted messages are short and unprocessed Additionally, the security of ElGamal is influenced by the parameters chosen during the cryptosystem's creation, which, despite being overlooked in textbooks, are crucial in practical applications.

Computer scientists have long sought to define what constitutes a secure cryptosystem, yet many formal definitions appear overly stringent compared to the requirements for practical security As a result, numerous cryptosystems in use today, such as standard ElGamal implementations, do not fulfill these formal criteria The attacks highlighted in this paper indicate the importance of aligning actual implementations with formal security definitions to enhance overall cryptographic security.

Hybrid Cryptosystems

A cryptosystem enables secure communication between two or more parties over a public channel by utilizing a cipher for encryption and decryption The original message, known as plaintext, is transformed into ciphertext through an encryption rule that employs a predetermined key, effectively concealing the message's content To retrieve the original plaintext from the ciphertext, a decryption rule is applied using a potentially different key.

Cryptosystems can be categorized into two main types: symmetric key systems, which use a private key, and public key systems, also known as asymmetric systems While public key systems offer enhanced security, they tend to operate at a slower speed compared to symmetric systems However, symmetric systems necessitate a secure channel for key agreement To leverage the benefits of both approaches, hybrid cryptosystems integrate elements from both symmetric and public key systems.

Symmetric key systems utilize a single key for both encryption and decryption, requiring two parties to agree on this key through a secure channel to ensure secure communication As the number of parties increases, key distribution becomes increasingly complex, posing significant challenges for the practical implementation of cryptography Typical symmetric ciphers employ intricate transformations to conceal patterns in the original message, with the key dictating the nature of these transformations and serving as a guide for reversing them during decryption.

Symmetric key systems such as the Data Encryption Standard (DES), Advanced Encryption Standard (AES), and Skipjack play a crucial role in data encryption While DES, which utilizes a 56-bit key, was the standard from 1976 until 1999, it is now deemed inadequate for modern security needs In contrast, AES is the preferred algorithm for new applications, employing keys of 128 bits or more Skipjack, developed by the U.S National Security Agency and declassified in 1998, operates with 80-bit keys All these algorithms are classified as block ciphers, meaning they encrypt and decrypt data in chunks, and many legacy systems still rely on DES, particularly those that cannot be easily updated.

Public-key cryptosystems address the key distribution problem by utilizing distinct keys for encryption and decryption, with the encryption key being publicly available This allows anyone to encrypt a message, while only those with the private key can decrypt it These systems depend on one-way trap door functions, which are mathematical functions that are easy to compute in one direction but challenging to reverse without the secret key Consequently, deriving the private decryption key from the public encryption key must be infeasible.

Public-key cryptography is essential for secure email communication, as it relies on public keys usually published on users' websites However, if a website is compromised, malicious actors can replace the legitimate public key with their own, which poses a risk to key distribution To address this issue, digital signatures can be utilized to create a web of trusted public keys, enabling users to accept new keys only if they are signed by a previously trusted key.

ElGamal is a public key cryptosystem that relies on modular exponentiation as a one-way trap door function, with the discrete logarithm operation deemed intractable Unlike the widely recognized RSA system, ElGamal was never patented, making it an appealing choice Public key systems like ElGamal differ significantly from symmetric systems and generally require larger key sizes, with a minimum recommendation of 1024 bits, and even larger keys suggested for certain applications.

Symmetric systems are far faster than public key systems — AES, for example is roughly

10000 times faster than RSA Efficiency is important for real-time communication and bulk encryption and decryption of large data sets.

In a hybrid cryptographic system, a session key is established by one party generating a random key and encrypting it with the other party's public key This encrypted session key is sent to the recipient, who uses their private key to decrypt it, allowing both parties to share the session key securely Subsequently, this session key facilitates the encryption and decryption of all subsequent communications using a symmetric cipher, ensuring secure data exchange.

1.2.4 Short messages and public key systems

Public key systems, unlike older symmetric systems such as DES, are primarily utilized for session key negotiations rather than direct encryption and decryption Consequently, in a hybrid system that incorporates both DES and ElGamal, it is common for 56-bit DES keys to be securely transmitted by encrypting them with ElGamal.

Implementation

I executed two attacks from the research by BJN00: the fundamental meet-in-the-middle attack and the two-table attack, both of which conduct meet-in-the-middle searches within a portion of the message space Additionally, I developed several variations of the basic meet-in-the-middle attack, including one that utilizes external storage rather than main memory The source code for these implementations is accessible under the MIT license at http://www.math.iastate.edu/brycea/thesis2008.html.

All attacks were conducted using C++ and the GNU Multiple Precision Arithmetic Library (GMP), which, despite being a highly optimized general-purpose big number library, lacks specific optimizations for cryptographic applications Notably, GMP does not permit programmers to dictate the type of reduction for modular exponentiations or multiplications This limitation may hinder performance, as internal choices made by GMP could be adjusted to significantly enhance the efficiency of the attacks.

Various attacks were conducted on ciphertexts of varying message sizes to assess the scalability of these attacks and their comparative effectiveness The findings are detailed in Chapter 5.

Splitting Probabilities

This paper discusses attacks that operate under the assumption that messages are positive integers, which can be factored into two or more integers of specified sizes This assumption aligns with ElGamal's requirement for messages to be converted into positive integers prior to encryption For instance, a DES key, represented as a 56-bit string, can also be viewed as a 56-bit unsigned integer, potentially allowing it to be split into two 28-bit factors While not all 56-bit integers will split in this manner, the likelihood of success for attacks based on this splitting assumption remains relatively high.

Table 1.1 presents splitting probabilities derived from the factorization of 100,000 random numbers \( m \) within the range of 1 to \( 2^b - 1 \), where the splits are defined as \( m_1 m_2 = m \) with \( m_1 \leq 2^{b_1} \) and \( m_2 \leq 2^{b_2} \) Additionally, the final column compares these findings with results from [BJN00], highlighting slight discrepancies between their outcomes and mine.

The selection of b1 influences the time and space needed for the initial pre-computation step in a cryptosystem, which is performed only once, while b2 affects the time taken to decrypt a specific message By varying the values of b1 and b2, one can achieve different trade-offs regarding pre-computation time, cracking duration, memory usage, and the likelihood of success.

See [BJN00] for some analytic results about splitting probabilities.

Table 1.1 Experimental splitting probabilities b b 1 b 2 Probability Probability [BJN00] 40

THE ELGAMAL CRYPTOSYSTEM

The Discrete Log Problem

The standard logarithm serves as the inverse of standard exponentiation, while the discrete logarithm is defined as the inverse of modular exponentiation In this context, for a modular exponentiation expressed as y = g^x in Z∗p, the discrete logarithm log_g(y) equals x This discrete logarithm exists within the cyclic group generated by g, which may not encompass all of Z∗p When the order of g, denoted as |g|=n, is large and contains at least one significant prime factor, the discrete logarithm problem in the group hgi is deemed intractable.

There are three basic types of discrete log algorithms: “square-root” algorithms such as Pollard’s rho algorithm, the Pohlig-Hellmen algorithm, and index calculus algorithms.

Pollard’s rho algorithm efficiently computes discrete logarithms in cyclic groups of prime order n with a time complexity of O(√n) and minimal space requirements For non-prime n, the Pohlig-Hellman algorithm is applicable if the factorization of n is known Given the prime factorization n = p₁ê₁ p₂ê₂ pₖêₖ, this algorithm calculates partial solutions by finding discrete logs in subgroups of order pᵢ for i = 1 to c, often utilizing Pollard’s rho as a subroutine The overall runtime of the Pohlig-Hellman algorithm is O(Σ₁^c eᵢ (log n + √pᵢ)) Notably, when n is B-smooth, meaning all prime factors are less than or equal to B, the runtime improves to O(ln ln n (log n + √pᵢ)).

B)), since the average number of not necessarily distinct prime factors is ∼ ln lnn If n is at most 256 bits and has no factors of more than 16 bits, i.e n is

The Pohlig-Hellman algorithm is expected to require only O(2^12) operations when applied to a (2^16 - 1)-smooth number When combined with Pollard’s rho algorithm, this approach efficiently utilizes minimal space However, the effectiveness of both algorithms diminishes when n contains a large prime factor.

Index calculus algorithms are ineffective in general cyclic groups but are efficient in Z ∗ p, operating in sub-exponential time For instance, the number field sieve has an expected running time of O(e(1.923+O(1))(ln p)^(1/3)(ln ln p)^(2/3)) While index calculus methods cannot be applied directly to subgroups of Z ∗ p, they can facilitate logarithm computations in these subgroups by leveraging logs in Z ∗ p Consequently, if n is prime, algorithms like Pollard's rho or the Pohlig-Hellman method (for composite n) may outperform index calculus methods, depending on the specific relationship between n and p.

Encryption and Decryption

The ElGamal cryptosystem is defined by a 4-tuple (p, g, x, y), where p is a large prime that indicates the group Z ∗ p, g is an element of order n in Z ∗ p, x is a random integer within the range of 1 to n−1, and y is calculated as g raised to the power of x While g being primitive implies that n equals p−1, it is not a strict requirement, and n can be selected to be significantly smaller than p−1 to enhance efficiency The public key comprises (p, g, y), whereas the private key is represented by x.

To encrypt a message, it must first be transformed into an integer within the range of 1 to p−1, specifically as an element of Z ∗ p If the message serves as the key for a symmetric cipher, it may already be in numerical form For messages exceeding p−1, they can be divided into smaller blocks While some textbooks suggest that messages should belong to hgi [MVO96] [Sti05], practical implementations frequently overlook this guideline If g is not a primitive root, only a subset of the p−1 elements from Z p ∗ will be represented in hgi, complicating the conversion of messages to this set Additionally, raising g to the power of the message is ineffective, as it requires a discrete logarithm to retrieve the original message.

The encryption function requires a random integer k∈[1, n−1] in addition to the public key The encryption and decryption functions are:

Ek(m) = (g k , my k ) and D(u, v) =u −x v. where all operations are done modp(inZ ∗ p) The decryption function will recover the original message: u −x =g −kx = (g x ) −k =y −k , soD(E k (m)) =u −x v=y −k my k =m.

Security

Recovering the private key x allows us to decrypt all past and future messages, as the public key is represented by y = g^x and g Consequently, deriving the private key from the public key involves solving a discrete logarithm problem To ensure security, both n and p must be significantly large; n influences the efficiency of square-root discrete logarithm algorithms, such as Pollard’s rho algorithm, while p affects the performance of index calculus methods for discrete logarithms.

The recommended minimum key size is 1024 bits, although legacy systems on limited hardware may use sizes as small as 768 bits When using a value of n less than p−1, it is essential that n is sufficiently large to ensure that the O(√ n) discrete logarithm algorithms take at least as long as the index calculus algorithms.

To break a single ciphertext (u, v) = (g^k, m*y^k), it is essential to find y^(-k), as m = v*y^(-k) Efficient computation of inverses allows us to focus on determining y^k We can find k by calculating the discrete logarithm of u = g^k with respect to g, although this step may not always be necessary Cracking a single ciphertext parallels the Diffie-Hellman problem, which involves determining g^(kx) = y^k from g^k and y = g^x While a discrete logarithm can solve the Diffie-Hellman problem, it remains unclear if this task is more or less challenging than the discrete logarithm problem itself.

Efficiency

The ElGamal cryptosystem, represented as (p, g, x, y) with n being the order of g, requires two exponentiations and one multiplication in Z ∗ p for encryption While the exponent k is selected from the range of 1 to n−1, utilizing a larger n can lead to slower exponentiation, thus affecting the encryption speed However, the impact of the single multiplication is minimal compared to the time taken for the two exponentiations.

Decryption involves a single exponentiation, inversion, and multiplication in Z ∗ p, with the private key x serving as the exponent, selected between 1 and n−1 Opting for a smaller n can enhance decryption performance, while alternative methods, such as selecting an x with minimal ones in its binary representation, can further decrease computation time due to the implementation of modular exponentiation.

Reducing the value of p enhances performance, as multiplications in Z ∗ p are more efficient with smaller p This presents implementers facing performance constraints with a decision: either lower p and set n = p−1 or maintain a larger p and select n = p−1 Research indicates that the latter option is frequently adopted in practical implementations, likely due to the reliance of faster index calculus discrete logarithm methods on p rather than n.

Brute Force Attacks on a Hybrid System

Brute force attacks target cryptosystems through exhaustive searching, typically involving the decryption of ciphertext with every possible key This method generates numerous false plaintexts, with only one likely resembling the actual plaintext While humans can easily identify real plaintexts, a computer program must assess the authenticity of the decrypted outputs This is feasible when real plaintexts contain structured data, as incorrect decryptions often yield false plaintexts with a uniform byte distribution, while genuine plaintexts tend to exhibit irregular distributions.

This article discusses two brute force attacks on the ElGamal cryptosystem, highlighting that there are n−1 potential values for the private key x By using the ElGamal ciphertext of a 64-bit session key plaintext, we can attempt to decrypt it with each possible private key value While every decryption yielding a result of up to 64 bits could indicate a potential session key, the limited number of decryptions occurs when n is much larger than p−1 Specifically, even with n set at 256 bits, the process demands O(2^256) modular exponentiations, rendering it computationally infeasible.

To find a matching ciphertext, one could compute all possible encryptions of all potential messages; however, this approach is inefficient For a 256-bit key and a 64-bit session key, the maximum number of encryptions required would be 2^64 (n−1), leading to an attack complexity of O(2^320) encryptions, with each encryption involving two modular exponentiations.

In a hybrid cryptosystem utilizing ElGamal, data is encrypted with a symmetric cipher using a session key, making the symmetric cipher a potential target for attacks If it's possible to identify valid plaintexts, an attacker could efficiently decrypt the ciphertext by testing all 2^64 possible session keys until a recognizable plaintext is found This method, requiring O(2^64) symmetric cipher decryptions and plaintext verifications, is significantly faster than traditional brute force attacks.

MEET-IN-THE-MIDDLE ATTACK

Requirements and Assumptions

In this scenario, the adversary has intercepted a ciphertext denoted as (u, v) and is aware of the public key parameters (g, p, y) utilized for the encryption process The focus of the attack is solely on the second component of the ciphertext, represented as v = my k.

The attack works well under the following conditions:

1 The original message m is at most b bits, b is small, and the adversary is aware of this limit For example, the adversary may know that the message is a 56-bit DES key The attack becomes infeasible for messages much larger than 64 bits — in particular there is no hope of using this attack on encryptions of a Skipjack (80-bit) or AES (128-bit) session key.

2 m can be factored (split) into two factors of at most b 1 and b 2 bits respectively The probability of different splits is discussed in section1.4.

3 The order n of g in Z ∗ p is known If p−1 has only one large prime factor, then it can be factored efficiently using a combination of trial division, Pollard’s rho algorithm for factoring, and primality testing In that case, or if the factorization of p−1 is already known, n can be computed efficiently Otherwise there is no known efficient algorithm for computing n.

4 Messages are not represented as elements of hgi For ElGamal to work, the messages must be represented as members ofZ ∗ p — however the restriction tohgi is not necessary,although highly recommended based on the success of this attack.

5 n≤(p−1)2 −b This condition ensures that given an elementv n of order dividing (p−1)/n inZ ∗ p , the expected number of distinct messagesm such thatm n =v n is small.

The attack is not guaranteed to succeed every time, as the message is divided in some manner For instance, with a 56-bit message and parameters set to b1 = b2 = 28, the success probability is approximately 18% Nonetheless, this does not significantly diminish the overall impact of the attack.

— if 18% of the credit cards numbers passing through a credit processor are stolen it’s nearly as much of a disaster as 100% stolen.

The Attack

ElGamal's strength lies in its non-deterministic encryption, where encrypting the same plaintext yields different ciphertexts due to the random selection of keys This non-determinism is characterized by the term y^k, which has an order dividing n By raising v = m * y^k to the n power, we effectively eliminate y^k, resulting in v^n = m^n * (y^k)^n = m^n * (g^(xk))^n = m^n * (g^n)^k*x = m^n.

Note that there may be other messages ˜msuch that ˜m n =m n =v n However under reasonable assumptions this is unlikely This is discussed in the next section.

A brute force attack involves searching for a message ˜m such that ˜m n = v n, but if the message is a 56-bit session key, the search could take over 1000 years on average, assuming modular exponentiations are computed in one microsecond To optimize this search, we use the splitting assumption, restricting our search to messages that can be factored as ˜m = ˜m 1 m˜ 2, where ˜m 1 ≤ 2 b 1 and ˜m 2 ≤ 2 b 2 Consequently, we have v n = ˜m n = ˜m n 1 m˜ n 2, leading to the equation v n m˜ −n 2 = ˜m n 1.

The attack involves calculating ˜m n 1 for ˜m1 ranging from 1 to 2 b 1 and storing the (key, value) pairs ( ˜m n 1 ,m˜ 1 ) in a dictionary Next, it computes v n m˜ −n 2 for ˜m 2 from 1 to 2 b 2 and checks for matches in the dictionary A match indicates that v n m˜ −n 2 equals ˜m n 1, suggesting that ˜m= ˜m1m˜2 could be the original message This dictionary, based solely on the public key and b1, can be reused for multiple messages.

If every message is represented as a member ofhgi, thenv n = 1 for every message and this attack fails completely.

Solution Collisions

If \( \tilde{m} \in \mathbb{Z}^* p \), then \( \tilde{m}^n \) will have an order that divides \( (p-1)/n \), placing it in the subgroup of order \( (p-1)/n \) We aim to determine the expected number of messages \( \tilde{m} \) that differ from the actual message \( m \) while satisfying \( \tilde{m}^n = m^n = v_n \) Let \( X_c \) represent this count Assuming there are \( 2^b \) possible messages, the values of \( \tilde{m}^n \) for \( \tilde{m} = 1, \ldots, 2^b \) are approximately uniformly distributed in the subgroup Consequently, the probability that \( \tilde{m}^n = v_n \) is \( \frac{p-1}{n} \), leading \( X_c \) to follow a binomial distribution with probability \( \frac{p-1}{n} \) and \( 2^b - 1 \) trials This holds true if \( n \leq \frac{(p-1)2^{-b}} \).

The attack primarily targets splitting messages, leading to an expected number of collisions represented as E[X c ] multiplied by the splitting probability In practical scenarios, this can be approximated by the formula n (p−1)2 −b, which indicates that collisions are unlikely to occur For instance, if p−1 is 1024 bits, n is 512 bits, and b is 64 (with messages being 64 bits), the expected number of collisions E[X c ] is approximately 2 −448.

Implementation

The attack is fairly straight forward, but we need to select a suitable data structure for the dictionary.

To optimize insert and search operations in a dictionary, a hash table is typically the preferred choice; however, an alternative method proposed by [BJN00] involves using a sorted array This approach entails storing all (key, value) pairs in the array during generation and sorting the array by keys at the end Lookup operations can then be performed in O(log n) time using binary search This implementation, referred to as "mim" or meet-in-the-middle, suggests that the sorting and binary search times are negligible compared to the time spent on modular exponentiations and multiplications, raising questions about the necessity of a hash table's complexity.

The sorting process necessitates O(n log n) operations, which are significantly quicker than the modular exponentiations utilized for generating array keys Additionally, the space complexity is minimal, with O(1) for heapsort and O(log n) for an optimized quicksort implementation The author employed the qsort and bsearch functions from the standard C library, while custom binary search routines were implemented for specific variations.

Reducing the dictionary size enables the decryption of larger messages without relying on slow external storage solutions For instance, with b set to 64 and both b1 and b2 at 32, each dictionary entry requiring s bytes results in a total dictionary size of 4s gigabytes.

When using 1024 bits, most elements of Z ∗ p, particularly ˜m n 1, occupy 128 bytes, while each ˜m1 value is only 4 bytes Storing entire ( ˜m n 1 ,m˜ 1 ) pairs in an array would require over 512GB of memory, which exceeds system capacity To address this issue, [BJN00] recommends storing (hash( ˜m n 1 ),m˜1) in the table, utilizing a suitable hash function.

To locate \( u n m˜ −n 2 \), we start by calculating hash(u n m˜ −n 2 ) and subsequently perform a binary search Due to potential collisions in the hash function, the search may yield several possible values for \( ˜m1 \) For each identified match, we will recalculate \( ˜m n 1 \) and verify its equality with \( u n m˜ −n 2 \) If the number of matches is significantly lower than expected, we can draw conclusions based on the results.

2 b 2 , the extra exponentiation calculations required will not make a significant contribution to the run time.

The anticipated number of matches is influenced by the bit size of the hash values Assuming the hash values are 'h' bits, it is important to note that the values of hash(˜m n 1) are uniformly distributed between 0 and a specified range.

The probability of hash(˜m n 2) matching a specific element in the table for a given value of ˜m 2 is 2^(-h) With 2^(b1) entries in the table, the expected number of matches for a single value of ˜m 2 is 2^(b1 - h) Consequently, the total expected matches amount to 2^(b2) * 2^(b1 - h) = 2^(b - h) Thus, it is essential to choose h such that b - h is greater than b2 The hashmim function implements this strategy by utilizing a hash function that outputs the lower bits.

The analysis reveals that with an input of 32 bits, setting b=46 and b2=23 leads to an expected increase of only 0.2% in message cracking time when comparing hashmim to plain mim, equating to approximately 2^14 extra exponentiations In scenarios where b=64 and b1=b2=32, a hash function with a minimum of 40 bits would be required for optimal security.

The hashmim implementation necessitates over 36GB for dictionary storage when b1 equals 32, posing a challenge for users with limited memory, such as the author's 6GB machine To address this limitation, an efficient external storage implementation could facilitate the cracking of larger messages In contrast, diskmim employs the lower 32 bits similarly to hashmim but utilizes a Tokyo Cabinet B+ tree database for dictionary implementation, enhancing efficiency.

Running Time and Memory Usage

The pre-computation process necessitates 2b1 modular exponentiations, independent of the dictionary's data structure While mim and hashmim attacks require an O(b12b1) sorting operation, the exponentiation process is typically more significant due to higher constants involved The space requirement amounts to 2b1 table entries, with each hash implementation entry consuming 8 bytes If b1 exceeds 32, additional space is needed per entry, but external memory will be necessary at that point In the case of mim, each entry comprises logp + b1 bits, along with the overhead for a big integer type Diskmim also demands 2b1 exponentiations, but substitutes sorting with 2b1 inserts, each requiring O(h) disk accesses, where h represents the height of the B+ tree Given that B+ trees maintain balance, h approximates logt 2b1 = b1 logt 2, with t being the minimum branching factor Consequently, the table construction is expected to operate in O(b12b1), though the mix of disk accesses and in-memory searches within a node suggests a slower performance for this implementation.

To successfully attack a specific message after building the table, all methods necessitate O(2 b 2) modular exponentiations The mim and hashmim techniques each require O(2 b 2) binary searches of the table, with each search operating in O(log 2 b 1) time, leading to an overall complexity of O(b1^2 b 2) In contrast, diskmim demands O(b1^2 b 2) disk accesses, as each B+ tree search incurs O(h) = O(b1) disk accesses.

Comparison to Brute Force

In hybrid systems, the most efficient brute force attack targets the symmetric cipher directly Utilizing an ab-bit session key results in an expected runtime of O(2^b) When both b1 and b2 equal b/2, the runtime for each phase of the meet-in-the-middle attack is O(2^(b/2 + 1)).

In a scenario where b equals 56 and both b1 and b2 are 28, a brute force attack would require an average of 2^55 symmetric cipher decryptions and plaintext tests, taking over a year to execute if each operation takes one nanosecond Conversely, a meet-in-the-middle attack, while only successful 18% of the time, can be completed in approximately 5 days.

TWO TABLE ATTACK

Introduction

The two table attack is a refinement of the meet-in-the-middle attack which works when

The group Z ∗ p contains a subgroup that allows for efficient computation of discrete logarithms Specifically, when p−1 has a B-smooth factor s, with B being relatively small (for instance, 2^10), the Pohlig-Hellman algorithm can be effectively utilized to compute discrete logarithms within this subgroup of Z ∗ p.

This attack leverages discrete logarithms during the pre-computation phase, substituting modular exponentiation with additions in the message decryption phase The adversary only needs the second part of the ciphertext (v = my^k) and the public key components (p, g, y) The foundational requirements and assumptions of the basic meet-in-the-middle attack remain applicable, with the added condition that s must exceed 2b to minimize the expected number of solution collisions A splitting assumption is utilized, allowing for the selection of b1 and b2 to optimize trade-offs in time, space, and success probability.

The Attack

To efficiently factor p−1 = nrs with s-smooth, we can utilize trial division to find an element α in Z ∗ p that generates the subgroup of order s Instead of exponentiating v to the power of n, we raise it to the power of r For any member a of Z ∗ p, we have (a^nr)^s = a^(p−1) = 1, indicating that a^nr is an element of the subgroup generated by α, which facilitates the computation of the discrete logarithm with base α Assuming ˜m1 and ˜m2 are factors of ˜m with bit sizes at most b1 and b2, respectively, if v represents a ciphertext for ˜m, we can express it as v = y^k * ˜m = y^k * ˜m1 * ˜m2 Consequently, v^nr can be rewritten as (y^k)^nr * ˜m^nr1 * ˜m^nr2, leading to v^nr = ˜m^nr1 * ˜m^nr2 This allows us to derive log(v^nr) = log(˜m^nr1) + log(˜m^nr2), where all logarithms are calculated with base α Solution collisions will be addressed in the following section.

In the pre-computation phase, we create two tables, T1 and T2, where T1 consists of pairs (log ˜m nr 1 , m˜1) for ˜m1 ranging from 1 to 2^b1, and T2 contains pairs (log ˜m nr 2 , m˜2) for ˜m2 from 1 to 2^b2 During the cracking phase, our goal is to identify two pairs, (t1, v1) and (t2, v2), from these tables such that logv nr t1 + t2 is congruent to zero modulo some value If such a pair is discovered, the product v1 v2 represents a potential plaintext Under optimal conditions, this solution is likely to be unique, leading to the conclusion that m equals v1 v2.

The k-table problem involves expressing an integer as the sum of k distinct integers from k different tables This article focuses on the specific scenario where k equals 2, highlighting that a dictionary data structure is no longer necessary for this approach.

The basic idea for solving the two table problem is to sort both tables by the first coordinate

To solve the problem of finding pairs in two sorted lists, T1 in ascending order and T2 in descending order, we need to check if the sums of their heads equal a target value If not, we adjust the head pointer based on whether the sum is greater or less than the target Our goal is to find pairs that satisfy the condition t1 = t - t2 mod s, where t is defined as logv nr We create a virtual table T2' from T2, containing values (t - t2 mod s, v2) for each (t2, v2) in T2 Although the smallest element in T2' may not be at the first position, it will still maintain a circular order in ascending form If t2 = t + 1 is found in T2, then t - (t + 1) mod s - 1 will be the largest element in T2' A binary search for t + 1 in T2 will help us locate its index; if found, we return that index If not found, the search will identify the indexes where t + 1 would fit, allowing us to determine the smallest element in T2 greater than t + 1, and subsequently, the largest element in T2'.

To identify matching elements in T1 and T2 0, we need to find pairs (t1, v1) and (t0 2, v2) such that t1 equals t0 2 We begin by comparing the target t with the sum of the heads of T1 and T2 0 If headT1 is less than headT2 0, we move the head pointer of T1 forward If the heads are equal, we have a potential solution If headT1 is greater, we advance the head of T2 0.

Solution Collisions

Assuming there are 2^b possible messages and that the values of v nr are uniformly distributed within the subgroup of order s, the expected number of solution collisions, denoted as E[X c], can be calculated accordingly.

In particular ifs >2 b then E[Xc]

Tiêu đề	Implementing Several Attacks On Plain ElGamal Encryption
Tác giả	Bryce Allen
Người hướng dẫn	Clifford Bergman, Major Professor, Paul Sacks, Sung-Yell Song
Trường học	Iowa State University
Chuyên ngành	Mathematics
Thể loại	Thesis
Năm xuất bản	2008
Thành phố	Ames

Định dạng
Số trang	35
Dung lượng	239,42 KB