Handbook of Elliptic and hyperelliptic curve cryptography present introduction to public-key cryptography; mathematical background; algebraic background; background on p-adic numbers; background on curves and jacobians; varieties over special fields; background on pairings; background on weil descent; cohomological background on point counting...
Factorization and primality
Primality
In the applications we envision we must be sure that a given integerN is prime The most obvious way is to try for all integersn√
NwhetherN ≡0 (modn)in which case one even has found a divisor ofN However, this method requiresO(√
N)modular reductions, which is far too large for the size ofNencountered in practice.
Proving the primality or compositeness of an integer can be time-consuming, as it often involves searching for proper factors The most effective primality test algorithms, as outlined in Chapter 25, confirm whether a number \( N \) is prime but do not provide a divisor if it is not Many of these algorithms are probabilistic, yielding one guaranteed true output while the other is true only with a certain probability By iterating these algorithms, we can enhance the likelihood that the probabilistic result is accurate, leading to efficient performance.
To verify primality using probabilistic algorithms, one typically begins with several iterations of an algorithm that reliably identifies nonprime numbers, while its identification of prime numbers is only probabilistic After completing a series of rounds, a subsequent algorithm is employed that guarantees accuracy when it declares a number as prime.
The primary reason for this order is that first-type algorithms typically have shorter running times, enabling efficient detection of composite integers, whereas factoring algorithms tend to operate at a much slower pace.
Complexity of factoring
Even though we shall return to this matter in Chapter 25 we briefly recall the complexity of finding factors of composite numbers.
Divisibility of large numbers can be efficiently checked using brute force methods for integers such as 2, 3, 5, 7, and 11 For instance, even a number N with 1000 decimal digits, equivalent to about 3222 bits, can be divided by all integers up to 10 million in just a few seconds on a modern personal computer The trial division method for determining if a large number N is divisible by a smaller integer n (where N is significantly larger than n) requires a maximum of O(log(n) log(N)) operations when employing straightforward techniques.
=O lg(n) lg(N) asymptotically. § 1.4 Factorization and primality 7
Theelliptic curve method (ECM)of factorization given in Section 25.3.3 has expected complexity
2)for finding the smallest prime factorpofN It is expected that in the near future this method will be able to find60-digits, i.e.,200-bit factors.
Recent advancements in RSA challenges have spurred extensive research into number field sieve methods for integer factorization Theoretically, the number field sieve is expected to factor any number in a heuristic time frame.
Recent advancements have demonstrated the practicality of an algorithm previously deemed unfeasible, leading to significant improvements in its implementation Notably, the largest RSA modulus successfully factored to date is the 200-digit RSA challenge integer, accomplished by researchers Franke and Kleinjung in 2005.
Discrete logarithm systems
Generic discrete logarithm systems
Let (G,⊕) be a cyclic group of prime order with P as its generator The mapping ϕ: Z → G, defined by n → [n]P, has a kernel of Z, establishing an isomorphism between (G,⊕) and (Z/Z,+) The challenge of computing the inverse map is known as the discrete logarithm problem (DLP) to the base P, which involves finding k ∈ Z such that Q = [k]P for given P and Q The discrete logarithm of Q to the base P is represented as log_P(Q) and is unique only modulo the group order The complexity of solving this problem is influenced by the specific choice of G and ⊕, leading us to refer to the discrete logarithm system as (G,⊕, P).
In the example of the group (G,⊕) defined as (Z/Z,+) with the generator 1 + Z, the discrete logarithm of n + Z is simply n When selecting a generator of the form a + Z for any integer a, solving the problem becomes straightforward, as it involves calculating the inverse modulo This process is demonstrated in Chapter 10 to have a polynomial complexity relative to the size of the operands, specifically in log scale.
Hence, this groupcannotbe used in cryptographic applications.
Example 1.13 Choose a primepsuch that dividesp−1 Chooseζ = 1inZ/pZwithζ = 1 (i.e.,ζis a primitive -th root of unity) Then(G,⊕) = (ζ,×)andϕ(n) =ζ n
In Chapter 19, we will show that this DLP is of subexponential complexity, thus harder than in the previous example but not optimal.
An obvious generalization is to work in extension fieldsF q , withq=p d , |p d −1forpprime.
To represent the finite field F p d one fixes an irreducible polynomialm(X) ∈ F p [X] of degree dand uses the isomorphismF p d F p [X]/ m(X)
For an introduction to finite fields and their arithmetic see Chapters 2 and 11.
DLP-based systems within the multiplicative group of finite fields are straightforward to develop By selecting a prime of suitable size, one can find parameters p and d such that | p d − 1 With carefully chosen subgroups, compression techniques leveraging traces, such as LUC [SMSK 1995], can be effectively applied.
XTR [LEVE 2000] can be used to represent the subgroup elements These groups additionally allow faster group operations.
The groups associated to elliptic and hyperelliptic curves of small genus that will be studied in the sequel of the book are believed to have a DLP ofexponential complexity.
To effectively implement a deep learning (DL) system, it is essential to identify challenging instances of groups where the discrete logarithm problem (DLP) is difficult This requires the ability to efficiently compute the group size, ensuring the existence of a large prime order subgroup (G, ⊕, P) Additionally, a concise representation of group elements is necessary, utilizing O(lg) space for optimal performance in group operations.
Efficient computation of Q⊕R is essential for any inputs Q and R within the group G Chapter 19 explores the complexity of solving the Discrete Logarithm Problem (DLP), with the order of P being a specific case where ord(P) equals log P 1, where 1 represents the neutral element in G Techniques for scalar multiplication are detailed in Chapter 9, while Chapters 13 and 14 focus on groups derived from elliptic and hyperelliptic curves.
Protocols
Diffie–Hellman key exchange
The publication of Diffie and Hellman's groundbreaking paper, "New Directions in Cryptography" in 1976, marks the inception of public-key cryptography This article outlines the Diffie-Hellman (DH) protocol within the framework of an abstract group (G, ⊕), where the authors specifically suggested using the multiplicative group of a finite prime field.
Alice (A) and Bob (B) aim to establish a shared secret key using public parameters (G, ⊕, P) Once they successfully agree on a joint key, denoted as Pk, they can utilize a key derivation function to generate a bit-string that serves as a key in a symmetric encryption system To achieve this, Alice randomly selects a secret value A from a specified range and computes the necessary results.
In the described exchange, P A is computed as [a A]P and P B as [a B]P, with both parties publicly sharing these intermediate results Due to the difficulty of the Discrete Logarithm Problem (DLP), it is impossible for A to derive a A from P A or for B to derive a B from P B Upon receiving P B, A calculates P k as [a A]P B, resulting in P k = [a A a B]P Consequently, B can also compute the same group element as [a B]P A, which is equal to [a B a A]P Ultimately, both parties possess the group element P k, which cannot be derived from the public values P A and P B.
Clearly, this last assumption does not hold if the DLP in(G,⊕)is easy The problem of comput- ing[a A a B]P given[a A]Pand[a B]Pis called thecomputational Diffie–Hellman problem (CDHP).
Maurer and Wolf (1999) investigate the equivalence between the Computational Diffie-Hellman Problem (CDHP) and the Discrete Logarithm Problem (DLP), utilizing elliptic curves of split group order as a key tool in their proof They demonstrate that the existence of such curves allows for the use of an oracle to solve the CDHP, which in turn can be leveraged to resolve the DLP in polynomial time This topic is further explored in relation to groups associated with elliptic curves in the study by Musmann et al (2004).
In many deep learning (DL) systems, verifying the correctness of a proposed solution to the Computational Diffie-Hellman Problem (CDHP) can be challenging The decision version of the Diffie–Hellman Problem (DDHP), which involves determining if a given element equals the product of two others, is not more complex than the CDHP However, for the Discrete Logarithm Problem (DLP), considering a decision version is not practical, as one can simply test the proposed solution directly.
In a DL system with a bilinear structure, the Decisional Diffie-Hellman Problem (DDHP) can be efficiently solved by comparing specific group elements Groups where the Computational Diffie-Hellman Problem (CDHP) is considered difficult while the DDHP is manageable are referred to as Gap-Diffie–Hellman groups.
The presented version is not ready for implementation, as an eavesdropper, referred to as Eve (E), can intercept communications between Alice and Bob By posing as Bob to Alice and as Alice to Bob, Eve can obtain joint keys with both parties, denoted as Pk,A with Alice and Pk,B with Bob This enables her to decrypt messages from Alice meant for Bob and re-encrypt them for Bob using Pk,B, allowing her to remain undetected while accessing all communications This type of attack is known as the man-in-the-middle attack.
Asymmetric Diffie–Hellman and ElGamal encryption
The Diffie-Hellman key exchange necessitates that both parties are online simultaneously, engaging in the process together This method is asymmetric, meaning that the sender and receiver undertake distinct steps and utilize two types of keys: a private key and a public key.
If the Discrete Logarithm Problem (DLP) is difficult in the group (G,⊕), Alice can simply publish her public key, P_A = [a_A]P, in a directory The process of generating the public and private key pair is known as key generation It is essential for the message receiver to have established and published their public key beforehand The literature on Public-Key Infrastructure (PKI) addresses the challenge of making this data accessible and ensuring trust in the connection between Alice and her public key.
O UTPUT: The public key P A and private key a A
To eliminate human biases, random selections must be conducted by computing devices, preventing the influence of factors such as the tendency to choose smaller numbers for easier calculations Chapter 30 focuses on the intricacies of random number generators.
To send a message to Alice, Bob retrieves her public key from a directory and can then execute an asymmetric Diffie–Hellman key exchange, provided there is a mapping ψ: G →.
K from the group to thekeyspaceK and a symmetric cipher E κ depending on the key κ The decryption function, i.e., the inverse ofE κ , is denoted byD κ
Algorithm 1.16 Asymmetric Diffie–Hellman encryption
I NPUT: A message m , the public parameters(G,⊕, P)and the public key P A ∈G
To decrypt, Alice computesP k = [a A]Q, using her private keya A, from which she determines κ=ψ(P k ) She recovers the plaintext asm=D κ (c).
The randomly chosennoncek∈ R Nmakes this arandomized encryption.
If there is an invertible mapϕfrom the message spaceMtoGone can also useElGamal encryption.
I NPUT: A message m , the public parameters(G,⊕, P)and the public key P A ∈G
To decrypt, Alice usesP k = [a A]Qand computesm=ϕ − 1 (RP k ).
This encryption method can only handle messages up to the size of lg, where lg is the order of G While it is technically feasible to encrypt longer messages by utilizing a mode of operation that involves multiple calls to Algorithm 1.17, this approach is rarely used due to the slow performance of the encryption scheme Instead, the transmitted message m is typically used as a secret key for subsequent symmetric encryption.
Signature scheme of ElGamal-type
An electronic signature should bind the signer to the content of the signed message Ahash function
(see [MEOO + 1996]) is a maph:S→T between two setsS, T, where usually|S|>|T|, e.g., the input is a bit-string of arbitrary length and the output has fixed length.
Additional properties are required fromcryptographic hash functions:
• Preimage resistant: for essentially all outputst∈ T it is computationally infeasible to find anys∈Ssuch thatt=h(s).
• 2nd-preimage resistant: for any givens 1 ∈ Sit is computationally infeasible to find a differents 2 ∈Ssuch thath(s 1) =h(s 2).
• Collision resistant: it is computationally infeasible to find any distinct inputss 1 , s 2such thath(s 1) =h(s 2).
To ensure practical usability, signatures must maintain a fixed length regardless of the message size, leading to the practice of signing the hash of the message It is crucial that the hash function is collision-resistant; otherwise, a malicious actor could exploit this by obtaining a signature for an innocent message, m1, and using it for a different message, m2, if both share the same hash value, h(m1) = h(m2) Additionally, we apply hash functions to elements within the group G, which we represent as bit-strings, denoting this as h(Q) for Q ∈ G.
To compute an electronic signature, Alice must have performed Algorithm 1.15 in advance.
I NPUT: A message m , the public parameters(G,⊕, P)with |G|= and the private key a A ∈G
In the signature scheme, it is crucial to keep the short-term secret, known as the random nonce, confidential; failure to do so could lead to the recovery of the long-term secret, the private key A This recovery occurs through the equation A ≡ h(Q) − 1 (h(m) − sk) (mod).
ElGamal signature schemes have various variants, some of which eliminate the need to invert k modulo the group order, making them advantageous for restricted environments This approach helps avoid the complexities of implementing modular arithmetic for both finite field and group computations An overview of these different schemes can be found in [MEOO + 1996, Note 11.6], where an alternative signature representation is provided as s = k * h(m) + a * A * h(Q) mod, using the previously defined notations.
A signature can be verified by everybody.
I NPUT: A message m , its signature(Q, s)from Algorithm 1.18, the public parameters(G,⊕, P) where |G|= , and the public key P A
O UTPUT: Acceptance or rejection of signature.
3 if R 1=R 2 return “acceptance” else return “rejection”
The algorithm is valid as a correct signature gets accepted Namely,
In Line 1 one can apply simultaneous multiplication techniques (cf Chapter 9).
The ability to transmit specific parts of Q depends on the unique properties of the group The Digital Signature Algorithm (DSA) operates within a subgroup of the multiplicative group of a finite field, while the equivalent standard for elliptic curves is known as the Elliptic Curve Digital Signature Algorithm (ECDSA).
The Elliptic Curve Digital Signature Algorithm (ECDSA) is based on the ANSI X9.62 standard, while the German standard GECDSA improves upon this by eliminating inversions modulo the group order Currently, there is no established standard for hyperelliptic curves, although a comparable version to ECDSA has been proposed in the literature from 2005.
Other problems
This chapter presents the foundational elements of cryptography centered around the discrete logarithm problem, which serves as a key motivation for exploring the groups discussed in this book Additionally, we provide a brief overview of the RSA cryptosystem, highlighting its significance as a widely used public-key cryptography method in practical applications.
In a scenario where Alice and Bob aim to communicate securely over an insecure channel, they must ensure that an eavesdropper, Eve, cannot decipher or alter their encrypted conversation This is achieved through cryptographic primitives that are easy for Alice and Bob to implement but computationally infeasible for Eve to solve without partial access to their secret information Examples of such cryptographic methods include RSA, which relies on the difficulty of the integer factorization problem, and the discrete logarithm problem, which involves determining an integer k such that [k]P = Q, where P is a generator of a cyclic group and Q belongs to that group These concepts are further explored in Section 1.4.3.
1 and 1.5 They are applied in a prescribed way given by protocols We will only briefly state the necessary problems and hardness assumptions in Section 1.6 but not go into the details.
This article explores key topics in cryptography, beginning with primality proving and integer factorization It then delves into discrete logarithm systems, highlighting the application of elliptic and hyperelliptic curves within this framework Finally, the discussion shifts to protocols that utilize these cryptographic primitives for establishing shared keys, encrypting messages for recipients, and facilitating electronic signatures.
In ancient times, cryptography was limited to a select group, primarily the military and secret services, who relied on couriers to distribute keys for enciphering and deciphering messages Historical symmetric systems, such as Caesar’s cipher and the Enigma machine, laid the groundwork for modern encryption Today, the Advanced Encryption Standard (AES) is the prevailing symmetric cipher, known for its speed and efficiency in secure communication, as long as a shared key is established.
E-commerce thrives on secure transactions within a globally connected network, necessitating reliable key exchange mechanisms for parties who have not previously interacted Public-key cryptography enhances security by reducing the stringent requirements of symmetric cryptography, allowing for easier setup and scalability of secure networks It offers cryptographic services such as non-repudiable signatures, which are absent in symmetric systems The security of public-key cryptography is grounded in the computational difficulty of specific mathematical problems and their complexity classification.
Complexity theory seeks to establish formal models for the processors and algorithms utilized in everyday computing, while also categorizing these algorithms based on their memory and time consumption.
A Turing machine, a simple mathematical model, can simulate all complex computations performed by a computer It consists of a finite set of states, an initial state, a finite set of symbols, and a transition function that guides its step-by-step operations The execution time of an algorithm is defined by the number of steps taken from start to finish, while memory consumption is measured by the symbols written on the memory string This book will utilize a more advanced model called a Random Access Machine, which closely resembles the functionality of modern microprocessors, allowing us to determine execution time by counting the basic operations on machine words required for execution For further information, please refer to [PAP 1994].
The security of protocols is closely related to the perceived difficulty of certain computational problems In computational theory, a problem consists of a collection of finite-length questions and corresponding answers, typically represented as strings In this context, the input often includes mathematical entities such as integers or group elements encoded as strings These problems can be broadly categorized into two types: those that require computation of additional group elements and those that seek a binary response, such as yes or no.
Definition 1.1 A problem is called adecision problemif the problem is todecidewhether a state- ment about an input is true or false.
A problem is called acomputation problemif it asks tocomputean output maybe more elaborate than true or false on a certain set of inputs.
One can formulate a computation problem from a decision problem Many protocols base their security on a decision problem rather than on a computation problem.
The computation problem of finding the square root of 16 differs from the decision problem of determining whether 4 is a square root of 16 The latter can be resolved by calculating 4 squared, which equals 16, and then comparing the results.
A further decision problem in this context is also to answer whether 16 is a square Clearly this decision problem can be answered by solving the above computation problem.
In the upcoming section, we will explore the significant issue of determining whether a specific integer is a prime number, which is closely linked to the computational challenge of factorizing that integer.
In computational models, an algorithm can be associated with a function that quantifies the resources it consumes based on the input length, known as the complexity parameter When evaluating execution time, this function is referred to as time complexity, while memory usage is described as space complexity To express complexity in a way that is independent of specific processors, it is more practical to consider the algorithm's cost "up to a constant factor," focusing on the growth rate of the operation count rather than the exact number of operations for a given input size.
The Schoolbook method for multiplying n-digit integers is classified as an "n² algorithm," indicating that a maximum of c*n² single-digit multiplications is required, where c is a constant The focus here is not on the value of c, but rather on the efficiency of the algorithm, which can be expressed using "big-O" notation, as noted by GAGE in 1999.
Definition 1.4 Letf andgbe two real functions ofsvariables The functiong(N 1 , , N s )is of orderf(N 1 , , N s )denoted byO f(N 1 , , N s ) if for a positive constantcone has
In mathematical analysis, the relationship |g(N₁, , Nₛ)| is compared to cf(N₁, , Nₛ) when Nᵢ exceeds a constant N Certain finite tuples (N₁, , Nₛ) may be excluded when the functions f and g are undefined or lack meaning Additionally, it is important to consider "small-o" notation alongside "big-O" notation for a comprehensive understanding of these functions.
The functiong(N 1 , , N s )is of ordero f(N 1 , , N s ) if one has lim
Finally we writef(n) =O g(n) as a shorthand forf(n) =O g(n) lg k g(n) for somek.
In this article, we denote the logarithm to base 2 as lg and the natural logarithm as ln Since these expressions differ only by constants, big-O expressions consistently utilize the binary logarithm When referring to logarithms of different bases, we use log_a b to indicate the logarithm of b to base a, ensuring it is not confused with the discrete logarithm discussed in Section 1.5, as the context will clarify the intended meaning.
Example 1.5 Considerg(N) = 10N 2 + 30N+ 5000 It is of orderO(N 2 )as forc = 5040one hasg(N)cN 2 for allN We may writeg(N) =O(N 2 ) In additiong(N)iso(N 3 ).
To compute the nth fold of an integer m more efficiently, we can reduce the complexity of scalar multiplication from O(n) to O(log n) This is achieved by recognizing that doubling m (4m = 2(2m)) takes approximately the same time as adding two distinct elements, thereby decreasing the number of operations from three to two This concept can be further applied to other scalar multiplications, such as 5m = 2(2m) + m, which requires only three operations instead of four More generally, if n is expressed in its binary form as n = ∑(i=0 to l-1) n_i * 2^i, where n_i ∈ {0,1} and l = log n, the multiplication can be represented as n * m = 2(2( (2(m + n_(l-2)m) + n_(l-3)m) + + n_2m) + n_1m) + n_0m).
Elementary algebraic structures
Groups
Definition 2.1 Given a setS, acomposition law×ofSinto itself is a mapping from the Cartesian productS×StoS Common notations for the image of(x, y)under this mapping arex×y,x.y or simplyxy When the law iscommutative, i.e., when the images of(x, y)and(y, x)under the composition law are the same for allx, y∈S, it is customary to denote it by+.
Definition 2.2 AgroupGis a set with a composition law×such that
• ×isassociative, that is for allx, y, z∈Gwe have(xy)z=x(yz)
• ×has aunit elemente, i.e., for allx∈Gwe havexe=ex=x
• for everyx∈Gthere existsy, aninverseofxsuch thatxy=yx=e.
A group G is classified as commutative or abelian if its composition law is commutative This law is typically represented by the symbols + or ⊕, while the identity element is denoted by 0.
(ii) The unit of a groupGis necessarily unique as well as the inverse of an elementxthat is denoted byx − 1 IfGis commutative the inverse ofxis usually denoted by−x.
(iii) The cardinality of a groupGis also called itsorder The groupGisfiniteif its order is finite.
Definition 2.4 LetGbe a group AsubgroupH ofGis a subset ofGcontaining the unit element eand such that
Example 2.5 Letx∈G The set{x n |n∈Z}is thesubgroup ofGgenerated byx It is denoted byx.
Definition 2.6 LetGbe a group An elementx∈Gis offinite orderifxis finite In this case, theorder ofxis|x|, that is, the smallest positive integernsuch thatx n =e Otherwise,xis of infinite order.
Definition 2.7 A groupGiscyclicif there isx∈Gsuch thatx=G If such an elementxexists, it is called ageneratorofG.
Remark 2.8 Every subgroup of a cyclic groupGis also cyclic More precisely, if the order ofGis n, then for each divisordofn,Gcontains exactly one cyclic subgroup of orderd.
Definition 2.9 Let G be a group and H be a subgroup of G For all x, y ∈ G, the relation x∼y∈H, if and only ifx − 1 y∈H, respectivelyx∼yif and only ifyx − 1 ∈H, is an equivalence relation An equivalence class for this relation is denoted byxH ={xh | h∈ H}, respectively
In group theory, the left and right cosets of a subgroup H within a group G are denoted as Hx={hx|h∈H} The number of distinct cosets for both left and right relations is equal, and this quantity is referred to as the index of H in G, represented as [G:H].
Theorem 2.10 (Lagrange) LetGbe a finite group andHbe a subgroup ofG Then the order ofH divides the order ofG As a consequence, the order of every element also divides the order ofG.
Since all the classes moduloH have the same cardinality|H|and form a partition ofG, we have the more precise result|G|= [G:H]|H|.
Definition 2.11 LetGbe a group A subgroupH isnormalif for allx∈G,xH =Hx In this caseG/Hcan be endowed with a group structure such that(xH)(yH) =xyH.
The group G = (Z, +) is an abelian group, and for any integer n, the subgroup of multiples of n, denoted nZ, is a normal subgroup of G This allows for the formation of the quotient group Z/nZ, which consists of equivalence classes {x + nZ | x ∈ Z} Two integers x and y are considered congruent modulo n if they belong to the same class modulo n, meaning that x - y is an element of nZ This relationship is expressed as x ≡ y (mod n).
For every integerx, there is a unique integerrin the interval[0, n−1], which belongs to the class ofx This integerris called thecanonical representative ofxand we writer =xmodn.
But other choices are possible For example, to minimize the absolute value of the representatives, we writexmodsnfor the unique integer in[−n/2 + 1,n/2 ]congruent toxmodulon.
Definition 2.12 LetGandG be two groups with respective laws×and⊗and unitseande
• Agroup homomorphismψbetweenGandG is a map fromGtoG such that for all x, y∈G,ψ(x×y) =ψ(x)⊗ψ(y).
Remark 2.13 The kernel ofψis never empty as it is easy to see thatψ(e) =e In addition,kerψ is always a subgroup ofG, which is in addition normal.
Definition 2.14 LetS be a set andGbe a group The groupGacts onSif there is a mapσfrom
=σ(xy, t), for allt∈Sand for allx, y∈G.
Rings
Definition 2.15 AringRis a set together with two composition laws+and×such that
• Ris a commutative group with respect to+
• ×is associative and has a unit element1, which is different from0, the unit of+
• ×isdistributive over+, that is for allx, y, z∈R,x(y+z) =xy+xzand(y+z)x yx+zx.
(i) The ringRis said to becommutative, if the law×is commutative.
(ii) A commutative ringRsuch that for allx, y∈R, the equalityxy= 0implies thatx= 0 ory= 0is called anintegral domain.
Example 2.17 The setZof integers together with the usual addition and multiplication is a ring. The setZ[X]of polynomials with coefficients inZtogether with the addition and multiplication of polynomials is a ring.
Definition 2.18 LetRandR be two rings with the respective operations+,×and⊕,⊗ A ring homomorphismψis an application fromRtoR such that for allx, y∈R
Definition 2.19 LetRbe a ring,Iis anideal ofRif it is a nonempty subset ofRsuch that
• Iis a subgroup ofRwith respect to the law+
• for allx∈Rand ally∈I,xy∈Iandyx∈I.
The idealIRisprimeif for allx, y ∈Rwithxy∈Ione obtainsx∈Iory∈I.
The idealIRismaximalif for any idealJ ofRthe inclusionI⊂J impliesJ=IorJ =R.
Two idealsIandJofRarecoprimeifI+J ={i+j|i∈Iandj∈J}is equal toR.
Remark 2.20 It is easy to prove that a maximal ideal is also prime The converse is not true in general.
Definition 2.21 An idealIof a ringRisfinitely generatedif there are elementsa 1 , , a n such that everyx∈Ican be writtenx=x 1 a 1+ã ã ã+x n a n withx 1 , , x n ∈R.
The idealIisprincipalifI = aRandRis aprincipal ideal domain (PID)if it is an integral domain and if every ideal ofRis principal.
Example 2.22 The integer ringZand the polynomial ringK[X]whereKis a field are principal ideal domains.
Theorem 2.23 (Chinese remainder theorem) LetI 1 , , I k be pairwise coprime ideals ofR.
Corollary 2.24 Letn 1 , , n k be pairwise coprime integers, i.e., such thatgcd(n i , n j ) = 1for i=j Then, for any integersx i , there exists an integerxsuch that
Remark 2.25 See Algorithm 10.52 for an efficient method to computexgiven thex i ’s.
Next we define an important arithmetic invariant LetRbe a ring and let ψ be the natural ring homomorphism fromZtoR So ψ(n)
The kernel ofψ is an ideal of Z and if the multiples of1 are all different then kerψ = {0}.
Otherwise, for example ifRis finite, some multiples of1must be zero In other words, the kernel ofψis generated by a positive integerm.
Definition 2.26 LetRbe a ring andψdefined as above The kernel ofψis of the formmZ, for some nonnegative integerm, which is called thecharacteristic ofRand is denoted bychar(R).
Remark 2.27 In a commutative ringRof prime characteristicp, the binomial formula simplifies to
(α+β) p n =α p n +β p n for allα, β∈Randn∈N (2.2) § 2.1 Elementary algebraic structures 23
Definition 2.28 LetRbe a ring An elementx∈Ris said to beinvertibleif there is an elementy satisfyingxy=yx= 1 Such aninversey, also called aunit, is necessarily unique and is denoted byx − 1 The set of all the invertible elements is a group under multiplication denoted byR ∗
In the context of number theory, when considering the ring Z/NZ formed by taking the quotient of the integer ring Z by the ideal NZ, the invertible elements correspond uniquely to the canonical representatives that are coprime with the positive integer N The inverse of any element within this ring can be determined through an extended gcd computation, as detailed in Section 10.6.
Definition 2.30 LetN 1 and let us denote|(Z/NZ) ∗ |byϕ(N) The functionϕis called the
Euler totient functionand one hasϕ(N) =|{x|1xN,gcd(x, N) = 1}|.
From Lagrange’s Theorem 2.10, it is easy to prove the following.
Theorem 2.31 (Euler) LetNandxbe integers such thatxis coprime toN, then x ϕ(N) ≡1 (modN).
Fermat's little theorem, first established by Fermat, states that if the modulus \( N \) is a prime number \( p \), then for any integer \( x \) that is not divisible by \( p \), it holds that \( x^{p-1} \equiv 1 \mod p \) This theorem is a fundamental result in number theory and highlights a key property of prime numbers.
The ringZ/pZhas many other marvelous properties In particular, every nonzero element has an inverse, which means thatZ/pZis a field.
Fields
Definition 2.32 AfieldKis a commutative ring such that every nonzero element is invertible.
The set of rational numbers, denoted as Q, forms a field under the standard operations of addition and multiplication Additionally, for any prime number p, the quotient set Z/pZ, equipped with the corresponding integer addition and multiplication, also qualifies as a field.
A straightforward consequence of Definition 2.32 is that a field qualifies as an integral domain By taking the quotient of K by the kernel of ψ, as outlined in (2.1), we can determine that K includes a field that is isomorphic to Z/char(K)Z These observations lead to a significant conclusion.
Proposition 2.34 The characteristic of a field is either0or a prime numberp.
As a corollary, a fieldKcontains a subfield which is isomorphic toQorZ/pZ.
In an integral domain R, one can create a field by incorporating the formal inverses of all elements in R, resulting in the field of fractions of R For example, K(X) represents the field of fractions for the polynomial ring K[X] This method is commonly utilized in practice for constructing fields.
Proposition 2.35 LetRbe a ring andIan ideal ofR Then the quotient setR/Iis a field if and only ifIis maximal.
Definition 2.36 LetKandLbe fields Ahomomorphism of fieldsis a ring homomorphism between
We remark that a homomorphism of fields is always injective, for it is immediate that its kernel is reduced to{0}.
Introduction to number theory
Extension of fields
Definition 2.44 LetKandLbe fields, we say thatLis anextension fieldofKif there exists a field homomorphism fromKintoL Such an extension field is denoted byL/K.
Remark 2.45 As said before, a field homomorphism is always injective, so we shall identifyK with the corresponding subfield ofLwhen consideringL/K.
Example 2.46 LetRbe the field of real numbers with usual addition and multiplication Obviously,
Ris an extension ofQ Now, let us describe a less trivial example Consider the element√
2∈R and the subset ofRof the elements of the forma+√
2bwitha, b∈Q If we put fora+√
2(ab +a b), it is easy to see that we obtain a field denoted byQ(√
2), which is an extension ofQ.
Definition 2.47 LetLandL be two extension fields ofKandσa field isomorphism fromLtoL One says thatσis aK-isomorphismifσ(x) =xfor allx∈K.
Definition 2.48 LetL/Kbe a field extension thenLcan be considered as aK-vector space The dimension ofL/Kis called thedegreeofL/Kdenoted by[L:K]ordeg(L/K) If the degree of
L/Kis finite then we say that the extensionL/Kisfinite.
The following result is straightforward.
Proposition 2.49 LetK⊂L⊂F be a tower of extension fields then deg(F/K) = deg(F/L) deg(L/K).
In the context of a field extension L/K, for any element x in L, there exists a unique ring homomorphism ψ: K[X] → L that maps X to x while preserving the values of elements z in K The kernel of this homomorphism is either the zero ideal or a maximal ideal I of K[X].
Definition 2.50 Suppose thatIas defined above is nonzero AsK[X]is a principal ideal domain, there exists a unique monic irreducible polynomial m(X) =X d +a d − 1 X d − 1 +ã ã ã+a 0 such thatI=m(X)K[X] We say thatxis analgebraic elementofLof degreedand thatm(X) is theminimal polynomialofx.
Quotienting bykerψ, one sees thatψgives rise to a field inclusionψofK[X]/ m(X)K[X] into
L LetK[x] ={f(x)|f(X)∈K[X]}be the image ofψinL It is an extension field ofKand the monic polynomialm(X)is an invariant of the extension: in fact, if there existsy∈Lsuch that
K[x]K[y]then by constructionxandyhave the same minimal polynomials.
Definition 2.51 If every element ofLis algebraic overK, we say thatLis analgebraic extension ofK.
K[x]/K is a finite extension, where d, the degree of the polynomial m, matches its degree The mapping ψ establishes a bijective relationship between the K-vector space of polynomials with coefficients in K of degree less than d and K[x] This mapping, known as a polynomial representation of K[x]/K, highlights the structural connection between these mathematical entities.
Not all algebraic extensions are finite; however, every finite extension is algebraic In the case of a finite extension L/K, there exists a finite sequence of elements x1, , xn in L such that L is equal to K[x1, , xn] When L is expressed as K[x], it is referred to as a monogenic extension of K.
Definition 2.52 LetL/Kbe a finite algebraic extension and letx ∈ L The application of right multiplication byxfromLtoLconsidered as aK-vector space is linear The trace and the norm of this endomorphism ofLare called respectively thetraceandnormofxand denoted byTr L/K (x) andN L/K (x) We use the notationsTr(x)andN(x)when no confusion is likely to arise.
Ifxis a generating element ofL/Kwith minimal polynomialm(X) =X d +a d − 1 X d − 1 +ã ã ã+a 0 thenTr(x) =−a d − 1 andN(x) = (−1) d a 0
The trace and norm are both maps ofLtoK We have the basic properties:
Lemma 2.53 LetL/Kbe a degreedfinite algebraic extension Forx, y∈Landa∈Kwe have
LetK⊂L⊂F be a tower of finite algebraic extensions, letxbe an element ofFthen
Whenxis not a root of any polynomial equation with coefficients inKone needs a new notion.
Definition 2.54 If the kernel ofψis equal to{0}we say thatxis atranscendental elementofL If every element ofLis transcendental overK, we say thatLis apure transcendental extensionover
K More generally, if there exists an element ofLwhich is not algebraic overK, thenL/Kis a transcendental extensionofK.
To extend the inclusion ψ of the fraction field K(X) into L, we define ψ(1/X) = 1/x, resulting in K(x) as the image of ψ in L If L is not an algebraic extension over K(x), we can set x₁ = x and identify x₂, a transcendental element of L/K(x₁) not contained in K(x₁) This allows us to create an inclusion of K(X₁, X₂) into L By iterating this process, we can determine n ∈ N ∪ {∞}, representing the maximum number such that K(x₁, , xₙ) forms a subfield of L isomorphic to
K(X 1 , , X n ) It can be shown thatnis independent of the sequence of transcendental elements x 1 , , x n ofLoverKchosen.
Definition 2.55 The numberndefined above is called thetranscendence degreeofLoverK.
Every extension \( K \rightarrow L \) can be expressed as a composition of a pure transcendental extension \( K \rightarrow K_{\text{trans}} \) and an algebraic extension \( K_{\text{trans}} \rightarrow K_{\text{alg}} \).
The field of rational functions over Q, denoted as Q(X), represents a pure transcendental extension of Q with a transcendence degree of 1 In contrast, the algebraic closure of Q, referred to as Q, creates an algebraic extension of Q, although this extension is not finite.
2)/Qis a transcendental extension that can be written asQ→Q(X)→Q(X,√
Algebraic closure
In the context of a field K, we explore a monogenic algebraic extension K[x] defined by an irreducible polynomial m(X) over K This polynomial can be expressed as a product of irreducible polynomials m_i(X) within K[x] Since x is a root of m(X), (X−x) serves as an irreducible factor of m(X), leading to the conclusion that the degree of each m_i is less than the degree of m If all the m_i(X) polynomials have a degree of 1, we characterize m(X) as splitting completely in K[x].
If the polynomial \( m(X) \) does not split completely over \( K[x] \), then there exists an irreducible polynomial \( m_{i_1}(X) \) of degree at least 2 This allows us to define an extension \( K[x, y] \) over \( K[x] \) using \( m_{i_1}(X) \) By repeating this process, we can recursively construct an extension field in which \( m(X) \) splits completely.
Definition 2.57 The smallest extension ofKover whichm(X)completely splits is called thesplit- ting fieldofm(X) It is unique up to aK-isomorphism.
Every polynomial with coefficients in the real numbers splits completely in the complex numbers More broadly, for a given field K, we seek a maximal algebraic extension in which every algebraic extension of K can be embedded This extension ensures that every polynomial in K[X] splits completely within it The existence of such an extension is guaranteed by a theorem established in 1910.
Theorem 2.58 (Steinitz) There exists a unique algebraic extension ofKin which every polynomial m(X)∈K[X]splits completely This extension called thealgebraic closure ofKand denoted by
Kis unique up to aK-isomorphism.
Next, we review some basic properties of algebraic extensions in order to state the main theorem ofGalois theory.
Galois theory
For most parts of the book we consider finite algebraic extension fields Therefore we restrict the discussion of Galois theory to this important case.
Definition 2.59 An extensionLoverKis said to benormalif every irreducible polynomial over
Kthat has a root inLsplits completely inL.
Every field automorphism of L that fixes K leaves L invariant Let K be a field with its algebraic closure denoted as K, and let σ represent an embedding of K into K For an element x in K, K[x] is an algebraic monogenic extension of K defined by a polynomial m(X) of degree d If x = x₁, x₂, , xₛ are the distinct roots of m(X) in K, then for each i = 1, , s, a unique field inclusion σᵢ of K[x] into K can be defined, where the restriction of σᵢ on K is σ, and σᵢ(x) = xᵢ The σᵢ's represent all the inclusion homomorphisms of K[x] into K, maintaining the restriction given by σ on K.
The degree of separability of K[x₁] over K, denoted as s, is always less than or equal to the degree of K[x]/K This integer s represents the degree of separability of the element x₁ More broadly, this relationship holds true in general.
Definition 2.60 LetLbe a finite algebraic extension ofK,Kbe the algebraic closure ofKandσ an inclusion ofKintoK Then thedegree of separabilityofLoverKdenoted bydeg s (L/K)is the numbersof different field inclusionsσ i ,i= 1, , sofLintoKrestricting toσoverK If deg s (L/K) = deg(L/K), we say thatL/Kisseparable.
Ifx∈L, the elementsσ i (x)∈Kare called theconjugatesofx.
An immediate consequence of the definition and the preceding discussion is:
Lemma 2.61 A monogenic algebraic extensionK[x]defined by a minimal polynomialm(X)is separable if and only ifm(X)is prime to its derivativem (X).
Concerning the composition of degree of separability we have
Proposition 2.62 LetL/KandF/Kbe a tower of extension fields, then deg s (F/K) = deg s (F/L) deg s (L/K).
We have the basic fact
Fact 2.63 LetF/LandL/K be field extensions and letx ∈ F be separable over K; then it is separable overL.
An algebraic finite extension is separable if it can be expressed as the composition of monogenic separable extensions This conclusion follows from previous propositions and establishes a clear relationship between separability and the structure of algebraic extensions.
Proposition 2.64 A finite algebraic extensionL/K is separable if and only if every x ∈ L is separable overK.
Then the criterion of Proposition 2.61 tells us that every algebraic extension over a field of charac- teristic0is separable We have the following definition
Definition 2.65 A field over which every algebraic extension is separable is called aperfect field.
A field is perfect if and only if every irreducible polynomial is prime to its derivative We saw that every field of characteristic zero is perfect More generally, we have
Proposition 2.66 A fieldK is a perfect field if and only if one the the following conditions is realized
As a consequence of this proposition, we shall see that every finite field is perfect The following theorem shows that every finite algebraic separable extension is in fact monogenic.
Theorem 2.67 IfL/Kis a separable finite algebraic extension ofKthenL/Kis monogenic, i.e., there existsx∈Lsuch thatL=K[x]andxis called adefining element.
Definition 2.68 An extensionL/Kis aGalois extensionif it is normal and separable We define theGalois group ofLoverKdenoted byG L/K orGal(L/K)to be the group ofK-automorphisms ofL There is a natural action ofG L/K onLdefined forg∈G L/K andx∈Lbygãx=g(x) By its very definition, this action leaves the elements ofKinvariant.
IfHis a subgroup ofG, we denote byL H the set of elements ofLinvariant under the action of
H It is easy to see thatL H is a subfield ofL Moreover,L H is a Galois extension ofKif and only ifHis a normal subgroup ofG.
The separability condition of Galois extensions indicates that the order of the Galois group of L/K matches the degree of the extension This fundamental relationship serves as a key result in the foundation of Galois theory.
Theorem 2.69 LetL/Kbe a finite Galois extension Then there is a one-to-one correspondence between the set of subfields ofLcontainingKand the subgroups ofG L/K To a subgroupH of
G L/K this correspondence associates the fieldL H § 2.2 Introduction to number theory 29