5.2 Normal and Canonical Forms
Earlier we used the term “canonical form” to denote the idea that math- ematically equivalent quantities are represented the same way. We also intro- duced the zero-equivalence problem and indicated that it was unsolvable for most types of mathematical expressions. These two concepts are intimately related. In this section, we examine the notion of canonical simplification more formally. We also indicate some classes of expression for which the zero- equivalence problem is solvable and some for which instances of the problem are known to be unsolvable.
LetE denote a class of mathematical expressions. Examples include mul- tivariate polynomials with rational coefficients and elementary expressions of a single variable. The latter correspond roughly to the functions studied in introductory calculus. We use the symbol∼ to denote that two expressions are mathematically equivalent although the forms used to express them may be different.
Anormal(zero-equivalence)formis a transformationf :E → E satisfying the following properties for all expressionse∈ E:
1.f(e)∼e,
2. if e∼0, thenf(e)≡0.
A canonical form is a mapping f : E → E that is normal and satisfies the additional property:
3. if e1∼e2, thenf(e1)≡f(e2).
Intuitively, a normal form for an expression is a mathematically equivalent expression with the property that any expression equal to zero is transformed to zero. The existence of a normal form algorithm implies that we can test whether two expressions are equal by determining whether their difference is zero. A canonical form algorithm always transforms two mathematically equivalent expressions to the same expression. While the notions of normal and canonical forms are indeed different, the following surprising theorem shows that any class of expressionsE with a normal form also has a canonical form!
Theorem 5.1 (Normal/Canonical Form Theorem). A class of expres- sionsE possesses a normal form if and only ifE has a canonical simplification algorithm.
Proof. It is obvious from the definitions that the existence of a canonical form implies a normal form. We show that ifE has a normal form, then it also has a canonical form.
E consists of finite length expressions formed using some finite number of operations. Therefore, we can impose a lexicographical (alphanumerical) ordering over all the expressions in E. As a result, an algorithm can be con- structed to generate all the elements ofE sequentially according to this lexi- cographical ordering.
To obtain the canonical formf(e) for any expressione, we can generate the elements ofEin order and apply the zero-equivalence (normal form) algorithm to the difference between each generated element ande. We let the canonical form be the first element ofEin the lexicographical ordering that is equivalent toe.
Like many similar existence proofs in the theory of computation, this one may not be particularly satisfying. The canonical form constructed need not be “simpler” than the original expression in any mathematical sense. More- over, the process of generating and testing the expressions in E one after another is extraordinarily inefficient and, for all but the smallest expressions, computationally intractable. Nonetheless, the proof is valid and we have the surprising result that the existence of a normal simplifier implies the existence of a canonical simplifier for the same class of expressions.
The next theorem shows that once we get only a little beyond the ratio- nal functions, the zero-equivalence problem is not, in general, solvable. Con- sequently, the class of expressions having normal (canonical) simplification algorithms is very small!
Theorem 5.2 (Richardson’s Theorem). LetE be the class of expressions in a single variablexgenerated by:
1.the rational numbers and the two real numbers πandln 2,
2.the operations addition/subtraction, multiplication, composition, and 3.the sine, exponential, and absolute value functions.
Then if e is an expression in E, the problem of determining whether there exists a value forxsuch that e(x) = 0is undecidable.
The theorem is due to Daniel Richardson (1966). Its proof, which is well beyond our scope, is based on the unsolvability of Hilbert’s tenth problem, known as the Diophantine problem. This problem is concerned with whether an algorithm exists to determine if polynomials in several variables with inte- ger coefficients have solutions that can be expressed as integers. An example of a second order Diophantine equation is
42x2+ 8x y+ 15y2+ 23x+ 17y−4915 = 0, whose only solution isx=−11, y=−1.
5.2 Normal and Canonical Forms 155 In view of Richardson’s Theorem we might well ask, “What useful classes of expressions have canonical forms?” We next present three such classes.
Univariate Polynomials
The most familiar canonical representation for polynomials in one variable isexpanded form.For polynomials with integer coefficients, this representation is computed as follows:
1. Multiply all products of polynomials.
2. Collect terms of the same degree.
3. Order the terms by degree.
Omitting the last step results in a normal, rather than canonical, form. We illustrate the procedure with a simple example.
p(x) = 2 (x2+x−1)(x3+ 1)−(x2−x+ 1)(x3−1)
= (2x5+ 2x2+ 2x4+ 2x−2x3−2)−(x5−x2−x4+x+x3−1)
=x5+ 3x2+ 3x4+x−3x3−1
=x5+ 3x4−3x3+ 3x2+x−1
A second canonical representation for univariate polynomials is factored form. This is more expensive to compute than the expanded form. To obtain a canonical, rather than a normal form, a rule is needed to order the factors.
One method is to arrange them by degree, say from lowest to highest. Factors of equal degree can be ordered by the values of their coefficients working, say, from highest to lowest degree terms. With this ordering, the factored form of the polynomialpgiven above is
p(x) = (x2−x+ 1)(x3+ 4x2−1)
The problem of factoring univariate polynomials is the subject ofChapter 6.
Yet another canonical form, though one not particularly useful in computer algebra, is Horner’s rule. This representation produces a formula that uses both the fewest number of additions and the fewest number of multiplications to evaluate a general polynomial. Forp, it produces
p(x) =x(x(x(x(x+ 3)−3) + 3) + 1)−1.
Multivariate Polynomials
Two canonical representations, the expanded and recursive forms, are very useful in computer algebra. To find theexpanded form, multiply out all prod- ucts and collect terms of the same degree, as in the univariate case. This gives a normal form. To obtain a canonical representation, assume an ordering on the variables and arrange terms by degree using this order. For example, with the orderingx > y > z, the polynomial
q(x, y, z) = 2x2y2−3x2y z2+ 6x2z4+ 5x−4y3z+y3−4y2−z3+ 1
is in expanded form. This representation is also referred to asdistributed form.
An alternative canonical representation is recursive form. Again the vari- ables must be ordered. A multivariate polynomial in this form is written as a polynomial in the first variable with coefficients that are polynomials in the other variables. These coefficient polynomials are then, in turn, repre- sented in recursive form. The recursive form forq, again assuming the ordering x > y > z, is
q(x, y, z) = (2y2−3z2y+ 6z4)x2+ 5x+ (−4z+ 1)y3−4y2−z3+ 1.
It is historically interesting to note that some of the early computer algebra systems took diametrically opposite approaches to the issue of representing multivariate polynomials. Brown’s Altran system used an expanded represen- tation, while Collins’ PM and SAC-I systems stored polynomials recursively.
A polynomial of degreedinv variables has (d+ 1)v coefficients (zero and non-zero) when fully expanded. Consequently, in order to conserve memory, a sparse representation is invariably used to store multivariate polynomials in computer algebra systems. Without this approach, computations would quickly become unwieldy and the results incomprehensible.
As with univariate polynomials, polynomials in several variables can also be expressed canonically in factored form. While users can request this trans- formation, computer algebra systems never employ it as the standard internal representation since it can be costly to find.
Rational Functions
For simplicity, we direct our attention to the case where both the numer- ator and denominator are polynomials in only one variable. We first consider the situation where the coefficients are integers. One common simplification is to transform the ratio of the polynomials toexpanded canonical form with the following procedure:
1. Expand the polynomials in the numerator and denominator.
2. Remove the gcd of these polynomials.
3. Order the terms of the numerator and denominator polynomials by degree.
4. Make the leading coefficients of both polynomials positive.
Here’s a simple example,
r(x) = 6x2+ 2x−4
−4x2+ 2x+ 6
= 2 (x+ 1) (3x−2) 2 (x+ 1) (−2x+ 3)
= 3x−2
−2x+ 3
=−3x−2 2x−3.