\chapter[ Computer Algebra]{Computer Algebra\label{chapter:CompAlg}}\index{computer algebra|(}\index{symbolic!computation}
Computer systems performing various mathematical computations are
inevitable in modern science and technology. We are able to compute
the orbits of planets and stars, command nuclear reactors,
describe and model many of the natural forces. These computations can
be \textit{numerical and symbolical.}
Although {\em numerical computations} may involve not only elementary
arithmetical operations (addition, subtraction,
multiplication, division) but also more sophisticated calculations,
like computing numerical values of mathematical functions, finding
roots of polynomials or computing numerical eigenvalues of matrices,
these operations can only be carried out {\em on numbers}.
Furthermore, in most cases these numbers are not exact. Their degree
of precision depends on the floating-point arithmetic of the given
computer hardware architecture.
Unlike numerical calculations, {\em symbolic and algebraic
computations} operate on {\em symbols} that represent mathematical
objects. These objects may be numbers such as integers, rational numbers,
real and complex numbers, but may also be polynomials, rational and
trigonometric functions, equations, algebraic structures such as groups,
rings, ideals, algebras or elements of them, or even sets, lists,
tables.
Computer systems with the ability to handle symbolic computations are
called \textit{computer algebra systems} or \textit{symbolic and
algebraic systems} or \textit{formula manipulation systems}. In most
cases, these systems are able to handle both numerical and graphical
computations. The word ``symbolic'' emphasises that,
during the problem-solving procedure, the objects are represented by
symbols, and the adjective ``algebraic'' refers to the algebraic
origin of the operations on these symbolic objects.
To characterise the notion ``computer algebra'', one can describe it as a
collection of {\em computer programs} developed basically to perform
\begin{itemize}
\item exact representations of mathematical objects
and
\item arithmetic with these objects.
\end{itemize}
On the other hand, computer algebra can be viewed as a {\em
discipline} which has been developed in order to invent, analyse and
implement efficient mathematical algorithms based on exact arithmetic
for scientific research and applications.
Since computer algebra systems are able to perform error-free
computations with arbitrary precision, first we have to clarify the
data structures assigned to the various objects. Subsection
\ref{adatabrazolas} deals with the problems of representing
mathematical objects. Furthermore, we describe the symbolic
algorithms which are indispensable in modern science and practice.
The problems of natural sciences are mainly expressed in terms of
mathematical equations. Research in solving symbolic linear systems
is based on the well-known elimination methods. To find the solutions
of non-linear systems, first we analyse different versions of the
\textit{Euclidean algorithm} and the method of \textit{resultants}.
In the mid-sixties of the last century, Bruno
Buchberger\nevindex{Buchberger, Bruno} presented a method in his PhD
thesis for solving multivariate polynomial equations of arbitrary degree.
This method is known as the \textit{Gr\"obner basis
method}\nevindex{Gröbner, Wolfgang Anton Maria}. At that time , the
mathematical community paid little attention to his work, but since
then it became the basis of a powerful set of tools for computing with
higher degree polynomial equations. This topic is discussed in
Subsections \ref{polkozgyok} and \ref{grobner}.
The next area to be introduced is the field of \textit{symbolic
integration}. Although the nature of the problem was understood long
ago (Liouville's principle)\nevindex{Liouville, Joseph}, it was only
in $1969$ that Robert Risch invented an algorithm to solve the following:
given an elementary function $f(x)$ of a real variable $x$, decide
whether the indefinite integral $\int f(x) dx$ is also an elementary
function, and if so, compute the integral. We describe the method in
Subsection \ref{integration}.
At the end of this section, we offer a brief survey of the theoretical
and practical relations of symbolic algorithms in Subsection
\ref{elmgyak}, devoting an independent part to the present computer
algebra systems.
\vspace{-2mm}
\section{Data representation}\label{adatabrazolas}
\vspace{-1mm}
In computer algebra, one encounters mathematical objects of different
kinds. In order to be able to manipulate these objects on a computer,
one first has to represent and store them in the memory of that
computer. This can cause several theoretical and practical
difficulties. We examine these questions in this subsection.
Consider the \textit{integers}\index{integers}. We know from our
studies that the set of integers is countable, but
computers can only store finitely many of them. The range of values
for such a \textit{single-precision integer} is limited by the number
of distinct encodings that can be made in the computer word, which is
typically $32$ or $64$ bits in length. Hence, one cannot directly use
the computer's integers to represent the mathematical integers, but
must be prepared to write programs to handle ``arbitrarily'' large
integers represented by several computer integers. The term arbitrarily
large does not mean infinitely large since some architectural
constraints or the memory size limits in any case. Moreover, one has
to construct data structures over which efficient operations can be
built. In fact, there are two standard ways of performing such a
representation.
\begin{itemize}
\item Radix notation (a generalisation of conventional decimal notation), in which $n$ is represented as
$\sum_{i=0}^{k-1} d_{i} B^{i}$,
where the digits $d_i \ (0 \leq i \leq k-1)$ are single precision integers.
These integers can be chosen from the \textit{canonical digit set} $\{0\leq d_i\leq B-1\}$ or from the
\textit{symmetrical digit set} $\{-\lfloor B/2\rfloor 10^{50}$.
Then, an integer from the interval $I_2$ can be represented by a $6$-tuple from the interval $I_1$.
Therefore,
\begin{eqnarray*}
&n_1\equiv 2009436698\pmod{m_1},\ &n_1\equiv 961831343\pmod{m_2} \koz,\\
&n_1\equiv 4253639097\pmod{m_3},\ &n_1\equiv 1549708\pmod{m_4} \koz ,\\
&n_1\equiv 2459482973\pmod{m_5},\ &n_1\equiv 3373507250\pmod{m_6} \koz ,
\end{eqnarray*}
furthermore,
$n_2\equiv 2110\pmod{m_i}, \ (1\leq i\leq 6)$. Hence
\begin{eqnarray*}
n_1+\ n_2&=&[2009438808,961833453,4253641207,1551818,2459485083,3373509360] \koz ,\\
n_1\cdot\ n_2&=&[778716563,2239578042,2991949111,3269883880,1188708718,1339711723] \koz ,
\end{eqnarray*}
where addition and multiplication were carried out using modular arithmetic.
\end{pelda}
More generally, concerning the choice of representation of other mathematical objects,
it is worth distinguishing three levels of abstraction:
\begin{enumerate}\index{data representation!in computer algebra}
\item
\textit{Object level.} This is the level where the objects are considered as formal mathematical objects.
For example $3+3$, $4\cdot 4-10$ and $6$ are all representations of the integer $6$. On the object level,
the polynomials $(x-1)^2(x+1)$ and $x^3-x^2-x+1$ are considered equal.
\item
\textit{Form level.} On this level, one has to distinguish between different representations of an object.
For example $(x-1)^2(x+1)$ and $x^3-x^2-x+1$ are considered different representations of the same
polynomial, namely the former is a product, a latter is a sum.
\item
\textit{Data structure level.} On this level, one has to consider different ways of representing an object in
a computer memory. For example, we distinguish between representations of the polynomial
$x^3-x^2-x+1$ as
\begin{itemize}
\item an array $[1,-1,-1,1]$ \koz,
\item a linked list $[1,0] \rightarrow [-1,1] \rightarrow [-1,2] \rightarrow [1,3]$ \koz .
\end{itemize}
\end{enumerate}
In order to represent objects in a computer algebra system, one has to make choices on both the form
and the data structure level. Clearly, various representations are possible for many objects. The problem of
``how to represent an object'' becomes even more difficult when one takes into consideration
other criteria, such as memory space, computation time, or readability. Let us see an example. For the polynomial
\begin{eqnarray*}
f(x)&=&(x-1)^2 (x+1)^3 (2x+3)^4\\
&=&16x^9-80x^8+88x^7+160x^6-359x^5+x^4+390x^3-162x^2-135x+81
\end{eqnarray*}
the product form is more comprehensive, but the second one is more suitable to know the coefficient of, say, $x^5$.
Two other illustrative examples are
\begin{itemize}
\item $x^{1000} -1$ and $(x-1)(x^{999}+x^{998}+ \cdots +x+1)$ \koz ,
\item $(x+1)^{1000}$ and $x^{1000}+1000x^{999}+ \cdots +1000x +1$ \koz .
\end{itemize}
It is very hard to find any good strategy to represent mathematical objects satisfying several criteria.
In practice, one object may have several different representations. This, however, gives rise
to the problem of detecting equality when different representations of the same object are encountered.
In addition, one has to be able to convert a given representation to others and simplify
the representations.
Consider the \textit{integers.}\index{integers} In the form level, one can represent the integers using base $B$ representation,
while at the data structure level they can be represented by a linked list or as an array.
\textit{Rational numbers}\index{rational numbers} can be represented by two integers, a numerator and a denominator.
Considering memory constraints, one needs to ensure that rational numbers are in lowest terms and also
that the denominator is positive (although other choices, such as positive numerator, are also possible).
This implies that a greatest common divisor computation has to be performed. Since the ring of integers
is a Euclidean domain, this can be easily computed using the Euclidean algorithm. The uniqueness of the representation follows
from the choice of the denominator's sign.
\textit{Multivariate polynomials} (elements of $R[x_1,x_2, \ldots, x_n]$, where
$R$ is an integral domain) \index{polynomial!multivariate}
can be represented in the form $a_1 x^{e_1} + a_2 x^{e_2} + \cdots + a_n x^{e_n}$, where
$a_i \in R \setminus \{0\}$ and for $e_i = (e_{i_1}, \ldots, e_{i_n})$,
one can write $x^{e_i}$ for $x_1^{e_{i_1}} x_2^{e_{i_2}} \cdots x_n^{e_{i_n}}$.
In the form level, one can consider the following levels of abstraction:
\begin{enumerate}\index{polynomial!representation}
\item Expanded or factored representation, where the products are multiplied out or the expression is in
product form. Compare
\begin{itemize}
\item ${x}^{2}y-{x}^{2}+y-1 \koz, $ and
\item $\left ({x}^{2}+1\right )\left (y-1\right ) \koz .$
\end{itemize}
\item Recursive or distributive representation (only for multivariate polynomials).
In the bivariate case, the polynomial
$f(x,y)$ can be viewed as an element of the domain $R[x,y]$, $(R[x])[y]$ or $(R[y])[x]$. Compare
\begin{itemize}
\item ${x}^{2}{y}^{2}+{x}^{2}+x{y}^{2}-1 \koz ,$
\item $({x}^{2}+x){y}^{2}+{x}^{2}-1 \koz ,$ and
\item $({y}^{2}+1){x}^{2}+{y}^{2}x-1\koz .$
\end{itemize}
\end{enumerate}
At the data structure level, there can be dense or sparse representation. Either all terms are considered, or
only those having non-zero coefficients. Compare
$x^4 +0 x^3+0 x^2 +0 x -1$ and $x^4 -1$.
In practice, multivariate polynomials are represented mainly in the sparse way.
The traditional approach of representing \textit{power series} of the form
$\sum_{i=0}^{\infty} a_i x^i$ is to truncate at some specified point,
and then to regard them as univariate polynomials.
\index{power series}However, this is not a real representation, since
many power series can have the same representation. To overcome this
disadvantage, there exists a technique of representing power series by
a procedure generating all coefficients (rather than by any
finite list of coefficients). The generating function is a computable
function $f$ such that $f(i)=a_i$. To perform an operation with power
series, it is enough to know how to compute the coefficients of the
resulting series from the coefficients of the operands. For example,
the coefficients $h_i$ of the product of the power series $f$ and $g$
can be computed as $h_i = \sum_{k=0}^{i} f_k g_{i-k}$. In that way,
the coefficients are computed when they are needed. This technique is
called \ki{lazy evaluation.}\index{lazy evaluation}
Since computer algebra programs compute in a symbolic way with
arbitrary accuracy, in addition to examining \textit{time
complexity} of the algorithms it is also important to examine their
\textit{space complexity.}\footnote{ We consider the running time as
the number of operations executed, according to the
RAM-model. Considering the Turing-machine model, and using machine
words with constant length, we do not have this problem, since in this
case space is always bounded by the time.} Consider the simple
problem of solving a linear system having $n$ equations an $n$
unknowns with integer coefficients which require $\omega$ computer
word of storage. Using Gaussian elimination, it is easy to see that
each coefficient of the reduced linear system may need $2^{n-1}
\omega$ computer words of storage.\nevindex{Gauss, Johann Carl
Friedrich (1777--1855)} In other words, Gaussian elimination suffers
from exponential growth in the size of the coefficients. Note that if we applied the same method to linear systems having polynomial
coefficients, we would have exponential growth both in the size of the
numerical coefficients of the polynomials and in
the degrees of the polynomials themselves. In spite of the observed
exponential growth, the final result of the Gaussian elimination will
always be of reasonable size because by Cramer's
rule\nevindex{Cramer, Gabriel (1704--1752)} we know that each
component of the solution to such a linear system is a ratio of two
determinants, each of which requires approximately $n \omega$ computer
words. The phenomenon described above is called \ki{intermediate
expression swell.}\inddef{intermediate expression swell} This often
appears in computer algebra algorithms.
\begin{pelda}
Using only integer arithmetic we solve the following system of linear equations:
\begin{eqnarray*}
37x+22y+22z&=&1\koz ,\\
31x-14y-25z&=&97\koz ,\\
-11x+13y+15z&=&-86\koz .
\end{eqnarray*}
First, we eliminate variable $x$ from the second equation. We multiply the first row by $31$, the second by
$-37$ and take their sum. If we apply this strategy for the third equation to eliminate variable $x$, we get
the following system.
\begin{eqnarray*}
37x+22y+22z&=&1\koz ,\\
1200y+1607z&=&-3558\koz ,\\
723y+797z&=&-3171\koz .
\end{eqnarray*}
Now, we eliminate variable $y$ multiplying the second equation by $723$, the third one by $-1200$,
then taking their sum. The result is
\begin{eqnarray*}
37x+22y+22z&=&1\koz ,\\
1200y+1607z&=&-3558\koz ,\\
205461z&=&1232766\koz .
\end{eqnarray*}
Continuing this process of eliminating variables, we get the following system:
\begin{eqnarray*}
1874311479932400x&=&5622934439797200\koz ,\\
246553200y&=&-2712085200\koz ,\\
205461z&=&1232766\koz .
\end{eqnarray*}
After some simplification, we get that $x=3,y=-11,z=6$. If we apply greatest common divisor computations
in each elimination step, the coefficient growth will be less drastic.
\end{pelda}
In order to avoid the intermediate expression swell phenomenon, one uses modular techniques.
Instead of performing the operations in the base structure $R$ (e.g.\ Euclidean ring),
they are performed in some factor structure, and then, the result is transformed back to $R$
(Figure \ref{altsema}). In general, modular computations can be performed efficiently, and the
reconstruction steps can be made with some interpolation strategy.
\begin{figure}
\unitlength 1mm
\begin{center}
\begin{picture}(80,60)(0,0)
\thicklines
\put(0,40){\makebox(10,10)[cc]{problem in $R$}}
\put(0,5){\makebox(10,10)[cc]{solution}}
\put(0,0){\makebox(10,10)[cc]{in $R$}}
\put(70,40){\makebox(10,10)[cc]{problem in $R/\langle m\rangle $}}
\put(70,5){\makebox(10,10)[cc]{solution in}}
\put(70,0){\makebox(10,10)[cc]{$R/\langle m\rangle$}}
\put(33,43){\makebox(10,10)[cc]{modular}}
\put(33,37){\makebox(10,10)[cc]{reduction}}
\put(35,-1){\makebox(10,10)[cc]{reconstruction}}
\put(7,30){\makebox(10,10)[lc]{direct}}
\put(7,25){\makebox(10,10)[lc]{computations}}
\put(63,20){\makebox(10,10)[rc]{modular}}
\put(63,15){\makebox(10,10)[rc]{computations}}
\put(24,45){\vector(1,0){30}}
\put(54,7){\vector(-1,0){30}}
\put(5,40){\vector(0,-1){25}}
\put(75,40){\vector(0,-1){25}}
\end{picture}
\end{center}
\caption{The general scheme of modular computations.}\label{altsema}
\end{figure}
Note that modular algorithms are very common in computer algebra, but it is not a universal technique.
\section{Common roots of polynomials}\label{polkozgyok}
Let $R$ be an integral domain and let
\begin{eqnarray}
f(x)& = &f_0+f_1x+\cdots +f_{m-1}x^{m-1}+f_mx^m\in R[x],\ f_m\neq 0\koz ,\label{fpol}\\
g(x)& = &g_0+g_1x+\cdots +g_{n-1}x^{n-1}+g_nx^n\in R[x],\ g_n\neq 0\label{gpol}
\end{eqnarray}
be arbitrary polynomials with $n,m\in \N, n+m>0$. Let us give a necessary
and sufficient condition for $f$ and $g$ sharing a common root in $R$.
\subsection{Classical and extended Euclidean algorithm}
If $T$ is a field, then $T[x]$ is a Euclidean domain. Recall that we
call an integral domain $R$ Euclidean together with the function
$\varphi:R\setminus \{0\}\rightarrow \N$ if for all $a,b\in R\ (b\neq
0)$, there exist $q,r\in R$ such that $a=qb+r$, where $r=0$ or
$\varphi(r)<\varphi(b)$; furthermore, for all $a,b\in R\setminus \{0\}$,
we have $\varphi(ab)\geq \varphi(a)$. The element $q=a\,
\textrm{quo}\, b$ is called the \ki{quotient} and $r=a\,
\textrm{rem}\, b$ is called the \ki{remainder.}
\inddef{quotient}\inddef{remainder} If we are working
in a Euclidean domain, we would like the greatest common divisor to be
unique. For this, a unique element has to be chosen from each
equivalence class obtained by multiplying by the units of the ring
$R$. (For example, in case of integers we always choose the
non-negative one from the classes $\{0\}, \{-1,1\},\{-2,2\},\ldots$)
Thus, every element $a\in R$ has a unique form
\[a=\textrm{unit}(a)\cdot \textrm{normal}(a)\koz ,\] where
$\textrm{normal}(a)$ is called the \ki{normal form} of
$a$.\inddef{normal form} Let us consider a Euclidean domain $R=T[x]$
over a field $T$. Let the normal form of $a\in R$ be the corresponding
normalised monic polynomial, that is,
$\textrm{normal}(a)=a/\textrm{lc}(a)$, where $\textrm{lc}(a)$ denotes
the leading coefficient of polynomial $a$.\inddef{leading
coefficient} Let us summarise these important cases:
\begin{itemize}
\item If $R=\Z$ then $\textrm{unit}(a)=\, \textrm{sgn}(a)\ (a\neq 0)$ and $\varphi(a)=\textrm{normal}(a)=\, \mid a\mid$,
\item if $R=T[x]$ ($T$ is a field) then
$\textrm{unit}(a)=\textrm{lc}(a)$ (the leading coefficient of
polynomial $a$ with the convention $\textrm{unit}(0)=1$),
$\textrm{normal}(a)=a/\textrm{lc}(a)$ and $\varphi(a)=\deg{a}$.
\end{itemize}
The following algorithm computes the greatest common divisor of two
arbitrary elements of a Euclidean domain. Note that this is one of
the most ancient algorithms of the world, already known by Euclid
around 300 B.C. \nevindex{Euclid of Alexandria (i.e. 365--300)}
\begin{alg}{Classical-Euclidean($a,b$)}\inddef{\textsc{Classical-Euclidean}}
1 \' $c \leftarrow \textrm{normal}(a)$\\
2 \' $d \leftarrow \textrm{normal}(b)$\\
3 \' \key{while} \= $d \neq 0$\\
4 \' \> \key{do} \= $r\leftarrow c\ \textrm{rem}\ d$\\
5 \' \> \> $c\leftarrow d$\\
6 \' \> \> $d\leftarrow r$\\
7 \' \key{return} $\textrm{normal}(c)$
\end{alg}
\noindent In the ring of integers, the remainder in line 4 becomes $c-\lfloor
c/d\rfloor$. When $R=T[x]$, where $T$ is a field, the remainder in
line 4 can be calculated by the algorithm
\textsc{Euclidean-Division-Univariate-Polynomials}($c,d$),
the analysis of which is left to Exercise
\ref{elsgy}.\index{\textsc{Euclidean-Division-Univariate-Polynomials}}
Figure \ref{kleuk} shows the operation of the
\textsc{Classical-Euclidean} algorithm in $\Z$ and $\Q[x]$.
Note that in $\Z$ the program only enters the \textbf{while} loop with
non-negative numbers and the remainder is always non-negative, so the
normalisation in line 7 is not needed.
Before examining the running time of the \textsc{Classical-Euclidean}
algorithm, we deal with an extended version of it.
\begin{algN}{Extended-Euclidean($a,b$)}\inddef{\textsc{Extended-Euclidean}}
1 \' $(r_0,u_0,v_0) \leftarrow (\textrm{normal}(a),1,0)$\\
2 \' $(r_1,u_1,v_1) \leftarrow (\textrm{normal}(b),0,1)$\\
3 \' \key{while} \= $r_1 \neq 0$ \\
4 \' \> \key{do} \= $q\leftarrow r_0\ \textrm{quo}\ r_1$\\
5 \' \> \> $r\leftarrow r_0-qr_1$\\
6 \' \> \> $u\leftarrow (u_0-qu_1)$\\
7 \' \> \> $v\leftarrow (v_0-qv_1)$\\
8 \' \> \> $(r_0,u_0,v_0)\leftarrow (r_1,u_1,v_1)$\\
9 \' \> \> $(r_1,u_1,v_1)\leftarrow (r,u,v)$\\
10 \' \key{return} $(\textrm{normal}(r_0),\ u_0/(\textrm{unit}(a)\cdot \textrm{unit}(r_0)),\ v_0/(\textrm{unit}(b)\cdot \textrm{unit}(r_0)))$
\end{algN}
\begin{figure}[!t]
\begin{center}
\begin{tabular}{cccccc}
iteration & $r$ & $c$ & $d$ \\ \hline
-- & -- & 18 & 30 \\
1 & 18 & 30 & 18 \\
2 & 12 & 18 & 12 \\
3 & 6 & 12 & 6 \\
4 & 0 & 6 & 0
\end{tabular}\\
\vspace{.2cm}\centerline{\footnotesize{(a) The operation of \textsc{Classical-Euclidean}($-18,30$).}}
\vspace{.1cm}
\begin{tabular}{cccccc}
iteration & $r$ & $c$ & $d$ \\ \hline
-- & -- & $x^4-\frac{17}{3}x^3+\frac{13}{3}x^2-\frac{23}{3}x+\frac{14}{3}$ & $x^3-\frac{20}{3}x^2+7x-2$ \\
1 & $4x^2-\frac{38}{3}x+\frac{20}{3}$ & $x^3-\frac{20}{3}x^2+7x-2$ & $4x^2-\frac{38}{3}x+\frac{20}{3}$ \\
2 & $-\frac{23}{4}x+\frac{23}{6}$ & $4x^2-\frac{38}{3}x+\frac{20}{3}$ & $-\frac{23}{4}x+\frac{23}{6}$ \\
3 & 0 & $-\frac{23}{4}x+\frac{23}{6}$ & 0
\end{tabular}\\
\vspace{.2cm}\centerline{\footnotesize{(b) The operation of}}
\centerline{\footnotesize{\textsc{Classical-Euclidean}($12x^4-68x^3+52x^2-92x+56-12x^3+80x^2-84x+24$).}}
\end{center}
\caption{Illustration of the operation of the
\textsc{Classical-Euclidean} algorithm in $\Z$ and $\Q[x]$. In
case {\bf (a)}, the input is $a=-18,b=30,a,b\in \Z$. The first two
lines of the pseudocode compute the absolute values of the input
numbers. The loop between lines $3$ and $6$ is executed four times,
values $r$, $c$ and $d$ in these iterations are shown in the
table. The \textsc{Classical-Euclidean($-18$,$30$)} algorithm
outputs $6$ as result. In case {\bf (b)}, the input parameters
are $a=12x^4-68x^3+52x^2-92x+56,b=-12x^3+80x^2-84x+24\in \Q[x]$.
The first two lines compute the normal form of the polynomials, and
the \textbf{while} loop is executed three times. The output of the
algorithm is the polynomial $\textrm{normal}(c)=x-2/3$. }
\label{kleuk}
\end{figure}\abraindex{operation of the \textsc{Classical-Euclidean} algorithm}
It is known that in the Euclidean domain $R$, the greatest common divisor of elements
$a,b\in R$ can be expressed in the form $\textrm{gcd}(a,b)=au+bv$ with
appropriate elements $u,v\in R$. However, this pair $u,v$ is not
unique. For if $u_0,v_0$ are appropriate, then so are $u_1=u_0+bt$ and
$v_1=v_0-at$ for all $t\in R$:
\[au_1+bv_1=a(u_0+bt)+b(v_0-at)=au_0+bv_0=\textrm{gcd}(a,b) \koz .\]
The \textsc{Classical-Euclidean} algorithm is completed in a way that beside
the greatest common divisor it outputs an appropriate pair $u,v\in R$ as discussed above.
Let $a,b\in R$, where $R$ is a Euclidean domain together with the
function $\varphi$. The equations \begin{equation}\label{beainv}
r_0=u_0a+v_0b\ \ \mbox{ and}\ \ \ r_1=u_1a+v_1b \end{equation} are
obviously fulfilled due to the initialisation in the first two lines
of the pseudocode \textsc{Extended-Euclidean}. We show that equations
(\ref{beainv}) are invariant under the transformations of the
\textbf{while} loop of the pseudocode. Let us presume that the
conditions (\ref{beainv}) are fulfilled before an iteration of the
loop. Then lines $4$--$5$ of the pseudocode imply \[
r=r_0-qr_1=u_0a+v_0b-q(au_1+bv_1)=a(u_0-qu_1)+b(v_0-qv_1) \koz , \]
hence, because of lines $6$--$7$, \[r=a(u_0-qu_1)+b(v_0-qv_1)=au+bv\koz.\]
Lines $8$--$9$ perform the following operations: $u_0,v_0$ take the
values of $u_1$ and $v_1$, then $u_1,v_1$ take the values of $u$ and
$v$, while $r_0,r_1$ takes the value of $r_1$ and $r$. Thus, the
equalities in (\ref{beainv}) are also fulfilled after the iteration of
the \textbf{while} loop. Since $\varphi(r_1)<\varphi(r_0)$ in each
iteration of the loop, the series $\{\varphi(r_i)\}$ obtained in lines
$8$--$9$ is a strictly decreasing series of natural numbers, so sooner
or later the control steps out of the \textbf{while} loop. The
greatest common divisor is the last non-zero remainder in the series
of Euclidean divisions, that is, $r_0$ in lines $8$--$9$.
\begin{pelda}
Let us examine the series of remainders in the case of polynomials
\begin{eqnarray}
a(x)&=&63x^5+57x^4-59x^3+45x^2-8\koz ,\label{apol}\\
b(x)&=&-77x^4+66x^3+54x^2-5x+99\koz .\label{bpol}
\end{eqnarray}
\begin{eqnarray*}
r_0&=&x^5+\frac{19}{21}x^4-\frac{59}{63}x^3+\frac{5}{7}x^2-\frac{8}{63}\koz ,\\
r_1&=&x^4-\frac{6}{7}x^3-\frac{54}{77}x^2+\frac{5}{77}x-\frac{9}{7}\koz ,\\
r_2&=&\frac{6185}{4851}x^3+\frac{1016}{539}x^2+\frac{1894}{1617}x+\frac{943}{441}\koz ,
\end{eqnarray*}
\begin{eqnarray*}
r_3&=&\frac{771300096}{420796475} x^2 +\frac{224465568}{420796475} x + \frac{100658427}{38254225}\koz ,\\
r_4&=&-\frac{125209969836038125}{113868312759339264}x - \frac{3541728593586625}{101216278008301568}\koz ,\\
r_5&=&\frac{471758016363569992743605121}{180322986033315115805436875}\koz .
\end{eqnarray*}
The values of the variables $u_0,v_0$ before the execution of line $10$ are
\begin{eqnarray*}
u_0&=&\frac{113868312759339264}{125209969836038125}x^3-\frac{66263905285897833785656224}{81964993651506870820653125}x^2\\
&&-\frac{1722144452624036901282056661}{901614930166575579027184375}x+\frac{1451757987487069224981678954}
{901614930166575579027184375}\koz ,\\
v_0&=&-\frac{113868312759339264}{125209969836038125}x^4-\frac{65069381608111838878813536}{81964993651506870820653125}x^3\\
&&+\frac{178270505434627626751446079}{81964993651506870820653125}x^2+\frac{6380859223051295426146353}
{81964993651506870820653125}x\\
&&-\frac{179818001183413133012445617}{81964993651506870820653125}\koz .
\end{eqnarray*}
The return values are:
\begin{eqnarray*}
\textrm{gcd}(a,b)&=&1,\\
u&=&\frac{2580775248128}{467729710968369}x^3-\frac{3823697946464}{779549518280615}x^2\\
&&-\frac{27102209423483}{2338648554841845}x+\frac{7615669511954}{779549518280615}\koz ,\\
v&=&\frac{703847794944}{155909903656123}x^4+\frac{3072083769824}{779549518280615}x^3\\
&&-\frac{25249752472633}{2338648554841845}x^2-\frac{301255883677}{779549518280615}x+
\frac{25468935587159}{2338648554841845}\koz .
\end{eqnarray*}
\end{pelda}
\noindent
We can see that the size of the coefficients show a drastic
growth. One might ask why we do not normalise in \textit{every}
iteration of the \textbf{while} loop? This idea leads to the
normalised version of the Euclidean algorithm for polynomials.
\begin{algN}{Extended-Euclidean-Normalised($a,b$)}\inddef{\textsc{Extended-Euclidean-Normalised}}
1 \' $e_0 \leftarrow \textrm{unit}(a)$\\
2 \' $(r_0,u_0,v_0) \leftarrow (\textrm{normal}(a),e_0^{-1},0)$\\
3 \' $e_1 \leftarrow \textrm{unit}(b)$\\
4 \' $(r_1,u_1,v_1) \leftarrow (\textrm{normal}(b),0,e_1^{-1})$\\
5 \' \key{while} \= $r_1 \neq 0$ \\
6 \' \> \key{do} \= $q\leftarrow r_0\ \textrm{quo}\ r_1$\\
7 \' \> \> $s\leftarrow r_0\ \textrm{rem}\ r_1$\\
8 \' \> \> $e\leftarrow \textrm{unit}(s)$\\
9 \' \> \> $r\leftarrow \textrm{normal}(s)$\\
10 \' \> \> $u\leftarrow (u_0-qu_1)/e$\\
11 \' \> \> $v\leftarrow (v_0-qv_1)/e$\\
12 \' \> \> $(r_0,u_0,v_0)\leftarrow (r_1,u_1,v_1)$\\
13 \' \> \> $(r_1,u_1,v_1)\leftarrow (r,u,v)$\\
14 \' \key{return} $\big(r_0,\ u_0,\ v_0\big)$
\end{algN}
\begin{pelda}
Let us look at the series of remainders and series $e$ obtained in
the \textsc{Extended-Euclidean-Normalised} algorithm in case of
the polynomials (\ref{apol}) and (\ref{bpol})
\begin{alignat*}{2}
\textstyle
r_0&=\ x^5+\frac{19}{21}x^4-\frac{59}{63}x^3+\frac{5}{7}x^2-\frac{8}{63},&\quad e_0=&63\koz ,\\
r_1&=\ x^4-\frac{6}{7}x^3-\frac{54}{77}x^2+\frac{5}{77}x-\frac{9}{7},&\quad e_1=&-77\koz ,\\
%0, -\frac{1}{77}\\
\textstyle r_2&=\ x^3+\frac{9144}{6185}x^2+\frac{5682}{6185}x+\frac{10373}{6185},&e_2=&\frac{6185}{4851}\koz ,\\
%\frac{77}{6185}, \frac{111}{6185}+\frac{63}{6185}x\\
\textstyle r_3&=\ x^2+\frac{2338183}{8034376}x+\frac{369080899}{257100032},&e_3=&\frac{771300096}{420796475}\koz ,\\
%\frac{2039213}{128550016}-\frac{748385}{110185728}x,
%\frac{1738997}{110185728}+\frac{819599}{257100032}x-\frac{204105}{36728576}x^2\\
\textstyle r_4&=\ x+\frac{166651173}{5236962760},&e_4=&-\frac{222685475860375}{258204790837504}\koz ,\\
%\frac{96949527}{13092406900}+\frac{710308907}{78554441400}x-
%\frac{77330869}{9819305175}x^2, \frac{72256009}{78554441400}+\frac{854527199}{78554441400}x-
%\frac{103551301}{26184813800}x^2-\frac{21090237}{3273101725}x^3\\
\textstyle r_5&=\ 1,&e_5=&\frac{156579848512133360531}{109703115798507270400}\koz .
%\frac{7615669511954}{779549518280615}-\frac{27102209423483}{2338648554841845}x-
%\frac{3823697946464}{779549518280615}x^2+\frac{2580775248128}{467729710968369}x^3,
%\frac{25468935587159}{2338648554841845}-\frac{301255883677}{779549518280615}x-
%\frac{25249752472633}{2338648554841845}x^2+\frac{3072083769824}{779549518280615}x^3+
%\frac{703847794944}{155909903656123}x^4
\end{alignat*}
Before the execution of line $14$ of the pseudocode, the values of the
variables $\textrm{gcd}(a,b)=r_0,u=u_0,v=v_0$ are
\begin{eqnarray*}
\textrm{gcd}(a,b)&=&1,\\
u&=&\frac{2580775248128}{467729710968369}x^3-\frac{3823697946464}{779549518280615}x^2\\
&&-\frac{27102209423483}{2338648554841845}x+\frac{7615669511954}{779549518280615}\koz ,\\
v&=&\frac{703847794944}{155909903656123}x^4+\frac{3072083769824}{779549518280615}x^3\\
&&-\frac{25249752472633}{2338648554841845}x^2-\frac{301255883677}{779549518280615}x+
\frac{25468935587159}{2338648554841845}\koz .
\end{eqnarray*}
\end{pelda}
\noindent
Looking at the size of the coefficients in $\Q[x]$, the advantage of
the normalised version is obvious, but we could still not avoid the growth.
To get a machine architecture-dependent description and
analysis of the \textsc{Extended-Euclidean-Normalised} algorithm, we
introduce the following notation. Let
\begin{eqnarray*}
\lambda(a)&=&\lfloor \log_2|a|/w\rfloor +1 \mbox{ if}\ a\in \Z\setminus \{0\},\mbox{ and}\ \lambda(0)=0\koz ,\\
\lambda(a)&=&\max\{\lambda(b),\lambda(c)\} \mbox{ if}\ a=b/c\in\Q,\ b,c\in\Z,\ \textrm{gcd}(b,c)=1\koz ,\\
\lambda(a)&=&\max\{\lambda(b),\lambda(a_0),\ldots,\lambda(a_n)\}\mbox{ if}\ a=\sum_{0\leq i\leq n}a_ix^i/b\in \Q[x]\koz ,\\
&&a_i\in \Z,\ b\in \N^+,\ \textrm{gcd}(b,a_0,\ldots,a_n)=1\koz ,
\end{eqnarray*}
where $w$ is the word length of the computer in bits. It is easy to
verify that if
$a,b\in \Z[x]$ and $c,d\in \Q$, then
\begin{eqnarray*}
\lambda(c+d)&\leq&\lambda(c)+\lambda(d)+1 \koz,\\
\lambda(a+b)&\leq&\max\{\lambda(a),\lambda(b)\}+1 \koz ,\\
\lambda(cd),\lambda(c/d)&\leq&\lambda(c)+\lambda(d) \koz ,\\
\lambda(ab)&\leq&\lambda(a)+\lambda(b)+\lambda(\min\{\deg a,\deg b\}+1) \koz .
\end{eqnarray*}
We give the following theorems without proof.
\begin{tetel}
If $a,b\in \Z$ and $\lambda(a)=m\geq n=\lambda(b)$, then the
\textsc{Classical-Euclidean} and \textsc{Extended-Euclidean}
algorithms require $O(mn)$ machine-word arithmetic operations.
\end{tetel}
\begin{tetel}
If $F$ is a field and $a,b\in F[x],\ \deg(a)=m\geq n=\deg(b)$, then
the \textsc{Classical-Euclidean}, \textsc{Extended-Euclidean} and
\textsc{Extended-Euclidean-Normalised} algorithms require $O(mn)$
elementary operations in $F$.
\end{tetel}
Can the growth of the coefficients be due to the choice of our
polynomials? Let us examine a single Euclidean division in the
\textsc{Extended-Euclidean-Normalised} algorithm. Let $a=bq+e^*r$,
where
\begin{eqnarray*}
a&=&x^m+\frac{1}{c}\sum_{i=0}^{m-1}a_ix^i\in \Q[x]\koz ,\\
b&=&x^n+\frac{1}{d}\sum_{i=0}^{n-1}b_ix^i\in \Q[x]\koz ,
\end{eqnarray*}
and $r\in \Q[x]$ are monic polynomials, $a_i,b_i\in \Z$, $e^*\in\Q$,
$c,d\in \N^+$, and consider the case $n=m-1$. Then
\begin{eqnarray}
q&=&x+\frac{a_{m-1}d-b_{n-1}c}{cd}\koz ,\nonumber\\
\lambda(q)&\leq& \lambda(a)+\lambda(b)+1\koz ,\nonumber\\
e^*r&=&a-qb=\frac{acd^2-xbcd^2-(a_{m-1}d-b_{n-1}c)bd}{cd^2}\koz ,\nonumber\\
\lambda(e^*r)&\leq& \lambda(a)+2\lambda(b)+3\koz .\label{er}
\end{eqnarray}
Note that the bound (\ref{er}) is valid for the coefficients of the
remainder polynomial $r$ as well, that is, $\lambda(r)\leq
\lambda(a)+2\lambda(b)+3$. So in case
$\lambda(a)\sim\lambda(b)$, the size of the coefficients may only grow
by a factor of around three in each Euclidean division. This estimate
seems accurate for pseudorandom polynomials, the interested reader
should look at Problem \ref{masgy} The worst case estimate suggests that
\[\lambda(r_l)=O(3^l\cdot \max\{\lambda(a),\lambda(b)\}) \koz,\]
where $l$ denotes the running time of the
\textsc{Extended-Euclidean-Normalised} algorithm, practically, the
number of times the \textbf{while} loop is executed. Luckily, this
exponential growth is not achieved in each iteration of the loop, and
altogether the growth of the coefficients is bounded polynomially in
terms of the input. Later we will see that
the growth can be eliminated using modular techniques.
Summarising: after computing the greatest common divisor of the
polynomials $f,g\in R[x]$ ($R$ is a field), $f$ and $g$ have a common
root if and only if their greatest common divisor is not a
constant. For if $\textrm{gcd}(f,g)=d\in R[x]$ is not a constant, then the
roots of $d$ are also roots of $f$ and $g$, since $d$ divides $f$
and $g$. On the other hand, if $f$ and $g$ have a root in common, then
their greatest common divisor cannot be a constant, since the common
root is also a root of it.
\subsection{Primitive Euclidean algorithm}\label{pea}
If $R$ is a UFD (unique factorisation domain, where every non-zero,
non-unit element can be written as a product of irreducible elements
in a unique way up to reordering and multiplication by units) but not
necessarily a Euclidean domain then, the situation is more complicated,
since we may not have a Euclidean algorithm in $R[x]$. Luckily, there
are several useful methods due to: (1) unique factorisation in $R[x]$,
(2) the existence of a greatest common divisor of two or more arbitrary elements.
The first possible method is to perform the calculations in the field
of fractions of $R$. The polynomial $p(x)\in R[x]$ is called a
\ki{primitive polynomial} if there is no prime in $R$ that divides all
coefficients of $p(x)$.\inddef{primitive!polynomial} A famous lemma by Gauss
\nevindex{Gauss, Johann Carl Friedrich (1777--1855)} says that the product
of primitive polynomials is also primitive, hence, for the primitive
polynomials $f,g$, $d=\textrm{gcd}(f,g)\in R[x]$ if and only if
$d=\textrm{gcd}(f,g)\in H[x]$, where $H$ denotes the field of
fractions of $R$. So we can calculate greatest common divisors in $H[x]$ instead of
$R[x]$. Unfortunately, this approach is not really effective because
arithmetic in the field of fractions $H$ is much more expensive than
in $R$.
A second possibility is an algorithm similar to the Euclidean
algorithm: in the ring of polynomials in one variable over an integral
domain, a so-called \ki{pseudo-division} can be
defined.\inddef{pseudo-division} Using the polynomials (\ref{fpol}),
(\ref{gpol}), if $m\geq n$, then there exist $q,r\in R[x]$, such that
\[g_n^{m-n+1}f = gq+r \koz ,\]
where $r=0$ or $\deg{r}<\deg{g}$. The polynomial $q$ is called the
\ki{pseudo-quotient} of $f$ and $g$ and $r$ is called the
\ki{pseudo-remainder.}\inddef{pseudo-quotient}
\inddef{pseudo-remainder} The notation is $q=\textrm{pquo}(f,g),
r=\textrm{prem}(f,g)$.
\begin{pelda}
Let
\begin{eqnarray}
f(x)&=&12x^4-68x^3+52x^2-92x+56\in \Z[x]\koz ,\label{primf}\\
g(x)&=&-12x^3+80x^2-84x+24\in \Z[x]\label{primg}\koz .
\end{eqnarray}
Then $\textrm{pquo}(f,g)=-144(x+1),\ \textrm{prem}(f,g)=1152(6x^2-19x+10)$.
\end{pelda}
On the other hand, each polynomial $f(x)\in R[x]$ can be written in a
unique form
\[f(x)=\textrm{cont}(f)\cdot \textrm{pp}(f)\]
up to a unit factor, where $\textrm{cont}(f)\in R$ and
$\textrm{pp}(f)\in R[x]$ are primitive polynomials. In this case,
$\textrm{cont}(f)$ is called the \ki{content,} $\textrm{pp}(f)$ is
called the \ki{primitive part} of
$f(x)$.\inddef{primitive!part}\inddef{content} The uniqueness of the
form can be achieved by the normalisation of units. For example, in the
case of integers, we always choose the positive ones from the
equivalence classes of $\Z$.
The following algorithm performs a series of pseudo-divisions. The
algorithm uses the function $\textrm{prem}()$, which computes the
pseudo-remainder, and it assumes that we can calculate greatest common divisors in $R$,
contents and primitive parts in $R[x]$. The input is $a,b\in R[x]$,
where $R$ is a UFD. The output is the polynomial $\textrm{gcd}(a,b)\in
R[x]$.
\vspace{-2pt}
\begin{alg}{Primitive-Euclidean($a,b$)}\inddef{\textsc{Primitive-Euclidean}}
1 \' $c \leftarrow \textrm{pp}(f)$\\
2 \' $d \leftarrow \textrm{pp}(g)$\\
3 \' \key{while} \= $d \neq 0$ \\
4 \' \> \key{do} \= $r\leftarrow \textrm{prem}(c,d)$ \\
5 \' \> \> $c\leftarrow d$ \\
6 \' \> \> $d\leftarrow \textrm{pp}(r)$\\
7 \' $\gamma\leftarrow \textrm{gcd}(\textrm{cont}(a),\textrm{cont}(b))$\\
8 \' $\delta \leftarrow \gamma c$ \\
9 \' \key{return} $\delta$
\end{alg}
\vspace{-2pt}
\noindent
The operation of the algorithm is illustrated by Figure
\ref{preuk}. The running time of the \textsc{Primitive-Euclidean}
algorithm is the same as the running time of the previous versions of
the Euclidean algorithm.
\begin{figure}
\begin{center}
\begin{tabular}{cccccc}
iteration & $r$ & $c$ & $d$ \\ \hline
-- & -- & $3x^4-17x^3+13x^2-23x+14$ & $-3x^3+20x^2-21x+6$ \\
1 & $108x^2-342x+108$ & $-3x^3+20x^2-21x+6$ & $6x^2-19x+10$ \\
2 & $621x-414$ & $6x^2-19x+10$ & $3x-2$ \\
3 & 0 & $3x-2$ & 0
\end{tabular}
\end{center}
\caption{The illustration of the operation of the
\textsc{Primitive-Euclidean} algorithm with input
$a(x)=12x^4-68x^3+52x^2-92x+56,\ b(x)=-12x^3+80x^2-84x+24\in \Z[x]$.
The first two lines of the program compute the primitive parts of
the polynomials. The loop between lines $3$ and $6$ is executed
three times, the table shows the values of $r$, $c$ and $d$ in the
iterations. In line $7$, variable $\gamma$ equals
$\textrm{gcd}(4,4)=4$. The \textsc{Primitive-Euclidean}($a,b$)
algorithm returns $4\cdot (3x-2)$ as result.}\label{preuk}
\end{figure}\abraindex{operation of the \textsc{Primitive-Euclidean} algorithm}
The \textsc{Primitive-Euclidean} algorithm is very important because
the ring \linebreak
$R[x_1,x_2,\ldots, x_t]$ of multivariate polynomials is a UFD,
so we apply the algorithm recursively, e.g.\ in
$R[x_2,\ldots,x_t][x_1]$, using computations in the UFDs
$R[x_2,\ldots,x_t],\ldots,$ \linebreak $R[x_t]$. In other words, the recursive view
of multivariate polynomial rings leads to the recursive application of
the \textsc{Primitive-Euclidean} algorithm in a straightforward way.
We may note that, like above, the algorithm shows a growth in the
coefficients.
Let us take a detailed look at the UFD $\Z[x]$. The bound on the size
of the coefficients of the greatest common divisor is given by the following theorem,
which we state without proof.
\begin{tetel}[Landau-Mignotte]\nevindex{Landau, Edmund Georg Hermann (1877--1938)}\nevindex{Mignotte, Maurice}
Let $a(x)=\sum_{i=0}^m a_ix^i$, $b(x)=\sum_{i=0}^n b_ix^i\in \Z[x]$, $a_m\neq 0\neq b_n$ and
$b(x)\mid a(x)$. Then
\[\sum_{i=1}^n|b_i|\leq 2^n\bigg| \frac{b_n}{a_m}\bigg| \sqrt{\sum_{i=0}^m a_i^2}\koz .\]
\end{tetel}
\begin{kov}\label{LM-kov}
With the notations of the previous theorem, the absolute value of any
coefficient of the polynomial $\textrm{gcd}(a,b)\in \Z[x]$ is smaller
than
\[2^{\min\{m,n\}}\cdot\textrm{gcd}(a_m,b_n)\cdot \min\bigg\{ \frac{1}{|a_m|}\sqrt{\sum_{i=1}^m a_i^2},
\frac{1}{|b_n|}\sqrt{\sum_{i=1}^n b_i^2}\bigg\}\koz .\]
\end{kov}
\begin{biz}
The greatest common divisor of $a$ and $b$ obviously divides both $a$ and $b$, and its degree is
at most the minimum of their degrees. Furthermore, the leading coefficient
of the greatest common divisor divides $a_m$ and $b_n$, so it also divides
$\textrm{gcd}(a_m,b_n)$.
\end{biz}
\begin{pelda}
Corollary \ref{LM-kov} implies that the absolute value of the
coefficients of the greatest common divisor is at most $\lfloor
32/9\sqrt{3197}\rfloor=201$ for the polynomials (\ref{apol}),
(\ref{bpol}), and at most $\lfloor 32\sqrt{886}\rfloor=952$ for the
polynomials (\ref{primf}) and (\ref{primg}).
\end{pelda}
\subsection{The resultant}\label{rezultans}\index{resultant method}
The following method describes the necessary and sufficient
conditions for the common roots of (\ref{fpol}) and (\ref{gpol}) in
the most general context. As a further advantage, it can be applied to
solve algebraic equation systems of higher degree.
Let $R$ be an integral domain and $H$ its field of fractions. Let us
consider the smallest extension $K$ of $H$ over which both $f(x)$ of
(\ref{fpol}) and $g(x)$ of (\ref{gpol}) splits into linear factors.
Let us denote the roots (in $K$) of the polynomial $f(x)$ by
$\alpha_1,\alpha_2,\ldots,\alpha_m$, and the roots of $g(x)$ by
$\beta_1,\beta_2,\ldots,\beta_n$. Let us form the following product:
\begin{eqnarray*}
\textrm{res}(f,g)&=& f_m^ng_n^m(\alpha_1-\beta_1)(\alpha_1-\beta_2)\cdots (\alpha_1-\beta_n)\\
&&\cdot(\alpha_2-\beta_1)(\alpha_2-\beta_2)\cdots (\alpha_2-\beta_n)\\
&&\quad\vdots\\
&&\cdot(\alpha_m-\beta_1)(\alpha_m-\beta_2)\cdots (\alpha_m-\beta_n)\\
&=&f_m^ng_n^m\prod_{i=1}^m\prod_{j=1}^n (\alpha_i-\beta_j)\koz .
\end{eqnarray*}
It is obvious that $\textrm{res}(f,g)$ equals to $0$ if and only if $\alpha_i=\beta_j$ for
some $i$ and $j$, that is, $f$ and $g$ have
a common root. The product $\textrm{res}(f,g)$ is called the \ki{resultant}
of the polynomials $f$ and $g$.\inddef{resultant} Note that the value
of the resultant depends on the order of $f$ and $g$, but the
resultants obtained in the two ways can only differ in sign.
\begin{eqnarray*}
\textrm{res}(g,f)&=&g_n^mf_m^n\prod_{j=1}^n\prod_{i=1}^m (\beta_j-\alpha_i)\\
&=&(-1)^{mn}f_m^ng_n^m\prod_{i=1}^m\prod_{j=1}^n (\alpha_i-\beta_j)=(-1)^{mn}\textrm{res}(f,g) \koz .
\end{eqnarray*}
Evidently, this form of the resultant cannot be applied in practice, since it
presumes that the roots are known. Let us examine the different
forms of the resultant. Since
\begin{eqnarray*}
f(x)&=&f_m(x-\alpha_1)(x-\alpha_2)\cdots (x-\alpha_m)\quad (f_m\neq 0)\koz ,\\
g(x)&=&g_n(x-\beta_1)(x-\beta_2)\cdots (x-\beta_n)\quad (g_n\neq 0)\koz ,
\end{eqnarray*}
hence,
\begin{eqnarray*}
g(\alpha_i)&=&g_n(\alpha_i-\beta_1)(\alpha_i-\beta_2)\cdots (\alpha_i-\beta_n)\\
&=&g_n\prod_{j=1}^n (\alpha_i-\beta_j)\koz .
\end{eqnarray*}
Thus,
\begin{eqnarray*}
\textrm{res}(f,g)&=&f_m^n \prod_{i=1}^m \bigg( g_n\prod_{j=1}^n (\alpha_i-\beta_j)\bigg)\\
&=&f_m^n\prod_{i=1}^m g(\alpha_i)=(-1)^{mn}g_n^m\prod_{j=1}^n f(\beta_j)\koz .
\end{eqnarray*}
Although it looks a lot more friendly, this form still requires the roots of at least one polynomial. Next we examine how
the resultant may be expressed only in terms of the coefficients of the
polynomials. This leads to the Sylvester form of the
resultant.\nevindex{Sylvester, James Joseph (1814--1897)}
Let us presume that polynomial $f$ in (\ref{fpol}) and
polynomial $g$ in (\ref{gpol}) have a common root. This means that
there exists a number $\alpha\in K$ such that
\begin{eqnarray*}
f(\alpha)&=&f_m\alpha^m+f_{m-1}\alpha^{m-1}+\cdots +f_1\alpha+f_0=0\koz ,\\
g(\alpha)&=&g_n\alpha^n+g_{n-1}\alpha^{n-1}+\cdots +g_1\alpha+g_0=0\koz .
\end{eqnarray*}
Multiply these equations by the numbers
$\alpha^{n-1},\alpha^{n-2},$ \linebreak
$\ldots,\alpha,1$ and $\alpha^{m-1}$,
$\alpha^{m-2},\ldots,\alpha,1,$ respectively. We get $n$ equations
from the first one and $m$ from the second one. Consider these $m+n$
equations as a homogeneous system of linear equations in $m+n$
indeterminates. This system has the obviously non-trivial solution
$\alpha^{m+n-1}, \alpha^{m+n-2}$, $\ldots,\alpha,1$. It is a
well-known fact that a homogeneous system with as many equations as
indeterminates has non-trivial solutions if and only if its
determinant is zero. We get that $f$ and $g$ can only have common
roots if the determinant
\begin{equation} \label{rezu}
D=
\begin{array}{|ccccccc|c}
f_m & \cdots & \cdots & \cdots & f_0 & & & \uparrow \\
& \ddots & & & & \ddots & & n \\
& & f_m & \cdots & \cdots & \cdots & f_0 & \downarrow \\
g_n & \cdots & \cdots & g_0 & & & & \uparrow \\
& \ddots & & & \ddots & & & m \\
& & \ddots & & & \ddots & & \\
& & & g_n & \cdots & \cdots & g_0 & \downarrow
\end{array}
\end{equation}
equals to $0$ (there are 0s everywhere outside the dotted areas). Thus, a
necessary condition for the existence of common roots is that the
determinant $D$ of order $(m+n)$ is 0. Below we prove that $D$ equals to
the resultant of $f$ and $g$, hence, $D=0$ is also a sufficient condition
for common roots. The determinant (\ref{rezu}) is called the
\ki{Sylvester form} of the resultant.\index{resultant!Sylvester
form}\nevindex{Sylvester, James Joseph (1814--1897)}
\begin{tetel}
Using the above notation
\[D=f_m^n\prod_{i=1}^mg(\alpha_i)\koz .\]
\end{tetel}
\begin{biz}
We will precede by induction on $m$. If $m=0$, then $f=f_m=f_0$, so the
right-hand side is $f_0^n$. The left-hand side is a determinant of
order $n$ with $f_0$ everywhere in the diagonal, and 0 everywhere
else. Thus, $D=f_0^n$, so the statement is true. In the following, presume
that $m>0$ and the statement is true for $m-1$. If we take the
polynomial
\[f^{*}(x)=f_m(x-\alpha_1)\cdots (x-\alpha_{m-1})=f^*_{m-1}x^{m-1}+f^*_{m-2}x^{m-2}+\cdots +f_1^*x+f^*_0\]
instead of $f$, then $f^*$ and $g$ fulfil the condition:
\begin{equation*}
D^*=
\begin{array}{|ccccccc|}
f^*_{m-1} & \cdots & \cdots & \cdots & f^*_0 & & \\
& \ddots & & & & \ddots & \\
& & f^*_{m-1} & \cdots & \cdots & \cdots & f^*_0 \\
g_n & \cdots & \cdots & g_0 & & & \\
& \ddots & & & \ddots & & \\
& & \ddots & & & \ddots & \\
& & & g_n & \cdots & \cdots & g_0
\end{array}=f_{m-1}^{*m}\prod_{i=1}^{m-1}g(\alpha_i)\koz .
\end{equation*}
Since $f=f^*(x-\alpha_m)$, the coefficients of $f$ and $f^*$ satisfy
\[f_m=f_{m-1}^*, f_{m-1}=f^*_{m-2}-f^*_{m-1}\alpha_m, \ldots, f_1=f^*_0-f_1^*\alpha_m, f_0=-f^*_0\alpha_m .\]
Thus,
\begin{equation*}
D=
\begin{array}{|ccccccc|}
f^*_{m-1} & f^*_{m-2}-f^*_{m-1}\alpha_m & \cdots & \cdots & -f^*_0\alpha_m & & \\
& \ddots & & & & \ddots & \\
& & f^*_{m-1} & \cdots & \cdots & \cdots & -f^*_0\alpha_m\\
g_n & \cdots & \cdots & g_0 & & & \\
& \ddots & & & \ddots & & \\
& & \ddots & & & \ddots & \\
& & & g_n & \cdots & \cdots & g_0
\end{array}\koz .
\end{equation*}
We transform the determinant in the following way: add $\alpha_m$
times the first column to the second column, then add $\alpha_m$
times the new second column to the third column, etc. This way the
$\alpha_m$-s disappear from the first $n$ lines, so the first $n$
lines of $D^*$ and the transformed $D$ are identical. In the last $m$
rows, subtract $\alpha_m$ times the second one from the first one, and
similarly, always subtract $\alpha_m$ times a row from the row right
above it. In the end, $D$ becomes
\begin{equation*}
D=
\begin{array}{|cccccccc|}
f^*_{m-1} & \cdots & \cdots & \cdots & f^*_0 & & & \\
& \ddots & & & & \ddots & & \\
& & f^*_{m-1} & \cdots & \cdots & \cdots & f^*_0 & \\
g_n & \cdots & \cdots & g_0 & & & & \\
& \ddots & & & \ddots & & & \\
& & \ddots & & & \ddots & & \\
& & & g_n & \cdots & \cdots & g_0 & \\
& & & & g_n & g_n\alpha_m+g_{n-1}& \cdots & g(\alpha_m)
\end{array}\koz .
\end{equation*}
Using the last row for expansion, we get $D=D^*g(\alpha_m)$, which
implies $D=f_m^n\prod_{i=1}^mg(\alpha_i)$ by the induction hypothesis.
\end{biz}
We get that $D=\textrm{res}(f,g)$, that is, polynomials $f$ and $g$
have a common root in $K$ if and only if determinant $D$
vanishes.
>From an algorithmic point of view, the computation of the resultant in
Sylvester\nevindex{Sylvester, James Joseph (1814--1897)} form for higher degree
polynomials means the computation of a large determinant. The
following theorem implies that pseudo-division may simplify the
computation.
\begin{tetel}
For the polynomials $f$ of (\ref{fpol}) and $g$ of (\ref{gpol}), in
case of $m\geq n>0$
\[ \left\{ \begin{array}{ll}
\textrm{res} (f, g) = 0, & \hbox{if\ } \textrm{prem} (f, g) = 0\koz , \\
\\
g_n^{(m-n)(n-1)+d} \textrm{res} (f, g) = (-1)^{mn} \textrm{res} (g, r), &
\hbox{if\ } r = \textrm{prem} (f, g) \neq 0 \hbox{ and\ } d = \deg(r)\koz .
\end{array} \right. \]
\end{tetel}
\begin{biz}
Multiply the first line of the determinant (\ref{rezu}) by
$g_n^{m-n+1}$. Let
$q= q_{m-n} x^{m-n} + \cdots + q_0\in R[x]$ and
$r=r_d x^d + \cdots + r_0 \in R[x]$ be the uniquely determined
polynomials with
\begin{eqnarray*}
g_n^{m-n+1} (f_m x^m + \cdots + f_0) & = &
(q_{m-n} x^{m-n} + \cdots + q_0)(g_n x^n + \cdots + g_0) \\
& & \mbox{} + r_d x^d + \cdots + r_0\koz ,
\end{eqnarray*}
where $r=\textrm{prem}(f,g)$. Then multiplying row $(n+1)$ of
the resultant by $q_{m-n}$, row $(n+2)$ by $q_{m-n-1}$
etc., and subtracting them from the first row we get the determinant
\vspace*{12pt}
\[ g_n^{m-n+1} \textrm{res} (f, g) =
\left| \begin{array}{cccccccccc}
0 & \cdots & 0 & r_d & \cdots & \cdots & r_0 \\
& f_m & \cdots & \cdots & \cdots
& \cdots & \cdots & f_0 \\
& & \ddots & & & & & & \ddots \\
& & & f_m & \cdots & \cdots & \cdots
& \cdots & \cdots & f_0 \\
g_n & \cdots & \cdots & \cdots & g_0 \\
& \ddots & & & & \ddots \\
& & \ddots & & & & \ddots \\
& & & \ddots & & & & \ddots \\
& & & & \ddots & & & & \ddots \\
& & & & & g_n & \cdots & \cdots & \cdots & g_0
\end{array} \right|\koz .
\]\\
\vspace*{36pt}
\noindent Here $r_d$ is in the $(m-d+1)$th column of the first row,
and $r_0$ is in the $(m+1)$th column of the first row.
Similarly, multiply the second row by $g_n^{m-n+1}$, then
multiply rows $(n+2)$, $(n+3)$, $\ldots$ by $q_{m-n}$, $q_{m-n-1}$
etc., and subtract them from the second row. Continue the same way
for the third, $\ldots$, $n$th row. The result is
\vspace*{36pt}
\[ g_n^{n(m-n+1)} \textrm{res} (f, g) =
\left| \begin{array}{cccccccccc}
& & & r_d & \cdots & \cdots & r_0 \\
& & & & \ddots & & & \ddots \\
& & & & & \ddots & & & \ddots \\
& & & & & & r_d & \cdots & \cdots & r_0 \\
g_n & \cdots & \cdots & \cdots & g_0 \\
& \ddots & & & & \ddots \\
& & \ddots & & & & \ddots \\
& & & \ddots & & & & \ddots \\
& & & & \ddots & & & & \ddots \\
& & & & & g_n & \cdots & \cdots & \cdots& g_0
\end{array} \right|\koz .
\]
After reordering the rows
\[ g_{n}^{n(m-n+1)}\textrm{res}(f,g)=(-1)^{mn}
\left| \begin{array}{cccccccccc}
g_n & \cdots & \cdots & \cdots & g_0 \\
& \ddots & & & & \ddots \\
& & \ddots & & & & \ddots \\
& & & g_n & \cdots & \cdots & \cdots & g_0 \\
& & & & \ddots & & & & \ddots \\
& & & & & g_n & \cdots & \cdots & \cdots & g_0 \\
& & & r_d & \cdots & \cdots & r_0 \\
& & & & \ddots & & & \ddots \\
& & & & & \ddots & & & \ddots \\
& & & & & & r_d & \cdots & \cdots & r_0
\end{array} \right|\koz .
\]
Note that
\[ \left| \begin{array}{cccccccccc}
g_n & \cdots & \cdots & \cdots & g_0 \\
& \ddots & & & & \ddots \\
& & g_n & \cdots & \cdots & \cdots & g_0 \\
r_d & \cdots & \cdots & r_0 \\
& \ddots & & & \ddots \\
& & \ddots & & & \ddots \\
& & & r_d & \cdots & \cdots & r_0
\end{array} \right| = \textrm{res} (g, r)\koz ,
\]
thus,
\[ g_n^{n(m-n+1)} \textrm{res} (f, g) = (-1)^{mn} g_n^{m-d} \textrm{res} (g, r)\koz , \]
and therefore
\begin{equation}
g_n^{(m-n)(n-1)+d} \textrm{res} (f, g) = (-1)^{mn} \textrm{res} (g, r)
\label{res_rel}\koz .
\end{equation}
\end{biz}
Equation (\ref{res_rel}) describes an important relationship. Instead
of computing the possibly gigantic determinant $\textrm{res}(f,g)$, we
perform a series of pseudo-divisions and apply (\ref{res_rel}) in
each step. We calculate the resultant only when no more
pseudo-division can be done. An important consequence of the theorem
is the following corollary.
\begin{kov}
There exist polynomials $u,v\in R[x]$ such that
$\textrm{res}(f,g)=fu+gv$, with $\deg u< \deg g$, $\deg v< \deg f$.
\end{kov}
\begin{biz}
Multiply the $i$th column of the determinant form of the resultant by
$x^{m+n-i}$ and add it to the last column for $i=1,\ldots,
(m+n-1)$. Then
\[ \textrm{res}(f,g)=
\left |
\begin{array}{cccccr}
f_m & \cdots & \cdots & f_0 & \cdots & x^{n-1}f \\
& \ddots & & & \ddots & \vdots \\
& & f_m & \cdots & \cdots & f \\
g_n & \cdots & \cdots & g_0 & \cdots & x^{m-1}g \\
& \ddots & & & \ddots & \vdots \\
& & g_n & \cdots & \cdots & g \\
\end{array}
\right |\koz .
\]
Using the last column for expansion and factoring $f$ and $g$, we get
the statement with the restrictions on the degrees.
\end{biz}
\smallskip
The most important benefit of the resultant method, compared to
the previously discussed methods, is that the input polynomials may
contain symbolic coefficients as well.
\begin{pelda}
Let
\begin{eqnarray*}
f(x)&=& 2x^3-\xi x^2+x+3\in\Q[x]\koz ,\\
g(x)&=& x^2 -5x+6\in\Q[x]\koz .
\end{eqnarray*}
Then the existence of common rational roots of $f$ and $g$ cannot be
decided by variants of the Euclidean algorithm, but we can decide it with the
resultant method. Such a root exists if and only if
\[\textrm{res}(f,g)=
\left |
\begin{array}{ccccc}
2 & -\xi & 1 & 3 & \\
& 2 & -\xi & 1 & 3 \\
1 & -5 & 6 & & \\
& 1 & -5 & 6 & \\
& & 1 & -5 & 6
\end{array}
\right |=36\xi^2-429\xi+1260=3(4\xi-21)(3\xi-20)=0\koz ,
\]
that is, when $\xi=20/3$ or $\xi=21/4$.
\end{pelda}
The significance of the resultant is not only that we can decide the
existence of common roots of polynomials, but also that using it we
can reduce the solution of algebraic equation systems to solving
univariate equations.
\begin{pelda}
Let
\begin{eqnarray}
f(x,y)&=& x^2+xy+2x+y-1\in\Z[x,y]\koz ,\label{rek_res1}\\
g(x,y)&=& x^2+3x-y^2+2y-1\in\Z[x,y]\koz .\label{rek_res2}
\end{eqnarray}
Consider polynomials $f$ and $g$ as elements of $(\Z[x])[y]$. They
have a common root if and only if
\[\textrm{res}_y(f,g)=
\left |
\begin{array}{ccc}
x+1 & x^2+2x-1 & 0 \\
0 & x+1 & x^2+2x-1 \\
-1 & 2 & x^2+3x-1
\end{array}
\right |=-x^3-2x^2+3x=0\koz .
\]
Common roots in $\Z$ can exist for $x\in \{-3,0,1\}$. For each $x$, we
substitute into equations (\ref{rek_res1}) and (\ref{rek_res2})
(already in $\Z[y]$) and get that the integer solutions are
$(-3,1),(0,1),(1,-1)$.
\end{pelda}
We note that the resultant method can also be applied to solve
polynomial equations in several variables, but it is not really
effective. One problem is that computational space explosion occurs in
the computation of the determinant. Note that computing the
resultant of two univariate polynomials in determinant form using the
usual Gauss-elimination\nevindex{Gauss, Johann Carl Friedrich (1777--1855)}
requires $O((m+n)^3)$ operations, while the variants of the Euclidean
algorithm are quadratic. The other problem is that computational
complexity depends strongly on the order of the
indeterminates. \textit{Eliminating all variables together} in a
polynomial equation system is much more effective. This leads to the
introduction of multivariate resultants.
\subsection{Modular greatest common divisor}\label{modgcd}
All methods considered so far for the existence and calculation of
common roots of polynomials are characterised by an explosion of
computational space. The natural question arises: can we apply modular
techniques? Below we examine the case $a(x),b(x)\in
\Z[x]$ with $(a,b\neq 0)$. Let us consider the polynomials $(\ref{apol}),
(\ref{bpol})\in \Z[x]$ and let $p=13$ a prime number. Then the series
of remainders in $\Z_p[x]$ in the \textsc{Classical-Euclidean}
algorithm is
\begin{eqnarray*}
r_0&=&11x^5+5x^4+6x^3+6x^2+5\koz ,\\
r_1&=&x^4+x^3+2x^2+8x+8\koz ,\\
r_2&=&3x^3+8x^2+12x+1\koz ,\\
r_3&=&x^2+10x+10\koz ,\\
r_4&=&7x\koz ,\\
r_5&=&10\koz .
\end{eqnarray*}
We get that polynomials $a$ and $b$ are relatively prime in
$\Z_p[x]$. The following theorem describes the connection between greatest common divisors
in $\Z[x]$ and $\Z_p[x]$.
\begin{tetel}\label{modgcd}
Let $a,b\in \Z[x], a,b\neq 0$. Let $p$ be a prime such that $p\not\,
\mid\, \textrm{lc}(a)$ and $p\not\, \mid\, \textrm{lc}(b)$. Let
furthermore $c= \textrm{gcd}(a,b)\in \Z[x]$, $a_p=a\, \textrm{rem} p$,
$b_p=b\, \textrm{rem} p$ and $c_p=c\, \textrm{rem} p$. Then\\ \indent
(1) $\deg\big(\textrm{gcd}(a_p,b_p)\big)\geq
\deg\big(\textrm{gcd}(a,b)\big),$\\ \indent
(2) if $p\not\, \mid
\textrm{res}(a/c,b/c)$, then $\textrm{gcd}(a_p,b_p)=c_p$.
\end{tetel}
\begin{biz}
(1): Since $c_p\mid a_p$ and $c_p\mid b_p$, thus $c_p\mid
\textrm{gcd}(a_p,b_p)$. So
\[\deg\big(\textrm{gcd}(a_p,b_p)\big)\geq \deg\big(\textrm{gcd}(a,b)\bmod p\big).\]
By the hypothesis $p\not\, \mid\ \textrm{lc}\big(\textrm{gcd}(a,b)\big)$,
which implies
\[\deg\big(\textrm{gcd}(a,b)\bmod p\big)=\deg\big(\textrm{gcd}(a,b)\big)\koz .\]
(2): Since $\textrm{gcd}(a/c,b/c)=1$ and $c_p$ is non-trivial,
\begin{equation}
\textrm{gcd}(a_p,b_p)=c_p\cdot \textrm{gcd}(a_p/c_p,b_p/c_p)\koz .\label{modeu}
\end{equation}
If $\textrm{gcd}(a_p,b_p)\neq c_p$, then the right-hand side of
(\ref{modeu}) is non-trivial, thus $\textrm{res}(a_p/c_p,b_p/c_p)=0$.
But the resultant is the sum of the corresponding products of the
coefficients, so $p\mid \textrm{res}(a/c,b/c)$, contradiction.
\end{biz}
\begin{kov}\label{modlnkokov}
There are at most a finite number of primes $p$ such that $p\not\,
\mid \textrm{lc}(a)$, $p\not\, \mid \textrm{lc}(b)$ and
$\deg\big(\textrm{gcd}(a_p,b_p)\big) > \deg\big(\textrm{gcd}(a,b)\big).$\hfill{\fkocka}
\end{kov}
In case statement (1) of Theorem \ref{modgcd} is fulfilled,
we call $p$ a ``lucky prime''.\inddef{lucky prime}
We can outline a modular algorithm for the computation of the gcd.
\begin{alg}{Modular-Gcd-Bigprime($a,b$)}\inddef{\textsc{Modular-Gcd-Bigprime}}
1 \' $M \leftarrow$ the Landau-Mignotte constant (from Corollary \ref{LM-kov})\\
2 \' $H\leftarrow \{\}$\\
3 \' \key{while} \= \textsc{true}\\
4 \' \> \key{do} \= $p \leftarrow$ a prime with $p\geq 2M$, $p\not\in H$, $p\not\, \mid \textrm{lc}(a)$ and $p\not\, \mid \textrm{lc}(b)$\\
5 \' \> \> $c_p\leftarrow \textrm{gcd}(a_p,b_p)$\\
6 \' \> \> \key{if} \= $c_p\mid a$ and $c_p\mid b$\\
7 \' \> \> \> \key{then} \= \key{return} $c_p$\\
8 \' \> \> \> \key{else} \> $H\leftarrow H\cup \{p\}$
\end{alg}
\noindent
The first line of the algorithm requires the calculation of the
Landau-Mignotte bound.\nevindex{Landau, Edmund Georg
Hermann (1877--1938)}\nevindex{Mignotte, Maurice} The fourth line requires a
``sufficiently large'' prime $p$ which does not divide the leading
coefficient of $a$ and $b$. The fifth line computes the greatest common divisor of
polynomials $a$ and $b$ modulo $p$ (for example with the
\textsc{Classical-Euclidean} algorithm in $\Z_p[x]$). We store the
coefficients of the resulting polynomials with symmetrical
representation. The sixth line examines whether $c_p\mid a$ and
$c_p\mid b$ are fulfilled, in which case $c_p$ is the required greatest common divisor. If
this is not the case, then $p$ is an ``unlucky prime'', so we choose
another prime. Since, by Theorem \ref{modgcd}, there are only finitely
many ``unlucky primes'', the algorithm eventually terminates.
If the primes are chosen according to a given strategy, set
$H$ is not needed.
The disadvantage of the \textsc{Modular-gcd-bigprime} algorithm is
that the Landau-Mignotte\nevindex{Mignotte,
Maurice}\nevindex{Laurent, Pierre Alphonse (1813--1854)} constant grows
exponentially in terms of the degree of the input polynomials, so we
have to work with large primes. The question is how we could modify
the algorithm so that we can work with ``many small
primes''. Since the greatest common divisor in $\Z_p[x]$ is only unique up to a constant
factor, we have to be careful with the coefficients of the polynomials
in the new algorithm. So, before applying the Chinese remainder theorem
for the coefficients of the modular greatest common divisors taken modulo different
primes, we have to normalise the leading coefficient of
$\textrm{gcd}(a_p,b_p)$. If $a_m$ and $b_n$ are the leading
coefficients of
$a$ and $b$, then the leading coefficient of $\textrm{gcd}(a,b)$
divides $\textrm{gcd}(a_m,b_n)$. Therefore, we normalise the leading
coefficient of $\textrm{gcd}(a_p,b_p)$ to
$\textrm{gcd}(a_m,b_n) \bmod p$ in case of primitive polynomials
$a$ and $b$; and finally take the primitive part of the resulting
polynomial. Just like in the \textsc{Modular-gcd-bigprime} algorithm,
modular values are stored with symmetrical representation. These
observations lead to the following modular gcd algorithm using small
primes.
\begin{algN}{Modular-Gcd-Smallprimes($a,b$)}\inddef{\textsc{Modular-Gcd-Smallprimes}}
\vspace{-1mm}
1 \' $d\leftarrow \textrm{gcd}(\, \textrm{lc}(a),\textrm{lc}(b))$\\
2 \' $p\leftarrow$ a prime such that $p\not\, \mid d$\\
3 \' $H\leftarrow \{p\}$ \\
4 \' $P\leftarrow p$ \\
5 \' $c_p\leftarrow \textrm{gcd}(a_p,b_p)$\\
6 \' $g_p\leftarrow (d \bmod p)\cdot c_p$\\
7 \' $(n,i,j)\leftarrow (3,1,1)$ \\
8 \' \key{while} \= \textsc{true}\\
9 \' \> \key{do} \= \key{if} \= $j=1$\\
10 \' \> \> \> \key{then} \= \key{if} \= $\deg g_p=0$\\
11 \' \> \> \> \> \> \key{then} \key{return} $1$\\
12 \' \> \> \> $(g,j,P)\leftarrow (g_p,0,p)$\\
13 \' \> \> \key{if} \= $n\leq i$ \\
14 \' \> \> \> \key{then} \= $g\leftarrow \textrm{pp}(g)$\\
15 \' \> \> \> \> \key{if} \= $g\mid a$ and $g\mid b$\\
16 \' \> \> \> \> \> \key{then} \key{return} $g$\\
17 \' \> \> $p\leftarrow$ a prime such that $p\not\, \mid d$ and $p\not\in H$\\
18 \' \> \> $H\leftarrow H\cup \{p\}$\\
19 \' \> \> $c_p\leftarrow \textrm{gcd}(a_p,b_p)$\\
20 \' \> \> $g_p\leftarrow (d \bmod p)\cdot c_p$ \\
21 \' \> \> \key{if} \= $\deg g_p< \deg g$ \\
22 \' \> \> \> \key{then} \=$(i,j) \leftarrow (1,1)$
\algnewpage
23 \' \> \> \key{if} \= $j=0$ \\
24 \' \> \> \> \key{then} \= \key{if} \= $\deg g_p=\deg g$ \\
25 \' \> \> \> \> \> \key{then} \= $g_1=$ \textsc{Coeff-build($g,g_p,P,p$)}\\
26 \' \> \> \> \> \> \> \key{if} \= $g_1=g$\\
27 \' \> \> \> \> \> \> \> \key{then} \= $i\leftarrow i+1$ \\
28 \' \> \> \> \> \> \> \> \key{else} \> $i\leftarrow 1$\\
29 \' \> \> \> \> \> \> $P\leftarrow P\cdot p$\\
30 \' \> \> \> \> \> \> $g\leftarrow g_1$
\end{algN}
\begin{alg}{Coeff-Build($a,b,m_1,m_2$)}\inddef{\textsc{Coeff-Build}}
1 \' $p\leftarrow 0$\\
2 \' $c\leftarrow 1/m_1 \bmod m_2$\\
3 \' \key{for} \= $i\leftarrow \deg a$ \key{downto} $0$ \\
4 \' \> \key{do} \= $r\leftarrow a_i \bmod m_1$\\
5 \' \> \> $s\leftarrow (b_i-r) \bmod m_2$\\
6 \' \> \> $p\leftarrow p+ (r+s\cdot m_1)x^i$\\
7 \' \key{return} $p$
\end{alg}
\noindent
We may note that the algorithm \textsc{Modular-Gcd-Smallprimes} does
not require as many small primes as the Landau-Mignotte bound tells us.\nevindex{Landau,
Edmund Georg Hermann (1877--1938)} \nevindex{Mignotte, Maurice} When the value of
polynomial $g$ does not change during a few iterations, we test in
lines $13$--$16$ if $g$ is a greatest common divisor. The number of these
iterations is stored in the variable $n$ of line six. Note that the
value of $n$ could vary according to the input polynomial. The
primes used in the algorithms could preferably be chosen from an
(architecture-dependent) prestored list containing primes that fit in
a machine word, so the use of set $H$ becomes unnecessary.
Corollary \ref{modlnkokov} implies that the
\textsc{Modular-Gcd-Smallprimes} algorithm always terminates.
The \textsc{Coeff-Build} algorithm computes the solution of the
equation system obtained by taking congruence relations modulo $m_1$
and $m_2$ for the coefficients of identical degree in the input
polynomials $a$ and $b$. This is done according to the Chinese
remainder theorem. It is very important to store the results in
symmetrical modular representation form.
\begin{pelda}
Let us examine the operation of the \textsc{Modular-Gcd-Smallprimes}
algorithm for the previously seen polynomials
(\ref{apol}), (\ref{bpol}).
For simplicity, we calculate with small primes. Recall that
\begin{eqnarray*}
a(x)&=&63x^5+57x^4-59x^3+45x^2-8\in \Z[x]\koz ,\\
b(x)&=&-77x^4+66x^3+54x^2-5x+99\in \Z[x]\koz .
\end{eqnarray*}
After the execution of the first six lines of the algorithm with $p=5$,
we have $d=7, c_p=x^2+3x+2$ and $g_p=2x^2+x-1$. Since $j=1$ due to
line $7$, lines $10$--$12$ are executed. Polynomial $g_p$ is
not zero, so $g=2x^2+x-1, j=0$, and $P=5$ will be the values after
the execution. The condition in line $13$ is not fulfilled, so we choose
another prime, $p=7$ is a bad choice, but $p=11$ is
allowed. According to lines $19$--$20$, $c_p=1$, $g_p=-4$. Since
$\deg g_p<\deg g$, we have $j=1$ and lines $25$--$30$ are not
executed. Polynomial $g_p$ is constant, so the return value in line
$11$ is 1, which means that polynomials $a$ and $b$ are relatively
prime.
\end{pelda}
\begin{pelda}
In our second example, consider the already discussed polynomials
\begin{eqnarray*}
a(x)&=&12x^4-68x^3+52x^2-92x+56\in \Z[x]\koz ,\\
b(x)&=&-12x^3+80x^2-84x+24\in \Z[x]\koz .
\end{eqnarray*}
Let again $p=5$. After the first six lines of the polynomials $d=12,
c_p=x+1, g_p=2x+2$. After the execution of lines $10$--$12$, we have
$P=5, g=2x+2$. Let the next prime be $p=7$. So the new values are
$c_p=x+4, g_p=-2x-1$. Since
$\deg g_p=\deg g$, $P=35$ and the new value of
$g$ is $12x-8$ after lines $25$--$30$. The value of the variable $i$ is still 1. Let the next
prime be 11. Then $c_p=g_p=x+3$. Polynomials $g_p$ and $g$ have
the same degree, so we modify the coefficients of $g$. Then
$g_1=12x-8$ and since $g=g_1$, we get $i=2$ and $P=385$. Let the new
prime be 13. Then $c_p=x+8, g_p=-x+5$. The degrees of $g_p$ and $g$
are still equal, thus lines $25$--$30$ are executed and the variables
become $g=12x-8, P=4654, i=3$.
After the execution of lines $17$--$18$, it turns out that $g\mid a$
and $g\mid b$, so $g=12x-8$ is the greatest common divisor.
\end{pelda}
\noindent
We give the following theorem without proof.
\begin{tetel}
The \textsc{Modular-Gcd-Smallprimes} algorithm works correctly. The
computational complexity of the algorithm is $O(m^3(\lg
m+\lambda(K))^2)$ machine word operations, where
$m=\min\{\deg a,\deg b\}$, and $K$ is the Landau-Mignotte bound for
polynomials $a$ and $b$.
\end{tetel}\nevindex{Landau, Edmund Georg Hermann (1877--1938)}\nevindex{Mignotte, Maurice}
\vspace{-6pt}
\begin{gyak}
\ujgyak Let $R$ be a commutative ring with identity element, $a=\sum_{i=0}^m a_i x^i\in R[x]$,
$b=\sum_{i=0}^n b_i x^i\in R[x]$, furthermore, $b_n$ a unit, $m\geq
n\geq 0$. The following algorithm performs Euclidean division for $a$
and $b$ and
outputs polynomials $q,r\in R[x]$ for which $a=qb+r$ and $\deg
r \key{do} \= \key{if} \= $\deg r=n+i$ \\
4 \' \> \> \> \key{then} \= $q_i\leftarrow \textrm{lc}(r)/b_n$\\
5 \' \> \> \> \> $r\leftarrow r-q_ix^i b$\\
6 \' \> \> \> \key{else} \> $q_i\leftarrow 0$\\
7 \' $q\leftarrow\sum_{i=0}^{m-n}q_ix^i$ and $r$ \\
8 \' \key{return} $q$
\end{alg}
\noindent
Prove that the algorithm uses at most
\[(2\deg b+1)(\deg q+1)=O(m^2)\]
operations in $R$.
\ujgyak What is the difference between the algorithms
\textsc{Extended-Euclidean} and \textsc{Extended-Euclidean-Normalised}
in $\Z[x]$?\gyakindex{\textsc{Extended-Euclidean}}\gyakindex{\textsc{Extended-Euclidean-Normalised}}
\ujgyak Prove that $\textrm{res}(f\cdot g,h)=\textrm{res}(f,h)\cdot \textrm{res}(g,h)$.
\ujgyak The \ki{discriminant}\gyakindex{discriminant} of polynomial $f(x)\in R[x]$ ($\deg f=m$, $\textrm{lc}(f)=f_m$) is the element
\[
\textrm{discr} f= \frac{(-1)^{\frac{m(m-1)}{2}}}{f_m}\textrm{res}(f,f')\in R\koz ,
\]
where $f'$ denotes the derivative of $f$ with respect to $x$.
Polynomial $f$ has a multiple root if and only if its discriminant is
0. Compute $\textrm{discr} f$ for general polynomials of second and
third degree.
\end{gyak}
\section{Gr\"obner basis}\label{grobner}\index{Gr\"obner basis}\nevindex{Gr\"obner, Wolfgang Anton Maria}
Let $F$ be a field and $R=F[x_1,x_2,\ldots,x_n]$ be a multivariate polynomial ring in $n$ variables over $F$.
Let $f_1,f_2,\ldots,f_s\in R$. First we determine a necessary and sufficient condition for
the polynomials $f_1,f_2,\ldots,f_s$ having common roots in $R$.
We can see that the problem is a generalisation of the case $s=2$ from the previous subsection.
Let
\[I=\langle f_1,\ldots,f_s \rangle=\bigg\{ \sum_{1\leq i\leq s} q_i f_i : q_i \in R \bigg\}\]
denote the \ki{ideal generated by polynomials} $f_1,\ldots,f_s$. Then the polynomials $f_1,\ldots,f_s$ form a
\ki{basis} of ideal $I$. \inddef{basis!of the ideal} The \ki{variety} of an ideal $I$ is the set
\inddef{variety} \[V(I)=\bigg\{u\in F^n : f(u)=0\ \mbox{ for all}\ f\in I \bigg\}.\]
The knowledge of the variety $V(I)$ means that we also know the common roots of $f_1,\ldots,f_s$.
The most important questions about the variety and ideal $I$ are as follows.
\begin{itemize}
\item $V(I)\neq \emptyset\ ?$\\
\item How ``big'' is $V(I)$?\\
\item Given $f\in R$, in which case is $f\in I$?\\
\item $I=R$ ?
\end{itemize}
Fortunately, in a special basis of ideal $I$, in the so-called Gr\"obner basis,
these questions are easy to answer.
First let us study the case $n=1$. Since $F[x]$ is a Euclidean ring,
\begin{equation}
\langle f_1,\ldots,f_s \rangle=\langle\ \textrm{gcd}(f_1,\ldots,f_s)\ \rangle\koz .\label{egyvgenid}
\end{equation}
We may assume that $s=2$. Let $f,g\in F[x]$ and divide $f$ by $g$ with remainder. Then
there exist unique polynomials $q,r\in F[x]$ with $f=gq+r$ and $\deg r< \deg g$. Hence,
\[f\in \langle g\rangle \Leftrightarrow r=0\koz .\]
Moreover, $V(g)=\{u_1,\ldots,u_d\}$ if $x-u_1,\ldots,x-u_d$ are the distinct linear factors of $g\in F[x]$.
Unfortunately, equality (\ref{egyvgenid}) is not true in case of two or more variables. Indeed,
a multivariate polynomial ring over an arbitrary field is not necessary Euclidean, therefore
we have to find a new interpretation of division with remainder.
We proceed in this direction.
\subsection{Monomial order}\index{order!allowable}\index{monomial!order}\index{order!monomial}
Recall that a partial order $\rho\subseteq S\times S$ is a total order (or simply order) if either $a\rho b$ or
$b\rho a$ for all $a,b\in S$.
The total order `$\preceq$` $\subseteq \N^n$ is \ki{allowable} if
\smallskip
(i) $(0,\ldots,0)\preceq v$ for all $v\in \N^n$,\\
\indent
(ii) $v_1\preceq v_2\Rightarrow v_1+v\preceq v_2+v$ for all $v_1,v_2,v\in \N^n$.
\smallskip
\noindent
It is easy to prove that any allowable order on $\N^n$ is a well-order
(namely, every nonempty subset of $\N^n$ has a least element).
With the notation already
adopted consider the set
\[T=\{x_1^{i_1}\cdots x_n^{i_n}\ |\ i_1,\ldots,i_n\in \N\}\koz .\]
The elements of $T$ are called \ki{monomials}.\inddef{monomial}
Observe that $T$ is closed under multiplication in $F[x_1,\ldots,x_n]$,
constituting a commutative monoid.
The map $\N^n\rightarrow T$, $(i_1,\ldots,i_n)\mapsto x_1^{i_1}\cdots x_n^{i_n}$ is an isomorphism, therefore,
for an allowable total order $\preceq$ on $T$, we have that
\smallskip
(i) $1\preceq t$ for all $t\in T$,\\
\indent
(ii) $\forall\ t_1,t_2,t\in T\ \ t_1\prec t_2\Rightarrow t_1t\prec t_2t.$
\smallskip
\noindent
The allowable orders on $T$ are called \ki{monomial orders}.\inddef{order!monomial}
If $n=1$, the natural order is
a monomial order, and the corresponding
univariate monomials are ordered by their degree.
Let us see some standard examples of higher degree monomial orders. Let
\[\alpha=x_1^{i_1}\cdots x_n^{i_n},\ \beta=x_1^{j_1}\cdots x_n^{j_n}\in T\koz ,\]
where the variables are ordered as $x_1\succ x_2\succ \cdots \succ x_{n-1}\succ x_n$.
\begin{itemize}
\item \ki{Pure lexicographic order.}\\
$\alpha\prec_{plex}\beta \Leftrightarrow$ $\exists l\in\{1,\ldots,n\}$ $i_lj_l$ and $i_{l+1}=j_{l+1},\ldots, i_n=j_n$).
\end{itemize}
The proof that these orders are monomial orders is left as an exercise.
Observe that if $n=1$, then $\prec_{plex}=\prec_{grlex}=\prec_{grevlex}$.
The graded reverse lexicographic order is often called a total degree order and it is denoted by $\prec_{tdeg}$.
\begin{pelda}
\noindent
Let $\prec=\prec_{plex}$ and let $z\prec y\prec x$. Then
\begin{eqnarray*}
1&\prec& z\prec z^2\prec \cdots \prec y\prec yz\prec yz^2 \prec \cdots\\
&\prec& y^2 \prec y^2z\prec y^2z^2\prec \cdots x\prec xz\prec xz^2\prec \cdots \\
&\prec& xy\prec xy^2\prec\cdots \prec x^2\prec \cdots\koz .
\end{eqnarray*}
Let $\prec=\prec_{tdeg}$ and again, $z\prec y\prec x$. Then
\begin{eqnarray*}
1&\prec& z\prec y\prec x\\
&\prec& z^2 \prec yz\prec xz\prec y^2\prec xy\prec x^2 \\
&\prec& z^3\prec yz^2\prec xz^2\prec y^2z\prec xyz\\
&\prec& x^2z\prec y^3\prec xy^2 \prec x^2y\prec x^3\prec \cdots\koz .
\end{eqnarray*}
\end{pelda}
%Further we assume that we work with a fixed monomial order.
\noindent
Let a monomial order $\prec$ be given.
Furthermore, we identify the vector $\alpha=(\alpha_1,\ldots,\alpha_n)\in \N^n$ with the monomial
$x^{\alpha}=x_1^{\alpha_1}\cdots x_n^{\alpha_n}\in R$.
Let $f=\sum_{\alpha\in \N^n} c_{\alpha}x^{\alpha}\in R$ be a non-zero polynomial, $c_{\alpha}\in F$.
Then $c_{\alpha}x^{\alpha}\ (c_{\alpha}\neq 0)$ are the \ki{terms}
of polynomial $f$, $\textrm{mdeg}(f)=\max\{\alpha\in \N^n:c_{\alpha}\neq 0\}$ is the \ki{multidegree}
of the polynomial (where the maximum is with respect to the monomial order),
$\textrm{lc}(f)=c_{\textrm{mdeg}(f)}\in F\setminus \{0\}$
is the \ki{leading coefficient} of $f$, $\textrm{lm}(f)=x^{\textrm{mdeg}(f)}\in R$ is the \ki{leading monomial}
of $f$, and
$\textrm{lt}(f)=\textrm{lc}(f)\cdot \textrm{lm}(f)\in R$ is the \ki{leading term} of $f$.
Let $\textrm{lt}(0)=\textrm{lc}(0)=\textrm{lm}(0)=0$ and $\textrm{mdeg}(0)=-\infty$.
\index{polynomial!multivariate}
\inddef{multivariate polynomial!leading term}
\inddef{multivariate polynomial!multidegree}
\inddef{multivariate polynomial!leading coefficient}
\inddef{multivariate polynomial!leading monomial}
\begin{pelda}
Consider the polynomial $f(x,y,z)=2xyz^2-3x^3+4y^4-5xy^2z\in \Q[x,y,z]$.
Let $\prec=\prec_{plex}$ and $z\prec y\prec x$. Then
\[\textrm{mdeg}(f)=(3,0,0),\ \textrm{lt}(f)=-3x^3,\ \textrm{lm}(f)=x^3,\ \textrm{lc}(f)=-3\koz . \]
If $\prec=\prec_{tdeg}$ and $z\prec y\prec x$, then
\[\textrm{mdeg}(f)=(0,4,0),\ \textrm{lt}(f)=4y^4,\ \textrm{lm}(f)=y^4,\ \textrm{lc}(f)=4\koz .\]
\end{pelda}
\vspace{-12pt}
\subsection{Multivariate division with remainder}
In this subsection, our aim is to give an algorithm for division with remainder in $R$.
Given multivariate polynomials $f, f_1,\ldots,f_s\in R$ and monomial order $\prec$,
we want to compute the polynomials $q_1,\ldots,q_s\in R$ and $r\in R$ such that
$f=q_1f_1+\cdots +q_sf_s+r$ and no monomial in $r$ is divisible by any of $\textrm{lt}(f_1),\ldots,\textrm{lt}(f_s)$.
\begin{algN}{Multivariate-Division-with-Remainder($f,f_1,\ldots,f_s$)}
\inddef{\textsc{Division-with-Remainder}!multivariate}
1 \' $r\leftarrow 0$\\
2 \' $p\leftarrow f$\\
3 \' \key{for} \= $i \leftarrow 1$ \key{to} $s$ \\
4 \' \> \key{do} \= $q_i\leftarrow 0$\\
5 \' \key{while} \= $p\neq 0$ \\
6 \' \> \key{do} \= \key{if} \= $\textrm{lt}(f_i)$ divides $\textrm{lt}(p)$ for some $i\in \{1,\ldots,s\}$ \\
7 \' \> \> \> \key{then} \= choose such an $i$ and \= $q_i\leftarrow q_i+\textrm{lt}(p)/\textrm{lt}\cdot(f_i)$\\
8 \' \> \> \> \> $p\leftarrow p-\textrm{lt}(p)/\textrm{lt}(f_i)\cdot f_i$\\
9 \' \> \> \> \key{else} \> $r\leftarrow r+\textrm{lt}(p)$\\
10 \' \> \> \> \> $p\leftarrow p-\textrm{lt}(p)$\\
11 \' \key{return} $q_1,\ldots,q_s$ and $r$
\end{algN}
\noindent
The correctness of the algorithm follows from the fact that
in every iteration of the \textbf{while} cycle of lines $5$--$10$, the following invariants hold
(i) $\textrm{mdeg}(p)\preceq \textrm{mdeg}(f)$ and $f=p+q_1f_1+\cdots +q_sf_s+r$,
\indent
(ii) $q_i\neq 0\Rightarrow \textrm{mdeg}(q_if_i)\preceq \textrm{mdeg}(f)$ for all $1\leq i\leq s$,
\indent
(iii) no monomial in $r$ is divisible by any $\textrm{lt}(f_i)$.
\smallskip
\noindent
The algorithm has a weakness, namely, the multivariate division with remainder is not deterministic.
In line $7$, we can choose arbitrarily from the appropriate values of $i$.
\begin{pelda} Let $f=x^2y+xy^2+y^2\in \Q[x,y]$, $f_1=xy-1$,
$f_2=y^2-1$, the monomial order $\prec_{plex}$, $y\prec_{plex} x$, and
in line $7$, we always choose the smallest possible i. Then the
result of the algorithm is $q_1=x+y$, $q_2=1$, $r=x+y+1$. But if we
change the order of the functions $f_1$ and $f_2$ (that is, $f_1=y^2-1$
and $f_2=xy-1$), then the output of the algorithm is $q_1=x+1$, $q_2=x$
and $r=2x+1$. \end{pelda}
As we have seen in the previous example, we can make the algorithm deterministic
by always choosing the smallest possible $i$ in line $7$.
In this case, the \ki{quotients} $q_1,\ldots,q_s$ and the \ki{remainder} $r$ are unique, which we can express as
$r=f\ \textrm{rem}\ (f_1,\ldots,f_s)$.\inddef{quotient}\inddef{remainder}
Observe that if $s=1,$ then the algorithm gives the answer to the ideal membership problem: $f\in \langle f_1\rangle$
if and only if the remainder is zero.
Unfortunately, if $s\geq 2$, then this is not true anymore. For example, with the monomial order $\prec_{plex}$
\[xy^2-x\ \textrm{rem}\ (xy+1,y^2-1)=-x-y \koz ,\]
and the quotients are $q_1=y$, $q_2=0$. On the other hand, $xy^2-x= x\cdot (y^2-1)+\, 0$, which shows that
$xy^2-x\in \langle xy+1,y^2-1\rangle$.
\subsection{Monomial ideals and Hilbert's basis theorem}\nevindex{Hilbert, David (1862--1943)}
Our next goal is to find a special basis for an arbitrary polynomial ideal
such that the remainder on division by that basis is unique, which gives the answer to the ideal membership problem.
But does such a basis exist at all? And if it does, is it finite?
The ideal $I\subseteq R$ is a \ki{monomial ideal} if there exists a subset $A\subseteq \N^n$ such that
\[I=\langle x^A\rangle=\langle \{x^{\alpha}\in T:\alpha\in A\}\rangle\koz ,\]
that is, ideal $I$ is generated by monomials.
\begin{lemma}\inddef{monomial ideal}\label{monid}
Let $I=\langle x^A\rangle\subseteq R$ be a monomial ideal, and $\beta\in \N^n$. Then
\[x^{\beta}\in I\Leftrightarrow \exists \alpha\in A\ \ \ x^{\alpha}\mid x^{\beta}\koz .\]
\end{lemma}
\begin{biz}
The $\Leftarrow$ direction is obvious. Conversely, let $\alpha_1,\ldots,\alpha_s\in A$ and
$q_1,\ldots,q_s\in R$ such that $x^{\beta}=\sum_i q_ix^{\alpha_i}$.
Then the sum has at least one member $q_ix^{\alpha_i}$ which contains $x^{\beta}$, therefore
$x^{\alpha_i}\mid x^{\beta}$.
\end{biz}
\vspace{1mm}
The most important consequence of the lemma is that two monomial ideals are
identical if and only if they contain the same monomials.
\begin{lemma}[Dickson's lemma]\index{Dickson's lemma}\nevindex{Dickson, Leonard Eugene}
Every monomial ideal is finitely generated, namely,
for every $A\subseteq \N^n$, there exists a finite subset $B\subseteq A$ such that
$\langle x^A\rangle = \langle x^B\rangle$.
\end{lemma}
\vspace{1mm}
\begin{lemma}
Let $I$ be an ideal in $R=F[x_1,\ldots,x_n]$. If $G\subseteq I$ is a finite subset such that
$\langle \textrm{lt}(G)\rangle = \langle \textrm{lt}(I)\rangle$, then $\langle G\rangle = I$.
\end{lemma}
\vspace{1mm}
\begin{biz}
Let $G=\{g_1,\ldots,g_s\}$. If $f\in I$ is an arbitrary polynomial, then division with remainder gives
$f=q_1g_1+\cdots +q_sg_s+r$, with $q_1,\ldots,q_s,r\in R$, such that either $r=0$ or no term of $r$ is
divisible by the leading term of any $g_i$. But $r=f-q_1g_1-\cdots -q_sg_s\in I$, hence,
$\textrm{lt}(r)\in \textrm{lt}(I)\subseteq
\langle \textrm{lt}(g_1),\ldots,\textrm{lt}(g_s)\rangle$.
This, together with Lemma (\ref{monid}), implies that $r=0$, therefore
$f\in \langle g_1,\ldots,g_s\rangle = \langle G\rangle$.
\end{biz}
Together with Dickson's lemma applied to $\langle \textrm{lt}(I)\rangle$,
and the fact that the zero polynomial generates
the zero ideal, we obtain the following famous result.
\vspace{1mm}
\begin{tetel}[Hilbert's basis theorem]\index{Hilbert's basis}\nevindex{Dickson,
Leonard Eugene}\nevindex{Hilbert, David (1862--1943)}
Every ideal $I\subseteq R=F[x_1,\ldots,x_n]$ is finitely generated, namely,
there exists a finite subset $G\subseteq I$ such that $\langle G\rangle =I$ and
$\langle \textrm{lt}(G)\rangle = \langle \textrm{lt}(I)\rangle$.
\end{tetel}
\vspace{1mm}
\begin{kov}[ascending chain condition]\label{ideallanc}\inddef{ascending chain of ideals}
Let $I_1\subseteq I_2\subseteq \cdots$ be an ascending chain of ideals in $R$.
Then there exists an $n\in \N$ such that $I_n=I_{n+1}=\cdots$\koz .
\end{kov}
\begin{biz}
Let $I=\cup_{j\geq 1}I_j$. Then $I$ is an ideal, which is finitely generated by Hilbert's basis theorem.
Let $I=\langle g_1,\ldots,g_s\rangle$. With $n=\min\{j\geq 1:g_1,\ldots,g_s\in I_j\}$, we have
$I_n=I_{n+1}=\cdots=I$.
\end{biz}
\vspace{1mm}
A ring satisfying the ascending chain condition is called \ki{Noetherian}.
\nevindex{Noether, Amalia Emmy (1882--1935)}\inddef{Noetherian ring}
Specifically, if $F$ is a field, then $F[x_1,\ldots,x_n]$ is Noetherian.
Let $\prec$ be a monomial order on $R$ and $I\subseteq R$ an ideal.
A finite set $G\subseteq I$ is a \ki{Gröbner basis}\inddef{Gröbner basis} of ideal $I$ with respect to $\prec$
if $\langle \textrm{lt}(G)\rangle = \langle \textrm{lt}(I)\rangle$.\nevindex{Gröbner, Wolfgang Anton Maria}
Hilbert's basis theorem implies the following corollary \nevindex{Hilbert, David (1862--1943)}
\begin{kov}
Every ideal $I$ in $R=F[x_1,\ldots,x_n]$ has a Gröbner basis.
\end{kov}\index{Gröbner basis}
It is easy to show that the remainder on division by the Gröbner basis $G$ does not depend on the order
of the elements of G. Therefore, we can use the notation $f\ \textrm{rem}\ G=r\in R$.
Using the Gröbner basis, we can easily answer the ideal membership problem.
\vspace{1mm}
\begin{tetel}
Let $G$ be a Gröbner basis of ideal $I\subseteq R$ with respect to a monomial order $\prec$ and let $f\in R$.
Then $f\in I\Leftrightarrow f\ \textrm{rem}\ G=0$.
\end{tetel}\index{Gröbner basis}
\vspace{1mm}
\begin{biz}
We prove that there exists a unique $r\in R$ such that (1) $f-r\in I$, (2) no term of $r$ is divisible
by any monomial of $\textrm{lt}(G)$. The existence of such an $r$ comes from division with remainder.
For the uniqueness, let $f=h_1+r_1=h_2+r_2$ for arbitrary $h_1,h_2\in I$ and
suppose that no term of $r_1$ or $r_2$ is divisible by any monomial of $\textrm{lt}(G)$. Then
$r_1-r_2=h_2-h_1\in I$, and by Lemma
\ref{monid}, $\textrm{lt}(r_1-r_2)$ is divisible by $\textrm{lt}(g)$ for some $g\in G$. This means that $r_1-r_2=0$.
\end{biz}
\vspace{1mm}
Thus, if $G$ is a Gröbner basis of $R $, then for all $f,g,h\in R$,
\[g= f\ \textrm{rem}\ G \mbox{ and}\ h=f\ \textrm{rem}\ G \Rightarrow g=h.\]
\subsection{Buchberger's algorithm}\nevindex{Buchberger, Bruno}
Unfortunately, Hilbert's basis theorem is not constructive, since
it does not tell us how to compute a Gröbner basis for an ideal $I$ and basis $G$.
In the following, we investigate how the finite set $G$ can fail to be a Gröbner basis for an ideal $I$.
\nevindex{Hilbert, David (1862--1943)}
Let $g,h\in R$ be nonzero polynomials, $\alpha=(\alpha_1,\ldots,\alpha_n)=\textrm{mdeg}(g)$,
$\beta=(\beta_1,\ldots,\beta_n)=\textrm{mdeg}(h)$, and
$\gamma=(\max\{\alpha_1,\beta_1\},\ldots,\max\{\alpha_n,\beta_n\})$.
The \ki{S-polynomial}\inddef{S-polynomial} of $g$ and $h$ is
\[S(g,h)=\frac{x^{\gamma}}{\textrm{lt}(g)}g-\frac{x^{\gamma}}{\textrm{lt}(h)}h\in R \koz .\]
It is easy to see that $S(g,h)=-S(h,g)$, moreover, since
$x^{\gamma}/\textrm{lt}(g), x^{\gamma}/\textrm{lt}(h)\in R$, therefore $S(g,h)\in \langle g,h\rangle$.
The following theorem yields an easy method to test whether a given set $G$ is a Gröbner basis of the ideal
$\langle G\rangle$.
\vspace{1mm}
\begin{tetel}\label{spoli}
The set $G=\{g_1,\ldots,g_s\}\subseteq R$ is the Gröbner basis of the ideal $\langle G\rangle$ if and only if
\[S(g_i,g_j)\ \textrm{rem}\ (g_1,\ldots,g_s)=0\ \mbox{ for all}\ i\ (1\leq i \key{do} \= $(f,g)\leftarrow$ an arbitrary pair from $P$\\
5 \' \> \> $P\leftarrow P\setminus (f,g)$\\
6 \' \> \> $r\leftarrow S(f,g)\ \textrm{rem}\ G$\\
7 \' \> \> \key{if} \= $r\neq 0$\\
8 \' \> \> \> \key{then} \= $G\leftarrow G\cup \{r\}$\\
9 \' \> \> \> \> $P\leftarrow P\cup \{(f,r)\mid f\in G\}$\\
10 \' \key{return} $G$
\end{algN}
\noindent First we show the correctness of the \textsc{Gröbner-basis}
algorithm assuming that the procedure terminates. At any stage of the
algorithm, set $G$ is a basis of ideal $I$, since initially it is, and
at any other step only those elements are added to $G$ that are remainders
of the $S$-polynomials on division by $G$. If the algorithm
terminates, the remainders of all $S$-polynomials on division by
$G$ are zero, and by Theorem (\ref{spoli}), $G$ is a Gröbner basis.
Next we show that the algorithm terminates. Let $G$ and $G^*$ be the
sets corresponding to successive iterations of the \textbf{while}
cycle (lines $3-9$). Clearly, $G\subseteq G^*$ and
$\langle\textrm{lt}(G)\rangle\subseteq \langle
\textrm{lc}(G^*)\rangle$. Hence, ideals $\langle
\textrm{lt}(G)\rangle$ in successive iteration steps form an ascending
chain, which stabilises by Corollary (\ref{ideallanc}). Thus, after a
finite number of steps, we have $\langle\textrm{lt}(G)\rangle= \langle
\textrm{lc}(G^*)\rangle$. We state that $G=G^*$ in this case. Let
$f,g\in G$ and $r=S(f,g)\ \textrm{rem}\ G$. Then $r\in G^*$ and either
$r=0$ or $\textrm{lt}(r)\in \langle
\textrm{lt}(G^*)\rangle=\langle\textrm{lt}(G)\rangle$, and using the
definition of the remainder, we conclude that $r=0$.\index{ascending
chain of ideals}
\begin{pelda}\label{grpeld}
%Az algoritmus működését vizsgáljuk meg egy példán keresztül.
Let $F=\Q$, $\prec=\prec_{plex}$, $z\prec y\prec x$,
$f_1=x-y-z, f_2=x+y-z^2, f_3=x^2+y^2-1$.
$G=\{f_1,f_2,f_3\}$ by step one, and
$P=\{(f_1,f_2),(f_1,f_3),(f_2,f_3)\}$ by step two.
At the first iteration of the \textbf{while} cycle, let us choose the
pair $(f_1,f_2)$. Then $P=\{(f_1,f_3),(f_2,f_3)\}$,
$S(f_1,f_2)=-2y-z+z^2$ and $r=f_4=S(f_1,f_2)\ \textrm{rem}\ G
=-2y-z+z^2$. Therefore, $G=\{f_1,f_2,f_3,f_4\}$ and $P=\{(f_1,f_3),
(f_2,f_3), (f_1,f_4), (f_2,f_4), (f_3,f_4)\}$.
At the second iteration of the \textbf{while} cycle, let us choose the pair $(f_1,f_3)$.
Then $P=P\setminus \{f_1,f_3\}$,
$S(f_1,f_3)=-xy-xz-y^2+1$, $r=f_5=S(f_1,f_3)\ \textrm{rem}\ G =-1/2z^4-1/2z^2+1$, hence,
$G=\{f_i\mid 1\leq i\leq 5\}$ and $P=\{(f_2,f_3),(f_1,f_4),\ldots,(f_3,f_4),(f_1,f_5),\ldots,(f_4,f_5)\}$.
At the third iteration of the \textbf{while} cycle, let us choose the pair $(f_2,f_3)$.
Then $P=P\setminus \{(f_2,f_3)\}$,
$S(f_2,f_3)=xy-xz^2-y^2+1$, $r=S(f_2,f_3)\ \textrm{rem}\ G =0$.
At the fourth iteration, let us choose the pair $(f_1,f_4)$. Then
$P=P\setminus \{f_1,f_4\}$,
$S(f_1,f_4)=2y^2+2yz+xz-xz^2$, $r=S(f_1,f_4)\ \textrm{rem}\ G =0$.
In the same way, the remainder of the $S$-polynomials of all the
remaining pairs on division by $G$ are zero hence, the algorithm
returns with
$G=\{x-y-z,x+y-z^2,x^2+y^2-1,-2y-z+z^2,-1/2z^4-1/2z^2+1\}$ which
constitutes a Gröbner basis. \end{pelda}
\subsection{Reduced Gröbner basis}\inddef{Gröbner basis!reduced}\nevindex{Gröbner, Wolfgang Anton Maria}
In general, the Gröbner basis computed by Buchberger's algorithm is neither unique nor minimal.
Fortunately, both can be achieved by a little finesse.\nevindex{Buchberger, Bruno}
\begin{lemma}
If $G$ is a Gröbner basis of $I\subseteq R$, $g\in G$ and
$\textrm{lt}(g)\in \langle \textrm{lt}(G\setminus \{g\})\rangle$, then
$G\setminus \{g\}$ is a Gröbner basis of $I$ as well.
\end{lemma}
We say that the set $G\subseteq R$ is a \ki{minimal} Gröbner basis for ideal $I=\langle G\rangle$
if it is a Gröbner basis, and for all $g\in G$,\inddef{Gröbner basis!minimal}
\begin{itemize}
\item $\textrm{lc}(g)=1$,
\item $\textrm{lt}(g)\not\in \langle \textrm{lt}(G\setminus \{g\})\rangle$.
\end{itemize}
An element $g\in G$ of a Gröbner basis $G$ is said to be \ki{reduced} with respect to $G$ if
no monomial of $g$ is in the ideal $\langle \textrm{lt}(G\setminus \{g\})\rangle$.
A minimal Gröbner basis $G$ for $I\subseteq R$ is reduced if all of its elements are reduced with respect to $G$.
\inddef{Gröbner basis!reduced}
\begin{tetel}
Every ideal has a unique reduced Gröbner basis.
\end{tetel}
\begin{pelda}
In Example \ref{grpeld} not only $G$ but also $G'=\{x-y-z,-2y-z+z^2,-1/2z^4-1/2z^2+1\}$ is a
Gröbner basis. It is not hard to show that $G_r=\{x-1/2z^2-1/2z,y-1/2z^2-1/2z,z^4+z^2-z\}$ is a reduced
Gröbner basis.
\end{pelda}
\subsection{The complexity of computing Gröbner bases}\index{Gröbner
basis}\nevindex{Gröbner, Wolfgang Anton Maria} The last forty years
(since Buchberger's dissertation) was not enough to clear up entirely
the algorithmic complexity of Gröbner basis computation.
Implementation experiments show that we are faced with the
intermediate expression swell phenomenon. \index{intermediate
expression swell} Starting with a few polynomials of low degree and
small coefficients, the algorithm produces a large number of polynomials
with huge degrees and enormous coefficients. Moreover, in contrast to
the variants of the Euclidean algorithm, the explosion cannot be kept
under control. In $1996$, Kühnle and Mayr gave an exponential space
algorithm for computing a reduced Gröbner basis. The polynomial ideal
membership problem over $\Q$ is {\em EXPSPACE}-complete.
Let $f,f_1,\ldots,f_s\in F[x_1,\ldots,x_n]$ be polynomials over a field $F$ with $\deg f_i\leq d$
($\prec=\prec_{tdeg}$).
If $f\in \langle f_1,f_2,\ldots f_s \rangle$, then
\[
f=f_1g_1+\cdots+f_sg_s
\]
for polynomials $g_1,\ldots,g_s\in F[x_1,\ldots,x_n]$ for which their degrees are bounded by
$\beta=\beta(n,d)=(2d)^{2^n}$. The double exponential bound is essentially unavoidable, which is shown by
several examples. Unfortunately, in case $F=\Q$, the ideal membership problem falls into this category.
Fortunately, in special cases better results are available.
If $f=1$ (Hilbert's famous Nullstellensatz), then in case $d=2$, the bound is $\beta=2^{n+1}$,
while for $d>2$, the bound is $\beta=d^n$.
But the variety $V(f_1,\ldots,f_s)$ is empty if and only if $1\in \langle f_1,f_2,\ldots f_s \rangle$,
therefore the solvability problem of a polynomial system is in {\em PSPACE}.
Several results state that under specific circumstances, the (general) ideal membership problem is
also in {\em PSPACE}.
Such a criterion is for example that $\langle f_1,f_2,\ldots f_s \rangle$ is zero-dimensional
(contains finitely many isolated points).
In spite of the exponential complexity, there are many successful stories for the
application of Gröbner bases: geometric theorem proving, robot kinematics and motion planning,
solving polynomial systems of equations are the most widespread application areas.
In the following, we enumerate some topics where the Gröbner basis strategy has been applied successfully.
\begin{itemize}
\item \textit{Equivalence of polynomial equations.} Two sets of polynomials generate the same ideal
if and only if their Gröbner bases are equal with arbitrary monomial order.\index{polynomial equations!equivalence}
\item \textit{Solvability of polynomial equations.} The polynomial system of equations
$f_i(x_1,\ldots,x_n)=0,\ 1\leq i\leq s$ is solvable if and only if $1\not\in \langle f_1,\ldots,f_s\rangle$.
\index{polynomial equations!solvability}
\item \textit{Finitely many solutions of polynomial equations.} The polynomial system of equations
$f_i(x_1,\ldots,x_n)=0,\ 1\leq i\leq s$ has a finite number of solutions if and only if in any Gröbner basis
of $\langle f_1,\ldots,f_s\rangle$ for every variable $x_i$, there is a polynomial such that its leading term
with respect to the chosen monomial order is a power of $x_i$.
\index{polynomial equations!finitely many solutions}
\item \textit{The number of solutions.} Suppose that the system of polynomial equations
$f_i(x_1,\ldots,x_n)=0,\ 1\leq i\leq s$ has a finite number of solutions.
Then the number of solutions counted with multiplicityes is equal to the cardinality
of the set of monomials that are not multiples of the leading monomials of the polynomials in the Gröbner basis,
where any monomial order can be chosen.
\index{polynomial equations!number of finite solutions}
\item \textit{Simplification of expressions.}\inddef{simplification of expressions}
\end{itemize}
We show an example for the last item.
\begin{pelda}
Let $a,b,c\in \R$ be given such that
\[a+b+c=3,\ \ a^2+b^2+c^2=9,\ \ a^3+b^3+c^3=24\koz .\]
Compute the value of $a^4+b^4+c^4$. So let $f_1=a+b+c-3,f_2=a^2+b^2+c^2-9$ and
$a^3+b^3+c^3-24$ be elements of $\R[a,b,c]$ and let $\prec=\prec_{plex}$, $c\prec b\prec a$.
Then the Gröbner basis of $\langle f_1,f_2,f_3\rangle$ is
\[G=\{a+b+c-3,b^2+c^2-3b-3c+bc,1-3c^2+c^3\}\koz .\]
Since $a^4+b^4+c^4\ \textrm{rem}\ G=69$, the answer to the question follows.
\end{pelda}
\begin{gyak}
\ujgyak Prove that the orders $\prec_{plex}, \prec_{grlex}$ and $\prec_{tdeg}$ are
monomial orders.\gyakindex{order!monomial}
\ujgyak Let $\prec$ be a monomial order on $R$, $f,g\in R\setminus \{0\}$. Prove the following:
\textit{a.} $\textrm{mdeg}(fg)=\textrm{mdeg}(f)+\textrm{mdeg}(g),$
\textit{b.} if $f+g\neq 0$, then $\textrm{mdeg}(f+g)\preceq \max\{\textrm{mdeg}(f),\textrm{mdeg}(g)\},$ where
equality holds if $\textrm{mdeg}(f)\neq \textrm{mdeg}(g)$.
\ujgyak Let $f=2x^4y^2z-3x^4yz^2+4xy^4z^2-5xy^2z^4+6x^2y^4z-7x^2yz^4\in \Q[x,y,z]$.
\textit{a.} Determine the order of the monomials in $f$ for the monomial
orders $\prec_{plex}$, $\prec_{grlex}$ and $\prec_{tdeg}$ with
$z\prec y\prec x$ in all cases.
\textit{b.} For each of the three monomial orders from ($a.$), determine
$\textrm{mdeg}(f)$, $\textrm{lc}(f)$, $\textrm{lm}(f)$ and $\textrm{lt}(f)$.
\hardgyak Prove Dickson's lemma.\nevindex{Dickson, Leonard Eugene}\gyakindex{Dickson's lemma}
\ujgyak Compute the Gröbner basis and the reduced Gröbner basis of the ideal
$I=\langle x^2+y-1,xy-x\rangle\subseteq \Q[x,y]$ using the monomial order $\prec=\prec_{lex}$, where $y\prec x$.
Which of the following polynomials belong to $I$:
$f_1=x^2+y^2-y,f_2=3xy^2-4xy+x+1$?\nevindex{Gröbner, Wolfgang Anton Maria}
\end{gyak}
\
\section{Symbolic integration}\label{integration}\index{symbolic!integration}
The problem of indefinite integration is the following: given a
function $f$, find a function $g$ the derivative of which is $f$, that
is, $g'(x)=f(x)$; for this relationship the notation $\int f(x)\dx=g(x)$
is also used. In introductory calculus courses, one tries to
solve indefinite integration problems by different methods, among which
one tries to choose in a heuristic way: substitution, trigonometric
substitution, integration by parts, etc. Only the integration of
rational functions is usually solved by an algorithmic method.
It can be shown that indefinite integration in the general case is
algorithmically unsolvable. So we only have the possibility to look
for a reasonably large part that can be solved using algorithms.
The first step is the algebraisation of the problem: we discard every
analytical concept and consider differentiation as a new (unary)
algebraic operation connected to addition and multiplication in a
given way, and we try to find the ``inverse'' of this operation.
This approach leads to the introduction of the concept of differential
algebra.\index{differential algebra}
The integration routines of computer algebra systems (e.g.\
$\textsc{Maple}^{\textregistered}$), similarly to us, first try a few heuristic methods.
Integrals of polynomials (or a bit more generally, finite
Laurent-series)\nevindex{Laurent, Pierre Alphonse (1813--1854)} are easy to
determine. This is followed by a simple table lookup process (e.g.\
in case of \textsc{Maple}\, 35 basic integrals are used). One can,
of course, use integral tables entered from books as well. Next we may
look for special cases where appropriate methods are known. For
example, for integrands of the form
$$
e^{ax+b}\sin(cx+d)\cdot p(x)\koz ,
$$ where $p$ is a polynomial, the integral can be determined using
integration by parts. When the above methods fail, a form of
substitution called the ``derivative-divides'' method is tried: if
the integrand is a composite expression, then for every sub-expression
$f(x)$, we divide by the derivative of $f$, and check if $x$
vanishes from the result after the substitution $u=f(x)$.
Using these simple methods we can determine a surprisingly large
number of integrals. To their great advantage, they can solve simple
problems very quickly. If they do not succeed, we try algorithmic
methods. The first one is for the integration of rational functions.
As we will see, there is a significant difference between the version
used in computer algebra systems and the version used in hand
computations, since the aims are short running times even in
complicated cases, and the simplest possible form of the result. The
Risch\nevindex{Risch, Robert} algorithm for integrating elementary
functions is based on the algorithm for the integration of rational
functions. We describe the Risch algorithm, but not in full
detail. In most cases, we only outline the proofs.
\subsection{Integration of rational functions}\index{integration!of
rational functions}
In this subsection, we introduce the notion of differential field and
differential extension field, then we describe Hermite's
method.\nevindex{Hermite, Charles (1822--1901)}
\subsubsection{Differential fields.}\index{differential field}
Let $K$ be a field of characteristic 0, with a mapping $f\mapsto f'$
of $K$ into itself satisfying:
\cd(1){$(f+g)'=f'+g'$ (additivity);}
\cd(2){$(fg)'=f'g+g'f$
(Leibniz-rule).}\index{Leibniz-rule}\nevindex{Leibniz,
Gottfried Wilhelm (1646--1716)}
\noindent The mapping $f\mapsto f'$ is called a \ki{differential
operator}, \ki{differentiation} or \ki{derivation}, and $K$ is called
a \ki{differential field}\inddef{differential
operator}\inddef{differentiation}\inddef{derivation}\inddef{differential
field}. The set $C=\{c\in K:c'=0\}$ is the \ki{field of
constants}\inddef{field of constants} or \ki{constant
subfield}\inddef{constant subfield} in $K$. If $f'=g$, we also write
$f=\int g$. Obviously, for any constant $c\in C$, we have $f+c=\int
g$. The \ki{logarithmic derivative}\inddef{logarithmic!derivative} of
an element $0\neq f\in K$ is defined as $f'/f$ (the
``derivative of $\log(f)$'').
\begin{tetel}\index{derivation!rules of}
With the notations of the previous definition, the usual rules of
derivation hold:
\cd(1){$0'=1'=(-1)'=0$;}
\cd(2){derivation is $C$-linear: $(af+bg)'=af'+bg'$ for
$f,g\in K$, $a,b\in C$;}
\cd(3){if $g\neq 0$, $f$ is arbitrary, then
$(f/g)'=(f'g-g'f)/g^2$;}
\cd(4){$(f^n)'=nf'f^{n-1}$ for $0\neq f\in K$ and $n\in\Z$;}
\cd(5){$\int fg'=fg-\int gf'$ for $f,g\in K$ (\ki{integration
by parts}).}\inddef{integration!by parts}
\end{tetel}
\begin{pelda} (1) With the notations of the previous definition, the
mapping $f\mapsto 0$ on $K$ is the {\sl trivial derivation\/}, for
this we have $C=K$.
(2) Let $K=\Q(x)$. There exists a single differential operator on
$\Q(x)$ with $x'=1$, it is the usual differentiation. For this the
constants are the elements of $\Q$. Indeed, $n'=0$ for $n\in\N$ by
induction, so the elements of $\Z$ and $\Q$ are also constants. We have
by induction that the derivative of power functions is the usual one,
thus, by linearity, this is true for polynomials, and by the
differentiation rule of quotients, we get the statement. It is not
difficult to calculate that for the usual differentiation, the constants
are the elements of $\Q$.
(3) If $K=C(x)$, where $C$ is an arbitrary field of characteristic 0,
then there exists a single differential operator on $K$ with constant
subfield $C$ and $x'=1$: it is the usual differentiation. This
statement is obtained similarly to the previous one.
\end{pelda}
If $C$ is an arbitrary field of characteristic 0, and $K=C(x)$ with
the usual differentiation, then $1/x$ is not the derivative of
anything. (The proof of the statement is very much like the proof of
the irrationality of $\sqrt 2$, but we have to work with divisibility
by $x$ rather than by 2.)
The example shows that for the integration of $1/x$ and other similar
functions, we have to extend the differential field. In order to
integrate rational functions, an extension by logarithms will be
sufficient.
\subsubsection{Extensions of differential fields.}\index{differential
field!extensions of}
Let $L$ be a differential field and $K\subset L$ a subfield of $L$. If
differentiation doesn't lead out of $K$, then we say that $K$ is a
\ki{differential subfield} of $L$, and $L$ is a \ki{differential
extension field} of $K$.\inddef{differential extension field}
\inddef{differential subfield} If for some $f,g\in L$ we have
$f'=g'/g$, that is, the derivative of $f$ is the logarithmic derivative
of $g$, then we write $f=\log g$. (We note that $\log$, just like
$\int$ is a relation rather than a function. In other words, $\log$ is
an abstract concept here and not a logarithm function to a given
base.) If we can choose $g\in K$, we say that $f$ is \ki{logarithmic over
$K$.\/} \index{differential subfield}\index{differential extension
field}\inddef{logarithmic!extension}
\begin{pelda}
(1) Let $g=x\in K=\Q(x)$, $L=\Q(x,f)$, where $f$ is a new
indeterminate, and let $f'=g'/g=1/x$, that is, $f=\log(x)$.
Then $\int 1/x\dx=\log(x)$.
(2) Analogically,
$$
\int{1\over x^2-2}={\sqrt 2\over 4}\log(x-\sqrt 2)-{\sqrt 2\over
4}\log(x+\sqrt 2)
$$
is in the differential field
$\Q(\sqrt2)\bigl(x,\log(x-\sqrt2),\log(x+\sqrt2)\bigr)$.
(3) Since
$$
\int{1\over x^3+x}=\log(x)-{1\over 2}\log(x+i)-{1\over 2}\log(x-i)
=\log(x)-{1\over 2}\log(x^2+1)\koz ,
$$
the integral can be considered as an element of
$$
\Q\bigr(x,\log(x),\log(x^2+1)\bigl)
$$
or an element of
$$
\Q(i)\bigr(x,\log(x),\log(x-i),\log(x+i)\bigl)
$$
as well. Obviously, it is more reasonable to choose the first
possibility, because in this case there is no need to extend the base
field.
\end{pelda}
\subsubsection{Hermite's method}\index{Hermite-reduction}\nevindex{Hermite, Charles (1822--1901)}
Let $K$ be a field of characteristic 0, $f,g\in K[x]$ non-zero
relatively prime polynomials. To compute the integral $\int f/g$
using Hermite's method, we can find polynomials $a,b,c,d\in K[x]$ with
\begin{equation}
\int{f\over g}={c\over d}+\int{a\over b}\ ,\label{int1}
\end{equation}
where $\deg a<\deg b$ and $b$ is monic and square-free. The rational
function $c/d$ is called the \ki{rational part,} the expression $\int
a/b$ is called the \ki{logarithmic part} of the integral. The method
avoids the factorisation of $g$ into linear factors (in a factor field
or some larger field), and even its decomposition into irreducible
factors over $K$.\inddef{integral!rational
part of}\inddef{integral!logarithmic part of}
Trivially, we may assume that $g$ is monic. By Euclidean division
we have $f=pg+h$, where $\deg h<\deg g$, thus, $f/g=p+h/g.$ The
integration of the polynomial part $p$ is trivial. Let us determine
the square-free factorisation of $g$, that is, find monic and pairwise
relatively prime polynomials $g_1,\ldots,g_m\in K[x]$ such that
$g_m\neq 1$ and $g=g_1g_2^2\cdots g_m^m$. Let us construct the partial
fraction decomposition (this can be achieved by the Euclidean
algorithm):
$$
{h\over g}=\sum_{i=1}^m\sum_{j=1}^i{h_{i,j}\over g_i^j}\ ,
$$
where every $h_{i,j}$ has smaller degree than $g_i$.
The Hermite-reduction\nevindex{Hermite, Charles (1822--1901)} is the iteration of
the following step: if $j>1$, then the integral $\int h_{i,j}/g_i^j$ is
reduced to the sum of a rational function and an integral similar to
the original one, with $j$ reduced by 1. Using that $g_i$ is
square-free, we get $\textrm{gcd}(g_i,g_i')=1$, thus, we can obtain
polynomials $s,t\in K[x]$ by the application of the extended Euclidean
algorithm such that $sg_i+tg_i'=h_{i,j}$ and $\deg s,\deg t<\deg
g_i$. Hence, using integration by parts,
\begin{eqnarray*}
\int{h_{i,j}\over g_i^j}&=&\int{t\cdot g'_i\over g_i^j}+\int{s\over
g_i^{j-1}}\cr
&=&{-t\over(j-1)g_i^{j-1}}+\int{t'\over(j-1)g_i^{j-1}}+\int{s\over g_i^{j-1}}\cr
&=&{-t\over(j-1)g_i^{j-1}}+\int{s+t'/(j-1)\over g_i^{j-1}}\ .
\end{eqnarray*}
It can be shown that using fast algorithms, if $\deg f,\deg
g1$, then let
\[
g^*=g_1g_2^2\cdots g_{m-1}^{m-1}={g\over g_m^m}\koz .
\]
Since $\textrm{gcd}(g_m,g^*g'_m)=1$, there exist polynomials $s,t\in
K[x]$ such that
\[
sg_m+tg^*g'_m=h\koz .
\]
Dividing both sides by $g=g^*g_m^m$ and integrating by parts,
\[
\int{h\over g}={-t\over(m-1)g_m^{m-1}}+\int{s+g^*t'/(m-1)\over g^*g_m^{m-1}} \koz ,
\]
thus, $m$ is reduced by one.
Note that $a$ and $c$ can be determined by the method of
undetermined coefficients (Horowitz's method).\nevindex{Horowitz,
Ellis} After \index{Horowitz's method} division, we may assume $\deg
f<\deg g$. As it can be seen from the algorithm, we can choose
$d=g_2g_3^2\cdots g_m^{m-1}$ and $b=g_1g_2\cdots g_m$. Differentiating
(\ref{int1}), we get a system of linear equations on $\deg b$
coefficients of $a$ and $deg d$ coefficients of $c$, altogether $n$
coefficients. This method in general is not as fast as Hermite's
method.
The algorithm below performs the Hermite-reduction for a rational
function $f/g$ of variable $x.$
\begin{algN}{Hermite-Reduction($f,g$)}\inddef{\textsc{Hermite-Reduction}}
1 \' $p\leftarrow \textrm{quo}(f,g)$\\
2 \' $h\leftarrow \textrm{rem}(f,g)$\\
3 \' $(g[1], \ldots, g[m])\leftarrow$ \textsc{Square-Free}$(g)$\\
4 \' construct the partial fraction decomposition of $(h/g)$, compute
numerators \\
\' $h[i,j]$ belonging to $g[i]^j$\\
5 \' $rac\leftarrow 0$\\
6 \' $int\leftarrow 0$ \\
7 \' \key{for} \= $i\leftarrow 1$ \key{to} $m$ \\
8 \' \> \key{do} \= $int\leftarrow int+h[i,1]/g[i]$\\
9 \' \> \key{for} \= $j\leftarrow 2$ \key{to} $i$ \\
10 \' \> \> \key{do} \= $n\gets j$\\
11 \' \> \> \key{while} \= $n>1$ \\
12 \' \> \> \> \key{do} \= determine $s$ and $t$ from the equation $s\cdot g[i]+t\cdot g[i]'=h[i,j]$\\
13 \' \> \> \> \> $n\gets n-1$\\
14 \' \> \> \> \> $rac \gets rac-(t/n)/g[i]^n$\\
15 \' \> \> \> \> $h[i,n]\gets s+t'/n$\\
16 \' \> \> $int \gets int+h[i,1]/g[i]$\\
17 \' $\var{red} \leftarrow rac+\int p+\int int$ \\
18 \' \key{return} $\var{red}$
\end{algN}
If for some field $K$ of characteristic 0, we want to compute the
integral $\int a/b$, where $a,b\in K[x]$ are non-zero relatively prime
polynomials with $\deg a<\deg b,$ $b$ square-free and monic, we can
proceed by decomposing polynomial $b$ into linear factors, $b=\prod
_{k=1}^n(x-\l a_k)$, in its splitting field $L$, then constructing the
partial fraction decomposition, $a/b=\sum_{k=1}^n c_k/(x-\l a_k)$,
over $L$, finally integrating we get
\[
\int{a\over b}=\sum_{k=1}^n c_k\log(x-\l a_k)\in
L\bigl(x,\log(x-\l a_1),\ldots,\log(x-\l a_n)\bigr) \koz .
\]
The disadvantage of this method, as we have seen in the example of the
function $1/(x^3+x)$, is that the degree of the extension field $L$
can be too large. An extension degree as large as $n!$ can occur,
which leads to totally unmanageable cases. On the other hand, it is
not clear either if a field extension is needed at all: for example, in
case of the function $1/(x^2-2)$, can we not compute the integral
without extending the base field? The following theorem enables us to
choose the degree of the field extension as small as possible.
\begin{tetel}[Rothstein-Trager integration
algorithm]\inddef{Rothstein-Trager integration algorithm}
\nevindex{Rothstein, Michael}\nevindex{Trager, Barry Marshall}
Let $K$ be a field of characteristic 0, $a,b\in K[x]$ non-zero
relatively prime polynomials, $\deg a<\deg b,$ $b$ square-free and
monic. If $L$ is an algebraic extension of $K$, $c_1,\ldots,c_k\in
L\setminus K$ are square-free and pairwise relatively prime monic
polynomials, then the following statements are
equivalent:\index{algebraic!extension}
$$
\int{a\over b}=\sum_{i=1}^k c_i\log v_i\koz ,\leqno(1)
$$
\cd(2){The polynomial $r=\textrm{res}_x(b,a-yb')\in K[y]$ can be
decomposed into linear factors over $L$, $c_1,\ldots,c_k$ are
exactly the distinct roots of $r$, and $v_i=\textrm{gcd}(b,a-c_ib')$ if
$i=1,\ldots,k$. Here, $\textrm{res}_x$ is the resultant taken in the
indeterminate $x$.}
\end{tetel}
\begin{pelda}
Let us consider again the problem of computing the integral
$\int1/(x^3+x)\dx$. In this case,
$$
r=\textrm{res}_x\bigl(x^3+x,1-y(3x^2+1)\bigr)=-4z^3+3y+1=-(2y+1)^2(y-1)\koz ,
$$
the roots of which are $c_1=1$ and $c_2=-1/2.$ Thus,
\begin{eqnarray*}
v_1&=&\textrm{gcd}\bigl(x^3+x,1-(3x^2+1)\bigr)=x \koz , \\
v_2&=&\textrm{gcd}\bigl(x^3+x,1+{1\over 2}(3x^2+1)\bigr)=x^2+1 \koz.
\end{eqnarray*}
\end{pelda}
The algorithm, which can be easily given based on the previous theorem, can
be slightly improved: instead of computing
$v_i=\textrm{gcd}(b,a-c_ib')$ (by calculations over the field $L$),
$v_i$ can also be computed over $K$, applying the
\textsc{Extended-Euclidean-Normalised} algorithm. This was discovered
by Trager, \nevindex{Trager, Barry Marshall},
and independently by Lazard\nevindex{Lazard, Daniel} and
Rioboo\nevindex{Rioboo, Renaud}. It is not difficult to show that the
running time of the complete integration algorithm obtained this way
is $O\bigl(nM(n)\lg n\bigr)$ if $\deg f,\deg g \key{do} \= \key{if} \= $r_i(y)\neq 1$ \\
7 \' \> \> \> \key{then} \= $w(y)\gets$
the \textrm{gcd} of the coefficients of $w_i(x,y)$\\
8 \' \> \> \> \> $(l(y),s(y),t(y))\gets$ \\
\' \> \> \> \> \textsc{Extended-Euclidean-Normalised}$(w(y),r_i(y))$\\
9 \' \> \> \> \> $w_i(x,y)\gets$\textsc{Primitive-Part}$(\textrm{rem}(s(y)\cdot w_i(x,y),r_i(y)))$\\
10 \' \> \> \> \> $(r_{i,1},\ldots,r_{i,k})\gets$\textsc{Factors}$(r_i)$\\
11 \' \> \> \> \> \key{for} \= $j\gets 1$ \key{to} $k$ \\
12 \' \> \> \> \> \> \key{do} \= $d\gets \deg(r_{i,j})$\\
13 \' \> \> \> \> \> \> $c\gets$\textsc{Solve}$(r_{i,j}(y)=0,y)$\\
14 \' \> \> \> \> \> \> \key{if} \= $d=1$ \\
15 \' \> \> \> \> \> \> \> \key{then} \= $int \gets int+c\cdot\log(w_i(x,c))$\\
16 \' \> \> \> \> \> \> \> \key{else} \> \key{for} \= $n\gets 1$ \key{to} $d$ \\
17 \' \> \> \> \> \> \> \> \> \> \key{do} $int \gets int +c[n]\cdot\log(w_i(x,c[n]))$\\
18 \' \key{return} $int$
\end{algN}
\begin{pelda}
Let us consider again the problem of computing the integral
$\int1/(x^2-2)\dx$. In this case,
$$
r=\textrm{res}_x(x^2-2,1-y\cdot2x)=-8y^2+1\koz .
$$
The polynomial is irreducible in $\Q[x]$, thus, we cannot avoid the
extension of $\Q$. The roots of $r$ are $\pm1/\sqrt8$. From the
\textsc{Extended-Euclidean-Normalised}-algorithm over $\Q(y)$,
$w_1(x,y)=x-1/(2y)$, thus, the integral is
$$
\int{1\over x^2-2}\dx
={1\over\sqrt8}\log(x-\sqrt2)-{1\over\sqrt8}\log(x+\sqrt2) \koz .
$$
\end{pelda}
\subsection{The Risch integration algorithm}\index{Risch integration algorithm}\nevindex{Risch, Robert}
Surprisingly, the methods found for the integration of rational
functions can be generalised for the integration of expressions
containing basic functions ($\sin$, $\textrm{exp}$ etc.) and their
inverse. Computer algebra systems can compute the integral of
remarkably complicated functions, but sometimes they fail in seemingly
very simple cases, for example the expression $\int x/(1+e^x)\dx$ is
returned unevaluated, or the result contains a special
non-elementary function, for example the logarithmic
integral.\index{logarithmic integral} This is due to the fact that in
such cases, the integral cannot be given in ``closed form''.
Although the basic results for integration in ``closed form'' had
been discovered by Liouville\nevindex{Liouville, Joseph} in 1833, the
corresponding algorithmic methods were only developed by
Risch\nevindex{Risch, Robert} in 1968.
\subsubsection{Elementary functions}\index{elementary!functions}
The functions usually referred to as functions in ``closed form''
are the ones composed of rational functions, exponential functions,
logarithmic functions, trigonometric and hyperbolic functions, their
inverses and $n$-th roots (or more generally ``inverses'' of
polynomial functions, that is, solutions of polynomial equations); that
is, any nesting of the above functions is also a function in
``closed form''.
One might note that while $\int1/(1+x^2)\dx$ is usually given in the form
$\textrm{arctg}(x)$, the algorithm for the integration of rational
functions returns
$$
\int{1\over 1+x^2}\dx={i\over 2}\log(x+i)-{i\over 2}\log(x-i)
$$
as solution. Since trigonometric and hyperbolic functions and their
inverses over $\C$ can be expressed in terms of exponentials and
logarithms, we can restrict our attention to exponentials and
logarithms. Surprisingly, it also turns out that the only extensions needed
are logarithms (besides algebraic numbers) in the general case.
\subsubsection{Exponential elements.}\index{exponential element}
Let $L$ be a differential extension field\index{differential extension
field} of the differential field $K$. If for a $\theta\in L$, there
exists a $u\in K$ such that $\theta'/\theta=u'$, that is, the
logarithmic derivative\index{logarithmic!derivative} of $\theta$
equals the derivative of an element of $K$, then we say that $\theta$
is \ki{exponential} over $K$ and we write
$\theta=\textrm{exp}(u)$.\inddef{exponential element} If only the
following is true: for an
element $\theta\in L$, there is a $u\in K$ such that
$\theta'/\theta=u$, that is, the logarithmic derivative of $\theta$ is
an element of $K$, then $\theta$ is called \ki{hyperexponential} over
$K$.\inddef{hyperexponential element}
Logarithmic, exponential or hyperexponential elements may be algebraic
or transcendent over
$K$.\index{algebraic!element}\index{logarithmic!element}\inddef{transcendent!element}
\subsubsection{Elementary extensions.}\index{elementary!extensions}
Let $L$ be a differential extension field\index{differential extension
field} of the differential field $K$. If
$$
L=K(\theta_1,\theta_2,\ldots,\theta_n) \koz ,
$$
where for $j=1,2,\ldots,n$, $\theta_j$ is logarithmic, exponential or
algebraic over the field
$$
K_{j-1}=K(\theta_1,\ldots,\theta_{j-1})
$$
($K_0=K$), then $L$ is called an \ki{elementary extension} of $K$. If
for $j=1,2,\ldots,n$, $\theta_j$ is either transcendental and
logarithmic, or transcendental and exponential over $K_{j-1}$, then $L$ is
a \ki{transcendental elementary extension} of
$K$.\inddef{elementary!extension}\inddef{transcendental!elementary
extension}
\medskip
Let $C(x)$ be the differential field of rational functions with the
usual differentiation and constant subfield $\C$. An elementary
extension of $C(x)$ is called a field of elementary functions, a
transcendental elementary extension of $C(x)$ is called a field of
transcendental elementary functions.\inddef{field!of elementary
functions}\inddef{field!of transcendental elementary functions}
\begin{pelda}\label{elemki1}
The function $f=\textrm{exp}(x)+\textrm{exp}(2x)+\textrm{exp}(x/2)$
can be written in the form
$f=\theta_1+\theta_2+\theta_3\in\Q(x,\theta_1,\theta_2,\theta_3)$,
where $\theta_1=\textrm{exp}(x)$, $\theta_2=\textrm{exp}(2x)$,
$\theta_3=\textrm{exp}(x/2)$. Trivially, $\theta_1$ is exponential
over $\Q(x)$, $\theta_2$ is exponential over $\Q(x,\theta_1)$ and
$\theta_3$ is exponential over $\Q(x,\theta_1,\theta_2)$. Since
$\theta_2=\theta_1^2$ and $\Q(x,\theta_1,\theta_2)=\Q(\theta_1)$, $f$
can be written in the simpler form
$f=\theta_1+\theta_1^2+\theta_3$. The function $\theta_3$ is not only
exponential but also algebraic over $\Q(x,\theta_1)$, since
$\theta_3^2-\theta_1=0$, that is, $\theta_3=\theta_1^{1/2}$. So
$f=\theta_1+\theta_1^2+\theta_1^{1/2}\in\Q(x,\theta_1,\theta_1^{1/2})$.
But $f$ can be put in an even simpler form:
$$
f=\theta_3^2+\theta_3^4+\theta_3\in\Q(x,\theta_3)\koz .
$$
\vspace{-4mm}
\end{pelda}
\vspace{-6mm}
\begin{pelda}\label{elemki2}
The function
$$
f=\sqrt{\log(x^2+3x+2)\bigl(\log(x+1)+\log(x+2)\bigr)}
$$
can be written in form
$f=\theta_4\in\Q(x,\theta_1,\theta_2,\theta_3,\theta_4)$, where
$\theta_1=\log(x^2+3x+2)$, $\theta_2=\log(x+1)$, $\theta_3=\log(x+2)$,
and $\theta_4$ satisfies the algebraic equation
$\theta_4^2-\theta_1(\theta_2+\theta_3)=0$. But $f$ can also be given in
the much simpler form $f=\theta_1\in\Q(x,\theta_1)$.
\end{pelda}
\begin{pelda}\label{elemki3}
The function $f=\textrm{exp}\bigl(\log(x)/2\bigr)$ can be written in
the form $f=\theta_2\in\Q(x,\theta_1,\theta_2)$, where
$\theta_1=\log(x)$ and $\theta_2=\textrm{exp}(\theta_1/2)$, so
$\theta_1$ is logarithmic over $\Q(x)$, and $\theta_2$ is exponential
over $\Q(x,\theta_1)$. But $\theta_2^2-x=0$, so $\theta_2$ is
algebraic over $\Q(x)$, and $f(x)=x^{1/2}$.
\end{pelda}
\subsubsection{The integration of elementary
functions.}\index{integration!of elementary functions}
The integral of an element of a field of elementary
functions\index{field!of elementary functions} will be completely
characterised by Liouville's Principle\index{Liouville's Principle} in
case it is an elementary function. Algebraic extensions,
however, cause
great difficulty if not only the constant field is extended.
Here we only deal with the integration of elements of fields of
transcendental elementary functions by the Risch integration
algorithm.\index{Risch integration algorithm}\nevindex{Risch, Robert}
In practice, this means an element of the field of transcendental
elementary functions
$\Q(\alpha_1,\ldots,\alpha_k)(x,\theta_1,\ldots,\theta_n)$, where
$\alpha_1,\ldots,\alpha_k$ are algebraic over $\Q$ and the integral is
an element of the field
$$
\Q(\alpha_1,\ldots,\alpha_k,\ldots,\alpha_{k+h})(x,\theta_1,\ldots,\theta_n,\ldots,\theta_{n+m})
$$ of elementary functions. In principle, it would be simpler to choose
$\C$ as constant subfield but, as we have seen in the case of rational
functions, this is impossible, for we can only compute in an exact way
in special fields like algebraic number fields\index{algebraic!number
field}; and we even have to keep the number and degrees of
$\alpha_{k+1},\ldots,\alpha_{k+h}$ as small as possible. Nevertheless,
we will deal with algebraic extensions\index{algebraic!extension} of
the constant subfield dynamically: we can imagine that the necessary
extensions have already been made, while in practice, we only perform
extensions when they become necessary.
After the conversion of trigonometric and hyperbolic functions (and
their inverses) to exponentials (and logarithms, respectively), the
integrand becomes an element of a field of elementary
functions. Examples \ref{elemki1} and \ref{elemki2} show that there
are functions that do not seem to be elements of a transcendental
elementary extension ``at first sight'', and yet they are; while
Example \ref{elemki3} shows that there are functions that seem to be
elements of such an extension ``at first sight'', and yet they are
not. The first step is to represent the integrand as an element of a
field of transcendental elementary functions using the algebraic
relationships between the different exponential and logarithmic
functions.\index{logarithmic!function} We will not consider how this
can be done. It can be verified whether we succeeded by the
following structure theorem by Risch\nevindex{Risch, Robert}. We omit
the proof of the theorem. We will need a definition.
\smallskip
An element $\theta$ is \ki{monomial} over a differential field $K$ if
$K$ and $K(\theta)$ have the same constant field, $\theta$ is
transcendental over $K$ and it is either exponential or logarithmic
over $K$.\inddef{monomial!element}
\begin{tetel}[Structure theorem]
Let $K$ be the field of constants and $K_n=K(x,\theta_1,\ldots,\theta_n)$
a differential extension field of $K(x)$, which has constant field
$K$. Let us assume that for all $j$, either $\theta_j$ is algebraic
over $K_{j-1}=K(x,\theta_1,\ldots,\theta_{j-1})$, or
$\theta_j=w_j$, with $w_j=\log(u_j)$ and $u_j\in K_{j-1}$, or
$\theta_j=u_j$, with $u_j=\textrm{exp}(w_j)$ and $w_j\in K_{j-1}$.
Then
\begin{enumerate}
\item {$g=\log(f)$, where $f\in K_n\setminus K$, is monomial over
$K_n$ if and only if there is no product\index{differential extension}
$$
f^k\cdot\prod u_j^{k_j},\quad k,k_j\in\Z, k\neq 0
$$
which is an element of $K$;}
\item
{$g=\textrm{exp}(f)$, where $f\in K_n\setminus K$, is monomial over
$K_n$ if and only if there is no
linear combination
$$
f+\sum c_jw_j,\quad c_j\in\Q
$$
which is an element of $K$.} \end{enumerate}
\noindent Product and summation is only taken for logarithmic and
exponential steps.
\end{tetel}
\noindent
The most important classic result of the entire theory is the
following theorem.
\begin{tetel}[Liouville's Principle]\inddef{Liouville's
Principle}\nevindex{Liouville, Joseph}
Let $K$ be a differential field with constant field $C$. Let $L$ be a
differential extension field of $K$ with the same constant field. Let
us assume that $g'=f\in K$. Then there exist constants
$c_1,\ldots,c_m\in C$ and elements
$v_0,v_1,\ldots,v_m\in K$ such that
$$
f=v'_0+\sum_{j=1}^m c_j{v'_j\over v_j}\koz ,
$$
that is,
$$
g=\int f=v_0+\sum_{j=1}^m c_j\log(v_j)\koz .
$$
\end{tetel}
Note that the situation is similar to the case of rational functions.
\medbreak
We will not prove this theorem. Although the proof is lengthy, the
idea of the proof is easy to describe. First we show that a
transcendental exponential extension cannot be ``eliminated'', that
is differentiating a rational function of it, the new element does not
vanish. This is due to the fact that differentiating a polynomial of
an element of the transcendental exponential extension, we get a
polynomial of the same degree, and the original polynomial does not
divide the derivative, except for the case when the original polynomial is
monomial. Next we show that no algebraic extension is needed to
express the integral. This is essentially due to the fact that
substituting an element of an algebraic extension into its minimal
polynomial, we get zero, and differentiating this equation, we get the
derivative of the extending element as a rational function of the
element. Finally, we have to examine extensions by transcendental
logarithmic elements. We show that such an extending element can be
eliminated by differentiation if and only if it appears in a linear
polynomial with constant leading coefficient. This is due to the fact
that differentiating a polynomial of such an extending element, we get
a polynomial the degree of which is either the same or is reduced by
one, the latter case only being possible when the leading coefficient
is constant.
\subsubsection{The Risch algorithm.}\inddef{Risch algorithm}\nevindex{Risch, Robert}
Let $K$ be an algebraic number field over $\Q$,\index{algebraic!number
field} and $K_n=K(x,\theta_1,\ldots,\theta_n)$ a field of
transcendental elementary functions.\index{field!of elementary
functions} The algorithm is recursive in $n$: using the notation
$\theta=\theta_n$, we will integrate a function $f(\theta)/g(\theta)\in
K_n=K_{n-1}(\theta)$, where
$K_{n-1}=K(x,\theta_1,\ldots,\theta_{n-1})$. (The case $n=0$ is the
integration of rational functions.) We may assume that $f$ and $g$
are relatively prime and $g$ is monic. Besides differentiation with respect
to $x$, we will also use differentiation with respect to $\theta$, which
we denote by $d/d\theta$. In the following, we will only present the
algorithms.
\subsubsection{Risch algorithm: logarithmic case.}\index{Risch
algorithm!logarithmic case}
Using the notations of the previous paragraph, we first presume that
$\theta$ is transcendental and logarithmic, $\theta'=u'/u$, $u\in
K_{n-1}$. By Euclidean division,
$f(\theta)=p(\theta)g(\theta)+h(\theta)$,
hence, \index{transcendental!element}\index{logarithmic!element}
$$
\int{f(\theta)\over g(\theta)}=\int p(\theta)+\int{h(\theta)\over g(\theta)} \koz .
$$
Unlike the integration of rational functions, here the integration of
the polynomial part is more difficult. Therefore, we begin with the
integration of the rational part.
\bigbreak
\subsubsection{Logarithmic case, rational part.}
Let $g(\theta)=g_1(\theta)g_2^2(\theta)\cdots g_m^m(\theta)$ be the
square-free factorisation of $g(\theta)$. Then
$$
\textrm{gcd} \left ( g_j(\theta),{d\over d\theta}g_j(\theta) ) \right ) =1
$$
is obvious. It can be shown that the much stronger condition
$\textrm{gcd}\bigl(g_j(\theta),g_j(\theta)'\bigr)=1$ is also
fulfilled. By partial fraction decomposition,\index{partial fraction
decomposition}
$$
{h(\theta)\over g(\theta)}=\sum_{i=1}^m\sum_{j=1}^i{h_{i,j}(\theta)\over g_i(\theta)^j}\koz .
$$
We use Hermite-reduction\index{Hermite-reduction}: using the
extended Euclidean algorithm we get polynomials
$s(\theta),t(\theta)\in K_{n-1}[\theta]$ that satisfy
$s(\theta)g_i(\theta)+t(\theta)g_i(\theta)'=h_{i,j}(\theta)$ and
$\deg s(\theta),\deg t(\theta)<\deg g_i(\theta)$. Using integration by
parts,
\begin{eqnarray*}
\int{h_{i,j}(\theta)\over g_i^j(\theta)}
&=&\int{t(\theta)\cdot g_i(\theta)'\over g_i(\theta)^j}+\int{s(\theta)\over
g_i(\theta)^{j-1}}\cr
&=&{-t(\theta)\over(j-1)g_i(\theta)^{j-1}}
+\int{t(\theta)'\over(j-1)g_i(\theta)^{j-1}}+\int{s(\theta)\over g_i(\theta)^{j-1}}\cr
&=&{-t(\theta)\over(j-1)g_i(\theta)^{j-1}}+\int{s(\theta)+t(\theta)'/(j-1)\over g_i(\theta)^{j-1}} \koz .
\end{eqnarray*}
Continuing this procedure while $j>1$, we get
$$
\int{h(\theta)\over g(\theta)}={c(\theta)\over d(\theta)}+\int{a(\theta)\over b(\theta)}\koz ,
$$
where $a(\theta),b(\theta),c(\theta),d(\theta)\in K_{n-1}[\theta]$, $\deg a(\theta)<\deg
b(\theta)$ and $b(\theta)$ is a square-free and monic polynomial.
It can be shown that the
Rothstein-Trager method\index{Rothstein-Trager method}\nevindex{Rothstein,
Michael}\nevindex{Trager, Barry Marshall} can be applied to compute
the integral $\int a(\theta)/b(\theta)$. Let us calculate the resultant
$$
r(y)=\textrm{res}_\theta\bigl(b(\theta),a(\theta)-y\cdot
b(\theta)'\bigr)\koz .
$$
\index{resultant} It can be shown that the integral is elementary
if and only if $r(y)$ is of the form $r(y)=\o r(y)s$, where $\o r(y)\in
K[y]$ and $s\in K_{n-1}$. If we compute the primitive part of $r(y)$,
choose it as $\o r(y)$ and any coefficient of $\o r(y)$ is not a
constant, then there is no elementary integral. Otherwise, let
$c_1,\ldots,c_k$ be the distinct roots of $\o r(y)$ in
its factor field and let
$$
v_i(\theta)=\textrm{gcd}\bigl(b(\theta),a(\theta)-c_ib(\theta)'\bigr)\in
K_{n-1}(c_1,\ldots,c_k)[\theta]
$$
for $i=1,\ldots,k$. It can be shown that
$$
\int{a(\theta)\over b(\theta)}=\sum_{i=1}^k c_i\log\bigl(v_i(\theta)\bigr) \koz .
$$
Let us consider a few examples.
\begin{pelda}\label{lograc1}
The integrand of the integral $\int 1/\log(x)$
is $1/\theta\in\Q(x,\theta)$, where $\theta=\log(x)$ \koz . Since
$$
r(y)=\textrm{res}_\theta(\theta,1-y/x)=1-y/x\in\Q(x)[y]
$$
is a primitive polynomial and it has a coefficient that is, not
constant, the integral is not elementary.
\end{pelda}
\begin{pelda}
The integrand of the integral $\int1/\bigl(x\log(x)\bigr)$
is $1/(x\theta)\in\Q(x,\theta)$, where $\theta=\log(x)$. Here,
$$
r(y)=\textrm{res}_\theta(\theta,1/x-y/x)=1/x-y/x\in\Q(x)[y] \koz,
$$
which has primitive part $1-y$. Every coefficient of this is constant,
so the integral is elementary, $c_1=1$,
$v_1(\theta)=\textrm{gcd}(\theta,1/x-1/x)=\theta$, so
$$
\int{1\over
x\log(x)}=c_1\log\bigl(v_1(\theta)\bigr)=\log\bigl(\log(x)\bigr) \koz .
$$
\end{pelda}
\subsubsection{Logarithmic case, polynomial part}
The remaining problem is the integration of the polynomial part
$$
p(\theta)=p_k\theta^k+p_{k-1}\theta^{k-1}+\cdots+p_0\in
K_{n-1}[\theta]\koz .
$$
According to Liouville's Principle \index{Liouville's
Principle}\nevindex{Liouville, Joseph}$\int p(\theta)$ is elementary
if and only if
$$
p(\theta)=v_0(\theta)'+\sum_{j=1}^k c_j{v_j(\theta)'\over v_j(\theta)}\koz ,
$$
where $c_j\in \oK$ and $v_i\in\oK_{n-1}(\theta)$ for $j=0,1,\ldots,
m$, $\oK_{\C}$ is an extension of $K$ and
$\oK_{n-1}=\oK(x,\theta_1,\ldots,\theta_{n-1})$. We will show that
$\oK$ can be an algebraic extension of
$K$\index{algebraic!extension}. A similar reasoning to the proof of
Liouville's Principle shows that $v_0(\theta)\in\oK_{n-1}[\theta]$ and
$v_j(\theta)\in\oK_{n-1}$ (that is, independent of $\theta$) for
$j=1,2,\ldots,m$. Thus,
$$
p(\theta)=v_0(\theta)'+\sum_{j=1}^m c_j{v_j'\over v_j}\koz .
$$
We also get, by the reasoning used in the proof of Liouville's
Principle,\index{Liouville's Principle}\nevindex{Liouville, Joseph}
that the degree of $v_0(\theta)$ is at most $k+1$. So if
$v_0(\theta)=q_{k+1}\theta^{k+1}+q_k\theta^k+\cdots+q_0$, then
$$
p_k\theta^k+p_{k-1}\theta^{k-1}+\cdots+p_0
=(q_{k+1}\theta^{k+1}+q_k\theta^k+\cdots+q_0)'+\sum_{j=1}^m c_j{v'_j\over
v_j}\koz .
$$
Hence, we get the following system of equations:
\begin{eqnarray*}
0&=&q_{k+1}'\koz ,\cr
p_k&=&(k+1)q_{k+1}\theta'+q'_k\koz ,\cr
p_{k-1}&=&kq_k\theta'+q'_{k-1}\koz ,\cr
&\vdots\cr
p_1&=&2q_2\theta'+q'_1\koz ,\cr
p_0&=&q_1\theta'+\o q'_0\koz ,
\end{eqnarray*}
where $\o q_0=q_0+\sum_{j=1}^m c_j\log(v_j)$ in the last equation.
The solution of the first equation is simply a constant
$b_{k+1}$. Substituting this into the next equation and integrating
both sides, we get
$$
\int p_k=(k+1)b_{k+1}\cdot\theta+q_k\koz .
$$
Applying the integration procedure recursively, the integral of
$p_k\in K_{n-1}$ can be computed, but this equation can only be solved
if the integral is elementary, it uses at most one logarithmic
extension, and it is exactly
$\theta=\log(u)$.\index{logarithmic!extension} If this is not fulfilled,
then $\int p(\theta)$ cannot be elementary. If it is fulfilled, then
$\int p_k=c_k\theta+d_k$ for some $c_k\in\oK$ and
$d_k\in\oK_{n-1}$, hence, $b_{k+1}=c_{k+1}/(k+1)\in\oK$ and
$q_k=d_k+b_k$ with an arbitrary integration constant
$b_k$. Substituting for $q_k$ into the next equation and rearranging, we get
$$
p_{k-1}-kd_k\theta'=kb_k\theta'+q'_{k-1}\koz ,
$$
so we have
$$
\int\Bigl(p_{k-1}-kd_k{u'\over u}\Bigr)=kb_k\theta+q_{k-1}
$$
after integration. The integrand on the right hand side is in $K_{n-1}$, so we
can call the integration procedure in a recursive way. Just like
above, the equation can only be solved when the integral is elementary,
it uses at most one logarithmic extension and it is exactly
$\theta=\log(u)$. Let us assume that this is the case and
$$
\int\Bigl(p_{k-1}-kd_k{u'\over u}\Bigr)=c_{k-1}\theta+d_{k-1}\koz ,
$$
where $c_{k-1}\in\oK$ and $d_{k-1}\in\oK_{n-1}$. Then the solution is
$b_k=c_{k-1}/k\in\oK$ and $q_{k-1}=d_{k-1}+b_{k-1}$, where $b_{k-1}$
is an arbitrary integration constant. Continuing the procedure, the
solution of the penultimate equation is $b_2=c_1/2\in\oK$
and $q_1=d_1+b_1$ with an integration constant $b_1$. Substituting for
$q_1$ into the last equation, after rearrangement and integration, we get
$$
\int\Bigl(p_0-d_1{u'\over u}\Bigr)=b_1\theta+\o q_0 \koz .
$$
This time the only condition is that the integral should be an
elementary function. If it is elementary, say
$$
\int\Bigl(p_0-d_1{u'\over u}\Bigr)=d_0\in\oK_{n-1}\koz ,
$$
then $b_1\in\oK$ is the coefficient of $\theta=\log(u)$ in $d_0$, $\o
q_0=d_0-b_1\log(u)$, and the result is
$$
\int p(\theta)=b_{k+1}\theta^{k+1}+q_k\theta^k+\cdots+q_1\theta+\o q_0 \koz .
$$
Let us consider a few examples.
\begin{pelda}
The integrand of the integral $\int\log(x)$ is
$\theta\in\Q(x,\theta)$, where $\theta=\log(x)$. If the integral is
elementary, then
$$
\int\theta=b_2\theta^2+q_1\theta+\o q_0
$$
and $0=b'_2$, $1=2b_2\theta'+q'_1$, $0=q_1\theta'+\o q'_0$. With the
unknown constant $b_2$ from the second equation, $\int
1=2b_2\theta+q_1$. Since $\int 1=x+b_1$, we get $b_2=0$,
$q_1=x+b_1$. From the third equation $-x\theta'=b_1\theta'+\o q'_0$.
Since $\theta'=1/x$, after integration $\int-1=b_1\theta+\o q_0$ and
$\int-1=-x$, we get $b_1=0$, $\o q_0=-x$, hence, $\int\log(x)=x\log(x)-x$.
\end{pelda}
\begin{pelda}
The integrand of the integral $\int\log\bigl(\log(x)\bigr)$ is
$\theta_2\in\Q(x,\theta_1,\theta_2)$, where $\theta_1=\log(x)$ and
$\theta_2=\log(\theta_1)$. If the integral is elementary, then
$$
\int\theta_2=b_2\theta_2^2+q_1\theta_2+\o q_0
$$
and $0=b'_2$, $1=2b_2\theta_2'+q'_1$, $0=q_1\theta_2'+\o q'_0$.
With the unknown constant $b_2$ from the second equation, $\int
1=2b_2\theta+q_1$. Since $\int 1=x+b_1$, we get $b_2=0$,
$q_1=x+b_1$. From the third equation $-x\theta_2'=b_1\theta_2'+\o
q'_0$. Since $\theta'_2=\theta'_1/\theta_1=1/\bigl(x\log(x)\bigr)$,
the equation
$$
\int{-1\over\log(x)}=b_1\theta_2+\o q_0
$$
must hold but we know from Example \ref{lograc1} that the integral on
the left hand side is not elementary.
\end{pelda}
\subsubsection{Risch algorithm: exponential case.}\index{Risch
algorithm!exponential case}
Now we assume that $\theta$ is transcendental and exponential,
$\theta'/\theta=u'$, $u\in K_{n-1}$. By Euclidean division,
$f(\theta)=q(\theta)g(\theta)+h(\theta)$, hence
$$
\int{f(\theta)\over g(\theta)}=\int q(\theta)+\int{h(\theta)\over g(\theta)} \koz .
$$
We plan using Hermite's method\index{Hermite's
method}\nevindex{Hermite, Charles (1822--1901)} for the rational part. But we have
to face an unpleasant surprise: although for the square-free factors
$g_j(\theta)$
$$
\textrm{gcd} \left ( g_j(\theta),{d\over d\theta}g_j(\theta) \right )
= 1\
$$
is obviously satisfied, the much stronger condition
$\textrm{gcd}\bigl(g_j(\theta),g_j(\theta)'\bigr)=1$ is not. For
example, if $g_j(\theta)=\theta$, then
$$
\textrm{gcd}\bigl(g_j(\theta),g_j(\theta)'\bigr)=\textrm{gcd}(\theta,u'\theta)=\theta \koz .
$$
It can be shown, however, that this unpleasant phenomenon does not
appear if $\theta\notmid g_j(\theta)$,
in which case $\textrm{gcd}\bigl(g_j(\theta),g_j(\theta)'\bigr)=1$.
Thus, it will be sufficient to eliminate $\theta$ from the
denominator. Let $g(\theta)=\theta^\ell\o
g(\theta)$, where $\theta\notmid\o g(\theta)$, and let us look
for polynomials $\o h(\theta),s(\theta)\in K_{n-1}[\theta]$ with $\o
h(\theta)\theta^\ell+t(\theta)\o g(\theta)=h(\theta)$, $\deg\o
h(\theta)<\deg\o g(\theta)$ and $\deg s(\theta)<\ell$. Dividing both
sides by $g(\theta)$, we get
$$
{f(\theta)\over g(\theta)}=q(\theta)+{t(\theta)\over\theta^l}+{\o h(\theta)\over\o g(\theta)}\koz .
$$
Using the notation $p(\theta)=q(\theta)+t(\theta)/\theta^l$,
$p(\theta)$ is a finite
Laurent-series\index{Laurent-series}\nevindex{Laurent, Pierre
Alphonse (1813--1854)} the integration of which will be no harder than the
integration of a polynomial. This is not surprising if we note
$\theta^{-1}=\textrm{exp}(-u)$. Even so, the integration of the
``polynomial part'' is more difficult here, too. We start with the
other one.
\subsubsection{Exponential case, rational part.}
Let $\o g(\theta)=g_1(\theta)g_2^2(\theta)\cdots g_m^m(\theta)$ be the
square-free factorisation of $\o g(\theta)$. Then, since
$\theta\notmid g_j(\theta)$,
$\textrm{gcd}\bigl(g_j(\theta),g_j(\theta)'\bigr)=1$. Using partial
fraction decomposition
$$
{\o h(\theta)\over\o g(\theta)}=\sum_{i=1}^m\sum_{j=1}^i{h_{i,j}(\theta)\over g_i(\theta)^j} \koz .
$$
Hermite-reduction goes the same way as in the logarithmic
case.\index{Hermite-reduction} We get
$$
\int{\o h(\theta)\over\o g(\theta)}={c(\theta)\over d(\theta)}+\int{a(\theta)\over b(\theta)} \koz ,
$$
where $a(\theta),b(\theta),c(\theta),d(\theta)\in K_{n-1}[\theta]$,
$\deg a(\theta)<\deg
b(\theta)$ and $b(\theta)$ is a square-free and monic polynomial,
$\theta\notmid b(\theta)$.
It can be shown that the
Rothstein-Trager method\index{Rothstein-Trager method}\nevindex{Rothstein,
Michael}\nevindex{Trager, Barry Marshall} can be applied to compute
the integral $\int a(\theta)/b(\theta)$. Let us calculate the resultant
$$
r(y)=\textrm{res}_\theta\bigl(b(\theta),a(\theta)-y\cdot
b(\theta)'\bigr)\koz .
$$
\index{resultant} It can be shown that the integral is elementary
if and only if $r(y)$ is of form $r(y)=\o r(y)s$, where $\o r(y)\in
K[y]$ and $s\in K_{n-1}$. If we compute the primitive part of $r(y)$,
choose it as $\o r(y)$ and any coefficient of $\o r(y)$ is not a
constant, then there is no elementary integral. Otherwise, let
$c_1,\ldots,c_k$ be the distinct roots of $\o r(y)$ in
its factor field and let
$$
v_i(\theta)=\textrm{gcd}\bigl(b(\theta),a(\theta)-c_ib(\theta)'\bigr)\in
K_{n-1}(c_1,\ldots,c_k)[\theta]
$$
for $i=1,\ldots,k$. It can be shown that
$$
\int{a(\theta)\over b(\theta)}=- \left ( \sum_{i=1}^k c_i\deg v_i(\theta) \right )
+\sum_{i=1}^k c_i\log\bigl(v_i(\theta)\bigr) \koz .
$$
Let us consider a few examples.
\begin{pelda}
The integrand of the integral $\int1/\bigl(1+\textrm{exp}(x)\bigr)$ is $1/(1+\theta)\in\Q(x,\theta)$, where $\theta=\textrm{exp}(x)$. Since
$$
r(y)=\textrm{res}_\theta(\theta+1,1-y\theta)=-1-y\in\Q(x)[y]
$$
is a primitive polynomial which only has constant coefficients, the
integral is elementary, $c_1=-1$, $v_1(\theta)=\textrm{gcd}(\theta+1,1+\theta)=1+\theta,$ thus,
$$
\int{1\over 1+\textrm{exp}(x)}=-c_1x\deg
v_1(\theta)+c_1\log\bigl(v_1(\theta)\bigr)=x-\log\bigl(\textrm{exp}(x)+1\bigr) \koz .
$$
\end{pelda}
\begin{pelda}
The integrand of the integral $\int x/\bigr(1+\textrm{exp}(x)\bigl)$ is
$x/(1+\theta)\in\Q(x,\theta)$, where $\theta=\textrm{exp}(x)$. Since
$$
r(y)=\textrm{res}_\theta(\theta+1,x-y\theta)=-x-y\in\Q(x)[y]
$$
is a primitive polynomial that has a non-constant coefficient, the
integral is not elementary.
\end{pelda}
\subsubsection{Exponential case, polynomial part}
The remaining problem is the integration of the ``polynomial part''
$$
p(\theta)=\sum_{i=-\ell}^k p_i\theta^i\in K_{n-1}(\theta)\koz .
$$
According to Liouville's Principle \index{Liouville's
Principle}\nevindex{Liouville, Joseph}$\int p(\theta)$ is elementary
if and only if
$$
p(\theta)=v_0(\theta)'+\sum_{j=1}^m c_j{v_j(\theta)'\over v_j(\theta)}\koz ,
$$
where $c_j\in \oK$ and $v_j\in\oK_{n-1}(\theta)$ for $j=0,1,\ldots,m$,
$\oK_{\C}$ is an extension of $K$ and
$\oK_{n-1}=\oK(x,\theta_1,\ldots,\theta_{n-1})$.
It can be shown that $\oK$ can be an algebraic
extension\index{algebraic!extension} of $K$. A similar reasoning to
the proof of Liouville's Principle shows that we may assume without
breaking generality that $v_j(\theta)$ is either an element of $\oK_{n-1}$
(that is, independent of $\theta$), or it is monic and irreducible in
$\oK_{n-1}[\theta]$ for $j=1,2,\ldots,m$. Furthermore, it can be shown
that there can be no non-monomial factor in the denominator of
$v_0(\theta)$, since such a factor would also be present in the
derivative. Similarly, the denominator of $v_j(\theta)$
$(j=1,2,\ldots,m)$ can have no non-monomial factor either. So we get
that either $v_j(\theta)\in K_{n-1}$, or $v_j(\theta)=\theta$, since
this is the only irreducible monic monomial. But if
$v_j(\theta)=\theta$, then the corresponding term of the sum is
$c_jv_j'(\theta)/v_j(\theta) = c_j u'$, which can be incorporated into
$v_0(\theta)'$. Hence, we get that if $p(\theta)$ has an elementary
integral, then
$$
p(\theta)=\left ( \sum_{j=-\ell}^k q_j\theta^j \right )'+\sum_{j=1}^m c_j{v'_j\over v_j}\koz ,
$$
where $q_j,v_j\in\oK_{n-1}$ and $c_j\in\oK$. The summation should be
taken over the same range as in $p(\theta)$ since
$$
(q_j\theta^j)'=(q'_j+ju'g_j)\theta^j \koz .
$$
Comparing the coefficients, we get the system
\begin{eqnarray*}
p_j&=&q'_j+ju'q_j,\quad\hbox{ha $-\ell\le j\le k$, $j\neq 0$}\koz ,\cr
p_0&=&\o q'_0\koz ,
\end{eqnarray*}
where $\o q_0=q_0+\sum_{j=1}^m c_j\log(v_j)$. The solution of the
equation $p_0=\o q'_0$ is simply $\o q_0=\int p_0$; if this integral
is not elementary, then $\int p(\theta)$ cannot be elementary either,
but if it is, then we have determined $\o q_0$. In the case $j\neq 0$,
we have to solve a differential equation called \ki{Risch differential
equation}\inddef{Risch differential equation}\nevindex{Risch, Robert}
to determine $q_j$. The differential equation is of the form $y'+fy=g$,
where the given functions $f$ and $g$ are elements of $K_{n-1}$, and we are
looking for solutions in $\oK_{n-1}$. At first sight it looks as if we
had replaced the problem of integration with a more difficult
problem, but the linearity of the equations and that the solution has
to be in $\oK_{n-1}$ means a great facility. If any Risch differential
equation\index{Risch differential
equation} fails to have a solution in $\oK_{n-1}$, then $\int p(\theta)$ is not
elementary, otherwise
$$
\int p(\theta)=\sum_{j\neq 0}q_j\theta^j+\o q_0 \koz .
$$
The Risch differential equation can be solved algorithmically, but we
will not go into details.\nevindex{Risch, Robert}\index{Risch
differential equation}
\smallskip
\noindent
Let us consider a few examples.
\begin{pelda}
The integrand of the integral $\int\textrm{exp}(-x^2)$ is
$\theta\in\Q(x,\theta)$, where $\theta=\textrm{exp}(-x^2)$.
If the integral is elementary, then $\int\theta=q_1\theta$, where
$q_1\in\C(x)$. It is not difficult to show that the differential
equation has no rational solutions, so $\int\textrm{exp}(-x^2)$ is not
elementary.
\end{pelda}
\begin{pelda}
The integrand of the integral $\int x^x$ is
$\textrm{exp}\bigl((x\log(x)\bigr)=\theta_2\in\Q(x,\theta_1,\theta_2)$, where
$\theta_1=\log(x)$ and $\theta_2=\textrm{exp}(x\theta_1)$. If the
integral is elementary, then $\int\theta_2=q_1\theta_2$, where
$q_1\in\C(x,\theta_1)$. Differentiating both sides,
$\theta_2=q'_1\theta_2+q_1(\theta_1+1)\theta_2$, thus,
$1=q'_1+(\theta_1+1)q_1$. Since $\theta_1$ is transcendental over
$\C(x)$, by comparing the coefficients $1=q'_1+q_1$ and $0=q_1$, which
has no solutions. Therefore, $\int x^x$ is not elementary.
\end{pelda}
\begin{pelda}
The integrand of the integral
$$
\int{(4x^2+4x-1)\bigl(\textrm{exp}(x^2)+1\bigr)\bigl(\textrm{exp}(x^2)-1\bigr)
\over(x+1)^2}
$$
is
$$
f(\theta)={4x^2+4x-1\over(x+1)^2}(\theta^2-1)\in\Q(x,\theta) \koz ,
$$
where
$\theta=\textrm{exp}(x^2)$. If the integral is elementary, then it is
of the form $\int f(\theta)=q_2\theta^2+\o q_0$ , where
\begin{eqnarray*}
q'_2+4xq_2&=&{4x^2+4x-1\over(x+1)^2}\ ,\cr
\o q'_0&=&-{4x^2+4x-1\over(x+1)^2}\ .
\end{eqnarray*}
The second equation can be integrated and $\o q_0$ is elementary. The
solution of the first equation is $q_2=1/(1+x)$. Hence,
$$
\int f(\theta)={1\over x+1}\textrm{exp}^2(x^2)-{(2x+1)^2\over x+1}+4\log(x+1) \koz .
$$
\vspace{-4mm}
\end{pelda}
\vspace{-4mm}
\begin{gyak}
\ujgyak Apply Hermite-reduction to the following function\gyakindex{Hermite-reduction} $f(x)\in \Q(x)$ :
\[
f(x)=\frac{441x^7+780x^6-286x^5+4085x^4+769x^3+3713x^2-43253x+24500}{9x^6+6x^5-65x^4+20x^3+135x^2-154x+49} \koz .
\]
\ujgyak Compute the integral $\int f$, where
\[
f(x)=\frac{36x^6+126x^5+183x^4+13807/6x^3-407x^2-3242/5x+3044/15}{(x^2+7/6x+1/3)^2(x-2/5)^3}\in \Q(x)\koz .
\]
\ujgyak Apply the Risch integration algorithm\index{Risch algorithm}
to compute the following integral:
\[
\int \frac{x(x+1)\{(x^2e^{2x^2}-\log (x+1)^2)+2xe^{3x^2}(x-(2x^3+2x^2+x+1)\log (x+1))\}}{((x+1)\log^2 (x+1)-
(x^3+x^2)e^{2x^2})^2}dx\koz .
\]
\end{gyak}
\section{Theory and practice}\label{elmgyak}
So far in the chapter we tried to illustrate the algorithm design
problems of computer algebra through the presentation of a few
important symbolic algorithms. Below the interested reader will find
an overview of the wider universe of the research of symbolic
algorithms.
\subsection{Other symbolic algorithms}
Besides the resultant method and the theory of
Gröbner-bases\index{Gröbner-basis}\nevindex{Gröbner, Wolfgang Anton
Maria} presented in this chapter, there also exist algorithms for finding
\textit{real} symbolic roots of non-linear equations and inequalities.
(Collins).\nevindex{Collins, Georges Edwin}
There are some remarkable algorithms in the area of symbolic solution
of differential equations. There exists a decision procedure similar
to the Risch algorithm\nevindex{Risch, Robert} for the computation of
solutions in closed form of a homogeneous ordinary differential equation
of second degree with rational function coefficients. In case of
higher degree linear equations, Abramov's procedure\nevindex{Abramov,
Sergey Alexandrovich} gives closed rational solutions of an equation
with polynomial coefficients, while Bronstein's
algorithm\nevindex{Bronstein, Manuel} gives solutions of the form
$\textrm{exp}( \int f(x) dx )$. In the case of partial differential
equations Lie's\nevindex{Lie, Marius Sophus} symmetry methods can be
used. There also exists an algorithm for the factorisation of linear
differential operators over formal power series and rational
functions.
Procedures based on factorisation are of great importance in the
research of computer algebra algorithms. They are so important that many
consider the birth of the entire research field with
Berlekamp's\nevindex{Berlekamp, Elwyn Ralph} publication on an
effective algorithm for the factorisation of polynomials of one
variable over finite fields of small characteristic $p$. Later,
Berlekamp extended his results for larger characteristic. In
order to have similarly good running times, he introduced probabilistic
elements into the algorithm. Today's computer algebra systems use
Berlekamp's procedure even for large finite fields as a routine,
perhaps without most of the users knowing about the probabilistic
origin of the algorithm. The method will be presented in another
chapter of the book. We note that there are numerous algorithms for
the factorisation of polynomials over finite fields.
Not much time after polynomial factorisation over finite fields was
solved, Zassenhaus\nevindex{Zassenhaus, Hans Julius (1912--1991)}, taking van der
Waerden's\nevindex{van der Waerden, Bartel Leendert} book {\em Moderne
Algebra} from 1936 as a base, used Hensel's lemma\nevindex{Hensel,
Kurt Wilhelm Sebastian (1861--1913)} for the arithmetic of $p$-adic numbers to
extend factorisation. ``Hensel-lifting'' -- as his procedure is
now called -- is a general approach for the reconstruction of factors
from their modular images. Unlike interpolation, which needs multiple
points from the image, Hensel-lifting only needs one point from the
image. The Berlekamp--Zassenhaus-algorithm\nevindex{Zassenhaus, Hans
Julius (1912--1991)}\nevindex{Berlekamp, Elwyn Ralph} for the factorisation of
polynomials with integer coefficients is of fundamental importance but
it has two hidden pitfalls. First, for a certain type of polynomial
the running time is exponential. Unfortunately, many ``bad''
polynomials appear in the case of factorisation over algebraic number
fields. Second, a representation problem appears for multivariable
polynomials, similar to what we have encountered at the
Gauss-elimination\nevindex{Gauss, Johann Carl Friedrich
(1777--1855)}\index{Gauss-elimination} of sparse matrices. The first
problem was solved by a Diophantine optimisation based on the geometry
of numbers, a so-called lattice reduction algorithm by
Lenstra-Lenstra-Lovász \cite{Lenstra-Lenstra-Lovasz82}\nevindex{Lenstra, Arjen
Klaas}\nevindex{Lenstra, Hendrik Willem, Jr.}\nevindex{Lovász, László};
it is used together with Berlekamp's\nevindex{Berlekamp, Elwyn Ralph}
method. This polynomial algorithm is completed by a procedure which
ensures that the Hensel-lifting\nevindex{Hensel, Kurt Wilhelm
Sebastian (1861--1913)} will start from a ``good'' modular image and that it
will end ``in time''. Solutions have been found for the mentioned
representation problem of the factorisation of multivariable
polynomials as well. This is the second area where randomisation
plays a crucial role in the design of effective algorithms. We note
that in practice, the
Berlekamp-Zassenhaus-Hensel-algorithm\nevindex{Zassenhaus, Hans
Julius (1912--1991)}\nevindex{Berlekamp, Elwyn Ralph}\nevindex{Hensel, Kurt Wilhelm
Sebastian (1861--1913)} proves more effective than the
Lenstra-Lenstra-Lovász-procedure.\nevindex{Lenstra, Arjen
Klaas}\nevindex{Lenstra, Hendrik Willem, Jr.}\nevindex{Lovász, László}
As a contrast, the problem of polynomial factorisation can be solved
in polynomial time, while the best proved algorithmic bound for the
factorisation of the integer $N$ is $\widetilde{O}(N^{1/4})$ (Pollard
and Strassen)\nevindex{Pollard, John Michael}\nevindex{Strassen,
Volker} in the deterministic case and $L(N)^{1+o(1)}$ (Lenstra and
Pomerance)\nevindex{Lenstra, Arjen Klaas}\nevindex{Pomerance, Karl} in
the probabilistic case, where $L(N)=e^{\sqrt{\ln N\ln\ln N}}$.
In fact, a new theory of heuristic or probabilistic methods in
computer algebra is being born to avoid computational or memory
explosion and to make algorithms with deterministically large running
times more effective. In the case of probabilistic algorithms, the
probability of inappropriate operations can be positive, which may
result in an incorrect answer (Monte Carlo algorithms) or---although
we always get the correct answer (Las Vegas algorithms)---we may not
get anything in polynomial time. Beside the above examples, nice
results have been achieved in testing polynomial identity,
irreducibility of polynomials, determining matrix normal forms
(Frobenius, Hilbert, Smith), etc.\nevindex{Frobenius, Ferdinand
Georg (1849--1917)}\nevindex{Hilbert, David (1862--1943)}\nevindex{Smith, Henry
John Stephen} Their role is likely to increase in the future.
So far in the chapter we gave an overview of the most important
symbolic algorithms. We mentioned in the introduction that most
computer algebra systems are also able to perform numeric
computations: unlike traditional systems, the precision can be set by
the user. In many cases, it is useful to combine the symbolic and numeric
computations. Let us consider for example the symbolically computed
power series solution of a differential equation. After truncation,
evaluating the power series with the usual floating point arithmetics
in certain points, we get a numerical approximation of the
solution. When the problem is an approximation of a physical problem,
the attractivity of the symbolic computation is often lost, simply because
they are too complicated or too slow and they are not necessary or
useful, since we are looking for a numerical solution. In other cases,
when the problem cannot be dealt with using symbolic computation, the
only way is the numerical approximation. This may be the case when the
existing symbolic algorithm does not find a closed solution (e.g.\ the
integral of non-elementary functions, etc.), or when a symbolic
algorithm for the specified problem does not exist. Although more and
more numerical algorithms have symbolic equivalents, numerical
procedures play an important role in computer algebra. Let us think of
differentiation and integration: sometimes traditional algorithms---integral
transformation, power series approximation, perturbation
methods---can be the most effective.
In the design of computer algebra algorithms, parallel architectures
will play an increasing role in the future. Although many existing
algorithms can be parallelised easily, it is not obvious
that good sequential algorithms will perform optimally on parallel
architectures as well: the optimal performance might be achieved by a
completely different method.
\subsection{An overview of computer algebra systems}
The development of computer algebra systems is linked with the
development of computer science and algorithmic mathematics. In the
early period of computers, researchers of different fields began the
development of the first computer algebra systems to facilitate and
accelerate their symbolic computations; these systems, reconstructed
and continuously updated, are present in their manieth versions
today. \ki{General purpose} computer algebra systems\index{computer
algebra systems!general purpose} appeared in the seventies, and
provide a wide range of built-in data structures, mathematical
functions and algorithms, trying to cover a wide area of
users. Because of their large need of computational resources, their
expansion became explosive in the beginning of the eighties when
microprocessor-based workstations appeared. Better hardware
environments, more effective resource management, the use of
system-independent high-level languages and, last but not least
social-economic demands gradually transformed general purpose computer
algebra systems into market products, which also resulted in a
better user interface and document preparation.
Below we list the most widely known general and special purpose
computer algebra systems and
libraries.\index{computer algebra systems!special purpose}
\begin{itemize}
\item General purpose computer algebra systems:
{\sc Axiom, Derive, Form, GNU-calc, Jacal, Macsyma, Maxima, Maple, Distributed Maple, MathCAD, Matlab Symbolic
Math Toolbox, Scilab, MAS, Mathematica, Mathview, Mock-Mma, MuPAD, Reduce, Risa.}
\item Algebra and number theory: {\sc Bergman, CoCoA, Felix, Fermat, GRB, Kan, Macaulay, Magma, Numbers, Pari, Simath,
Singular}.
\item Algebraic geometry: {\sc Casa, Ganith}.
\item Group theory: {\sc Gap, LiE, Magma, Schur}.
\item Tensor analysis: {\sc Cartan, FeynCalc, GRG, GRTensor, MathTensor, RedTen, Ricci, TTC}.
\item Computer algebra libraries: {\sc Apfloat, BigNum, GNU MP, Kant, LiDiA, NTL, Saclib, Ubasic, Weyl, Zen}.
\end{itemize}
Most general purpose computer algebra systems are characterised by
\begin{itemize}
\item interactivity,
\item knowledge of mathematical facts,
\item a high-level, declarative\footnote{ Declarative programming
languages specify the desired result unlike imperative languages,
which describe how to get the result.} programming language with the
possibility of functional programming and the knowledge of
mathematical objects,
\item expansibility towards the operational system and other programs,
\item integration of symbolic and numeric computations,
\item automatic (optimised) C and Fortran code generation,
\item graphical user interface,
\item 2- and 3-dimensional graphics and animation,
\item possibility of editing text and automatic \LaTeX \ conversion,
\item on-line help.
\end{itemize}
Computer algebra systems are also called \ki{mathematical expert
systems.}\index{mathematical expert systems} Today we can see an
astonishing development of general purpose computer algebra systems,
mainly because of their knowledge and wide application area. But it would
be a mistake to underestimate special systems, which play a
very important role in many research fields, besides, in many cases
are easier to use and more effective due to their system of notation
and the low-level programming language implementation of their
algorithms. It is essential that we choose the most appropriate
computer algebra system to solve a specific problem.
\begin{fld}
\ujfld{The length of coefficients in the series of remainders in a
Euclidean division}
{Generate two pseudorandom polynomials of degree $n=10$ in $\Z[x]$
with coefficients of $l=10$ decimal digits. Perform a single
Euclidean division in (in $\Q[x]$) and compute the ratio of the
maximal coefficients of the remainder and the original polynomial
(determined by the function $\lambda$). Repeat the computation $t=20$
times and compute the average. What is the result? Repeat the
experiment with $l=100, 500, 1000$. \label{masgy}}
\ujfld{Simulation of the \textsc{Modular-Gcd-Smallprimes} algorithm}
{Using simulation, give an estimation for the optimal value of the
variable $n$ in the \textsc{Modular-Gcd-Smallprimes} algorithm. Use
random polynomials of different degrees and coefficient magnitudes.}
\ujfld{Modified pseudo-euclidean division}
{Let $f,g\in \Z[x],\ \deg f=m\geq n= \deg g$. Modify the
pseudo-euclidean division in such a way that in the equation
\[ g_n^sf=gq+r\]
instead of the exponent $s=m-n+1$ put the smallest value
$s\in \N$ such that $q,r\in \Z[x]$.
Replace the procedures $\textrm{pquo}()$ and $\textrm{prem}()$ in the
\textsc{Primitive-Euclidean} algorithm by the obtained procedures
$\textrm{spquo}()$ and $\textrm{sprem}()$. Compare the amount of memory
space required by the algorithms.}\felindex{\textsc{Primitive-Euclidean}}
\ujfld{Construction of reduced Gröbner basis}
{Design an algorithm that computes a
reduced Gröbner basis from a given Gröbner-basis $G$.} \felindex{Gröbner basis}
\ujfld{Implementation of Hermite-reduction}
{Implement Hermite-reduction in a chosen computer algebra
language.}\felindex{Hermite-reduction}\nevindex{Hermite, Charles (1822--1901)}
\ujfld{Integration of rational functions}
{Write a program for the integration of rational functions.}
\end{fld}
\begin{fejmegj}
The algorithms \textsc{Classical-Euclidean} and
\textsc{Extended-Euclidean} for non-negative integers are described in
\cite{CormenLe2009}.\index{\textsc{Classical-Euclidean}}
\index{\textsc{Extended-Euclidean}} A natural continuation of the
theory of resultants leads to subresultants, which can help in
reducing the growth of the coefficients in the
\textsc{Extended-Euclidean} algorithm (see e.g.\
\cite{vonzurGathen,geddes}).
Gröbner bases were introduced by B.~Buchberger in
1965 \cite{buchberger}.\nevindex{Buchberger, Bruno} Several authors
examined polynomial ideals before this. The most well-known is perhaps
Hironaka\nevindex{Hironaka, Heisuke}, who used bases of ideals of
power series to resolve singularities over $\C$. He was rewarded a
Fields-medal for his work. His method was not constructive,
however. Gröbner bases have been generalised for many algebraic
structures in the last two decades.
The bases of differential algebra have been layed by J.~F.~Ritt in
1948 \cite{ritt}.\nevindex{Ritt, Joseph Fels} The square-free
factorisation algorithm used in symbolic integration can be found
for example in the books \cite{vonzurGathen,geddes}. The importance of
choosing the smallest possible extension degree in
Hermite-reduction is illustrated by Example 11.11 in \cite{geddes},
where the splitting field has very large degree but the integral
can be expressed in an extension of degree 2. The proof of the
Rothstein-Trager integration algorithm can be found in
\cite{vonzurGathen} (Theorem 22.8). We note that the algorithm was
found independently by Rothstein and Trager. The proof of the
correctness of the Lazard-Rioboo-Trager formula, the analysis of the
running time of the \textsc{Integrate-Logarithmic-Part} algorithm, an
overview of the procedures that deal with the difficulties of algebraic
extension steps, the determination of the hyperexponential integral
(if exists) of a hyperexponential element over $C(x)$, the proof of
Liouville's principle and the proofs of the statements connected to
the Risch algorithm can be found in the book \cite{vonzurGathen}.
There are many books and publications available on computer algebra
and related topics. The interested reader will find mathematical
description in the following general works: Caviness \cite{caviness},
Davenport et al.\ \cite{davenport}, von zur Gathen et al.\
\cite{vonzurGathen}, Geddes et al.\ \cite{geddes}, Knuth
\cite{Knuth68I,Knuth69II,Knuth73III}, Mignotte \cite{mignotte}, Mishra
\cite{mishra}, Pavelle et al.\ \cite{pavelle}, Winkler
\cite{winkler}.\nevindex{Caviness, Bob Forrester}\nevindex{von zur
Gathen, Joachim} \nevindex{Geddes, Keith Oliver}\nevindex{Knuth,
Donald Ervin}\nevindex{Mignotte, Maurice} \nevindex{Mishra,
Bhubaneswar}\nevindex{Winkler, Franz}
The computer-oriented reader will find further information on computer
algebra in Christensen \cite{christ}, Gonnet and Gruntz \cite{gonnet},
Harper et al.\ \cite{harper} and on the world wide web.\nevindex{Gonnet,
Haas Gaston Henry}
A wide range of books and articles deal with applications, e.g.\
Akritas \cite{akritas}, Cohen et al.\ (eds.) \cite{cohen2,cohen1},
Grossman (ed.) \cite{grossman}, Hearn (ed.) \cite{hearn}, Kovács
\cite{kovacs2} and Odlyzko \cite{odlyzko}. \nevindex{Cohen, Arjeh
M.}\nevindex{Hearn, Anthony Clern}\nevindex{Odlyzko, Andrew Michael}
\nevindex{Kovács, Attila}
For the role of computer algebra systems in education see for example the
works of Karian \cite{karian} and Uhl \cite{uhl}.
Conference proceedings: {\sc Aaecc, Disco, Eurocal, Eurosam, Issac}
and {\sc Symsac}.
Computer algebra journals: {\em Journal of Symbolic Computation}---Academic Press,
{\em Applicable Algebra in Engineering, Communication and Computing}---Springer-Verlag,
{\em {\sc Sigsam} Bulletin}---ACM Press.
The Department of Computer Algebra of the Eötvös Loránd University,
Budapest takes works \cite{vonzurGathen,geddes,mignotte,winkler} as a base in education.
\end{fejmegj}