\chapter[ Cryptology]{Cryptology}
\label{chap:cryptology}\label{sec:introduction-cryptology}\index{cryptology|(}
This chapter introduces a number of cryptographic protocols and their
underlying problems and algorithms. A typical cryptographic scenario
is shown in Figure~\ref{fig:scenario} (the design of Alice and Bob is
due to Cr\'{e}peau). Alice and Bob wish to exchange messages over an
insecure channel, such as a public telephone line or via email over a
computer network. Erich is eavesdropping on the channel. Knowing
that their data transfer is being eavesdropped, Alice and Bob encrypt
their messages using a cryptosystem.
\begin{figure}[!h]
\footnotesize\centering
\[
\begin{array}{ccc}
&
\psfig{file=figs/mielke-g.ps,height=1.3cm}
& \\
& \mbox{{\bf\small Erich}} & \\[.2cm]
\psfig{file=figs/alice-g.ps,height=1.7cm}
&
\psfig{file=figs/channel.eps,width=1.7cm}
&
\psfig{file=figs/bob-g.ps,height=1.7cm}
\end{array}
\]
\caption{A typical scenario in cryptography.
\label{fig:scenario}
}
\vspace{-6pt}
\end{figure}
In Section~\ref{sec:foundations-cryptology}, various symmetric cryptosystems
are presented. A cryptosystem is said to be symmetric if one and the same key
is used for encryption and decryption. For symmetric systems, the question of
key distribution is central: How can Alice and Bob agree on a joint secret key
if they can communicate only via an insecure channel? For example, if Alice
chooses some key and encrypts it like a message using a symmetric cryptosystem
to send it to Bob, then which key should she use to encrypt this key?
This paradoxical situation is known as
\ki{the secret-key agreement problem,}\index{secret-key agreement problem} and
it was considered unsolvable for a long time. Its surprisingly simple,
ingenious solution by Whitfield Diffie and
Martin Hellman\nevindex{Diffie, Whitfield}\nevindex{Hellman, Martin E.} in 1976 is a
milestone in the history of cryptography. They proposed a protocol that Alice
and Bob can use to exchange a few messages after which they both can easily
determine their joint secret key. Eavesdropper Erich, however, does not have
clue about their key, even if he was able to intercept every single bit of
their transferred messages. Section~\ref{sec:diffie-hellman} presents the
Diff{}ie-Hellman secret-key agreement protocol.\index{Diffie-Hellman protocol}
It may be considered an irony of history that this protocol, which finally
solved the long-standing secret-key agreement problem that is so important in
symmetric cryptography, opened the door to \ki{public-key cryptography}\index{public-key cryptography}
in which there is no need to distribute joint secret keys via insecure channels.
In 1978, shortly after Diff{}ie and Hellman had published their pathbreaking
work in 1976, Rivest, Shamir, and Adleman developed
their famous RSA system, the first \ki{public-key cryptosystem}\index{public-key cryptosystem}
in the open literature. Section~\ref{sec:rsa-factoring} describes the RSA cryptosystem
and the related digital signature scheme. Using the latter protocol, Alice
can sign her message to Bob such that he can verify that she indeed is the
sender of the message. Digital signatures prevent Erich from forging
Alice's messages.
The security of the Diff{}ie-Hellman protocol rests on the assumption that
computing discrete logarithms is computationally intractable. That is why
modular exponentiation (the inverse function of which is the discrete
logarithm) is considered to be a candidate of a one-way function. The
security of RSA similarly rests on the assumption that a certain problem is
computationally intractable, namely on the assumption that factoring large
integers is computationally hard. However, the authorised receiver Bob is
able to efficiently decrypt the ciphertext by employing the factorisation of
some integer he has chosen in private, which is his private ``trapdoor''
information.
Section~\ref{sec:riv-rab-she} introduces a secret-key agreement protocol
developed by Rivest and Sherman, which is based on so-called strongly
noninvertible associative one-way functions. This protocol can be modified to
yield a digital signature scheme as well.
Section~\ref{sec:zero-knowledge} introduces the fascinating area of
interactive proof systems and zero-knowledge protocols that has practical
applications in cryptography, especially for authentication issues. In
particular, a zero-knowledge protocol for the graph isomorphism problem is
presented. On the other hand, this area is also central to complexity theory
and will be revisited in Chapter~\ref{chap:complexity}, again in connection to
the graph isomorphism problem.
\vspace{-2.5mm}
\section{Foundations}\label{sec:foundations-cryptology}
\vspace{-1.5mm}
\ki{Cryptography}\inddef{cryptography} is the art and science of designing secure cryptosystems,
which are used to encrypt texts and messages so that they be kept secret and
unauthorised decryption is prevented, whereas the authorised receiver is able
to efficiently decrypt the ciphertext received. This section presents two
classical symmetric cryptosystems. In subsequent sections, some important
asymmetric cryptosystems and cryptographic protocols are introduced. A
``protocol'' is dialog between two (or more) parties, where a ``party'' may be
either a human being or a computing machine. Cryptographic protocols can have
various cryptographic purposes. They consist of algorithms that are jointly
executed by more than one party.
\ki{Cryptanalysis}\inddef{cryptanalysis} is the art and science of (unauthorised) decryption of
ciphertexts and of breaking existing cryptosystems. \ki{Cryptology}\inddef{cryptology}
captures both these fields, cryptography and cryptanalysis. In this chapter,
we focus on \ki{cryptographic algorithms}\textbf{.}\inddef{cryptographic algorithms} Algorithms of cryptanalysis,
which are used to break cryptographic protocols and systems, will be mentioned
as well but will not be investigated in detail.
\subsection{Cryptography}
Figure~\ref{fig:scenario} shows a
typical scenario in cryptography: Alice and Bob communicate over an insecure
channel that is eavesdropped by Erich, and thus they encrypt their messages
using a cryptosystem.
\begin{defi}[Cryptosystem]\label{def:cryptosystem}\inddef{cryptosystem}
A \ki{cryptosystem} is a quintuple $(\mathcal{P}, \mathcal{C}, \mathcal{K},
\mathcal{E}, \mathcal{D})$ with the following properties:
\begin{enumerate}
\item $\mathcal{P}$, $\mathcal{C}$, and $\mathcal{K}$ are finite sets, where
$\mathcal{P}$ is the \ki{plaintext space,}\inddef{plaintext space} $\mathcal{C}$ is the
\ki{ciphertext space,}\inddef{ciphertext space} and $\mathcal{K}$ is the
\ki{key space.}\inddef{key space} The elements of $\mathcal{P}$ are called the
\ki{plaintexts,}\inddef{plaintext} and the elements of $\mathcal{C}$ are called the
\ki{ciphertexts.}\inddef{ciphertext} A \ki{message}\inddef{message} is a
string of plaintext symbols.
\item $\mathcal{E} = \{E_k \condition k \in \mathcal{K}\}$ is a family of
functions $E_k : \mathcal{P} \rightarrow \mathcal{C}$, which are used for
encryption. $\mathcal{D} = \{D_k \condition k \in \mathcal{K}\}$ is a
family of functions $D_k : \mathcal{C} \rightarrow \mathcal{P}$, which are
used for decryption.
\item For each key $e \in \mathcal{K}$, there exists some key $d \in
\mathcal{K}$ such that for each plaintext $p \in \mathcal{P}$,
\begin{equation}
\label{equ:correctness}
D_d (E_e (p)) = p \koz .
\end{equation}
\end{enumerate}
A cryptosystem is said to be \ki{symmetric}\inddef{cryptosystem!symmetric} (or \textit{private-key}) if
either $d = e$ or if $d$ at least can ``easily'' be determined from~$e$. A
cryptosystem is said to be \ki{asymmetric}\inddef{cryptosystem!symmetric} (or \textit{public-key}) if $d
\neq e$ and it is ``computationally infeasible'' to determine the private key
$d$ from the corresponding public key~$e$.
\end{defi}
At times, we may use distinct key spaces for encryption and decryption, with
the above definition modified accordingly.
We now introduce some easy examples of classical symmetric cryptosystems.
Consider the alphabet $\Sigma = \{\mbox{\rm{}A}, \mbox{\rm{}B}, \ldots ,
\mbox{\rm{}Z}\}$, which will be used both for the plaintext space and for the
ciphertext space. We identify $\Sigma$ with $\integers_{26} = \{0, 1, \ldots
, 25\}$ so as to be able to perform calculations with letters as if they were
numbers. The number $0$ corresponds to the letter \mbox{\rm{}A}, the $1$
corresponds to~\mbox{\rm{}B}, and so on. This coding of plaintext or
ciphertext symbols by nonnegative integers is not part of the actual
encryption or decryption.
Messages are elements of~$\sigmastar$, where $\sigmastar$ denotes the set of
strings over~$\Sigma$. If some message $m \in \sigmastar$ is subdivided into
blocks of length $n$ and is encrypted blockwise, as it is common in many
cryptosystems, then each block of $m$ is viewed as an element
of~$\integers_{26}^{n}$.
\begin{pelda} (Shift Cipher) \inddef{cipher!shift}
The first example is a monoalphabetic symmetric cryptosystem. Let
$\mathcal{K} = \mathcal{P} = \mathcal{C} = \integers_{26}$. The
\ki{shift cipher}\inddef{shift cipher} encrypts messages by shifting every plaintext symbol by the same
number $k$ of letters in the alphabet modulo~$26$. Shifting each letter in
the ciphertext back using the same key~$k$, the original plaintext is
recovered. For each key $k \in \integers_{26}$, the encryption function
$E_k$ and the decryption function $D_k$ are defined by:
\begin{eqnarray*}
E_k (m) & = & (m + k) \bmod 26 \\
D_k (c) & = & (c - k) \bmod 26 \koz ,
\end{eqnarray*}
where addition and subtraction by $k$ modulo $26$ are carried out
characterwise.
\begin{figure}[!t]
\centering
\footnotesize
\begin{tabular}{|c|@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.8mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.8mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.8mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.8mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.8mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}|}
\hline
$m$ & S & H & I & F & T & E & A & C & H & L & E & T
& T & E & R & T & O & T & H & E & L & E & F & T \\ \hline
$c$ & R & G & H & E & S & D & Z & B & G & K & D & S
& S & D & Q & S & N & S & G & D & K & D & E & S \\
\hline
\end{tabular}
\caption{Example of an encryption by the shift cipher.
\label{tab:caesar}
}
\end{figure}
Figure~\ref{tab:caesar} shows an encryption of the message $m$ by the shift
cipher with key $k = 25$. The resulting ciphertext is~$c$. Note that the
particular shift cipher with key $k = 3$ is also known as the
\ki{Caesar cipher,}\inddef{cipher!Caesar} since the Roman Emperor allegedly used this cipher during his wars
to keep messages secret.\footnote{ Historic remark: Gaius Julius Caesar reports
in his book \textit{De Bello Gallico} that he sent an encrypted message to Q.
Tullius Cicero (the brother of the famous speaker) during the Gallic Wars
(58 until 50 B.C.). The system used was monoalphabetic and replaced Latin
letters by Greek letters; however, it is not explicitly mentioned there if
the cipher used indeed was the shift cipher with key $k=3$. This
information was given later by Suetonius.} This cipher is a very simple
substitution cipher in which each letter is substituted by a certain letter of
the alphabet.
\end{pelda}
Since the key space is very small, the shift cipher can easily be broken. It
is already vulnerable by attacks in which the attacker knows the ciphertext
only, simply by checking which of the $26$ possible keys reveals a
meaningful plaintext, provided that the ciphertext is long enough to allow
unique decryption.
The shift cipher is a monoalphabetic cryptosystem, since every plaintext
letter is replaced by one and the same letter in the ciphertext. In contrast,
a polyalphabetic cryptosystem can encrypt the same plaintext symbols by
different ciphertext symbols, depending on their position in the text. Such a
polyalphabetic cryptosystem that is based on the shift cipher, yet much harder
to break, was proposed by the French diplomat Blaise de Vigen\`{e}re (1523
until 1596).\nevindex{Vigen\`{e}re, Blaise, de (1523--1596)}
His system builds on previous work by the Italian mathematician
Leon Battista Alberti\nevindex{Alberti, Leon Battista (1404--1472)} (born in 1404),
the German abbot Johannes Trithemius\nevindex{Trithemius, Johannes (1492--1516)}
(born in 1492), and the Italian scientist Giovanni Porta (born in 1675).\nevindex{Porta, Giovanni (1675--1755)}
It works like the shift cipher, except that the letter that encrypts some
plaintext letter varies with its position in the text.
\begin{pelda}[Vigen\`{e}re Cipher]\inddef{cipher!Vigen\`{e}re}
\label{exa:vigenerechiffre}
This symmetric polyalphabetic cryptosystem uses a so-called
\ki{Vigen\`{e}re square,}\inddef{Vigen\`{e}re square} a matrix consisting of $26$ rows and $26$ columns, see
Figure~\ref{tab:vigenere-square}. Every row has the $26$ letters of the
alphabet, shifted from row to row by one position. That is, the single rows
can be seen as a shift cipher obtained by the keys $0, 1, \ldots , 25$. Which
row of the Vigen\`{e}re square is used for encryption of some plaintext symbol
depends on its position in the text.
\begin{figure}[!t]
\begin{center}
\footnotesize
\begin{tabular}{|@{\hspace{2pt}}r@{\hspace{1pt}}||@{\hspace*{2pt}}c@{\hspace*{2pt}}|@{\hspace*{2pt}}c@
{\hspace*{2pt}}|@{\hspace*{2pt}}c@{\hspace*{2pt}}|@{\hspace*{2pt}}c@
{\hspace*{2pt}}|@{\hspace*{2pt}}c@{\hspace*{2pt}}|@{\hspace*{2pt}}c@
{\hspace*{2pt}}|@{\hspace*{2pt}}c@{\hspace*{2pt}}|@{\hspace*{2pt}}c@
{\hspace*{2pt}}|@{\hspace*{2pt}}c@{\hspace*{2pt}}|@{\hspace*{2pt}}c@
{\hspace*{2pt}}|@{\hspace*{2pt}}c@{\hspace*{2pt}}|@{\hspace*{2pt}}c@
{\hspace*{2pt}}|@{\hspace*{2pt}}c@{\hspace*{2pt}}|@{\hspace*{2pt}}c@
{\hspace*{2pt}}|@{\hspace*{2pt}}c@{\hspace*{2pt}}|@{\hspace*{2pt}}c@
{\hspace*{2pt}}|@{\hspace*{2pt}}c@{\hspace*{2pt}}|@{\hspace*{2pt}}c@
{\hspace*{2pt}}|@{\hspace*{2pt}}c@{\hspace*{2pt}}|@{\hspace*{2pt}}c@
{\hspace*{2pt}}|@{\hspace*{2pt}}c@{\hspace*{2pt}}|@{\hspace*{2pt}}c@
{\hspace*{2pt}}|@{\hspace*{2pt}}c@{\hspace*{2pt}}|@{\hspace*{2pt}}c@
{\hspace*{2pt}}|@{\hspace*{2pt}}c@{\hspace*{2pt}}|@{\hspace*{2pt}}c@
{\hspace*{2pt}}|}
\hline
$0$ & A & B & C & D & E & F & G & \fbox{\hspace*{-1mm}{\bf{}H}\hspace*{-1mm}}
& I & J & K & L & M
& N & O & P & Q & R & S & T & U & V & W & X & Y & Z \\ \hline
$1$ & B & C & D & E & F & G & H & I & J & K & L & M & N
& O & P & Q & R & S & T & U & V & W & X & Y & Z & A \\ \hline
$2$ & C & D & E & F & G & H & I & J & K & L & M & N & O
& P & Q & R & S & T & U & V & W & X & Y & Z & A & B \\ \hline
$3$ & D & E & F & G & H & I & J & K & L & M & N & O & P
& Q & R & S & T & U & V & W & X & Y & Z & A & B & C \\ \hline
$4$ & E & F & G & H & I & J & K & L & M & N & O & P & Q
& R & S & T & U & V & W & X & Y & Z & A & B & C & D \\ \hline
$5$ & F & G & H & I & J & K & L & M & N & O & P & Q & R
& S & T & U & V & W & X & Y & Z & A & B & C & D & E \\ \hline
$6$ & G & H & I & J & K & L & M & N & O & P & Q & R & S
& T & U & V & W & X & Y & Z & A & B & C & D & E & F \\ \hline
$7$ & H & I & J & K & L & M & N & O & P & Q & R & S & T
& U & V & W & X & Y & Z & A & B & C & D & E & F & G \\ \hline
$8$ & I & J & K & L & M & N & O & P & Q & R & S & T & U
& V & W & X & Y & Z & A & B & C & D & E & F & G & H \\ \hline
$9$ & J & K & L & M & N & O & P & Q & R & S & T & U & V
& W & X & Y & Z & A & B & C & D & E & F & G & H & I \\ \hline
$10$ & K & L & M & N & O & P & Q & R & S & T & U & V & W
& X & Y & Z & A & B & C & D & E & F & G & H & I & J \\ \hline
$11$ & L & M & N & O & P & Q & R & S & T & U & V & W & X
& Y & Z & A & B & C & D & E & F & G & H & I & J & K \\ \hline
$12$ & M & N & O & P & Q & R & S & T & U & V & W & X & Y
& Z & A & B & C & D & E & F & G & H & I & J & K & L \\ \hline
$13$ & N & O & P & Q & R & S & T & U & V & W & X & Y & Z
& A & B & C & D & E & F & G & H & I & J & K & L & M \\ \hline
$14$ & O & P & Q & R & S & T & U & V & W & X & Y & Z & A
& B & C & D & E & F & G & H & I & J & K & L & M & N \\ \hline
$15$ & P & Q & R & S & T & U & V & W & X & Y & Z & A & B
& C & D & E & F & G & H & I & J & K & L & M & N & O \\ \hline
$16$ & Q & R & S & T & U & V & W & X & Y & Z & A & B & C
& D & E & F & G & H & I & J & K & L & M & N & O & P \\ \hline
$17$ & R & S & T & U & V & W & X & Y & Z & A & B & C & D
& E & F & G & H & I & J & K & L & M & N & O & P & Q \\ \hline
$18$ & S & T & U & V & W & X & Y & Z & A & B & C & D & E
& F & G & H & I & J & K & L & M & N & O & P & Q & R \\ \hline
$19$ & \fbox{\hspace*{-1mm}{\bf{}T}\hspace*{-1mm}}
& U & V & W & X & Y & Z & \fbox{\hspace*{-1mm}{\bf{}A}\hspace*{-1mm}}
& B & C & D & E & F
& G & H & I & J & K & L & M & N & O & P & Q & R & S \\ \hline
$20$ & U & V & W & X & Y & Z & A & B & C & D & E & F & G
& H & I & J & K & L & M & N & O & P & Q & R & S & T \\ \hline
$21$ & V & W & X & Y & Z & A & B & C & D & E & F & G & H
& I & J & K & L & M & N & O & P & Q & R & S & T & U \\ \hline
$22$ & W & X & Y & Z & A & B & C & D & E & F & G & H & I
& J & K & L & M & N & O & P & Q & R & S & T & U & V \\ \hline
$23$ & X & Y & Z & A & B & C & D & E & F & G & H & I & J
& K & L & M & N & O & P & Q & R & S & T & U & V & W \\ \hline
$24$ & Y & Z & A & B & C & D & E & F & G & H & I & J & K
& L & M & N & O & P & Q & R & S & T & U & V & W & X \\ \hline
$25$ & Z & A & B & C & D & E & F & G & H & I & J & K & L
& M & N & O & P & Q & R & S & T & U & V & W & X & Y \\ \hline
\end{tabular}
\caption{Vigen\`{e}re square: Plaintext {\rm{}``H''} is encrypted as
{\rm{}``A''} by key {\rm{}``T''}.
\label{tab:vigenere-square}
}
\end{center}
\end{figure}
Messages are subdivided into blocks of a fixed length $n$ and are encrypted
blockwise, i.e., $\mathcal{K} = \mathcal{P} = \mathcal{C} =
\integers_{26}^{n}$. The block length $n$ is also called the {\em period\/}
of the system. In what follows, the $i$th symbol in a string $w$ is denoted
by~$w_i$.
For each key $k \in \integers_{26}^{n}$, the encryption function $E_k$ and
the decryption function~$D_k$, both mapping from $\integers_{26}^{n}$ to
$\integers_{26}^{n}$, are defined by:
\begin{eqnarray*}
E_k (b) & = & (b + k) \bmod 26 \\
D_k (c) & = & (c - k) \bmod 26,
\end{eqnarray*}
where addition and subtraction by $k$ modulo $26$ are again carried out
characterwise. That is, the key $k \in \integers_{26}^{n}$ is written letter
by letter above the symbols of the block $b \in \integers_{26}^{n}$ to be
encrypted. If the last plaintext block has less than $n$ symbols, one uses
less key symbols accordingly. In order to encrypt the $i$th plaintext symbol
$b_i$, which has the key symbol $k_i$ sitting on top, use the $i$th row of the
Vigen\`{e}re square as in the shift cipher.
\begin{figure}[!t]
\centering
\rm\small
\begin{tabular}{|c|c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.8mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.8mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.8mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.8mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.8mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}|}
\hline
$k$ & T & O & N & Y & T & O & N & Y & T & O & N & Y & T & O
& N & Y & T & O & N & Y & T & O & N & Y & T & O & N & Y \\\hline
$m$ & H & U & N & G & A & R & I & A & N & I & S & A & L & L
& G & R & E & E & K & T & O & G & E & R & M & A & N & S \\\hline
$c$ & A & I & A & E & T & F & V & Y & G & W & F & Y & E & Z
& T & P & X & S & X & R & H & U & R & P & F & O & A & Q \\ \hline
\end{tabular}
\caption{Example of an encryption by the Vigen\`{e}re cipher.
\label{tab:vigenere}
}
\end{figure}
For example, choose the block length $n = 4$ and the key $k =
\mbox{\rm{}TONY}$. Figure~\ref{tab:vigenere} shows the encryption of the
message~$m$, which consists of seven blocks, into the ciphertext
$c$ by the Vigen\`{e}re cipher using key~$k$.
To the first plaintext letter, {\rm{}``H''}, the key symbol {\rm{}``T''} is
assigned. The intersection of the {\rm{}``H''} column with the {\rm{}``T''}
row of the Vigen\`{e}re square yields {\rm{}``A''} as the first letter of the
ciphertext, see Figure~\ref{tab:vigenere-square}.
\end{pelda}
There are many other classical cryptosystems, which will not be described in
detail here. There are various ways to classify cryptosystems according to
their properties or to the specific way they are designed. In
Definition~\ref{def:cryptosystem}, the distinction between \textit{symmetric}
and \textit{asymmetric} cryptosystems was explained. The two examples above
(the shift cipher and the Vigen\`{e}re cipher) demonstrated the distinction
between \textit{monoalphabetic} and \textit{polyalphabetic}
systems.\index{system!monoalphabetic}\index{system!polyalphabetic} Both are
\ki{substitution ciphers,}\inddef{cipher!substitution} which may be contrasted with
\ki{permutation ciphers}\inddef{cipher!permutation} (a.k.a.
\ki{transposition ciphers})\inddef{cipher!transposition} in which the plaintext
letters are not substituted by certain ciphertext letters but go to another
position in the text remaining otherwise unchanged.
Moreover, \ki{block ciphers}\inddef{cipher!block} such as the Vigen\`{e}re system can be
contrasted with \ki{stream ciphers,}\inddef{cipher!stream} which produce a continuous stream of key
symbols depending on the plaintext context to be encrypted. One can also
distinguish between different types of block ciphers. An important type are
the \ki{affine linear block ciphers,}\inddef{cipher!affin linear block} which are defined by affine linear
encryption functions $E_{(A,\vec{b})}$ and decryption functions
$D_{(A^{-1},\vec{b})}$, both mapping from $\integers_{m}^{n}$
to~$\integers_{m}^{n}$. That is, they are of the following form:
\begin{eqnarray}
\label{eqn:affin-linear-blockchiffre}
E_{(A,\vec{b})}(\vec{x}) & = & A \vec{x} + \vec{b} \bmod m \koz , \\
D_{(A^{-1},\vec{b})}(\vec{y}) & = & A^{-1} (\vec{y} - \vec{b}) \bmod m \koz .
\nonumber
\end{eqnarray}
Here, $(A,\vec{b})$ and $(A^{-1},\vec{b})$ are the keys used for encryption
and decryption, respectively; $A$ is a $(n \times n)$ matrix with entries
from~$\integers_m$; $A^{-1}$ is the inverse matrix for~$A$; $\vec{x}$,
$\vec{y}$, and $\vec{b}$ are vectors in $\integers_{m}^{n}$, and all
arithmetics is carried out modulo~$m$. Some mathematical explanations are in
order (see also Definition~\ref{def:group} in Subsection~\ref{sec:algebra}): An
$(n \times n)$ matrix $A$ over the ring $\integers_m$ has a multiplicative
inverse if and only if \mbox{$\gcd(\det A, m) = 1$}. The {\em inverse matrix
for $A$\/} is defined by $A^{-1} = (\det A)^{-1} A_{\tt adj}$, where $\det
A$ is the determinant of $A$, and $A_{\tt adj} = ((-1)^{i+j} \det A_{j,i})$ is
the {\em adjunct matrix for~$A$}. The {\em determinant\/} $\det A$ of $A$ is
recursively defined: For $n = 1$ and $A = (a)$, $\det A = a$; for $n > 1$ and
each $i \in \{1, 2, \ldots , n\}$, $\det A = \sum_{j=1}^{n} (-1)^{i+j} a_{i,j}
\det A_{i,j}$, where $a_{i,j}$ denotes the $(i,j)$ entry of $A$ and the $(n-1)
\times (n-1)$ matrix $A_{i,j}$ results from $A$ by cancelling the $i$th row and
the $j$th column. The determinant of a matrix and thus its inverse (if it
exists) can be computed efficiently,
see Problem \ref{ex:determinant}.
For example, the Vigen\`{e}re cipher is an affine linear cipher
whose key contains the unity matrix as its first component.
If $\vec{b}$ in~(\ref{eqn:affin-linear-blockchiffre}) is the zero vector, then
it is a {\em linear block cipher}. A classical example is the
\ki{Hill cipher,}\inddef{cipher!Hill} invented by Lester Hill in 1929.
Here, the key space is the set of
all $(n \times n)$ matrices $A$ with entries in $\integers_m$ such that
$\gcd(\det A, m) = 1$. This condition guarantees the invertibility of those
matrices that are allowed as keys, since the inverse matrix $A^{-1}$ is used
for decryption of the messages encrypted by key~$A$. For each key~$A$, the
Hill cipher is defined by the encryption function $E_A (\vec{x}) = A \vec{x}
\bmod m$ and the decryption function $D_{A^{-1}} (\vec{y}) = A^{-1} \vec{y}
\bmod m$. Thus, it is the most general linear cipher. The permutation cipher
also is linear, and hence is a special case of the Hill cipher.
\subsection{Cryptanalysis}
Cryptanalysis aims at breaking existing cryptosystems and, in
particular, at determining the decryption keys. In order to
characterise the security or vulnerability of the cryptosystem
considered, one distinguishes different types of attacks according to
the information available for the attacker. For the shift cipher,
\ki{ciphertext-only} attacks\inddef{attack!ciphertext-only}
were already mentioned. They are
the weakest type of attack, and a cryptosystem that does not resist
such attacks is not of much value.
Affine linear block ciphers such as the Vigen\`{e}re and the Hill
cipher are vulnerable to attacks in which the attacker knows the
plaintext corresponding to some ciphertext obtained and is able to
conclude the keys used. These attacks are called
\ki{known-plaintext attacks.}\inddef{attack!known-plaintext}
Affine linear block ciphers are even more
vulnerable to \ki{chosen-plaintext attacks,}\inddef{attack!chosen-plaintext} in which the
attacker can choose some plaintext and is then able to see which
ciphertext corresponds to the plaintext chosen. Another type of
attack is in particular relevant for asymmetric cryptosystems: In
an \ki{encryption-key attack,}\inddef{attack!encryption-key} the attacker merely knows the
public key but does not know any ciphertext yet, and seeks to
determine the private key from this information. The difference is
that the attacker now has plenty of time to perform computations,
whereas in the other types of attacks the ciphertext was already sent
and much less time is available to decrypt it. That is why keys of
much larger size are required in public-key cryptography to guarantee
the security of the system used. Hence, asymmetric cryptosystems are
much less efficient than symmetric cryptosystems in many practical
applications.
For the attacks mentioned above, the method of frequency counts is
often useful. This method exploits the redundancy of the natural
language used. For example, in many natural languages, the letter
``E'' occurs statistically significant most frequently. On average,
the ``E'' occurs in long, ``typical'' texts with a percentage of
$12.31\%$ in English, of $15.87\%$ in French, and even of $18.46\%$ in
German. In other languages, different letters may occur most frequently. For
example, the ``A'' is the most frequent letter in long, ``typical''
Finnish texts, with a percentage of $12.06\%$.
That the method of frequency counts is useful for attacks on
monoalphabetic cryptosystems is obvious. For example, if in a
ciphertext encrypting a long German text by the shift cipher, the
letter occurring most frequently is ``Y'', which is rather rare in
German (as well as in many other languages), then it is most likely
that ``Y'' encrypts~``E''. Thus, the key used for encryption is ``U''
($k = 20$). In addition to
counting the frequency of single letters, one can also count the
frequency with which certain pairs of letters (so-called digrams) and
triples of letters (so-called trigrams) occur, and so on. This kind
of attack also works for polyalphabetic systems, provided the period
(i.e., the block length) is known.
Polyalphabetic cryptosystems with an unknown period, however, provide
more security. For example, the Vigen\`{e}re cipher resisted each
attempt of breaking it for a long time. No earlier than in 1863, about
300 years after its discovery, the German cryptanalyst Friedrich
Wilhelm Kasiski found a method of breaking the Vigen\`{e}re cipher.
He showed how to determine the period, which initially is unknown,
from repetitions of the same substring in the ciphertext.
Subsequently, the ciphertext can be decrypted by means of frequency
counts. Singh writes that the British
eccentric Charles Babbage, who was considered a genius of his time
by many, presumably had discovered Kasiski's method even earlier, around 1854,
although he didn't publish his work.
The pathbreaking work of Claude Shannon (1916
until 2001), the father of modern coding and information theory, is
now considered a milestone in the history of cryptography. Shannon
proved that there exist cryptosystems that guarantee perfect secrecy
in a mathematically rigorous sense. More precisely, a cryptosystem
$(\mathcal{P}, \mathcal{C}, \mathcal{K}, \mathcal{E}, \mathcal{D})$
\ki{guarantees perfect secrecy}\inddef{perfect secrecy} if and only if
$|\mathcal{P}| = |\mathcal{C}| =
|\mathcal{K}|$, the keys in $\mathcal{K}$ are uniformly distributed,
and for each plaintext $p \in \mathcal{P}$ and for each ciphertext $c
\in \mathcal{C}$ there exists {\em exactly one\/} key $k \in
\mathcal{K}$ with $E_k(p) = c$. That means that such a cryptosystem
often is not useful for practical applications, since in order to
guarantee perfect secrecy, every key must be at least as long as the
message to be encrypted and can be used only once.
\subsection{Algebra, number theory, and graph theory}\label{sec:algebra}
In order to understand some of the algorithms and problems to be
presented later, some fundamental notions, definitions, and results
from algebra and, in particular, from number theory, group theory, and
graph theory are required. This concerns both the cryptosystems and
zero-knowledge protocols in Chapter~\ref{chap:cryptology} and some of
the problems to be considered in upcoming
Chapter~\ref{chap:complexity}. The present subsection may as well be
skipped for now and revisited later when the required notions and
results come up. In this section, most of the proofs are omitted.
\begin{defi}[Group, ring, and field]\label{def:group}
\inddef{group}\inddef{ring}\inddef{field}
\begin{itemize}
\item A \ki{group} $\mathfrak{G} = (S, \circ)$ is defined by some
nonempty set $S$ and a two-ary operation $\circ$ on~$S$ that satisfy
the following axioms:
\begin{itemize}
\item {\em Closure:\/} $(\forall x \in S)\, (\forall y \in S)\, [x \circ y
\in S]$ \koz .
\item {\em Associativity:\/} $(\forall x \in S)\, (\forall y \in S)\, (\forall
z \in S)\, [(x \circ y) \circ z = x \circ (y \circ z)]$ \koz .
\item {\em Neutral element:\/} $(\exists e \in S)\, (\forall x \in S)\, [e
\circ x = x \circ e = x]$ \koz .
\item {\em Inverse element:\/} $(\forall x \in S)\, (\exists x^{-1} \in S)\, [x
\circ x^{-1} = x^{-1} \circ x = e]$ \koz .
\end{itemize}
The element $e$ is called the \ki{neutral element of the group $\mathfrak{G}$}.\inddef{neutral element of a group}
The element $x^{-1}$ is called the
\ki{inverse element for$x$.}\inddef{inverse element for $x$} $\mathfrak{G}$ is said to be a
\ki{monoid}\inddef{monoid} if $\mathfrak{G}$ satisfies associativity and closure under~$\circ$,
even if $\mathfrak{G}$ has no neutral element or if not every element
in $\mathfrak{G}$ has an inverse. A group $\mathfrak{G} = (S, \circ)$
(respectively, a monoid~$\mathfrak{G} = (S, \circ)$) is said to be
\ki{commutative}\inddef{monoid!commutative} (or \textit{abelian})\index{abelian monoid|see{monoid}}
if and only if $x \circ y = y \circ x$ for all $x, y \in S$. The number of elements of a finite
group $\mathfrak{G}$ is said to be \ki{the order $\mathfrak{G}$}\inddef{order of a group}
and is denoted by~$|\mathfrak{G}|$.
\item $\mathfrak{H} = (T, \circ)$ is said to be a
\ki{subgroup of a group $\mathfrak{G} = (S, \circ)$}\inddef{subgroup of a group} (denoted by
$\mathfrak{H} \leq \mathfrak{G}$) if and only if $T \seq S$ and $\mathfrak{H}$ satisfies
the group axioms.
\item A \ki{ring}\inddef{ring} is a triple $\mathfrak{R} = (S, +, \cdot)$ such
that $(S, +)$ is an abelian group and $(S, \cdot)$ is a monoid and
the distributive laws are satisfied:
\begin{equation*}
(\forall x \in S)\, (\forall y \in S)\, (\forall
z \in S)\ :
\end{equation*}
\begin{equation*}
(x \cdot (y+z) = (x \cdot y) + (x \cdot z)) \wedge
((x+y) \cdot z = (x \cdot z) + (y \cdot z)) \koz.
\end{equation*}
A ring $\mathfrak{R} = (S, +, \cdot)$ is said to be \ki{commutative}\inddef{ring!commutative} if and
only if the monoid $(S, \cdot)$ is commutative. The neutral element group
$(S,+)$ is called the \ki{zero element}\inddef{zero element of a group} (the \textit{zero}, for short) of the
ring~$\mathfrak{R}$. A neutral element of the monoid $(S, \cdot)$ is called
the \ki{one element}\inddef{one element of a group} (the \textit{one}, for short) of the ring~$\mathfrak{R}$.
\item Let $\mathfrak{R} = (S, +, \cdot)$ be a ring with one. An
element $x$ of $\mathfrak{R}$ is said to be \ki{invertible}\inddef{invertible element of a ring} (or
a \textit{unity of~$\mathfrak{R}$}) if and only if it is invertible in
the monoid~$(S, \cdot)$.
\item A \ki{field}\inddef{field} is a commutative ring with one in which each
nonzero element is invertible.
\end{itemize}
\end{defi}
\begin{pelda}[Group, ring, and field]
\label{exa:gruppe}
\begin{itemize}
\item Let $k \in \N$. The set $\Z_k = \{0, 1, \ldots ,
k-1\}$ is a finite group with respect to addition modulo~$k$, with
neutral element $0$. With respect to addition and multiplication
modulo~$k$, $\Z_k$ is a commutative ring with one, see
Problem \ref{ex:field}. If
$p$ is a \textit{prime number}, then $\Z_{p}$ is a field with respect to
addition and multiplication modulo~$p$.
\item Let $\textrm{gcd}{n,m}$ denote the greatest common divisor of two
integers $m$ and~$n$. For $k \in \N$, define the set
$\Z_{k}^{\ast} = \{ i \condition \mbox{$1 \leq i \leq k-1$
and $\textrm{gcd}{i,k} = 1$}\}$. With respect to multiplication modulo~$k$,
$\Z_{k}^{\ast}$ is a finite group with neutral element~$1$.
\end{itemize}
\end{pelda}
If the operation $\circ$ of a group $\mathfrak{G} = (S, \circ)$ is clear from
the context, we omit stating it explicitly. The group $\Z_{k}^{\ast}$
from Example~\ref{exa:gruppe} will play a particular role in
Section~\ref{sec:rsa-factoring}, where the RSA cryptosystem is introduced.
The \textit{Euler function}\index{Euler function} $\varphi$ gives the order of this group, i.e.,
$\varphi(k) = |\Z_{k}^{\ast}|$. The following properties of
$\varphi$ follow from the definition:
\begin{itemize}
\item $\varphi(m \cdot n) = \varphi(m) \cdot \varphi(n)$ for all $m, n \in
\N$ with $\textrm{gcd}{m,n} = 1$, and
\item $\varphi(p) = p - 1$ for all prime numbers~$p$.
\end{itemize}
The proof of these properties is left to the reader as
Exercise \ref{ex:euler-funktion}. In particular, we will apply
the following fact in Subsection~\ref{sec:rsa}, which is a consequence of the
properties above.
\begin{allit}
\label{fac:euler-funktion}
If $n = p \cdot q$ for prime numbers $p$ and $q$, then $\varphi(n) =
(p-1)(q-1)$.
\end{allit}
Euler's Theorem below is a special case (namely, for the
group~$\Z_{n}^{\ast}$) of Lagrange's Theorem, which says that for each
group element $a$ of a finite multiplicative group $\mathfrak{G}$ of order
$|\mathfrak{G}|$ and with neutral element~$e$, $a^{|\mathfrak{G}|} = e$. The
special case of Euler's\nevindex{Euler, Leonhard (1707--1783)} theorem,
where $n$ is a prime number not dividing~$a$,
is known as Fermat's Little Theorem.
\begin{tetel}[Euler]
For each $a \in \Z_{n}^{\ast}$, $a^{\varphi(n)} \equiv 1 \bmod n$.
\end{tetel}
\begin{kov}[Fermat's Little Theorem]
\label{thm:fermat}
If $p$ is a prime number and $a \in \Z_{p}^{\ast}$, then $a^{p-1}
\equiv 1 \bmod p$.
\end{kov}
In Section~\ref{sec:graphisomorphie}, algorithms for the graph
isomorphism problem will be presented. This problem, which also is
related to the zero-knowledge protocols to be introduced in
Subsection~\ref{sec:zero-knowledge-for-graphiso}, can be seen as a
special case of certain group-theoretic problems. In particular,
\textit{permutation groups} are of interest here. Some examples for
illustration will be presented later.
\begin{defi}[Permutation group]\label{def:permutationgroup}
\begin{itemize}
\item A \ki{permutation}\inddef{permutation} is a bijective mapping of a set onto
itself. For each integer $n \geq 1$, let $[n] = \{1, 2, \ldots ,
n\}$. The set of all permutations of $[n]$ is denoted
by~$\mathfrak{S}_n$. For algorithmic purposes, permutations $\pi \in
\mathfrak{S}_n$ are given as pairs $(i, \pi(i))$ from $[n] \times [n]$.
\item If one defines the composition of permutations as an operation
on~$\mathfrak{S}_n$, then $\mathfrak{S}_n$ becomes a group. For two
permutations $\pi$ and $\tau$ in~$\mathfrak{S}_n$, their {\em
composition\/} $\pi \tau$ is defined to be that permutation in
$\mathfrak{S}_n$ that results from first applying $\pi$ and then
$\tau$ to the elements of $[n]$, i.e., $(\pi \tau)(i) = \tau (\pi
(i))$ for each $i \in [n]$. The neutral element of the permutation
group $\mathfrak{S}_n$ is the \ki{identical permutation,}\inddef{identical permutation}\index{permutation!identical}
which is defined by $\idJR (i) = i$ for each $i \in [n]$. The subgroup of
$\mathfrak{S}_n$ that contains $\mbox{\rm{}id}$ as its only element is denoted
by $\boldid$.
\item For any subset $\mathfrak{T}$ of~$\mathfrak{S}_{n}$, the
\ki{permutation group $\langle\mathfrak{T}\rangle$ generated by $\mathfrak{T}$}\inddef{permutation group}
is defined as the smallest subgroup of
$\mathfrak{S}_{n}$ containing~$\mathfrak{T}$. Subgroups
$\mathfrak{G}$ of $\mathfrak{S}_{n}$ are represented by their
generating sets, sometimes dubbed the \textit{generators of~$\mathfrak{G}$}.
In~$\mathfrak{G}$, the \ki{orbit of an element $i \in [n]$}\inddef{orbit of an element}
is defined as $\mathfrak{G}(i) = \{ \pi(i) \condition \pi \in \mathfrak{G}\}$.
\item For any subset $T$ of~$[n]$, let $\mathfrak{S}_{n}^{T}$ denote
the subgroup of~$\mathfrak{S}_n$ that maps every element of $T$ onto
itself. In particular, for $i \leq n$ and a subgroup $\mathfrak{G}$
of $\mathfrak{S}_n$, the \ki{(pointwise) stabiliser of $[i]$ in
$\mathfrak{G}$}\inddef{pointwise stabiliser}\inddef{stabiliser}\index{stabiliser!pointwise} is defined by
\[
\mathfrak{G}^{(i)} = \{\pi \in \mathfrak{G} \condition \mbox{$\pi(j) =
j$ for each $j \in [i]$} \} \koz .
\]
Observe that $\mathfrak{G}^{(n)} = \boldid$ and $\mathfrak{G}^{(0)} =
\mathfrak{G}$.
\item Let $\mathfrak{G}$ and $\mathfrak{H}$ be permutation groups with
$\mathfrak{H} \leq \mathfrak{G}$. For~$\tau \in \mathfrak{G}$,
$\mathfrak{H} \tau = \{ \pi \tau \condition \pi \in \mathfrak{H}\}$
is said to be a \ki{right coset of $\mathfrak{H}$ in
$\mathfrak{G}$}.\inddef{right coset} Any two right cosets of $\mathfrak{H}$ in
$\mathfrak{G}$ are either identical or disjoint. Thus, the
permutation group $\mathfrak{G}$ is partitioned by the right cosets
of $\mathfrak{H}$ in~$\mathfrak{G}$:
\begin{eqnarray}
\label{equ:right-coset}
\mathfrak{G} = \mathfrak{H} \tau_1 \cup \mathfrak{H} \tau_2 \cup
\cdots \cup \mathfrak{H} \tau_k \koz .
\end{eqnarray}
Every right coset of $\mathfrak{H}$ in $\mathfrak{G}$ has the
cardinality~$|\mathfrak{H}|$. The set $\{\tau_1, \tau_2, \ldots ,
\tau_k\}$ in~(\ref{equ:right-coset}) is called the \ki{complete right
transversal of $\mathfrak{H}$ in~$\mathfrak{G}$}.\inddef{complete right transversal}
\end{itemize}
\end{defi}
The notion of pointwise stabilisers is especially important for the
design of algorithms solving problems on permutation groups. The
crucial structure exploited there is the so-called \ki{tower of
stabilisers}\inddef{tower of stabilisers} of a permutation group~$\mathfrak{G}$:
\[
\boldid = \mathfrak{G}^{(n)} \leq \mathfrak{G}^{(n-1)} \leq \cdots \leq
\mathfrak{G}^{(1)} \leq \mathfrak{G}^{(0)} = \mathfrak{G} \koz .
\]
For each $i$ with $1 \leq i \leq n$, let $\mathfrak{T}_i$ be the
complete right transversal of $\mathfrak{G}^{(i)}$
in~$\mathfrak{G}^{(i-1)}$. Then, $\mathfrak{T} = \bigcup_{i =1}^{n-1}
\mathfrak{T}_i$ is said to be a \ki{strong generator of~$\mathfrak{G}$,}\inddef{strong generator of a group}
and we have $\mathfrak{G} =
\langle\mathfrak{T}\rangle$. Every $\pi \in \mathfrak{G}$ then has a
unique factorisation $\pi = \tau_1 \tau_2 \cdots \tau_n$ with $\tau_i
\in \mathfrak{T}_i$. The following basic algorithmic results about
permutation groups will be useful later in
Section~\ref{sec:graphisomorphie}.
\begin{tetel}
\label{thm:permutationgroup}
Let a permutation group $\mathfrak{G} \leq \mathfrak{S}_{n}$ be given
by a generator. Then, we have:
\begin{enumerate}
\item For each $i \in [n]$, the orbit $\mathfrak{G}(i)$ of $i$ in
$\mathfrak{G}$ can be computed in polynomial time.
\item The tower of stabilisers $\boldid = \mathfrak{G}^{(n)} \leq
\mathfrak{G}^{(n-1)} \leq \cdots \leq \mathfrak{G}^{(1)} \leq
\mathfrak{G}^{(0)} = \mathfrak{G}$ can be computed in time
polynomially in~$n$, i.e., for each $i$ with $1 \leq i \leq n$, the
complete right transversals $\mathfrak{T}_i$ of $\mathfrak{G}^{(i)}$
in~$\mathfrak{G}^{(i-1)}$ and thus a strong generator of
$\mathfrak{G}$ can be determined efficiently.
\end{enumerate}
\end{tetel}
The notions introduced in Definition~\ref{def:permutationgroup} for
permutation groups are now explained for concrete examples from graph
theory. In particular, we consider the automorphism group and the set
of isomorphisms between graphs. We start by introducing some useful
graph-theoretical concepts.
\vspace{-1mm}
\begin{defi}[Graph isomorphism and graph automorphism]\label{def:graphiso}
A graph $G$ consists of a finite set of vertices, $V(G),$ and a
finite set of edges, $E(G),$ that connect certain vertices with each
other. We assume that no multiple edges and no loops occur. In this
chapter, we consider only undirected graphs, i.e., the edges are not
oriented and can be seen as unordered vertex pairs. The {\em disjoint
union $G \cup H$ of two graphs $G$ and $H$} is defined as the graph
with vertex set $V(G) \cup V(H)$ and edge set $E(G) \cup E(H)$, where
we assume that that the vertex sets $V(G)$ and $V(H)$ are made
disjoint (by renaming if necessary).
Let $G$ and $H$ be two graphs with the same number of vertices. An
\ki{isomorphism\inddef{graph isomorphism} between $G$ and $H$} is an edge-preserving
bijection of the vertex set of $G$ onto that of~$H.$ Under the
convention that $V(G) = \{1, 2, \ldots , n\} = V(H)$, $G$ and $H$ are
isomorphic ($G \cong H$, for short) if and only if there exists a
permutation $\pi \in \mathfrak{S}_n$ such that for all vertices $i,j \in V(G)$,
\vspace{-1mm}
\begin{eqnarray}\label{eqn:iso}
\{i,j\} \in E(G) & \Longleftrightarrow & \{\pi(i),\pi(j)\} \in E(H) \koz .
\end{eqnarray}
\vspace{-1mm}
An \ki{automorphism of $G$}\index{graph automorphism}\inddef{automorhism!of a graph}
is an edge-preserving bijection of the
vertex set of $G$ onto itself. Every graph has the trivial
automorphism~$\idJR.$ By $\iso{G,H}$ we denote the set of all
isomorphisms between $G$ and~$H,$ and by $\auto{G}$ we denote the set
of all automorphisms of~$G$. Define the problems \ki{graph automorphism}\inddef{graph automorphism}
($\graphiso$, for short) and \ki{graph automorphism}\inddef{graph automorphism}
($\graphauto$, for short) by:
\begin{eqnarray*}
\graphiso & = & \{ \pair{G,H} \condition
\mbox{$G$ and $H$ are isomorphic graphs} \} \koz ; \\
\graphauto & = & \{ G \condition \mbox{$G$ has a nontrivial automorphism} \} \koz .
\end{eqnarray*}
For algorithmic purposes, graphs are represented either by their
vertex and edge lists or by an adjacency matrix, which has the entry
$1$ at position $(i,j)$ if $\{i,j\}$ is an edge, and the entry $0$
otherwise. This graph representation is suitably encoded over the
alphabet $\Sigma = \{0,1\}$. In order to represent pairs of graphs,
we use a standard bijective pairing function $\pair{\cdot, \cdot}$
from $\sigmastar \times \sigmastar$ onto $\sigmastar$ that is
polynomial-time computable and has polynomial-time computable inverses.
\end{defi}
\vspace{-2mm}
\begin{pelda}[Graph isomorphism and graph automorphism]\label{exa:graphiso}
The graphs $G$ and $H$ shown in Figure~\ref{fig:graphiso} are isomorphic.
\begin{figure}[!t]
\footnotesize\centering
\input{figs/noniso.eepic}
\hspace*{1.5cm}
\input{figs/iso.eepic}
\caption{Three graphs: $G$ is isomorphic to~$H$, but not to~$F$.}
\label{fig:graphiso}
\end{figure}
An isomorphism $\pi : V(G) \rightarrow V(H)$ preserving adjacency of
the vertices according to~(\ref{eqn:iso}) is given, e.g., by $\pi =
(\mbox{\tiny $
\begin{array}{@{}c@{\hspace*{.8mm}}c@{\hspace*{.8mm}}c@{\hspace*{.8mm}}c@{\hspace*{.8mm}}c@{}}
1 & 2 & 3 & 4 & 5 \\
3 & 4 & 1 & 5 & 2
\end{array}
$})$, or in cycle notation by $\pi = (1\, 3)(2\, 4\, 5)$. There are
three further isomorphisms between $G$ and~$H$, i.e., $|\iso{G,H}| =
4$, see Exercise \ref{ex:graphiso}. However, neither $G$
nor $H$ is isomorphic to~$F$. This is immediately seen from the fact
that the sequence of {\em vertex degrees\/} (the number of edges
fanning out of each vertex) of $G$ and~$H$, respectively, differs
from the sequence of vertex degrees of~$F$: For $G$ and~$H$, this
sequence is $(2,3,3,4,4)$, whereas it is $(3,3,3,3,4)$ for~$F$. A
nontrivial automorphism $\varphi : V(G) \rightarrow V(G)$ of $G$ is
given, e.g., by $\varphi = (\mbox{\tiny $
\begin{array}{@{}c@{\hspace*{.8mm}}c@{\hspace*{.8mm}}c@{\hspace*{.8mm}}c@{\hspace*{.8mm}}c@{}}
1 & 2 & 3 & 4 & 5 \\ 2 & 1 & 4 & 3 & 5
\end{array}
$})$, or $\varphi = (1\, 2)(3\, 4)(5)$; another one by
$\tau = (\mbox{\tiny $
\begin{array}{@{}c@{\hspace*{.8mm}}c@{\hspace*{.8mm}}c@{\hspace*{.8mm}}c@{\hspace*{.8mm}}c@{}}
1 & 2 & 3 & 4 & 5 \\ 1 & 2 & 4 & 3 & 5
\end{array}
$})$, or $\tau = (1)(2)(3\, 4)(5)$. There are two more automorphisms
of~$G$, i.e., $|\auto{G}| = 4$, see Exercise \ref{ex:graphiso}.
The permutation groups $\auto{F}$, $\auto{G}$, and $\auto{H}$ are
subgroups of~$\mathfrak{S}_5$. The tower $\auto{G}^{(5)} \leq
\auto{G}^{(4)} \leq \cdots \leq \auto{G}^{(1)} \leq \auto{G}^{(0)}$ of
stabilisers of $\auto{G}$ consists of the subgroups $\auto{G}^{(5)} =
\auto{G}^{(4)} = \auto{G}^{(3)} = \boldid$, $\auto{G}^{(2)} =
\auto{G}^{(1)} = \langle\{\idJR , \tau\}\rangle$, and $\auto{G}^{(0)} =
\auto{G}$. In the automorphism group $\auto{G}$ of~$G$, the vertices
$1$ and $2$ have the orbit $\{1,2\}$, the vertices $3$ and $4$ have
the orbit $\{3,4\}$, and vertex $5$ has the orbit~$\{5\}$.
\end{pelda}
$\iso{G,H}$ and $\auto{G}$ have the same number of elements if and
only if $G$ and $H$ are isomorphic. To wit, if $G$ and $H$ are
isomorphic, then $|\iso{G,H}| = |\auto{G}|$ follows from $\auto{G} =
\iso{G,G}$. Otherwise, if $G \not\cong H$, then $\iso{G,H}$ is empty
but $\auto{G}$ contains always the trivial automorphism $\mbox{\rm{}id}$. This
implies (\ref{eqn:graphiso-1}) in Lemma~\ref{lem:graphiso} below,
which we will need later in Section~\ref{sec:graphisomorphie}. For
proving (\ref{eqn:graphiso-2}), suppose that $G$ and $H$ are
connected; otherwise, we simply consider instead of $G$ and $H$ the
co-graphs $\overline{G}$ and~$\overline{H}$, see
Exercise \ref{ex:cograph}. An automorphism of $G \cup H$
that switches the vertices of $G$ and~$H$, consists of an isomorphism
in $\iso{G,H}$ and an isomorphism in~$\iso{H,G}$. Thus, $|\auto{G
\cup H}| = |\auto{G}| \cdot |\auto{H}| + |\iso{G,H}|^2$, which
implies (\ref{eqn:graphiso-2}) via~(\ref{eqn:graphiso-1}).
\begin{lemma}
\label{lem:graphiso}
For any two graphs $G$ and~$H$, we have
\begin{eqnarray}
|\iso{G,H}| & = & \left\{
\begin{array}{ll}
|\auto{G}| = |\auto{H}| & \mbox{ if $G \cong H$} \\
0 & \mbox{ if $G \not\cong H$ \koz ;}
\end{array}
\right. \label{eqn:graphiso-1} \\
|\auto{G \cup H}| & = & \left\{
\begin{array}{ll}
2 \cdot |\auto{G}| \cdot |\auto{H}| & \mbox{ if $G \cong H$} \\
|\auto{G}| \cdot |\auto{H}| & \mbox{ if $G \not\cong H$ \koz .}
\end{array}
\right.
\label{eqn:graphiso-2}
\end{eqnarray}
\end{lemma}
If $G$ and $H$ are isomorphic graphs and if $\tau \in \iso{G,H}$, then
$\iso{G,H} = \auto{G} \tau$. Thus, $\iso{G,H}$ is a right coset of
$\auto{G}$ in~$\mathfrak{S}_n$. Since any two right cosets are either
identical or disjoint, $\mathfrak{S}_n$ can be partitioned into right
cosets of $\auto{G}$ according to~(\ref{equ:right-coset}):
\begin{eqnarray}
\label{equ:right-coset-graph}
\mathfrak{S}_n = \auto{G} \tau_1 \cup \auto{G} \tau_2 \cup
\cdots \cup \auto{G} \tau_k \koz ,
\end{eqnarray}
where $|\auto{G} \tau_i| = |\auto{G}|$ for all~$i$, $1 \leq i \leq k$.
The set $\{\tau_1, \tau_2, \ldots , \tau_k\}$ of permutations in
$\mathfrak{S}_n$ thus is a complete right transversal of $\auto{G}$
in~$\mathfrak{S}_n$. Denoting by $\pi(G)$ the graph $H \cong G$ that
results from applying the permutation $\pi \in \mathfrak{S}_n$ to the
vertices of~$G$, we have $\{ \tau_i(G) \condition 1 \leq i \leq k\} = \{H
\condition H \cong G\}$. Since there are exactly $n! = n (n-1) \cdots
2 \cdot 1$ permutations in~$\mathfrak{S}_n$, it follows from
(\ref{equ:right-coset-graph}) that
\[
|\{H \condition H \cong G\}| = k = \frac{|\mathfrak{S}_{n}|}{|\auto{G}|} =
\frac{n!}{|\auto{G}|} \koz .
\]
This proves the following corollary.
\begin{kov}
\label{cor:number-of-isomorphic-graphs}
If $G$ is a graph with $n$ vertices, then there are exactly
$n!/|\auto{G}|$
graphs isomorphic to~$G$.
\end{kov}
For the graph $G$ in Figure~\ref{fig:graphiso} from
Example~\ref{exa:graphiso}, there thus exist exactly $5!/4 = 30$
isomorphic graphs. The following lemma will be used later in
Section~\ref{sec:graphisomorphie}.
\begin{lemma}
\label{lem:graphiso-pair}
Let $G$ and $H$ be two graphs with $n$ vertices. Define the set
\begin{eqnarray*}
\label{eqn:graphiso-pair-def}
A(G,H) & = &
\{\pair{F,\varphi} \condition \mbox{$F \cong G$ and $\varphi \in \auto{F}$}\}
\cup
\{\pair{F,\varphi} \condition \mbox{$F \cong H$ and $\varphi \in \auto{F}$}\} \koz .
\end{eqnarray*}
Then, we have
\begin{eqnarray*}
\label{eqn:graphiso-pair}
|A(G,H)| & = & \left\{
\begin{array}{ll}
n! & \mbox{ if $G \cong H$} \\
2n! & \mbox{ if $G \not\cong H$ \koz .}
\end{array}
\right.
\end{eqnarray*}
\end{lemma}
\begin{biz}
If $F$ and $G$ are isomorphic, then $|\auto{F}| = |\auto{G}|$.
Hence, by Corollary~\ref{cor:number-of-isomorphic-graphs}, we have
\[
|\{\pair{F,\varphi} \condition \mbox{$F \cong G$ and $\varphi \in \auto{F}$}\}|
= \frac{n!}{|\auto{F}|} \cdot |\auto{F}| = n! \koz .
\]
Analogously, one can show that $|\{\pair{F,\varphi} \condition
\mbox{$F \cong H$ and $\varphi \in \auto{F}$}\}| = n!$. If $G$
and $H$ are isomorphic, then
\begin{eqnarray*}
\{\pair{F,\varphi} \condition \mbox{$F \cong G$ and $\varphi \in \auto{F}$}\}
& = &
\{\pair{F,\varphi} \condition \mbox{$F \cong H$ and $\varphi \in \auto{F}$}\} \koz .
\end{eqnarray*}
It follows that $|A(G,H)| = n!$. If $G$ and $H$ are nonisomorphic,
then $\{\pair{F,\varphi} \condition \mbox{$F \cong G$}$ \linebreak
and $\varphi \in \auto{F} \}$ and $\{\pair{F,\varphi} \condition \mbox{$F \cong H$ and
$\varphi \in \auto{F}$}\}$ are disjoint sets. Thus, $|A(G,H)| = 2n!.$
\end{biz}
\begin{gyak}
\ujgyak Figure \ref{tab:dechiffrierung-caesar-vigenere}\label{ex:dechiffrierung-caesar-vigenere} shows two ciphertexts,
$c_1$ and $c_2.$ It is known that both encrypt the same plaintext $m$ and
that one was obtained using the shift cipher, the other one using the
Vigen\`{e}re cipher. Decrypt both ciphertexts. \textit{Hint.} After decryption
you will notice that the plaintext obtained is
a true statement for one of the two ciphertexts, whereas it is a false
statement for the other ciphertext. Is perhaps the method of frequency
counts useful here?
\begin{figure}[!t]
\centering
\rm\small
\begin{tabular}{|c|c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}c@{\hspace*{1.0mm}}|}
\hline
$c_1$ & W & K & L & V & V & H & Q & W & H & Q
& F & H & L & V & H & Q & F & U & B & S
& W & H & G & E & B & F & D & H & V & D
& U & V & N & H & B \\ \hline
$c_2$ & N & U & O & S & J & Y & A & Z & E & E
& W & R & O & S & V & H & P & X & Y & G
& N & R & J & B & P & W & N & K & S & R
& L & F & Q & E & P \\ \hline
\end{tabular}
\caption{Two ciphertexts encrypting the same plaintext, see
Exercise \ref{ex:dechiffrierung-caesar-vigenere}.
\label{tab:dechiffrierung-caesar-vigenere}
}
\end{figure}\label{ex:dechiffrierung-caesar-vigenere.}
\ujgyak Prove that $\integers$ is a ring with respect to ordinary
addition and multiplication. Is it also a field? What can be said
about the properties of the algebraic structures $(\N,+)$,
$(\N,\cdot)$, and $(\N,+,\cdot)$?
\ujgyak Prove the properties stated for Euler's $\varphi$ function:
\textit{a.} $\varphi(m \cdot n) = \varphi(m) \cdot \varphi(n)$ for all
$m, n \in \N$ with $\gcd(m,n) = 1$.
\textit{b.} $\varphi(p) = p - 1$ for all prime numbers~$p.$
Using these properties, prove Proposition~\ref{fac:euler-funktion}.\label{ex:euler-funktion}
\ujgyak Consider the graphs $F$, $G$, and $H$ from
Figure~\ref{fig:graphiso} in Example~\ref{exa:graphiso}.
\textit{a.} Determine all isomorphisms between $G$ and~$H$.
\textit{b.} Determine all automorphisms of~$F$, $G$, and~$H$.
\textit{c.} For which isomorphisms between $G$ and $H$ is $\iso{G,H}$ a right coset of
$\auto{G}$ in~$\mathfrak{S}_5$, i.e., for which $\tau \in \iso{G,H}$ is $\iso{G,H} = \auto{G} \tau$?
Determine the complete right transversals of $\auto{F}$, $\auto{G}$,
and $\auto{H}$ in~$\mathfrak{S}_5$.
\textit{d.} Determine the orbit of all vertices of $F$ in
$\auto{F}$ and the orbit of all vertices of $H$ in $\auto{H}$.
\textit{e.} Determine the tower of stabilisers of the subgroups
$\auto{F} \leq \mathfrak{S}_5$ and $\auto{H} \leq \mathfrak{S}_5$.
\textit{f.} How many graphs with $5$ vertices are isomorphic
to~$F$?\label{ex:graphiso}
\ujgyak The {\em co-graph\/} $\overline{G}$ of a graph $G$ is defined by
the vertex set $V(\overline{G}) = V(G)$ and the edge set
$E(\overline{G}) = \{ \{i,j\} \condition i, j \in V(\overline{G})
\mbox{ and } \{i,j\} \not\in E(G) \}$. Prove the following claims:
\textit{a.} $\auto{G} = \auto{\overline{G}}$.
\textit{b.} $\iso{G,H} = \iso{\overline{G},\overline{H}}$.
\textit{c.} $\overline{G}$ is connected if $G$ is not connected.\label{ex:cograph}
\end{gyak}
\section{Diff{}ie and Hellman's secret-key agreement protocol}\label{sec:diffie-hellman}
The basic number-theoretic facts presented in
Subsection~\ref{sec:algebra} will be needed in this and the subsequent
sections. In particular, recall the multiplicative group
$\integers_{k}^{\ast}$ from Example~\ref{exa:gruppe} and Euler's
$\varphi$ function. The arithmetics in remainder class rings will be
explained at the end of this chapter, see
Problem \ref{ex:field}.
Figure~\ref{fig:diffie-hellman} shows the Diff{}ie-Hellman secret-key
agreement protocol, which is based on exponentiation with base $r$ and
modulus~$p$, where $p$ is a prime number and $r$ is a primitive root of~$p$.
A {\em primitive root of a number $n$\/} is any element $r \in
\integers_{n}^{\ast}$ such that $r^d \not\equiv 1 \bmod n$ for each $d$ with
$1 \leq d < \varphi(n)$. A primitive root $r$ of $n$ generates the entire
group~$\integers_{n}^{\ast}$, i.e., $\integers_{n}^{\ast} = \{r^i \condition 0
\leq i < \varphi(n)\}$. Recall that for any prime number $p$ the group
$\integers_{p}^{\ast}$ has order $\varphi(p) = p-1$. $\integers_{p}^{\ast}$
has exactly $\varphi(p-1)$ primitive roots, see also
Exercise \ref{ex:primitive-root}.
\begin{pelda}
\label{exa:primitive-root}
Consider $\integers_{5}^{\ast} = \{1, 2, 3, 4\}$. Since $\integers_{4}^{\ast}
= \{1, 3\}$, we have $\varphi(4) = 2$, and the two primitive roots of $5$ are
$2$ and~$3$. Both $2$ and $3$ generate all of~$\integers_{5}^{\ast}$, since:
\[
\begin{array}{llll}
2^0 = 1;\ & 2^1 = 2;\ & 2^2 = 4;\ & 2^3 \equiv 3 \bmod 5 \koz ; \\
3^0 = 1;\ & 3^1 = 3;\ & 3^2 \equiv 4 \bmod 5;\ & 3^3 \equiv 2 \bmod 5 \koz .
\end{array}
\]
\end{pelda}
Not every number has a primitive root; $8$ is the smallest such example. It
is known that a number $n$ has a primitive root if and only if $n$ either is
from $\{1,2,4\}$, or has the form $n = q^k$ or $n = 2q^k$ for some odd prime
number~$q$.
\begin{defi}[Discrete logarithm]\label{def:discrete-logarithm}
Let $p$ be a prime number, and let $r$ be a primitive root of~$p$. The {\em
modular exponential function with base $r$ and modulus~$p$} is the function
${\tt exp}_{r,p}$ that maps from $\integers_{p-1}$ to~$\integers_{p}^{\ast}$,
and is defined by ${\tt exp}_{r,p}(a) = r^a \bmod p$. Its inverse function is
called the {\em discrete logarithm}, and maps for fixed $p$ and $r$ the value
${\tt exp}_{r,p}(a)$ to~$a$. We write $a = \log_r {\tt exp}_{r,p}(a) \bmod p$.
\end{defi}
\begin{figure}[!t]
\footnotesize \centering
\begin{tabular}{|c||c|c|c|}
\hline
\parbox[t]{0.6cm}{\bf Step} &
Alice &
Erich &
Bob \\ \hline\hline
{\bf 1} & \multicolumn{3}{c|}
{Alice and Bob agree upon a large prime number $p$ and a primitive root
$r$ of~$p$;} \\
& \multicolumn{3}{c|}
{both $p$ and $r$ are public} \\ \hline
{\bf 2} & \parbox[t]{4.3cm}
{chooses a large, secret number $a$ at random and computes
$\alpha = r^a \bmod p$}
& & \parbox[t]{4.3cm}
{chooses a large, secret number $b$ at random and computes
$\beta = r^b \bmod p$}
\\ \hline
{\bf 3} & &
$\alpha \Rightarrow$ & \\
& &
$\Leftarrow \beta$ & \\ \hline
{\bf 4} & \parbox[t]{4.3cm}
{computes $k_A = \beta^a \bmod p$}
& & \parbox[t]{4.3cm}
{computes $k_B = \alpha^b \bmod p$} \\ \hline
\end{tabular}
\caption{The Diff{}ie-Hellman secret-key agreement protocol.
\label{fig:diffie-hellman}
}
\end{figure}
The protocol given in Figure~\ref{fig:diffie-hellman} works, since (in the
arithmetics modulo~$p$)
\[
k_A = \beta^a = r^{ba} = r^{ab} = \alpha^b = k_B \koz ,
\]
so Alice indeed computes the same key as Bob. Computing this key is not
hard, since modular exponentiation can be performed efficiently using the
square-and-multiply method from algorithm \textsc{Square-and-Multiply}.
Erich, however, has a hard time when he attempts to determine their key, since
the discrete logarithm is considered to be a hard problem. That is why the
modular exponential function ${\tt exp}_{r,p}$ is a candidate of a {\em
one-way function}, a function that is easy to compute but hard to invert.
Note that it is not known whether or not one-way functions indeed exist; this
is one of the most challenging open research questions in cryptography. The
security of many cryptosystems rests on the assumption that one-way functions
indeed exist.
\begin{alg}{Square-and-Multiply$(a,b,m)$}\inddef{\textsc{Square-and-Multiply}}
1 \' \` $\rhd$ $m$ is the modulus, $b < m$ is the
base, and $a$ is the exponent \\
2 \' determine the binary expansion of the exponent
$a = \sum_{i=0}^{k} a_i 2^{i}$, where $a_i \in \{0,1\}$ \\
3 \' compute successively $b^{2^i}$, where $0 \leq i \leq k$, using
that $b^{2^{i+1}} = \left( b^{2^i}\right)^{2}$ \\
4 \' compute $b^a = \prod_{\stackrel{\mbox{\protect\scriptsize
$i=0$}}{a_i = 1}}^{k} b^{2^i}$ in the arithmetics modulo~$m$ \\
5 \' \` $\rhd$ as soon as a factor $b^{2^{i}}$ in the product and $b^{2^{i+1}}$ are determined, \\
\` $\rhd$ $b^{2^{i}}$ can be deleted and need not be stored \\
6 \' \key{return} $b^a$
\end{alg}
Why can the modular exponential function ${\tt exp}_{r,p}(a) = r^a \bmod p$ be
computed efficiently? Naively performed, this computation may require many
multiplications, depending on the size of the exponent~$a$. However, using
algorithm \textsc{Square-and-Multiply}
there is no need to perform $a-1$ multiplications as in the naive approach; no
more than $2 \log a$ multiplications suffice. The square-and-multiply
algorithm thus speeds modular exponentiation up by an exponential factor.
Note that in the arithmetics modulo~$m$, we have
\[
b^a = b^{\sum_{i = 0}^{k} a_i 2^{i}} =
\prod_{i = 0}^{k} \left( b^{2^i}\right)^{a_i} =
\prod_{\stackrel{\mbox{\protect\scriptsize $i = 0$}}{a_i = 1}}^{k} b^{2^i} \koz .
\]
Thus, the algorithm \textsc{Square-and-Multiply} is correct.
\begin{pelda}[ Square-and-Multiply in the Diff{}ie-Hellman Protocol]
\label{exa:square-and-multiply}
Alice and Bob have chosen the prime number $p = 5$ and the primitive root $r =
3$ of~$5$. Alice picks the secret number $a = 17$. In order to send her
public number $\alpha$ to Bob, Alice wishes to compute $\alpha = 3^{17} =
129140163 \equiv 3 \bmod 5$. The binary expansion of the exponent is $17 = 1
+ 16 = 2^0 + 2^4$.
Alice successively computes the values:
\begin{center}
\begin{tabular}{c@{\hspace{1mm}}c@{\hspace{1mm}}c@{\hspace{1mm}}c@{\hspace{1mm}}c}
$3^{2^0}\hspace{-0.5mm} = 3;$ &
$3^{2^1}\hspace{-0.5mm} = 3^2 \equiv 4 \bmod 5;$ &
$3^{2^2}\hspace{-0.5mm} \equiv 4^2\hspace{-0.5mm} \equiv 1 \hspace{-0.5mm}\bmod 5;$ &
$3^{2^3}\hspace{-0.5mm} \equiv 1^2\hspace{-0.5mm} \equiv 1 \hspace{-0.5mm}\bmod 5;$ &
$3^{2^4} \hspace{-0.5mm}\equiv 1^2 \hspace{-0.5mm}\equiv 1 \bmod 5 \koz .$ \\
\end{tabular}
\end{center}
Then, she computes $3^{17} \equiv 3^{2^0} \cdot 3^{2^4} \equiv 3 \cdot 1
\equiv 3 \bmod 5$. Note that Alice does not have to multiply $16$ times but
merely performs four squarings and one multiplication to determine $\alpha = 3
\bmod 5$.
Suppose that Bob has chosen the secret exponent $b = 23$. By the same method,
he can compute his part of the key, $\beta = 3^{23} = 94143178827 \equiv 2
\bmod 5$. Now, Alice and Bob determine their joint secret key according to
the Diff{}ie-Hellman protocol from Figure~\ref{fig:diffie-hellman}; see
Exercise~\ref{ex:square-and-multiply}.
Note that the protocol is far from being secure in this case, since the prime
number $p = 5$ and the secret exponents $a = 17$ and $b = 23$ are much too
small. This toy example was chosen just to explain how the protocol works.
In practice, $a$ and $b$ should have at least $160$ bits each.
\end{pelda}
If Erich was listening very careful, he knows the values~$p$, $r$, $\alpha$,
and $\beta$ after Alice and Bob have executed the protocol. His aim is to
determine their joint secret key $k_A = k_B$. This problem is known as the
{\em Diff{}ie-Hellman problem}. If Erich were able to determine $a = \log_r
\alpha \bmod p$ and $b = \log_r \beta \bmod p$ efficiently, he could compute
the key $k_A = \beta^a \bmod p = \alpha^b \bmod p = k_B$ just like
Alice and Bob and thus would have solved the Diff{}ie-Hellman problem. Thus,
this problem is no harder than the problem of computing discrete logarithms.
The converse question of whether or not the Diff{}ie-Hellman problem is
{\em at least\/} as hard as solving the discrete logarithm (i.e., whether or
not the two problems are equally hard) is still just an unproven conjecture.
As many other cryptographic protocols, the Diff{}ie-Hellman protocol
currently has no proof of security.
However, since up to date neither the discrete logarithm nor the
Diff{}ie-Hellman problem can be solved efficiently, this direct attack is not
a practical threat. On the other hand, there do exist other, indirect attacks
in which the key is determined not immediately from the values $\alpha$ and
$\beta$ communicated in the Diff{}ie-Hellman protocol. For example,
Diff{}ie-Hellman is vulnerable by the ``{\em man-in-the-middle}'' attack.
Unlike the {\em passive\/} attack described above, this attack is {\em
active}, since the attacker Erich aims at actively altering the protocol to
his own advantage. He is the ``man in the middle'' between Alice and Bob, and
he intercepts Alice's message $\alpha = r^a \bmod p$ to Bob and Bob's message
$\beta = r^b \bmod p$ to Alice. Instead of $\alpha$ and~$\beta$, he forwards
his own values $\alpha_{E} = r^c \bmod p$ to Bob and $\beta_{E} = r^d \bmod p$
to Alice, where the private numbers $c$ and $d$ were chosen by Erich. Now, if
Alice computes her key $k_A = (\beta_{E})^a \bmod p$, which she falsely
presumes to share with Bob, $k_A$ in fact is a key for future communications
with Erich, who determines the same key by computing (in the arithmetics
modulo~$p$)
\[
k_E = \alpha^d = r^{a d} = r^{d a} = (\beta_{E})^a = k_A \koz .
\]
Similarly, Erich can share a key with Bob, who has not the slightest idea that
he in fact communicates with Erich. This raised the issue of {\em
authentication}, which we will deal with in more detail later in
Section~\ref{sec:zero-knowledge} about zero-knowledge protocols.
\begin{gyak}
\ujgyak
\textit{a.} How many primitive roots do $\integers_{13}^{\ast}$ and
$\integers_{14}^{\ast}$ have?
\textit{b.}
Determine all primitive roots of
$\integers_{13}^{\ast}$ and $\integers_{14}^{\ast}$, and prove that they
indeed are primitive roots.
\textit{c.} Show that every primitive root of $13$ and of~$14$,
respectively, generates all of $\integers_{13}^{\ast}$ and
$\integers_{14}^{\ast}$.\label{ex:primitive-root}
\ujgyak
\textit{a.} Determine Bob's number $\beta = 3^{23} = 94143178827 \equiv 2
\bmod 5$ from Example \ref{exa:square-and-multiply} using the
algorithm \textsc{Square-and-Multiply}.\gyakindex{\textsc{Square-and-Multiply}}
\textit{b.} For $\alpha$ and $\beta$ from
Example~\ref{exa:square-and-multiply}, determine the joint secret key of
Alice and Bob according to the Diffie-Hellman protocol\gyakindex{Diffie-Hellman protocol} from
Figure \ref{fig:diffie-hellman}.\label{ex:square-and-multiply}
\end{gyak}
\section{RSA and factoring}\label{sec:rsa-factoring}
\subsection{RSA}\label{sec:rsa}
The RSA cryptosystem, which is named after its inventors Ron Rivest,\nevindex{Rivest, Ronald Lewis}
Adi Shamir,\nevindex{Shamir, Adi} and Leonard Adleman \cite{RivestShAd78},\nevindex{Adleman, Leonard}
is the first \textit{public-key}\index{public-key} cryptosystem. It is very popular still today and
is used by various cryptographic applications. Figure~\ref{fig:rsa}
shows the single steps of the RSA protocol, which we will now describe
in more detail, see also Example~\ref{exa:rsa}.
\begin{figure}[!t]
\footnotesize
\centering
\begin{tabular}{|c||c|c|c|}
\hline
\parbox[t]{0.6cm}{\bf Step } &
Alice &
Erich &
Bob \\ \hline\hline
{\bf 1} & & & \parbox[t]{6cm}
{chooses large prime numbers $p$ and $q$ at random and computes $n = pq$ and
$\varphi(n) = (p-1)(q-1)$, his public key $(n,e)$ and his
private key~$d$, which satisfy (\ref{equ:e}) and (\ref{equ:d})} \\
\hline
{\bf 2} & &
$\Leftarrow (n, e)$ & \\ \hline
{\bf 3} & \parbox[t]{2.6cm}
{encrypts $m$ as $c = m^e \bmod n$}
& & \\ \hline
{\bf 4} & &
$c \Rightarrow$ & \\ \hline
{\bf 5} &
& & \parbox[t]{4.8cm}
{decrypts $c$ as $m = c^d \bmod n$}
\\ \hline
\end{tabular}
\caption{The RSA protocol.\abraindex{RSA protocol}
\label{fig:rsa}
}
\end{figure}
\begin{description}
\item[{\bf 1. Key generation:}] \ Bob chooses two large prime numbers at
random, $p$ and $q$ with $p \neq q$, and computes their product $n =
pq$. He then chooses an exponent $e \in \N$ satisfying
\begin{eqnarray}
\label{equ:e}
1 < e < \varphi(n) = (p-1)(q-1) & \mbox{ and } & \gcd(e,\varphi(n)) = 1
\end{eqnarray}
and computes the inverse of $e \bmod \varphi(n)$, i.e., the unique
number $d$ such that
\begin{eqnarray}
\label{equ:d}
1 < d < \varphi(n) & \mbox{ and } & e \cdot d \equiv 1 \bmod \varphi(n) \koz .
\end{eqnarray}
The pair $(n,e)$ is Bob's {\em public key}, and $d$ is Bob's {\em
private key}.
\item[{\bf 2.~Encryption:}] \ As in
Section \ref{sec:foundations-cryptology}, messages are strings over
an alphabet~$\Sigma$. Any message is subdivided into blocks of a
fixed length, which are encoded as positive integers in
$|\Sigma|$-adic representation. These integers are then encrypted.
Let $m < n$ be the number encoding some block of the message Alice
wishes to send to Bob. Alice knows Bob's public key $(n, e)$ and
encrypts $m$ as the number $c = E_{(n,e)}(m)$, where the encryption
function is defined by
\[
E_{(n,e)}(m) = m^e \bmod n \koz .
\]
\item[{\bf 3.~Decryption:}] \ Let $c$ with $0 \leq c < n$ be the number
encoding one block of the ciphertext, which is received by Bob and
also by the eavesdropper Erich. Bob decrypts $c$ by using his
private key $d$ and the following decryption function
\[
D_d (c) = c^d \bmod n \koz .
\]
\end{description}
Theorem~\ref{thm:rsa} states that the RSA protocol described above
indeed is a cryptosystems in the sense of
Definition~\ref{def:cryptosystem}. The proof of Theorem~\ref{thm:rsa}
is left to the reader as Exercise \ref{ex:rsa}.
\begin{tetel}
\label{thm:rsa}
Let $(n,e)$ be the public key and $d$ be the private key in the
RSA protocol. Then, for each message $m$ with $0 \leq m < n$,
\[
m = \left( m^e \right)^d \bmod n \koz .
\]
Hence, RSA is a public-key cryptosystem.
\end{tetel}
To make RSA encryption and (authorised) decryption efficient, the
algorithm \textsc{Square-and-Multiply} algorithm is again employed for fast
exponentiation.
How should one choose the prime numbers $p$ and $q$ in the RSA
protocol? First of all, they must be large enough, since otherwise
Erich would be able to factor the number $n$ in Bob's public key
$(n,e)$ using the extended Euclidean algorithm.
Knowing the prime factors $p$ and $q$ of~$n$, he could then easily
determine Bob's private key~$d$, which is the unique inverse of $e
\bmod \varphi(n)$, where $\varphi(n) = (p-1)(q-1)$. To keep the prime
numbers $p$ and $q$ secret, they thus must be sufficiently large. In
practice, $p$ and $q$ should have at least $80$ digits each. To this
end, one generates numbers of this size randomly and then checks using
one of the known randomised primality tests whether the chosen numbers
are primes indeed. By the Prime Number Theorem, there are about
$N/\ln N$ prime numbers not exceeding~$N$. Thus, the odds are good to
hit a prime after reasonably few trials.
In theory, the primality of $p$ and $q$ can be decided even in \textit{deterministic}
polynomial time.\index{deterministic polynomial time} Agrawal et
al. \cite{AgrawalKayalSaxena2004,farkasg9}\nevindex{Kayal,
Neeraj}\nevindex{Agrawal, Manindra}\nevindex{Saxena, Nitin}
recently showed that the primality problem, which is defined by
\[
\primes = \{ \mbox{bin}(n) \condition \mbox{$n$ is prime}\} \koz ,
\]
is a member of the complexity class~$\p$. Their breakthrough solved a
longstanding open problem in complexity theory: Among a few other natural
problems such as the graph isomorphism problem, the primality problem was
considered to be one of the rare candidates of problems that are neither in
$\p$ nor $\np$-complete.\footnote{ The complexity classes $\p$ and $\np$
will be defined in Section~\ref{sec:computational-complexity} and the notion
of $\np$-completeness will be defined in Section~\ref{sec:np-completeness}.}
For practical purposes, however, the known randomised algorithms are more
useful than the deterministic algorithm by Agrawal et al. The running time of
$\bigoh(n^{12})$ obtained in their original
paper \cite{AgrawalKayalSaxena2004,farkasg9} could be improved to $\bigoh(n^{6})$
meanwhile, applying a more sophisticated analysis, but this running time is
still not as good as that of the randomised algorithms.
\begin{algN}{Miller-Rabin$(n)$}\inddef{\textsc{Miller-Rabin}}
1 \' determine the representation $n-1 = 2^k m$, where $m$ and $n$ are odd \\
2 \' choose a number $z \in \{1, 2, \ldots , n-1\}$ at random under
the uniform distribution \\
3 \' compute $x = z^{m} \bmod n$ \\
4 \' \key{if} \= $(x \equiv 1 \bmod n)$ \\
5 \' \> \key{then} \= \key{return} ``$n$ is a prime number'' \\
6 \' \> \key{else} \= \key{for} \= $j \leftarrow 0$ \key{to} $k-1$ \\
7 \' \> \> \key{do} \= \key{if} \= $(x \equiv -1 \bmod n)$ \\
8 \' \> \> \> \> \key{then} \= \key{return} ``$n$ is a prime number''\\
9 \' \> \> \> \> \key{else} \> $x \leftarrow x^2 \bmod n$ \\
10 \' \> \> \> \> \> \key{return} ``$n$ is not a prime number'' \\
\end{algN}\label{fig:miller-rabin}
One of the most popular randomised primality tests is the algorithm \textsc{Miller-Rabin} developed
by Rabin~\cite{Rabin80},\nevindex{Rabin, Michael O.} which is based
on the ideas underlying the deterministic algorithm of
Miller~\cite{Miller76}.\nevindex{Miller, Gary L.} The
Miller-Rabin test is a so-called Monte Carlo algorithm, since the
``no'' answers of the algorithm are always reliable, whereas its
``yes'' answers have a certain error probability. An alternative to the
Miller-Rabin test is the primality test of Solovay and
Strassen~\cite{sol-str:j:monte-carlo-for-primality}. Both primality tests
run in time~$\bigoh(n^3).$ However, the Solovay-Strassen test is less
popular because it is not as efficient in practice and also less accurate than
the Miller-Rabin test.
The class of problems solvable via Monte Carlo algorithms with always reliable
``yes'' answers is named~$\rp$, which stands for {\em R\/}andomised {\em
P\/}olynomial Time. The complementary class, $\corp = \{L \condition
\overline{L} \in \rp\}$, contains all those problems solvable via Monte Carlo
algorithms with always reliable ``no'' answers. Formally, $\rp$ is defined
via nondeterministic polynomial-time Turing machines (NPTMs, for short; see
Section~\ref{sec:computational-complexity} and in particular
Definitions~\ref{def:syntax-tm}, \ref{def:semantik-tm},
and~\ref{def:dtime-ntime}) whose computations are viewed as random processes:
For each nondeterministic guess, the machine flips an unbiased coin and
follows each of the resulting two next configurations with probability~$1/2$
until a final configuration is reached. Depending on the number of accepting
computation paths of the given NPTM, one obtains a certain acceptance
probability for each input. Errors may occur. The definition of $\rp$
requires that the error probability must be below the threshold of $1/2$ for
an input to be accepted, and there must occur no error at all for an input to
be rejected.
\begin{defi}[Randomised polynomial time]\label{def:rp}
The class $\rp$ consists of exactly those problems $A$ for which there exists
an NPTM $M$ such that for each input $x$, if $x \in A$ then $M(x)$ accepts
with probability at least~$1/2$, and if $x \not\in A$ then $M(x)$ accepts with
probability~$0$.
\end{defi}
\begin{tetel}
\label{thm:primes-is-in-corp}
$\primes$ is in~$\corp$.
\end{tetel}
Theorem~\ref{thm:primes-is-in-corp} follows from the fact that, for example,
the Miller-Rabin test is a Monte Carlo algorithm for the primality problem.
We present a proof sketch only. We show that the Miller-Rabin test accepts
$\primes$ with one-sided error probability as in Definition~\ref{def:rp}: If
the given number $n$ (represented in binary) is a prime number then the
algorithm cannot answer erroneously that $n$ is not prime. For a
contradiction, suppose $n$ is prime but the Miller-Rabin test halts with the
output: ``$n$ is not a prime number''. Hence, $z^m \not\equiv 1 \bmod n$.
Since $x$ is squared in each iteration of the ${\tt for}$ loop, we
sequentially test the values
\[
z^{m}, z^{2m}, \ldots, z^{2^{k-1}m}
\]
modulo~$n$. By assumption, for none of these values the algorithm says $n$
were prime. It follows that for each $j$ with $0 \leq j \leq k-1$,
\[
z^{2^{j}m} \not\equiv -1 \bmod n \koz .
\]
Since $n-1 = 2^k m$, Fermat's Little Theorem (see Corollary~\ref{thm:fermat})
implies $z^{2^{k}m} \equiv 1 \bmod n$. Thus, $z^{2^{k-1}m}$ is a square roots
of $1$ modulo~$n$. Since $n$ is prime, there are only two square roots of $1$
modulo~$n$, namely $\pm 1 \bmod n$, see
Exercise \ref{ex:primes-is-in-corp}. Since $z^{2^{k-1}m}
\not\equiv -1 \bmod n$, we must have $z^{2^{k-1}m} \equiv 1 \bmod n$. But
then, $z^{2^{k-2}m}$ again is a square root of $1$ modulo~$n$. By the same
argument, $z^{2^{k-2}m} \equiv 1 \bmod n$. Repeating this argument again and
again, we eventually obtain $z^{m} \equiv 1 \bmod n$, a contradiction. It
follows that the Miller-Rabin test works correctly for each prime number. On
the other hand, if $n$ is not a prime number, it can be shown that the error
probability of the Miller-Rabin tests does not exceed the threshold of~$1/4$.
Repeating the number of independent trials, the error probability can be made
arbitrarily close to zero, at the cost of increasing the running time of
course, which still will be polynomially in $\log n$,
where $\log n$ is the size of the input $n$ represented in binary.
\begin{pelda}[ RSA]
\label{exa:rsa}
Suppose Bob chooses the prime numbers $p = 67$ and $q = 11$. Thus, $n = 67
\cdot 11 = 737$, so we have $\varphi(n) = (p-1)(q-1) = 66 \cdot 10 = 660$. If
Bob now chooses the smallest exponent possible for $\varphi(n) = 660$, namely
$e = 7$, then $\pair{n,e} = \pair{737,7}$ is his public key. Using the
extended Euclidean algorithm,
Bob determines his private key $d = 283$, and we have $e \cdot d = 7 \cdot 283
= 1981 \equiv 1 \bmod 660$; see Exercise \ref{ex:rsa-example}. As
in Section~\ref{sec:foundations-cryptology}, we identify the alphabet $\Sigma
= \{\mbox{\rm{}A}, \mbox{\rm{}B}, \ldots , \mbox{\rm{}Z}\}$ with the set
$\integers_{26} = \{0, 1, \ldots , 25\}$. Messages are strings over $\Sigma$
and are encoded in blocks of fixed length as natural numbers in $26$-adic
representation. In our example, the block length is $\ell = \lfloor \log_{26}
n \rfloor = \lfloor \log_{26} 737 \rfloor = 2$.
More concretely, any block $b = b_1 b_2 \cdots b_{\ell}$ of length $\ell$ with
$b_i \in \integers_{26}$ is represented by the number
\[
m_b = \sum_{i=1}^{\ell} b_i \cdot 26^{\ell - i} \koz .
\]
Since $\ell = \lfloor \log_{26} n \rfloor$, we have
\[
0 \leq m_b \leq 25 \cdot \sum_{i=1}^{\ell} 26^{\ell - i} = 26^{\ell} - 1 < n \koz .
\]
The RSA encryption function encrypts the block $b$ (i.e., the corresponding
number~$m_b$) as $c_b = (m_b)^e \bmod n$. The ciphertext for block $b$ then
is $c_b = c_0 c_1 \cdots c_{\ell}$ with $c_i \in \integers_{26}$. Thus, RSA
maps blocks of length $\ell$ injectively to blocks of length $\ell + 1$.
Figure \ref{tab:rsa-example} shows how to subdivide a message of length $34$
into $17$ blocks of length~$2$ and how to encrypt the single blocks, which are
represented by numbers. For example, the first block, ``{\rm{}R\/S}'', is
turned into a number as follows: ``{\rm{}R}'' corresponds to
$17$ and ``{\rm{}S}'' to~$18$, and we have
\[
17 \cdot 26^{1} + 18 \cdot 26^{0} = 442 + 18 = 460 \koz .
\]
\begin{figure}[!t]
\centering
\rm\footnotesize
\begin{tabular}{|c||@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}c@{\hspace*{0.8mm}}
c@{\hspace*{0.8mm}}|}
\hline
M
& R & S & A & I & S & T & H & E & K & E & Y & T & O & P & U & B & L
& I & C & K & E & Y & C & R & Y & P & T & O & G & R & A & P & H & Y \\
\hline
$m_b$
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$460$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$8$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$487$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$186$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$264$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$643$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$379$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$521$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$294$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$62$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$128$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$69$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$639$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$508$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$173$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$15$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|}{\hspace*{-2.0mm}$206$} \\
\hline
$c_b$
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$697$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$387$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$229$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$340$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$165$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$223$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$586$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$5$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$189$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$600$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$325$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$262$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$100$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$689$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$354$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|@{\hspace*{0.8mm}}}{\hspace*{-2.0mm}$665$}
&\multicolumn{2}{c@{\hspace*{0.8mm}}|}{\hspace*{-2.0mm}$673$} \\
\hline
\end{tabular}
\caption{Example of an RSA encryption (M = Message).\label{tab:rsa-example}
}
\end{figure}
The resulting number $c_b$ is written again in $26$-adic representation and
can have the length $\ell + 1$: $c_b = \sum_{i=0}^{\ell} c_i \cdot 26^{\ell -
i}$, where $c_i \in \integers_{26}$, see also
Exercise \ref{ex:rsa-example}. So, the first block, $697 = 676 +
21 = 1 \cdot 26^2 + 0 \cdot 26^1 + 21 \cdot 26^0$, is encrypted to yield the
ciphertext ``{\rm{}B\/A\/V}''.
Decryption is also done blockwise. In order to decrypt the first block with
the private key $d = 283$, compute $697^{283} \bmod 737$, again using fast
exponentiation with \textsc{Square-and-Multiply}. To prevent the
numbers from becoming too large, it is recommendable to reduce modulo $n =
737$ after each multiplication. The binary expansion of the exponent is $283
= 2^0 + 2^1 + 2^3 + 2^4 + 2^8$, and we obtain
\[
697^{283} \equiv 697^{2^0} \cdot 697^{2^1} \cdot 697^{2^3} \cdot 697^{2^4}
\cdot 697^{2^8} \equiv 697 \cdot 126 \cdot 9 \cdot 81 \cdot 15 \equiv 460
\bmod 737
\]
as desired.
\end{pelda}
\subsection{Digital RSA signatures}
The public-key cryptosystem RSA from Figure~\ref{fig:rsa} can be modified so
as to produce digital signatures. This protocol is shown in
Figure~\ref{fig:rsa-digital-signature}. It is easy to see that this protocol
works as desired; see Exercise~\ref{ex:rsa-digital-signature}.
This digital signature protocol is vulnerable to ``{\em chosen-plaintext
attacks}'' in which the attacker can choose a plaintext and obtains the
corresponding ciphertext. This attack is described, e.g.,
in~\cite{Rothe02}.\nevindex{Rothe, Jörg}
\begin{figure}[!t]
\footnotesize \centering
\begin{tabular}{|c||c|c|c|}
\hline
\parbox[t]{0.6cm}{\bf Step} &
Alice &
Erich &
Bob \\ \hline\hline
{\bf 1} & \parbox[t]{3.8cm}
{chooses $n = pq$, her public key $(n, e)$ and her private key
$d$ just as Bob does in the RSA protocol in Figure~\ref{fig:rsa}} & & \\
\hline
{\bf 2} & &
$(n, e) \Rightarrow$ & \\ \hline
{\bf 3} & \parbox[t]{3.8cm}
{signs the message~$m$ by
\[\mbox{sig}_A (m) = m^d \bmod n\]}
& & \\ \hline
{\bf 4} & &
$\pair{m,\mbox{sig}_A (m)} \Rightarrow$ & \\ \hline
{\bf 5} &
& & \parbox[t]{3.8cm}
{checks $m \equiv \left(\mbox{sig}_A (m)\right)^e \bmod n$ to verify Alice's
signature}
\\ \hline
\end{tabular}
\caption{Digital RSA signatures.
\label{fig:rsa-digital-signature}
}
\end{figure}
\subsection{Security of RSA}
As mentioned above, the security of the RSA cryptosystem crucially depends on
the assumption that large numbers cannot be factored in a reasonable amount of
time. Despite much effort in the past, no efficient factoring algorithm has
been found until now. Thus, it is widely believed that there is no such
algorithm and the factoring problem is hard. A rigorous proof of this
hypothesis, however, has not been found either. And even if one could prove
this hypothesis, this would not imply a proof of security of RSA. Breaking
RSA is at most as hard as the factoring problem; however, the converse is not
known to hold. That is, it is not known whether these two problems are
equally hard. It may be possible to break RSA without factoring~$n.$
We omit listing potential attacks on the RSA system here. Rather, the
interested reader is pointed to the comprehensive literature on this subject;
note also Problem \ref{ex:low-exponent-attack} at the end of this
chapter. We merely mention that for each of the currently known attacks on
RSA, there are suitable countermeasures, rules of thumb that either completely
prevent a certain attack or make its probability of success negligibly small.
In particular, it is important to take much care when choosing the prime
numbers $p$ and $q$, the modulus $n = pq,$
the public exponent $e$, and the private key $d.$
Finally, since the {\em factoring attacks\/} on RSA play a particularly
central role, we briefly sketch two such attacks. The first one is based on
Pollard's $(p-1)$ method~\cite{pol:j:factorization}. This method is effective
for composite numbers $n$ having a prime factor $p$ such that the prime
factors of $p-1$ each are small. Under this assumption, a multiple $\nu$ of
$p-1$ can be determined without knowing~$p$. By Fermat's Little Theorem (see
Corollary~\ref{thm:fermat}), it follows that $a^{\nu} \equiv 1 \bmod p$ for
all integers $a$ coprime with~$p$. Hence, $p$ divides $a^{\nu} - 1$. If $n$
does not divide $a^{\nu} - 1$, then $\gcd(a^{\nu} - 1,n)$ is a nontrivial
divisor of~$n$. Thus, the number $n$ can be factored.
How can the multiple $\nu$ of $p-1$ be determined? Pollard's $(p-1)$
method uses as candidates for $\nu$ the products of prime powers below a
suitably chosen bound~$S$:
\[
\nu = \prod_{\mbox{\scriptsize $q$ is prime, $q^k \leq S$}} q^k \koz .
\]
If all prime powers dividing $p-1$ are less than~$S$, then $\nu$ is a multiple
of $p-1$. The algorithm determines $\gcd(a^{\nu} - 1,n)$ for a suitably
chosen base~$a$. If no nontrivial divisor of $n$ is found, the algorithm is
restarted with a new bound $S'>S$.
Other factoring methods, such as the {\em quadratic sieve},\index{quadratic sieve} are described,
e.g., in~\cite{Rothe04b,Stinson95}.\nevindex{Stinson, Douglas}\nevindex{Rothe, Jörg} They use the
following simple idea. Suppose $n$ is the number to be factored. Using the
sieve, determine numbers $a$ and $b$ such that:
\begin{eqnarray}
\label{equ:sieve}
a^2 \equiv b^2 \bmod n & \mbox{and} & a \not\equiv \pm b \bmod n \koz .
\end{eqnarray}
Hence, $n$ divides $a^2 - b^2 = (a-b)(a+b)$ but neither $a-b$ nor $a+b$.
Thus, $\gcd(a-b,n)$ is a nontrivial factor of~$n$.
There are also sieve methods other than the quadratic sieve. These methods
are distinguished by the particular way of how to determine the numbers $a$
and $b$ such that (\ref{equ:sieve}) is satisfied. A prominent example is the
``general number field sieve'', see~\cite{len-len:b:number-field-sieve}.
\begin{gyak}
\ujgyak
\textit{a.} Prove Theorem~\ref{thm:rsa}. \textit{Hint.}
Show $\left( m^e \right)^d \equiv m \bmod p$ and $\left( m^e
\right)^d \equiv m \bmod q$ using Corollary~\ref{thm:fermat}, Fermat's
Little Theorem. Since $p$ and $q$ are prime numbers with $p \neq q$ and $n
= pq$, the claim $\left( m^e \right)^d \equiv m \bmod n$ now follows from the
Chinese remainder theorem. \label{ex:rsa}
\textit{b.} The proof sketch of Theorem~\ref{thm:primes-is-in-corp} uses the fact
that any prime number $n$ can have only two square roots of $1$ modulo~$n$,
namely $\pm 1 \bmod n$. Prove this fact.
\textit{Hint.} It may be helpful to note that $r$ is a square root of 1
modulo $n$ if and only if $n$ divides $(r-1)(r+1)$.
\label{ex:primes-is-in-corp}
\ujgyak
\textit{a.} Let $\varphi(n) = 660$ and $e = 7$ be the values from
Example~\ref{exa:rsa}. Show that the extended Euclidean algorithm indeed
provides the private key $d = 283$, the inverse of $7 \bmod 660.$
\textit{b.} Consider the plaintext in Figure \ref{tab:rsa-example} from
Example~\ref{exa:rsa} and its RSA encryption. Determine the encoding of
this ciphertext by letters of the alphabet $\Sigma = \{\mbox{\rm{}A},
\mbox{\rm{}B}, \ldots , \mbox{\rm{}Z}\}$ for each of the 17 blocks.
\textit{c.} Decrypt each of the $17$ ciphertext blocks in
Figure \ref{tab:rsa-example} and show that the original message is obtained
indeed. \label{ex:rsa-example}
\textit{d.} Prove that the RSA digital signature protocol from
Figure~\ref{fig:rsa-digital-signature} works.
\label{ex:rsa-digital-signature}
\end{gyak}
\section{The protocols of Rivest, Rabi, and Sherman}\label{sec:riv-rab-she}
Rivest, Rabi, and Sherman proposed protocols for secret-key agreement and
digital signatures. The secret-key agreement protocol given in
Figure~\ref{fig:rivest-sherman-secret-key} is due to Rivest and Sherman. Rabi
and Sherman modified this protocol to a digital signature protocol, see
Exercise \ref{ex:rabi-sherman-digital-signature}
\begin{figure}[!t]
\centering
\begin{tabular}{|c||c|c|c|}
\hline
\parbox[t]{0.6cm}{\bf Step} &
Alice &
Erich &
Bob \\ \hline\hline
{\bf 1} & \parbox[t]{3.8cm}
{chooses two large numbers $x$ and $y$ at random, keeps $x$ secret and
computes $x \sigma y$} & & \\ \hline
{\bf 2} & &
$\pair{y, x \sigma y} \Rightarrow$ & \\ \hline
{\bf 3} &
& & \parbox[t]{3.8cm}
{chooses a large number $z$ at random, keeps $z$ secret and
computes $y \sigma z$}
\\ \hline
{\bf 4} & &
$\Leftarrow y \sigma z$
& \\ \hline
{\bf 5} & \parbox[t]{3.8cm}
{computes $k_A = x \sigma (y \sigma z)$
} & & \parbox[t]{3,8cm}
{computes $k_B = (x \sigma y) \sigma z$
} \\ \hline
\end{tabular}
\caption{The Rivest-Sherman protocol for secret-key agreement, based
on~$\sigma$.
\label{fig:rivest-sherman-secret-key}
}
\end{figure}
The Rivest-Sherman protocol is based on a \textit{total, strongly noninvertible,
associative one-way function}. Informally put, a \textit{one-way function}
is a function that is easy to compute but hard to invert. One-way functions
are central cryptographic primitives and many cryptographic protocols use them
as their key building blocks. To capture the notion of noninvertibility, a
variety of models and, depending on the model used, various candidates for
one-way functions have been proposed. In most cryptographic applications,
noninvertibility is defined in the average-case complexity model.
Unfortunately, it is not known whether such one-way functions exist; the
security of the corresponding protocols is merely based on the assumption of
their existence. Even in the less challenging worst-case model, in which
so-called ``complexity-theoretic'' one-way functions are usually defined, the
question of whether any type of one-way function exists remains an open issue
after many years of research.
A total (i.e., everywhere defined) function $\sigma$ mapping from $\N
\times \N$ to $\N$ is \ki{associative}\index{function}\inddef{function mapping!associative}
if and only if $(x \sigma y)
\sigma z = x \sigma (y \sigma z)$ holds for all $x, y, z \in \N$, where we
use the infix notation $x \sigma y$ instead of the prefix notation
$\sigma(x,y)$. This property implies that the above protocol works:
\[
k_A = x \sigma (y \sigma z) = (x \sigma y) \sigma z = k_B \koz,
\]
so Alice and Bob indeed compute the same secret key.
The notion of strong noninvertibility is not to be defined formally here.
Informally put, $\sigma$ is said to be {\em strongly noninvertible\/} if
$\sigma$ is not only a one-way function, but even if in addition to the
function value one of the corresponding arguments is given, it is not possible
to compute the other argument efficiently. This property implies that the
attacker Erich, knowing $y$ and $x \sigma y$ and $y \sigma z$, is not able to
compute the secret numbers $x$ and~$z$, from which he could easily determine
the secret key $k_A = k_B$.
\begin{gyak}
\ujgyak
Modify the Rivest-Sherman protocol for secret-key agreement from
Figure~\ref{fig:rivest-sherman-secret-key} to a protocol for digital
signatures.\label{ex:rabi-sherman-digital-signature}
\ujgyak
\textit{(a.} Try to give a formal definition of the notion of ``strong
noninvertibility'' that is defined only informally above. Use the worst-case
complexity model.
\textit{b.} Suppose $\sigma$ is a {\em partial\/} function from $\N
\times \N$ to $\N$, i.e., $\sigma$ may be undefined for some pairs in
$\N \times \N$. Give a formal definition of ``associativity'' for
partial functions. What is wrong with the following (flawed) attempt of a
definition: ``A partial function $\sigma : \N \times \N \rightarrow
\N$ is said to be {\em associative} if and only if $x \sigma (y \sigma z)
= (x \sigma y) \sigma z$ holds for all $x,y,z \in \N$ for which each of
the four pairs $(x,y)$, $(y,z)$, $(x,y \sigma z)$, and $(x \sigma y, z)$ is
in the domain of~$\sigma$.'' \textit{Hint.} A comprehensive discussion of these notions can be found
in~\cite{HemasRotheSaxena05,hem-pas-rot:j:strong-noninvertibility,HemaspaandraRo99}.
\end{gyak}
\section{Interactive proof systems and zero-knowledge}
\label{sec:zero-knowledge}
\subsection{Interactive proof systems and Arthur-Merlin games}
\label{sec:arthur-merlin}
In Section~\ref{sec:diffie-hellman}, the ``{\em man-in-the-middle}'' attack on
the Diff{}ie-Hellman protocol was mentioned. The problem here is that Bob
has not verified the true identity of his communication partner before
executing the protocol. While he assumes to communicate with Alice, he in
fact exchanges messages with Erich. In other words, Alice's task is to
convince Bob of her true identity without any doubt. This cryptographic task
is called {\em authentication}. Unlike digital signatures, whose purpose is
to authenticate electronically transmitted {\em documents\/} such as emails,
electronic contracts, etc., the goal now is to authenticate {\em
individuals\/} such as human or computer parties participating in a
cryptographic protocol.\footnote{ Here, an ``individual'' or a ``party'' is not
necessarily a human being; it may also be a computer program that
automatically executes a protocol with another computer program.}
In order to authenticate herself, Alice might try to prove her identity by a
secret information known to her alone, say by giving her PIN (``{\em
P\/}ersonal {\em I\/}dentifaction {\em N\/}umber'') or any other private
information that no one knows but her. However, there is a catch. To prove
her identity, she would have to give her secret away to Bob. But then, it no
longer is a secret! Bob, knowing her secret, might pretend to be Alice in
another protocol he executes with Chris, a third party. So the question is
how to prove knowledge of a secret without giving it away. This is what
zero-knowledge is all about. Zero-knowledge protocols are special interactive
proof systems, which were introduced by Goldwasser, Micali, and
Rackoff and, independently,
by Babai\nevindex{Babai, László} and Moran. Babai and Moran's notion
(which is essentially equivalent to the interactive proof systems proposed by
Goldwasser et al.) is known as Arthur-Merlin games, which we will now describe
informally.
Merlin and Arthur wish to jointly solve a problem~$L$, i.e., they wish to
jointly decide whether or not a given input $x$ belongs to~$L$. The mighty
wizard Merlin is represented by an $\np$ machine~$M$, and the impatient King
Arthur is represented by a randomised polynomial-time Turing machine~$A$. To
make their decision, they play the following game, where they are taking turns
to make alternating moves. Merlin's intention is always to convince Arthur
that $x$ belongs to $L$ (no matter whether or not that indeed is the case).
Thus, each of Merlin's moves consists in presenting a proof for ``$x \in L$'',
which he obtains by simulating $M(x,y)$, where $x$ is the input and $y$
describes all previous moves in this game. That is, the string $y$ encodes
all previous nondeterministic choices of $M$ and all previous random choices
of~$A$.
King Arthur, however, does not trust Merlin. Of course, he cannot check the
mighty wizard's proofs all alone; this task simply exceeds his computational
power. But he knows Merlin well enough to express some doubts. So, he
replies with a nifty challenge to Merlin by picking some details of his proofs
at random and requiring certificates for them that he can verify. In order to
satisfy Arthur, Merlin must convince him with overwhelming probability. Thus,
each of Arthur's moves consists in the simulation of $A(x,y)$, where $x$ again
is the input and $y$ describes the previous history of the game.
The idea of Arthur-Merlin games can be captured via alternating existential
and probabilistic quantifiers, where the former formalise Merlin's $\np$
computation and the latter formalise Arthur's randomised polynomial-time
computation.\footnote{ This is similar to the well-known characterisation of
the levels of the polynomial hierarchy via alternating $\exists$ and
$\forall$ quantifiers, see Section~\ref{sec:graphisomorphie} and in
particular item~\ref{thm:structure-ph:alternating-quantifiers}
Theorem~\ref{thm:structure-ph}.} In this way, a hierarchy of complexity
classes can be defined, the so-called Arthur-Merlin hierarchy. We here
present only the class $\ma$ from this hierarchy, which corresponds to an
Arthur-Merlin game with two moves, with Merlin moving first.
\begin{defi}[MA in the Arthur-Merlin hierarchy]
\label{def:arthur-merlin-hierarchie}
The class $\ma$ contains \linebreak
exactly those problems~$L$ for which there exists an
$\np$ machine $M$ and a randomised polynomial-time Turing machine $A$
such that for each input $x$:
\begin{itemize}
\item If $x \in L$ then there exists a path $y$ of $M(x)$ such that $A(x,y)$
accepts with probability at least $3/4$ (i.e., Arthur cannot refute Merlin's
proof $y$ for ``$x \in L$'', and Merlin thus wins).
\item If $x \not\in L$ then for each path $y$ of~$M(x)$, $A(x,y)$ rejects
with probability at least $3/4$ (i.e., Arthur is not taken in by
Merlin's wrong proofs for ``$x \in L$'' and thus wins).
\end{itemize}
Analogously, the classes $\am , \mam , \ama , \ldots$ can be defined,
see Exercise \ref{ex:arthur-merlin}.
\end{defi}
In Definition~\ref{def:arthur-merlin-hierarchie}, the probability threshold of
$3/4$ for Arthur to accept or to reject, respectively, is chosen at will and
does not appear to be large enough at first glance.
In fact, the probability of
success can be amplified using standard techniques and can be made arbitrarily
close to one. In other words, one might have chosen even a probability
threshold as low as $1/2 + \varepsilon$, for an arbitrary fixed constant
$\varepsilon > 0$,
and would still have defined the same class. Furthermore, it is known that
for a constant number of moves, the Arthur-Merlin hierarchy collapses down
to~$\am$:
\[
\np \seq \ma \seq \am = \ama = \mam = \cdots \koz .
\]
It is an open question whether or not any of the inclusions $\np \seq \ma \seq
\am$ is a strict one.
A similar model, which can be used as an alternative to the Arthur-Merlin
games, are the {\em interactive proof systems\/} mentioned above. The two
notions use different terminology: Merlin now is called the ``prover'' and
Arthur the ``verifier''. Also, their communication is not interpreted as a
game but rather as a protocol. Another difference between the two models,
which appears to be crucial at first, is that Arthur's random bits are public
(so, in particular, Merlin knows them), whereas the random bits of the
verifier in an interactive proof system are private. However, Goldwasser and
Sipser~\cite{gol-sip:j:private} proved that, in fact, it does not matter
whether the random bits are private or public, so Arthur-Merlin games
essentially are equivalent to interactive proof systems.
If one allows a polynomial number of moves (instead of a constant number),
then one obtains the complexity class~$\ip$. Note that
interactive proof systems are
also called $\ip$ protocols. By definition, $\ip$ contains all of $\np$. In
particular, the graph isomorphism problem is in~$\ip$. We will see later that
$\ip$ also contains problems from $\conp = \{\overline{L} \condition L \in
\np\}$ that are supposed to be not in $\np$. In particular, the proof of
Theorem~\ref{thm:graphiso-is-in-coam} shows that the complement of the graph
isomorphism problem is in $\am$ and thus in~$\ip$. A celebrated result by
Shamir~\cite{sha:j:ip} says that $\ip$ equals $\pspace$, the class of problems
solvable in polynomial space.
Let us now turn back to the problem of authentication mentioned above, and to
the related notion of zero-knowledge protocols. Here is the idea. Suppose
Arthur and Merlin play one of their games. So, Merlin sends hard proofs to
Arthur. Merlin alone knows how to get such proofs. Being a wise wizard, he
keeps this knowledge secret. And he uses his secret to authenticate himself
in the communication with Arthur.
Now suppose that malicious wizard Marvin wishes to fool Arthur by pretending
to be Merlin. He disguises as Merlin and uses his magic to look just like
him.
How\-ever, he does not know Merlin's secret of how to produce hard proofs.
His magic is no more powerful than that of an ordinary randomised
polynomial-time Turing machine. Still, he seeks to simulate the communication
between Merlin and Arthur. An interactive proof system has the \ki{zero-knowledge property}
if the information communicated between Marvin
and Arthur cannot be distinguished from the information communicated between
Merlin and Arthur. Note that Marvin, who does not know Merlin's secret,
cannot introduce any information about this secret into the simulated $\ip$
protocol. Nonetheless, he is able to perfectly copy the original protocol, so
no one can tell a difference. Hence, the (original) protocol itself must have
the property that it does not leak any information whatsoever about Merlin's
secret. If there is nothing to put in, there can be nothing to take out.
\begin{defi}[Zero-knowledge protocol]\label{def:zero-knowledge}
Let $L$ be any set in~$\ip$, and let $(M,A)$ be an interactive proof system
for~$L$, where $M$ is an NPTM and $A$ is a randomised polynomial-time Turing
machine. The $\ip$ protocol $(M,A)$ is a {\em zero-knowledge protocol for
$L$\/} if and only if there exists a randomised polynomial-time Turing
machine such that $(\widehat{M},A)$ simulates the original protocol $(M,A)$
and, for each $x \in L$, the tuples $(m_1, m_2, \ldots , m_k)$ and
$(\widehat{m}_1, \widehat{m}_2, \ldots , \widehat{m}_k)$ representing the
information communicated in $(M,A)$ and in $(\widehat{M},A)$, respectively,
are identically distributed over the random choices in $(M,A)$ and in
$(\widehat{M},A)$, respectively.
\end{defi}
The notion defined above is called ``\textit{honest-verifier perfect
zero-knowledge}''\index{honest-verifier perfect zero-knowledge} in the literature,
since (a)~it is assumed that the
verifier is {\em honest\/} (which may not necessarily be true in cryptographic
applications though),
and (b)~it is required that the information communicated in the
simulated protocol {\em perfectly\/} coincides with the information
communicated in the original protocol. Assumption~(a) may be somewhat too
idealistic, and assumption~(b) may be somewhat too strict. That is why also
other variants of zero-knowledge protocols are studied, see the notes at the
end of this chapter.
\subsection{Zero-knowledge protocol for graph isomorphism}
\label{sec:zero-knowledge-for-graphiso}
Let us consider a concrete example now. As mentioned above, the graph
isomorphism problem ($\graphiso$, for short) is in~$\np$, and the
complementary problem $\overline{\graphiso}$ is in $\am$, see the proof of
Theorem~\ref{thm:graphiso-is-in-coam}. Thus, both problems are contained
in~$\ip$. We now describe a zero-knowledge protocol for $\graphiso$ that is
due to Goldreich, Micali, and Wigderson~\cite{gol-mic-wid:j:nothing}.
Figure~\ref{fig:goldreich-micali-wigderson} shows this $\ip$ protocol
between the prover Merlin and the verifier Arthur.
Although there is no efficient algorithm known for $\graphiso$, Merlin can
solve this problem, since $\graphiso$ is in~$\np$. However, there is no need
for him to do so in the protocol. He can simply generate a large graph $G_0$
with $n$ vertices and a random permutation $\pi \in \mathfrak{S}_n$. Then, he
computes the graph $G_1 = \pi(G_0)$ and makes the pair $(G_0, G_1)$ public.
The isomorphism $\pi$ between $G_0$ and $G_1$ is
kept secret as Merlin's private information.
Of course, Merlin cannot send $\pi$ to Arthur, since he does not want to give
his secret away. Rather, to prove that the two graphs, $G_0$ and~$G_1$,
indeed are isomorphic, Merlin randomly chooses an isomorphism $\rho$ under the
uniform distribution and a bit $a \in \{0,1\}$ and computes the graph $H =
\rho(G_a)$. He then sends $H$ to Arthur whose response is a challenge for
Merlin: Arthur sends a random bit~$b \in \{0,1\}$, chosen under the uniform
distribution, to Merlin and requests to see an isomorphism $\sigma$ between
$G_b$ and~$H$. Arthur accepts if and only if $\sigma$ indeed satisfies
$\sigma(G_b) = H$.
\begin{figure}[!t]
\centering
\begin{tabular}{|c||c|c|c|}
\hline
\parbox[t]{0.6cm}{\bf Step} &
Merlin &
&
Arthur \\ \hline\hline
{\bf 1} &
\parbox[t]{5cm}
{randomly chooses a permutation $\rho$ on $V(G_0)$ and a bit~$a$, computes
$H = \rho(G_a)$} & & \\ \hline
{\bf 2} & &
$H \Rightarrow$ & \\ \hline
{\bf 3} & & &
\parbox[t]{4.1cm}
{chooses a random bit $b$ and requests to see an isomorphism
between $G_b$ and~$H$}
\\ \hline
{\bf 4} & &
$\Leftarrow b$ & \\ \hline
{\bf 5} & \parbox[t]{5cm}
{computes the isomorphism $\sigma$ satisfying $\sigma(G_b) = H$ as follows:
\\
if $b = a$ then $\sigma = \rho$; \\
if $0 = b \neq a = 1$ then $\sigma = \pi \rho$; \\
if $1 = b \neq a = 0$ then $\sigma = \pi^{-1} \rho$.}
& & \\ \hline
{\bf 6} & &
$\sigma \Rightarrow$ & \\ \hline
{\bf 7} & & &
\parbox[t]{4.1cm}
{verifies that $\sigma(G_b) = H$ and accepts accordingly} \\
\hline
\end{tabular}
\caption{Goldreich, Micali, and Wigderson's zero-knowledge protocol
for~$\graphiso$.
\label{fig:goldreich-micali-wigderson}
}
\end{figure}
The protocol works, since Merlin knows his secret isomorphism $\pi$ and his
random permutation~$\rho$: It is no problem for Merlin to compute the
isomorphism $\sigma$ between $G_b$ and $H$ and thus
to authenticate himself. The
secret $\pi$ is not given away. Since $G_0$ and $G_1$ are isomorphic, Arthur
accepts with probability one. The case of two nonisomorphic graphs does not
need to be considered here, since Merlin has chosen isomorphic graphs $G_0$
and $G_1$ in the protocol; see also the proof of
Theorem~\ref{thm:graphiso-is-in-coam}.
Now, suppose, Marvin wishes to pretend to be Merlin when communicating with
Arthur. He does know the graphs $G_0$ and~$G_1$, but he doesn't know the
secret isomorphism~$\pi$. Nonetheless, he tries to convince Arthur that he
does know~$\pi$. If Arthur's random bit $b$ happens to be the same as his
bit~$a$, to which Marvin committed {\em before\/} he sees~$b$, then Marvin
wins.
However, if $b \neq a$, then computing $\sigma = \pi \rho$ or $\sigma =
\pi^{-1} \rho$ requires knowledge of~$\pi$. Since $\graphiso$ is not
efficiently solvable (and even too hard for a randomised polynomial-time
Turing machine), Marvin cannot determine the isomorphism $\pi$ for
sufficiently large graphs $G_0$ and~$G_1$. But without knowing~$\pi$, all he
can do is guess. His chances of hitting a bit $b$ with $b = a$ are at
most~$1/2$. Of course, Marvin can always guess, so his success probability is
exactly~$1/2$. If Arthur challenges him in $r$ independent rounds of this
protocol again and again, the probability of Marvin's success will be only
$2^{-r}$. Already for $r=20$, this probability is negligibly small:
Marvin's probability of success is then less than one in one million.
\begin{figure}[!t]
\centering
\begin{tabular}{|c||c|c|c|}
\hline
\parbox[t]{0.6cm}{\bf Step} &
Marvin & &
Arthur \\ \hline\hline
{\bf 1} &
\parbox[t]{5cm}
{randomly chooses a permutation $\rho$ on $V(G_0)$ and a bit~$a$, computes
$H = \rho(G_a)$} & & \\ \hline
{\bf 2} & &
$H \Rightarrow$ & \\ \hline
{\bf 3} & & &
\parbox[t]{4.1cm}
{chooses a random bit $b$ and requests to see an isomorphism
between $G_b$ and~$H$}
\\ \hline
{\bf 4} & &
$\Leftarrow b$ & \\ \hline
{\bf 5} &\parbox[t]{5cm}
{if $b \neq a$ then $\widehat{M}$ deletes all information communicated in
this round;
if $b = a$ then $\widehat{M}$ sends $\sigma = \rho$} & & \\ \hline
{\bf 6} & &
$\sigma \Rightarrow$ & \\ \hline
{\bf 7} & & &
\parbox[t]{4.1cm}
{$b = a$ implies $\sigma(G_b) = H$,
so Arthur accepts Marvin's faked identity} \\ \hline
\end{tabular}
\caption{Simulation of the zero-knowledge protocol for $\graphiso$
without knowing~$\pi$.
\label{fig:goldreich-micali-wigderson-simulator}
}
\end{figure}
It remains to show that the protocol from
Figure~\ref{fig:goldreich-micali-wigderson} is a zero-knowledge protocol.
Figure~\ref{fig:goldreich-micali-wigderson-simulator} shows a simulated
protocol with Marvin who does not know Merlin's secret $\pi$ but pretends to
know it. The information communicated in one round of the protocol has the
form of a triple: $(H, b, \sigma)$. If Marvin is lucky enough to choose a
random bit $a$ with $a = b$, he can simply send $\sigma = \rho$ and wins:
Arthur (or any third party watching the communication) will not notice the
fraud. On the other hand, if $a \neq b$ then Marvin's attempt to betray will
be uncovered. However, that is no problem for the malicious wizard: He simply
deletes this round from the protocol and repeats. Thus, he can produce a
sequence of triples of the form $(H, b, \sigma)$ that is indistinguishable
from the corresponding sequence of triples in the original protocol between
Merlin and Arthur. It follows that Goldreich, Micali, and Wigderson's
protocol for $\graphiso$ is a zero-knowledge protocol.
\begin{gyak}
\ujgyak \textbf{Arthur-Merlin hierarchy:}
\textit{a.} Analogously to $\ma$ from
Definition~\ref{def:arthur-merlin-hierarchie},
define the other classes $\am$, $\mam$, $\ama$, $\ldots$
of the Arthur-Merlin hierarchy.
\textit{b.} What is the inclusion structure between the classes $\ma$,
$\co\ma$, $\am$, $\co\am$, and the classes of the polynomial hierarchy
defined in Definition~\ref{def:ph} of Subsection \ref{sec:reducibilities-hierarchies}.
\label{ex:arthur-merlin}
\vspace{-4mm}
\ujgyak \textbf{Zero-knowledge protocol for graph isomorphism:}
\textit{a.} Consider the graphs $G$ and $H$ from
Example~\ref{exa:graphiso} in Section~\ref{sec:algebra}. Execute the
zero-knowledge protocol from Figure~\ref{fig:goldreich-micali-wigderson}
with the graphs $G_0 = G$ and $G_1 = H$ and the isomorphism $\pi =
(\mbox{\tiny $
\begin{array}{@{}c@{\hspace*{.8mm}}c@{\hspace*{.8mm}}c@{\hspace*{.8mm}}c@{\hspace*{.8mm}}c@{}}
1 & 2 & 3 & 4 & 5 \\
3 & 4 & 1 & 5 & 2
\end{array}
$})$. Use an isomorphism $\rho$ of your choice, and try all possibilities for
the random bits $a$ and~$b$.
Repeat this Arthur-Merlin game with an unknown isomorphism $\rho$ chosen by
somebody else.
\textit{b.} Modify the protocol from Figure~\ref{fig:goldreich-micali-wigderson} such that, with only one round
of the protocol, Marvin's success probability is less than $2^{-25}$. \label{ex:zero-knowledge-for-graphiso}
\end{gyak}
\begin{fld}
\ujfld{Arithmetics in the ring \boldmath{$\integers_k$}}
{Let $k \in \N$ and $x, y, z \in \integers$. We say {\em $x$ is congruent to $y$
modulo~$k$\/} ($x \equiv y \bmod k$, for short) if and only if $k$ divides
the difference $y-x$. For example, $-10 \equiv 7 \bmod 17$ and $4 \equiv 0
\bmod 2$. The congruence $\equiv$ modulo $k$ defines an {\em equivalence
relation\/} on~$\integers$, i.e., it is {\em reflexive\/} ($x \equiv x
\bmod k$), {\em symmetric\/} (if $x \equiv y \bmod k$ then $y \equiv x \bmod
k$), and {\em transitive\/} ($x \equiv y \bmod k$ and $y \equiv z \bmod k$
imply $x \equiv z \bmod k$).
The set $x + k\integers = \{y \in \integers \condition y \equiv x \bmod k\}$
is the {\em remainder class of $x \bmod k$}. For example, the remainder
class of $3 \bmod 7$ is
\[
3 + 7\integers = \{3, 3 \pm 7 , 3 \pm 2 \cdot 7, \ldots \} = \{3, 10, -4, 17,
-11, \ldots\} \koz .
\]
We represent the remainder class of $x \bmod k$ by the smallest natural number
in $x + k\integers$. For instance, $3$ represents the remainder class of $3
\bmod 7$. The set of all remainder classes ${}\bmod k$ is $\integers_k = \{0,
1, \ldots , k-1\}$. On~$\integers_k$, {\em addition\/} is defined by $(x +
k\integers) + (y + k\integers) = (x+y) + k\integers$, and {\em
multiplication\/} is defined by $(x + k\integers) \cdot (y + k\integers) =
(x \cdot y) + k\integers$. For example, in the arithmetics modulo~$7$, we
have $(4 + 7\integers) + (5 + 7\integers) = (4+5) +
7\integers = 2 + 7\integers$ and $(4 + 7\integers) \cdot (5 + 7\integers) = (4
\cdot 5) + 7\integers = 6 + 7\integers$.
Prove that in the arithmetics modulo~$k$:
\begin{description}
\item[\textit{a.}] $(\integers_k, +, \cdot)$ is a commutative ring with one;
\item[\textit{b.}] $\integers_{k}^{\ast}$, which is defined in
Example~\ref{exa:gruppe}, is a multiplicative group;
\item[\textit{c.}] $(\integers_{p}, +, \cdot )$ is a
field for each prime number~$p$.
\item[\textit{d.}] Prove that the neutral element of a group and the inverse of
each group element are unique.
\item[\textit{e.}] Let $\mathfrak{R}$ be a commutative ring with one. Prove
that the set of all invertible elements in $\mathfrak{R}$ forms a
multiplicative group. Determine this group for the ring~$\integers_k$.
\end{description}\label{ex:field}
}
\vspace{-4mm}
\ujfld{Tree isomorphism}
{The graph isomorphism problem can be solved
efficiently on special graph classes, such as on the class of trees. An
(undirected) {\em tree\/} is a connected graph without cycles. (A {\em
cycle\/} is a sequence of pairwise incident edges that returns to the
point of origin.) The {\em leaves of a tree\/} are the vertices with degree
one. The {\em tree isomorphism problem} is defined by
\begin{eqnarray*}
\texttt{TI} & = & \{ \pair{G,H} \condition
\mbox{$G$ and $H$ are isomorphic trees} \} \koz .
\end{eqnarray*}
Design an efficient algorithm for this problem.
\textit{Hint.} Label the vertices of the given pair of trees successively by
suitable number sequences. Compare the resulting sequences of labels in the
single loops of the algorithm. Starting from the leaves of the trees and then
working step by step towards the center of the trees, the algorithm halts as
soon as all vertices are labelled. This algorithm can be found, e.g.,
in~\cite{Koebler92a}. \label{ex:treeiso}
}
\ujfld{Computing the determinant}
{Design an efficient algorithm in pseudocode for computing the determinant of a matrix. Implement your
algorithm in a programming language of your choice.
Can the inverse of a matrix be computed efficiently?\label{ex:determinant}
}
\ujfld{Low-exponent attack}
\vspace{-4mm}
{\begin{description}
\item[\textit{a.}] Consider the RSA cryptosystem from Figure~\ref{fig:rsa}. For
the sake of efficiency, the public exponent $e=3$ has been popular.
However, this choice is dangerous. Suppose Alice, Bob, and Chris encrypt the same message $m$ with
the same public exponent $e=3$, but perhaps with distinct moduli, $n_A$,
$n_B$, and~$n_C$. Erich intercepts the resulting three ciphertexts: $c_i =
m^3 \bmod n_i$ for $i \in \{A, B, C\}$. Then, Erich can easily decrypt the
message~$m$. How?
\textit{Hint.} Erich knows the Chinese remainder theorem, which also was useful in
Exercise \ref{ex:rsa}.
A recommended value for the public exponent is $e = 2^{16} +1$, since its
binary expansion has only two ones. Thus, the square-and-multiply algorithm
runs fast for this~$e$.
\item[\textit{b.}] The attack described above can be extended to $k$ ciphertexts
that are related with each other as follows. Let $a_i$ and $b_i$ be known,
$1 \leq i \leq k$, and suppose that $k$ messages $c_i = (a_i m + b_i)^e
\bmod n_i$ are sent and are intercepted by Erich. Further, suppose that $k
> e(e+1)/2$ and $\min(n_i) > 2^{e^2}$. How can attacker Erich now determine
the original message~$m$?
\textit{Hint.} Apply so-called lattice reduction techniques (see, e.g.,
Micciancio and Goldwasser~\cite{mic-gol:b:lattice-problems}). The attack
mentioned here is due to
H{\aa}stad~\cite{has:j:solving-low-degree-equations} and has been
strengthened later by Coppersmith~\cite{cop:j:low-exponent-rsa-attacks}.
\item[\textit{c.}] How can the attacks described above be prevented?
\end{description}\label{ex:low-exponent-attack}
}
\end{fld}
\begin{fejmegj}\label{sec:chapter-notes-cryptology}
Singh's book~\cite{Singh99} gives a nice introduction to the history
of cryptology, from its ancient roots to modern cryptosystems. For example,
you can find out there about recent discoveries related to the development of
RSA and Diff{}ie-Hellman in the nonpublic sector. Ellis, Cocks, and
Williamson from the Communications Electronics Security Group (CESG) of the
British Government Communications Head Quarters (GCHQ) proposed the RSA system
from Figure~\ref{fig:rsa} and the Diff{}ie-Hellman protocol from
Figure~\ref{fig:diffie-hellman} even earlier than Rivest, Shamir, and Adleman
and at about the same time as but independent of Diff{}ie and Hellman,
respectively. RSA and Diff{}ie-Hellman are described in probably every book
about cryptography written since their invention. A more comprehensive list
of attacks against RSA than that of Section~\ref{sec:rsa-factoring} can be
found in, e.g., \cite{Boneh99,KaliskiRobshaw95,Moore92,
Rothe02,Rothe04b,sha:j:rsa-for-paranoids}.
Primality tests such as the Miller-Rabin test
% from Figure~\ref{fig:miller-rabin}
and factoring algorithms are also described in
many books, e.g.,
in~\cite{Goldreich01,Rothe04b,Salomaa96,Stinson95}.
The notion of strongly noninvertible associative one-way functions, on which
the secret-key agreement protocol from
Figure~\ref{fig:rivest-sherman-secret-key} is based, is due to Rivest and
Sherman. The modification of this protocol to a digital signature scheme is
due to Rabi and Sherman. In their paper~\cite{rab-she:j:aowf}, they also
proved that commutative, associative one-way function exist if and only if $\p
\neq \np$. However, the one-way functions they construct are neither total
nor strongly noninvertible, even if $\p \neq \np$ is assumed. Hemaspaandra
and Rothe~\cite{HemaspaandraRo99} proved that total, strongly noninvertible,
commutative, associative one-way functions exist if and only if $\p \neq \np$. Further
investigations on this topic can be found in~\cite{Beygelzimer99,HemasRotheSaxena05,
hem-pas-rot:j:strong-noninvertibility,hom:j:low-ambiguity-aowf}.
The notion of interactive proof systems and zero-knowledge protocols is due to
Goldwasser, Micali, and Rackoff~\cite{gol-mic-rac:j:interactive-proof-systems}. One of the best and
most comprehensive sources on this field is Chapter~4 in Goldreich's
book~\cite{Goldreich01};\nevindex{Goldreich, Oded} see also the
books~\cite{Koebler93,Papadimitriou94,Rothe04b}\nevindex{Köbler, J.}\nevindex{Papadimitriou, Christos H.}
and the surveys~\cite{Goldwasser89,Goldreich88,Rothe02}.\nevindex{Goldreich, Oded}
Arthur-Merlin games were introduced by Babai and\nevindex{Babai, László}
Moran~\cite{Babai88,bab:c:trading} and have been investigated
in many subsequent papers. Variants of the notion of zero-knowledge, which differ from the notion in
Definition~\ref{def:zero-knowledge} in their technical details, are
extensively discussed in, e.g., \cite{Goldreich01} and also in, e.g.,
\cite{Goldreich88,Goldwasser89,Rothe02}.
\end{fejmegj}
\index{cryptology|)}
\chapter[ Complexity Theory]{Complexity Theory}\label{chap:complexity}\index{complexity theory|(}
In Chapter~\ref{chap:cryptology}, efficient algorithms were introduced that
are important for cryptographic protocols.
Designing efficient algorithms of course is a central task in all
areas of computer science. Unfortunately, many important problems have
resisted all attempts in the past to devise efficient algorithms solving them.
Well-known examples of such problems are the satisfiability problem for
boolean formulas and the graph isomorphism problem.
One of the most important tasks in complexity theory is to classify such
problems according to their computational complexity. Complexity theory and
algorithmics are the two sides of the same medal; they complement each other
in the following sense. While in algorithmics one seeks to find the best
\textit{upper bound}\index{upper bound} for some problem, an important goal of complexity theory is to obtain
the best possible \textit{lower bound} for the same problem. If the upper and
the lower bound coincide, the problem has been classified.
The proof that some problem cannot be solved efficiently often appears to be
``negative'' and not desirable. After all, we wish to solve our problems and
we wish to solve them fast. However, there is also some ``positive'' aspect
of proving lower bounds that, in particular, is relevant in cryptography (see
Chapter~\ref{chap:cryptology}). Here, we are interested in the applications
of inefficiency: A proof that certain problems (such as the factoring problem
or the discrete logarithm) cannot be solved efficiently can support the
security of some cryptosystems that are important in practical applications.
In Section~\ref{sec:computational-complexity}, we provide the foundations of
complexity theory. In particular, the central complexity classes $\p$ and
$\np$ are defined. The question of whether or not these two classes are equal
is still open after more than three decades of intense research. Up to now,
neither a proof of the inequality $\p \neq \np$ (which is widely believed)
could be achieved, nor were we able to prove the equality of $\p$ and~$\np$.
This question led to the development of the beautiful and useful theory of
$\np$-completeness.
One of the best understood $\np$-complete problems is $\texttt{SAT}$, the
satisfiability problem of propositional logic: Given a boolean
formula~$\varphi$, does there exist a satisfying assignment for~$\varphi$,
i.e., does there exist an assignment of truth values to $\varphi$'s variables
that makes $\varphi$ true? Due to its $\np$-completeness, it is very unlikely
that there exist efficient deterministic algorithms for~$\texttt{SAT}$. In
Section~\ref{sec:sat}, we present a deterministic and a randomised algorithm
for $\texttt{SAT}$ that both run in exponential time. Even though these algorithms
are {\em asymptotically inefficient\/} (which is to say that they are useless
in practice for {\em large\/} inputs), they are useful for sufficiently small
inputs of sizes still relevant in practice. That is, they outperform the
naive deterministic exponential-time algorithm for $\texttt{SAT}$ in that they
considerably increase the input size for which the algorithm's running time
still is tolerable.
In Section~\ref{sec:graphisomorphie}, we come back to the graph isomorphism
problem, which was introduced in Section~\ref{sec:algebra} (see
Definition~\ref{def:graphiso}) and which was useful in
Section~\ref{sec:zero-knowledge-for-graphiso} with regard to the
zero-knowledge protocols. This problem is one of the few natural problems
in~$\np$, which (under the plausible assumption that $\p \neq \np$) may be
neither efficiently solvable nor be $\np$-complete. In this regard, this
problem is special among the problems in~$\np$.
Evidence for the hypothesis that
the graph isomorphism problem may be neither in $\p$ nor $\np$-complete comes
from the theory of {\em lowness}, which is introduced in
Section~\ref{sec:graphisomorphie}. In particular, we present Schöning's result
that $\graphiso$ is contained in the low hierarchy within~$\np$.
This result provides
strong evidence that $\graphiso$ is not $\np$-complete. We also show that
$\graphiso$ is contained in the complexity class $\spp$ and thus is low for
certain probabilistic complexity classes. Informally put, a set is {\em
low\/} for a complexity class $\mathcal{C}$ if it does not provide any
useful information when used as an ``oracle'' in $\mathcal{C}$ computations.
For proving the lowness of $\graphiso$, certain group-theoretic algorithms are
useful.
\section{Foundations}\label{sec:computational-complexity}
As mentioned above, complexity theory is concerned with proving lower bounds.
The difficulty in such proofs is that it is not enough to analyse the runtime
of just \textit{one} concrete algorithm for the problem considered. Rather,
one needs to show that \textit{every} algorithm solving the problem has a
runtime no better than the lower bound to be proven.
This includes also algorithms that
have not been found as yet. Hence, it is necessary to give a formal and
mathematically precise definition of the notion of algorithm.
Since the 1930s, a large variety of formal algorithm models has been
proposed. All these models are equivalent in the sense that each such model
can be transformed (via an algorithmic procedure)
into any other such model. Loosely speaking, one might
consider this transformation as some sort of compilation between distinct
programming languages. The equivalence of all these algorithm models
justifies Church's thesis, which says that each such model captures precisely
the somewhat vague notion of ``intuitive computability''. The algorithm model
that is most common in complexity theory is the Turing machine, which was
introduced in 1936 by Alan Turing (1912 until 1954) in his pathbreaking
work~\cite{Turing36}. The Turing machine is a very simple
abstract model of a computer. In what follows, we describe this model by
defining its syntax and its semantics, and introduce at the same time two
basic computation paradigms: determinism and nondeterminism. It makes sense to
first define the more general model of nondeterministic Turing machines.
Deterministic Turing machines then are easily seen to be a special case.
First, we give some technical details and describe how Turing machines work.
A Turing machine has $k$ infinite work tapes that are subdivided into cells.
Every cell may contain one letter of the work alphabet. If a cell does not
contain a letter, we indicate this by a special blank symbol, denoted
by~$\Box$. The computation is done on the work tapes. Initially, a
designated input tape contains the input string, and all other cells contain
the blank. If a computation halts, its result is the string contained in the
designated output tape.\footnote{ One can require, for example, that the input
tape is a read-only and the output tape is a write-only tape. Similarly,
one can specify a variety of further variations of the technical details,
but we do not pursue this any further here.} To each tape belongs a head
that accesses exactly one cell of this tape. In each step of the computation,
the head can change the symbol currently read and then moves to the left or to
the right or does not move at all. At the same time, the current state of the
machine, which is stored in its ``finite control'' can change.
Figure~\ref{fig:tm} displays a Turing machine
with one input and two work tapes.
\begin{figure}[!t]
\centering
\input{figs/tm-english.eepic}
\caption{A Turing machine with one input and two work tapes.}\label{fig:tm}
\end{figure}
\begin{defi}[Syntax of Turing machines]\label{def:syntax-tm}
A \ki{nondeterministic Turing machine with {\boldmath $k$} tapes}\inddef{nondeterministic Turing machine with $k$ tapes}
(a $k$-tape NTM, for short) is a $7$-tuple $M = (\Sigma, \Gamma, Z, \delta , z_0, \Box, F)$, where
$\Sigma$ is the input alphabet, $\Gamma$ is the work alphabet, $Z$ is a finite
{\em set of states\/} disjoint with $\Gamma$, $\delta : Z \times \Gamma^k
\rightarrow \mathfrak{P}(Z \times \Gamma^k \times \{L, R, N\}^k)$ is the {\em
transition function}, $z_0 \in Z$ is the {\em initial state}, $\Box \in
\Gamma - \Sigma$ is the {\em blank symbol}, and $F \seq Z$ is the {\em set of
final states}. Here, $\mathfrak{P}(S)$ denotes the {\em power set\/} of
set~$S$, i.e., the set of all subsets of~$S$.
For readability, we write $(z,a) \mapsto (z',b,x)$ instead of $(z', b, x) \in
\delta(z,a)$ with $z,z' \in Z$, $x \in \{L,R,N\}$ and $a, b \in \Gamma$.
This transition has the following meaning. If the current state is $z$ and
the head currently reads a symbol~$a$, then:
\begin{itemize}
\item $a$ is replaced by~$b$,
\item $z'$ is the new state, and
\item the head moves according to $x \in \{L,R,N\}$, i.e., the head either
moves one cell to the left (if $x = L$), or one cell to the right (if $x =
R$), or it does not move at all (if $x = N$).
\end{itemize}
The special case of a {\em deterministic Turing machine with $k$ tapes\/}
($k$-tape DTM, for short) is obtained by requiring that the transition
function $\delta$ maps from $Z \times \Gamma^k$ to $Z \times \Gamma^k \times
\{L, R, N\}^k$.
\end{defi}
For $k = 1$, we obtain the one-tape Turing machine, abbreviated simply by NTM
and DTM, respectively. Every $k$-tape NTM or $k$-tape DTM can be simulated by
a Turing machine with only one tape, where the runtime at most doubles.
Turing machines can be considered both as acceptors (which accept languages)
and as transducers (which compute functions).
\begin{defi}[Semantics of Turing machines]\label{def:semantik-tm}
Let $M = (\Sigma, \Gamma, Z, \delta , z_0, \Box, F)$ be an~{NTM}. A {\em
configuration of $M$\/} is a string $k \in \Gamma^{\ast}Z\Gamma^{\ast}$,
where $k = \alpha z \beta$ is interpreted as follows:
$\alpha \beta$ is the current tape inscription,
the head reads the first symbol of~$\beta$, and
$z$ is the current state of~$M$.
On the set $\mathfrak{K}_M = \Gamma^{\ast}Z\Gamma^{\ast}$ of all
configurations of~$M$, define a binary relation $\vdash_{M}$, which describes
the transition from a configuration $k \in \mathfrak{K}_M$ into a
configuration $k' \in \mathfrak{K}_M$ according to~$\delta$. For any two
strings $\alpha = a_1 a_2 \cdots a_m$ and $\beta = b_1 b_2 \cdots b_n$
in~$\Gamma^{\ast}$, where $m \geq 0$ and $n \geq 1$, and for all $z \in Z$,
define
\[
\alpha z \beta \vdash_M \left\{
\begin{array}{ll}
a_1 a_2 \cdots a_m z' c b_2 \cdots b_n
& \mbox{if $(z, b_1) \mapsto (z', c, N)$ and $m \geq 0$ and $n \geq 1$} \\
a_1 a_2 \cdots a_m c z' b_2 \cdots b_n
& \mbox{if $(z, b_1) \mapsto (z', c, R)$ and $m \geq 0$ and $n \geq 2$} \\
a_1 a_2 \cdots a_{m-1} z' a_m c b_2 \cdots b_n
& \mbox{if $(z, b_1) \mapsto (z', c, L)$ and $m \geq 1$ and $n \geq 1$ \koz .}
\end{array}
\right.
\]
Two special cases need be considered separately:
\begin{enumerate}
\item If $n = 1$ and $(z, b_1) \mapsto (z', c, R)$ (i.e., $M$'s head moves to
the right and reads a $\Box$ symbol), then $a_1 a_2 \cdots a_m z b_1
\vdash_M a_1 a_2 \cdots a_m c z' \Box$.
\item If $m = 0$ and $(z, b_1) \mapsto (z', c, L)$ (i.e., $M$'s head moves to
the left and reads a $\Box$ symbol), then $z b_1 b_2 \cdots b_n \vdash_M z'
\Box c b_2 \cdots b_n$.
\end{enumerate}
The {\em initial configuration of $M$ on input~$x$\/} is always $z_0 x$.
The {\em final configurations of $M$ on input~$x$\/} have the form $\alpha z
\beta$ with $z \in F$ and $\alpha, \beta \in \Gamma^{\ast}$.
Let $\vdash_M^{\ast}$ be the reflexive, transitive closure of~$\vdash_M$: For
$k, k' \in \mathfrak{K}_M$, we have $k \vdash_M^{\ast} k'$ if and only if
there is a finite sequence $k_0 , k_1 , \ldots , k_t$ of configurations in
$\mathfrak{K}_M$ such that
\[
k = k_0 \vdash_M k_1 \vdash_M \cdots \vdash_M k_t = k' \koz ,
\]
where possibly $k = k_0 = k_t = k'$. If $k_0 = z_0x$ is the initial
configuration of $M$ on input~$x$, then this sequence of configurations is a
{\em finite computation of~$M(x)$}, and we say {\em $M$ halts on input~$x$}.
The {\em language accepted by $M$\/} is defined by
\[
L(M) = \{x \in \sigmastar \condition z_0x \vdash_M^{\ast} \alpha z \beta
\mbox{ with } z \in F \mbox{ and } \alpha, \beta \in \Gamma^{\ast}\} \koz .
\]
\end{defi}
For NTMs, any configuration may be followed by more than one configuration.
Thus, they have a \textit{computation tree},\index{computation tree} whose root is labelled by the initial
configuration and whose leaves are labelled by the final configurations. Note
that trees are special graphs (recall Definition~\ref{def:graphiso} in
Section~\ref{sec:algebra} and Problem \ref{ex:treeiso}), so they
have vertices and edges. The vertices of a computation tree $M(x)$ are the
configurations of $M$ on input~$x$. For any two configurations $k$ and $k'$
from~$\mathfrak{K}_M$, there is exactly one directed edge from $k$ to $k'$ if
and only if $k \vdash_M k'$. A path in the computation tree of $M(x)$ is a
sequence of configurations $k_0 \vdash_M k_1 \vdash_M \cdots \vdash_M k_t
\vdash_M \cdots$. The computation tree of an NTM can have infinite paths on
which never a halting configuration is reached. For DTMs, each non-halting
configuration has a unique successor configuration. Thus, the computation
tree of a DTM degenerates to a linear chain.
\begin{pelda}
\label{exa:tm}
Consider the language $L = \{a^n b^n c^n \condition n \geq 1\}$. A
Turing machine accepting $L$ is given by
\[
M = (\{a,b,c\}, \{a,b,c,\$,\Box \}, \{s_0, s_1,\ldots , s_6\}, \delta, s_0,
\Box , \{s_6\}) \koz ,
\]
where the transitions of $\delta$ are stated in Figure~\ref{tab:tm}.
Figure~\ref{tab:tm-interpretation} provides the meaning of the single states
of $M$ and the intention corresponding to the each state.
See also Exercise \ref{ex:tm}.
\begin{figure}[!t]
\centering
\rm\small
\begin{tabular}{|r@{\hspace*{1mm}}c@{\hspace*{1mm}}l|r@{\hspace*{1mm}}c@{\hspace*{1mm}}l|r@{\hspace*{1mm}}c@{\hspace*{1mm}}l|}
\hline
$(s_0,a)$ & $\mapsto$ & $(s_1,\$,R)$ &
$(s_2,\$)$ & $\mapsto$ & $(s_2,\$,R)$ &
$(s_5,c)$ & $\mapsto$ & $(s_5,c,L)$ \\
$(s_1,a)$ & $\mapsto$ & $(s_1,a,R)$ &
$(s_3,c)$ & $\mapsto$ & $(s_3,c,R)$ &
$(s_5,\$)$ & $\mapsto$ & $(s_5,\$,L)$ \\
$(s_1,b)$ & $\mapsto$ & $(s_2,\$,R)$ &
$(s_3,\Box)$ & $\mapsto$ & $(s_4,\Box, L)$ &
$(s_5,b)$ & $\mapsto$ & $(s_5,b,L)$ \\
$(s_1\$)$ & $\mapsto$ & $(s_1,\$,R)$ &
$(s_4,\$)$ & $\mapsto$ & $(s_4,\$,L)$ &
$(s_5,a)$ & $\mapsto$ & $(s_5,a,L)$ \\
$(s_2,b)$ & $\mapsto$ & $(s_2,b,R)$ &
$(s_4,\Box)$ & $\mapsto$ & $(s_6,\Box, R)$ &
$(s_5,\Box)$ & $\mapsto$ & $(s_0,\Box, R)$ \\
$(s_2,c)$ & $\mapsto$ & $(s_3,\$,R)$ &
$(s_4,c)$ & $\mapsto$ & $(s_5,c,L)$ &
$(s_0,\$)$ & $\mapsto$ & $(s_0,\$,R)$ \\
\hline
\end{tabular}
\caption{Transition function $\delta$ of $M$ for $L = \{a^n
b^n c^n \condition n \geq 1\}$.
\label{tab:tm}
}
\end{figure}
\begin{figure}[!t]
\centering
\rm\small
\begin{tabular}[t]{|l||l|l|}
\hline
$Z$ & \multicolumn{1}{c|}{Meaning}
& \multicolumn{1}{c|}{Intention} \\
\hline\hline
$s_0$ & initial state & start next cycle \\ \hline
$s_1$ & one $a$ stored & look for next $b$ \\ \hline
$s_2$ & one $a$ and one $b$ stored & look for next $c$ \\ \hline
$s_3$ & one $a$, one $b$, and one $c$ deleted & look for right boundary \\ \hline
$s_4$ & right boundary reached & move back and test if all $a$, $b$, and $c$
are deleted \\ \hline
$s_5$ & test not successful & move back and start next cycle \\ \hline
$s_6$ & test successful & accept \\ \hline
\end{tabular}
\caption{$M$'s states, their meaning and their intention.
\label{tab:tm-interpretation}
}
\end{figure}
\end{pelda}
In order to classify problems according to their computational complexity, we
need to define complexity classes. Each such class is defined by a given
resource function and contains all problems that can be solved by a Turing
machine that requires no more of a resource (e.g., computation time or
memory space) than is
specified by the resource function. We consider only the resource time here,
i.e., the number of steps---as a function of the input size---needed to solve
(or to accept) the problem. Further, we consider only the traditional \textit{worst-case}
complexity model. That is, among all inputs of size~$n$,
those that require the maximum resource are decisive; one thus assumes the
worst case to occur. We now define deterministic and nondeterministic time
complexity classes.
\begin{defi}[Deterministic and nondeterministic time complexity]\label{def:dtime-ntime}
\begin{itemize}
\item Let $M$ be a DTM with $L(M) \seq \sigmastar$ and let $x \in \sigmastar$
be an input. Define the {\em time function of~$M(x)$}, which maps from
$\sigmastar$ to~$\N$, as follows:
\begin{eqnarray*}
\Time{M}{x} & = & \left\{
\begin{array}{ll}
k & \mbox{if $M(x)$ has exactly $k+1$ configurations} \\
\mbox{undefined} & \mbox{otherwise.}
\end{array}
\right.
\end{eqnarray*}
Define the function $\mbox{\rm{}time}_{M} : \N \rightarrow \N$ by:
\begin{eqnarray*}
\timefcn{M}{n} & = & \left\{
\begin{array}{ll}
\max_{x : |x| = n} \Time{M}{x}
& \mbox{if $\Time{M}{x}$ is defined} \\
& \mbox{\quad\quad for all $x$ with $|x| = n$}\\
\mbox{undefined} & \mbox{otherwise \koz .}
\end{array}
\right.
\end{eqnarray*}
\item Let $M$ be an NTM with $L(M) \seq \sigmastar$ and let $x \in \sigmastar$
be an input. Define the {\em time function of~$M(x)$}, which maps from
$\sigmastar$ to~$\N$, as follows:
\begin{eqnarray*}
\NTime{M}{x} & = & \left\{
\begin{array}{ll}
\min\{\Time{M}{x,\alpha} \condition \mbox{$M(x)$ accepts on path~$\alpha$}\}
& \mbox{if $x \in L(M)$} \\
\mbox{undefined} & \mbox{otherwise} \koz .
\end{array}
\right.
\end{eqnarray*}
Define the function $\mbox{\rm{}ntime}_{M} : \N \rightarrow \N$ by
\begin{eqnarray*}
\ntime{M}{n} & = & \left\{
\begin{array}{ll}
\max_{x : |x| = n} \NTime{M}{x}
& \mbox{if $\NTime{M}{x}$ is defined} \\
& \mbox{\quad\quad for all $x$ with $|x| = n$}\\
\mbox{undefined} & \mbox{otherwise} \koz .
\end{array}
\right.
\end{eqnarray*}
\item Let $t$ be a computable function that maps from $\N$ to $\N$.
Define the {\em deterministic and nondeterministic complexity classes with
time function $t$} by
\begin{eqnarray*}
\DTIME{t} & = & \left\{ A \ \
\begin{array}{|l}
\mbox{ $A = L(M)$ for some DTM $M$ and}\\
\mbox{ for all $n \in \N$, $\timefcn{M}{n} \leq t(n)$}
\end{array}
\right\} \koz ;
\\
\NTIME{t} & = & \left\{ A \ \
\begin{array}{|l}
\mbox{ $A = L(M)$ for an NTM $M$ and}\\
\mbox{ for all $n \in \N$ is $\ntime{M}{n} \leq t(n)$}
\end{array}
\right\} \koz .
\end{eqnarray*}
\item Let $\polynomials$ be the set of all polynomials. Define the complexity
classes $\p$ and $\np$ as follows:
\begin{eqnarray*}
\p = \bigcup_{t \in \scriptpolynomials} \DTIME{t} & \quad \mbox{and} \quad &
\np = \bigcup_{t \in \scriptpolynomials} \NTIME{t} \koz .
\end{eqnarray*}
\end{itemize}
\end{defi}
Why are the classes $\p$ and $\np$ so important? Obviously, exponential-time
algorithms cannot be considered efficient in general. Garey and
Johnson compare the rates of growth of some particular
polynomial and exponential time functions $t(n)$ for certain input sizes
relevant in practice, see Figure~\ref{tab:polynomial-vs-exponential}. They
assume that a computer executes one million operations per second. Then all
algorithms bounded by a polynomial run in a ``reasonable'' time for inputs of
size up to $n = 60$, whereas for example an algorithm with time bound $t(n) =
3^n$ takes more than $6$ years already for the modest input size of $n = 30$.
For $n = 40$ it takes almost $4000$ centuries, and for $n = 50$ a truly
astronomic amount of time.
\begin{figure}[!t]
\centering
\rm\small
\begin{tabular}{|c||r|r|r|r|r|r|}
\hline
$t(n)$ & \multicolumn{1}{c|}{$n = 10$} & \multicolumn{1}{c|}{$n = 20$}
& \multicolumn{1}{c|}{$n = 30$} & \multicolumn{1}{c|}{$n = 40$}
& \multicolumn{1}{c|}{$n = 50$} & \multicolumn{1}{c|}{$n = 60$} \\
\hline \hline
$n$ & .00001 \mbox{ sec} & .00002 \mbox{ sec} & .00003 \mbox{ sec}
& .00004 \mbox{ sec} & .00005 \mbox{ sec} & .00006 \mbox{ sec} \\ \hline
$n^2$ & .0001 \mbox{ sec} & .0004 \mbox{ sec} & .0009 \mbox{ sec}
& .0016 \mbox{ sec} & .0025 \mbox{ sec} & .0036 \mbox{ sec} \\ \hline
$n^3$ & .001 \mbox{ sec} & .008 \mbox{ sec} & .027 \mbox{ sec}
& .064 \mbox{ sec} & .125 \mbox{ sec} & .256 \mbox{ sec} \\ \hline
$n^5$ & .1 \mbox{ sec} & 3.2 \mbox{ sec} & 24.3 \mbox{ sec}
& 1.7 \mbox{ min} & 5.2 \mbox{ min} & 13.0 \mbox{ min} \\
\hline \hline
$2^n$ & .001 \mbox{ sec} & 1.0 \mbox{ sec} & 17.9 \mbox{ min}
& 12.7 \mbox{ days} & 35.7 \mbox{ years} & 366 \mbox{ cent.} \\
\hline
$3^n$ & .059 \mbox{ sec} & 58 \mbox{ min} & 6.5 \mbox{ years}
& 3855 \mbox{ cent.} & $2 \cdot 10^8$ \mbox{ cent.}
& $1.3 \cdot 10^{13}$ \mbox{ cent.} \\ \hline
\end{tabular}
\caption{Comparison of some polynomial and exponential time functions.
\label{tab:polynomial-vs-exponential}
}
\end{figure}
The last decades have seen an impressive development of computer and
hardware technology. Figure~\ref{tab:computers-getting-faster} (taken
from~\cite{GareyJo79}) shows that this is not enough to provide an
essentially better runtime behaviour for exponential-time algorithms,
even assuming that the previous trend in hardware development
continues. What would happen if one had a computer that is $100$ times
or even $1000$ times as fast as current computers are? For functions
$t_i(n)$, $1 \leq i \leq 6$, let $N_i$ be the maximum size of inputs
that can be solved by a $t_i(n)$ time-bounded algorithm within one
hour. Figure~\ref{tab:computers-getting-faster} also taken
from~\cite{GareyJo79}) shows that a computer $1000$ times faster
than today's computers increases $N_5$ for $t_5(n) = 2^n$ by only an
additive value close to $10$. In contrast, using computers with the
same increase in speed, an $n^5$ time-bounded algorithm can handle
problem instances about four times as large as before.
\begin{figure}[!t]
\centering
\rm\small
\begin{tabular}{|l||@{}c@{}|@{}c@{}|@{}c@{}|}
\hline
\multicolumn{1}{|c||}{$t_i(n)$}
& \mbox{ Computer today }
& \mbox{ $100$ times faster }
& \mbox{ $1000$ times faster } \\ \hline \hline
$t_1(n) = n$ & $N_1$ & $100 \cdot N_1$ & $1000 \cdot N_1$ \\ \hline
$t_2(n) = n^2$ & $N_2$ & $10 \cdot N_2$ & $31.6 \cdot N_2$ \\ \hline
$t_3(n) = n^3$ & $N_3$ & $4.64 \cdot N_3$ & $10 \cdot N_3$ \\ \hline
$t_4(n) = n^5$ & $N_4$ & $2.5 \cdot N_4$ & $3.98 \cdot N_4$ \\ \hline \hline
$t_5(n) = 2^n$ & $N_5$ & $N_5 + 6.64$ & $N_5 + 9.97$ \\ \hline
$t_6(n) = 3^n$ & $N_6$ & $N_6 + 4.19$ & $N_6 + 6.29$ \\ \hline
\end{tabular}
\caption{What happens when the computers get faster?
\label{tab:computers-getting-faster}
}
\end{figure}
Intuitively, the complexity class $\p$ contains the efficiently solvable
problems. The complexity class $\np$ contains many problems important in
practice but currently not efficiently solvable. Examples are the
satisfiability problem and the graph isomorphism problem that will be dealt
with in more detail later in this chapter. The question of whether or not the
classes $\p$ and $\np$ are equal is still open. This famous $\p$-versus-$\np$
question gave rise to the theory of $\np$-completeness, which is briefly
introduced in Section~\ref{sec:np-completeness}.
\begin{gyak}
\ujgyak
Can Church's thesis ever be proven formally?\label{ex:church}
\ujgyak
Consider the Turing machine $M$ in Example~\ref{exa:tm}.
\textit{a.} What are the sequences of configurations of $M$ for
inputs $x = a^3b^3c^2$ and $y = a^3b^3c^3$, respectively?
\textit{b.} Prove that $M$ is correct, i.e., show that
$L(M) = \{a^n b^n c^n \condition n \geq 1\}$.
\textit{c.} Estimate the running time of~$M$.\label{ex:tm}
\textit{d.} Show that the graph isomorphism problem and the graph automorphism
problem introduced in Definition~\ref{def:graphiso} are both in~$\np$.\label{ex:gi-ga-in-np}
\end{gyak}
\section{NP-completeness}\label{sec:np-completeness}\index{NP-completeness}
The theory of $\np$-completeness provides methods to prove lower bounds for
problems in~$\np$. An $\np$ problem is said to be complete in~$\np$ if it
belongs to the hardest problems in this class, i.e., if it is at least as hard
as any $\np$ problem. The complexity of two given problems can be compared by
polynomial-time reductions. Among the different types of reduction one can
consider, we focus on the {\em polynomial-time many-one reducibility\/} in
this section. In Section~\ref{sec:graphisomorphie}, more general
reducibilities will be introduced, such as the {\em polynomial-time Turing
reducibility\/} and the {\em polynomial-time (strong) nondeterministic
Turing reducibility}.
\begin{defi}[Reducibility, NP-Completeness]\label{def:manyone-red}
A set $A$ is {\em reducible\/} to a set $B$ (in symbols, $A \manyone B$)
if and only if there exists a polynomial-time computable function $r$ such
that for all $x \in \sigmastar$, $x \in A \Lolra r(x) \in B$. A set $B$ is
said to be {\em $\manyonetext$-hard for~$\np$\/} if and only if $A \manyone B$
for each set $A \in \np$. A set $B$ is said to be {\em
$\manyonetext$-complete in~$\np$\/} ({\em $\np$-complete}, for short) if and
only if $B$ is $\manyonetext$-hard for $\np$ and $B \in \np$.
\end{defi}
Reductions are efficient algorithms that can be used to show that
problems are not efficiently solvable. That is, if one can
efficiently transform a hard problem into another problem via a
reduction, the hardness of the former problem is inherited by the
latter problem. At first glance, it might seem that infinitely many
efficient algorithms are required to prove some problem $X$
$\np$-hard, namely one reduction from each of the infinitely many
$\np$ problems to~$X$. However, an elementary result says that it is
enough to find just one such reduction, from {\em some\/}
$\np$-complete problem~$V$. Since the $\manyonetext$-reducibility is
transitive (see Exercise \ref{ex:manyone-is-transitive}),
the $\np$-hardness of $V$ implies the $\np$-hardness of $X$ via the
reduction $A \manyone V \manyone X$ for each $\np$ problem~$A$.
In 1971, Stephen Cook found a first such $\np$-complete problem: the
satisfiability problem of propositional logic, $\texttt{SAT}$ for short. For
many $\np$-completeness result, it is useful to start from the special
problem $\threesat$, the restriction of the satisfiability problem in
which each given Boolean formula is in conjunctive normal form and
each clause contains exactly three literals. $\threesat$ is also
$\np$-complete.
\begin{defi}[Satisfiability problem]\label{def:sat}
The Boolean constants {\em false\/} and {\em true\/} are
represented by $0$ and $1$. Let $x_1, x_2, \ldots, x_m$ be Boolean
variables, i.e., $x_i \in \{0,1\}$ for each~$i$. Variables and their
negations are called {\em literals}. A Boolean formula $\varphi$ is
{\em satisfiable\/} if and only if there is an assignment to the
variables in $\varphi$ that makes the formula true. A Boolean formula
$\varphi$ is in {\em conjunctive normal form\/} ({\em CNF}, for short)
if and only if $\varphi$ is of the form $\varphi(\anvecplus{x}{m}) =
\bigwedge_{i=1}^{n} \left(\bigvee_{j=1}^{k_i} \ell_{i,j} \right)$,
where the $\ell_{i,j}$ are literals over $\{\anvecplus{x}{m}\}$. The
disjunctions $\bigvee_{j=1}^{k_i} \ell_{i,j}$ of literals are called
the {\em clauses of~$\varphi$}. A Boolean formula $\varphi$ is in
{\em $k$-CNF} if and only if $\varphi$ is in CNF and each clause of
$\varphi$ contains exactly $k$ literals. Define the following two problems:
\begin{eqnarray*}
\texttt{SAT} & = & \{ \varphi \condition
\mbox{$\varphi$ is a satisfiable Boolean formula in CNF}\} \koz ; \\
\threesat & = & \{ \varphi \condition
\mbox{$\varphi$ is a satisfiable Boolean formula in $3$-CNF}\} \koz .
\end{eqnarray*}
\end{defi}
\begin{pelda}[Boolean formulas]\label{exa:Boolean-formula}
Consider the following two satisfiable Boolean formulas (see also
Exercise~\ref{ex:Boolean-formula}):
\begin{eqnarray*}
\varphi(w,x,y,z) & = & (x \vee y \vee \neg z) \wedge
(x \vee \neg y \vee \neg z)
\wedge (w \vee \neg y \vee z) \wedge (\neg w \vee \neg x \vee z); \\
\psi(w,x,y,z) & = & (\neg w \vee x \vee \neg y \vee z) \wedge
(x \vee y \vee \neg z)
\wedge (\neg w \vee y \vee z) \wedge (w \vee \neg x \vee \neg z) \koz .
\end{eqnarray*}
Here, $\varphi$ is in $3$-CNF, so $\varphi$ is in $\threesat$.
However, $\psi$ is not in $3$-CNF, since the first clause contains
four literals. Thus, $\psi$ is in $\texttt{SAT}$ but not in $\threesat$.
\end{pelda}
Theorem~\ref{thm:sat} states the above-mentioned result of Cook.
\begin{tetel}[Cook]\label{thm:sat}
The problems $\texttt{SAT}$ and $\threesat$ are $\np$-complete.
\end{tetel}
The proof that $\texttt{SAT}$ is $\np$-complete is omitted. The idea is to encode the
computation of an arbitrary NP machine $M$ running on input $x$ into a Boolean
formula $\varphi_{M, x}$ such that $\varphi_{M, x}$ is satisfiable if and only
if $M$ accepts~$x$.
$\texttt{SAT}$ is a good starting point for many other $\np$-completeness results. In
fact, in many cases it is very useful to start with its restriction
$\threesat$. To give an idea of how such proofs work, we now show that $\texttt{SAT}
\manyone \threesat$, which implies that $\threesat$ is $\np$-complete. To
this end, we need to find a reduction $r$ that transforms any given Boolean
formula $\varphi$ in CNF into another Boolean formula $\widehat{\varphi}$ in
$3$-CNF (i.e., with exactly three literals per clause) such that
\begin{eqnarray}
\label{equ:threesat-is-np-complete}
\mbox{$\varphi$ is satisfiable} & \Lolra &
\mbox{$\widehat{\varphi}$ is satisfiable} \koz .
\end{eqnarray}
Let $\varphi(\anvecplus{x}{n})$ be the given formula with clauses $C_1, C_2,
\ldots , C_m$. Construct the formula $\widehat{\varphi}$ from $\varphi$ as
follows. The variables of $\widehat{\varphi}$ are
\begin{itemize}
\item $\varphi$'s variables $\anvecplus{x}{n}$ and
\item for each clause $C_j$ of~$\varphi$, a number of additional variables
$\anvecplus{y^j}{k_j}$ as needed, where $k_j$ depends on the structure of
$C_j$ according to the case distinction below.
\end{itemize}
Now, define $\widehat{\varphi} = \widehat{C}_1 \wedge \widehat{C}_2 \wedge
\cdots \wedge \widehat{C}_m$, where each clause $\widehat{C}_j$ of
$\widehat{\varphi}$ is constructed from the clause $C_j$ of $\varphi$ as
follows. Suppose that $C_j = (z_1 \vee z_2 \vee \cdots \vee z_k)$, where each
$z_i$ is a literal over $\{\anvecplus{x}{n}\}$. Distinguish the following
four cases.
\begin{itemize}
\item If $k = 1$, define
\[
\widehat{C}_j = (z_1 \vee y^{j}_{1} \vee y^{j}_{2}) \wedge
(z_1 \vee \neg y^{j}_{1} \vee y^{j}_{2}) \wedge
(z_1 \vee y^{j}_{1} \vee \neg y^{j}_{2}) \wedge
(z_1 \vee \neg y^{j}_{1} \vee \neg y^{j}_{2}) \koz .
\]
\item If $k = 2$, define $\widehat{C}_j = (z_1 \vee z_2 \vee
y^{j}_{1}) \wedge (z_1 \vee z_2 \vee \neg y^{j}_{1})$.
\item If $k = 3$, define $\widehat{C}_j = C_j = (z_1 \vee z_2 \vee z_3)$,
i.e., the $j$th clause remains unchanged.
\item If $k \geq 4$, define
\begin{eqnarray*}
\widehat{C}_j & = & (z_1 \vee z_2 \vee y^{j}_{1}) \wedge
(\neg y^{j}_{1} \vee z_3 \vee y^{j}_{2}) \wedge
(\neg y^{j}_{2} \vee z_4 \vee y^{j}_{3}) \wedge \cdots \wedge \\ & &
(\neg y^{j}_{k-4} \vee z_{k-2} \vee y^{j}_{k-3}) \wedge
(\neg y^{j}_{k-3} \vee z_{k-1} \vee z_k) \koz .
\end{eqnarray*}
\end{itemize}
It remains to show that (a)~the reduction $r$ is polynomial-time computable,
and (b)~the equivalence~(\ref{equ:threesat-is-np-complete}) is true. Both
claims are easy to see; the details are left to the reader as
Exercise \ref{ex:threesat-is-np-complete}.
Thousands of problems have been proven $\np$-complete by now. A comprehensive
collection can be found in the work of Garey und Johnson~\cite{GareyJo79}.
\begin{gyak}
\ujgyak
Find a satisfying assignment each for the Boolean formulas $\varphi$ and
$\psi$ from Example~\ref{exa:Boolean-formula}. \label{ex:Boolean-formula}
\ujgyak Show that the $\manyonetext$-reducibility is transitive:
$(A \manyone B \wedge B \manyone C) \Lora A \manyone C$. \label{ex:manyone-is-transitive}
\ujgyak Prove that $\texttt{SAT}$ is in~$\np$. \label{ex:sat-is-in-np}
\ujgyak Consider the reduction $\texttt{SAT} \manyone \threesat$. Prove the following:
\textit{a.} the reduction $r$ is polynomial-time computable, and
\textit{b.} the equivalence~(\ref{equ:threesat-is-np-complete}) holds.
\label{ex:threesat-is-np-complete}
\end{gyak}
\section{Algorithms for the satisfiability problem}\label{sec:sat}
By Theorem~\ref{thm:sat}, $\texttt{SAT}$ and $\threesat$ are both $\np$-complete.
Thus, if $\texttt{SAT}$ were in~$\p$, it would immediately follow that $\p = \np$,
which is considered unlikely. Thus, it is very unlikely that there is a
polynomial-time deterministic algorithm for $\texttt{SAT}$ or $\threesat$. But what
is the runtime of the best deterministic algorithms for them? And what about
randomised algorithms? Let us focus on the problem $\threesat$ in this
section.
\subsection{A deterministic algorithm}\label{sec:sat-deterministic}
The ``naive'' deterministic algorithm for $\threesat$ works as follows: Given
a Boolean formula $\varphi$ with $n$ variables, sequentially check the $2^n$
possible assignments to the variables of~$\varphi$. Accept if the first
satisfying assignment is found, otherwise reject. Obviously, this algorithm
runs in time~$\bigoh(2^n)$. Can this upper bound be improved?
Yes, it can. We will present an slightly better deterministic algorithm for
$\threesat$ that still runs in exponential time, namely in time
$\tilde{\bigoh}(1.9129^n)$, where the $\tilde{\mathcal{O}}$ notation neglects
polynomial factors as is common for exponential-time algorithms.\footnote{ The
result presented here is not the best result known, but see
Figure~\ref{tab:sat-algorithmen} on page~\pageref{tab:sat-algorithmen} for
further improvements.}
The point of such an improvement is that a $\tilde{\mathcal{O}}(c^n)$
algorithm, where $1 < c < 2$ is a constant, can deal with larger instances
than the naive $\tilde{\mathcal{O}}(2^n)$ algorithm in the same amount of time
before the exponential growth rate eventually hits and the running time
becomes infeasible. For example, if $c = \sqrt{2} \approx 1.414$ then
$\tilde{\bigoh}\left(\sqrt{2}^{\, 2n}\right) = \tilde{\bigoh}(2^n)$. Thus,
this algorithm can deal with inputs twice as large as the naive algorithm in
the same amount of time. Doubling the size of inputs that can be handled by
some algorithm can be quite important in practice.
The deterministic algorithm for $\threesat$ is based on a simple ``{\em
backtracking\/}'' idea. Backtracking is useful for problems whose solutions
consist of $n$ components each having more than one choice possibility. For
example, a solution of $\threesat$ is composed of the $n$ truth values of a
satisfying assignment, and each such truth value can be either true
(represented by~$1$) or false (represented by~$0$).
The idea is the following. Starting from the initially empty solution (i.e.,
the empty truth assignment), we seek to construct by recursive calls to our
backtracking algorithm, step by step, a larger and larger partial solution
until eventually a complete solution is found, if one exists. In the
resulting recursion tree,\footnote{ The inner vertices of the recursion tree
represent the recursive calls of the algorithm, its root is the first call,
and the algorithm terminates at the leaves without any further recursive
call.} the root is marked by the empty solution, and the leaves are marked
with complete solutions of the problem. If the current branch in the
recursion tree is ``dead'' (which means that the subtree underneath it cannot
lead to a correct solution), one can prune this subtree and ``backtracks'' to
pursue another extention of the partial solution constructed so far. This
pruning may save time in the end.
\begin{alg}{Backtracking-SAT$(\varphi,\beta)$}\inddef{\textsc{Backtracking-SAT}}
1 \' \key{if} \= $(\beta$ assigns truth values to all variables of $\varphi)$ \\
2 \' \> \key{then} \= \key{return} $\varphi(\beta)$ \\
3 \' \> \key{else} \> \key{if} \= $(\beta$ makes one of the clauses of $\varphi$ false$)$ \\
4 \' \> \> \> \key{then} \= \key{return} 0 \\
5 \' \> ` $\rhd$ ``dead branch'' \\
6 \' \> \> \> \> \key{else} \= \key{if}
\= \textsc{Backtracking-SAT}$(\varphi,\beta 0))$ \\
7 \' \> \> \> \> \> \key{then} \= \key{return} 1 \\
8 \' \> \> \> \> \> \key{else} \> \key{return}
\textsc{Backtracking-SAT}$(\varphi,\beta 1))$
\end{alg}
The input of algorithm \textsc{Backtracking-SAT} are a Boolean formula $\varphi$ and a partial
assignment $\beta$ to some of the variables of $\varphi$. This algorithm returns a Boolean
value: 1, if the partial assignment $\beta$ can be extended to a satisfying
assignment to all variables of $\varphi,$ and 0 otherwise. Partial
assignments are here considered to be strings of length at most $n$ over the
alphabet $\{0,1\}$.
The first call of the algorithm is \textsc{Backtracking-SAT}$(\varphi,\lambda),$
where $\lambda$ denotes the empty assignment. If it turns out that the
partial assignment constructed so far makes one of the clauses of $\varphi$
false, it cannot be extended to a satisfying assignment of~$\varphi$. Thus,
the subtree underneath the corresponding vertex in the recursion tree can be
pruned; see also Exercise \ref{ex:backtrack}.
To estimate the runtime of \textsc{Backtracking-SAT}, note that this algorithm can
be specified so as to select the variables in an ``intelligent'' order that
minimises the number of steps needed to evaluate the variables in any clause.
Consider an arbitrary, fixed clause $C_j$ of the given formula~$\varphi$.
Each satisfying assignment $\beta$ of $\varphi$ assigns truth values to the
three variables in~$C_j$. There are $2^3 = 8$ possibilities to assign a truth
value to these variables, and one of them can be excluded certainly: the
assignment that makes $C_j$ false. The corresponding vertex in the recursion
tree of \textsc{Backtracking-SAT}$(\varphi,\beta)$ thus leads to a ``dead'' branch,
so we prune the subtree underneath it.
Depending on the structure of~$\varphi$, there may exist further ``dead''
branches whose subtrees can also be pruned. However, since we are trying to
find an upper bound in the worst case, we do not consider these additional
``dead'' subtrees. It follows that
\[
\tilde{\bigoh}\left(\left(2^3 - 1\right)^{\frac{n}{3}}\right) =
\tilde{\bigoh}(\sqrt[3]{7}^{\, n}) \approx \tilde{\bigoh}(1.9129^n)
\]
is an upper bound for \textsc{Backtracking-SAT} in the worst case. This bound
slightly improves upon the trivial
$\tilde{\bigoh}(2^{n})$ upper bound of the ``naive''
algorithm for $\threesat$.
As mentioned above, the deterministic time complexity of $\threesat$ can be
improved even further. For example, Monien and
Speckenmeyer~\cite{Monien85} proposed a
divide-and-conquer algorithm with runtime~$\tilde{\bigoh}(1.618^n)$.
% The currently best deterministic $\threesat$ algorithm is due to Dantsin et
% al.~\cite{Dantsin02}. It is based on a ``local search with
% restart'' and has a runtime of $\tilde{\bigoh}(1.481^n)$,
% which is the current world record.
Dantsin et al.~\cite{Dantsin02} designed a deterministic
``local search with restart'' algorithm whose runtime is
$\tilde{\bigoh}(1.481^n)$, which was further improved by
Brueggemann and Kern~\cite{bru-ker:j:improved-local-search-threesat}
in 2004 to a $\tilde{\bigoh}(1.4726^n)$ bound.
There are also randomised algorithms that have an even better runtime. One
will be presented now, a ``random-walk'' algorithm that is due to
Schöning~\cite{Schoning99}.
\subsection{A randomised algorithm}\label{sec:sat-randomized}
A {\em random walk\/} can be done on a specific structure, such as in the
Euclidean space, on an infinite grid or on a given graph. Here we are
interested in random walks occurring on a graph that represents a certain
stochastic automaton. To describe such automata, we first introduce the
notion of a finite automaton.
A {\em finite automaton} can be represented by its transition graph, whose
vertices are the states of the finite automaton, and the transitions between
states are directed edges marked by the symbols of the alphabet~$\Sigma$. One
designated vertex is the {\em initial state\/} in which the computation of the
automaton starts. In each step of the computation, the automaton reads one
input symbol (proceeding from left to right) and moves to the next state along
the edge marked by the symbol read. Some vertices represent {\em final
states}. If such a vertex is reached after the entire input has been read,
the automaton accepts the input, and otherwise it rejects. In this way, a
finite automaton accepts a set of input strings, which is called its language.
A {\em stochastic automaton $\mathcal{S}$\/} is a finite automaton whose edges
are marked by probabilities in addition. If the edge from $u$ to $v$ in the
transition graph of $\mathcal{S}$ is marked by $p_{u,v}$, where $0 \leq
p_{u,v} \leq 1$, then $\mathcal{S}$ moves from state $u$ to state $v$ with
probability~$p_{u,v}$. The process of random transitions of a stochastic
automaton is called a {\em Markov chain\/} in probability theory. Of course,
the acceptance of strings by a stochastic automaton depends on the transition
probabilities.
\begin{algN}{Random-SAT$(\varphi)$}\inddef{\textsc{Random-SAT}}
1 \' \key{for} \= $i \leftarrow 1$ \key{to} $\left\lceil \left( 4/3 \right)^n \right\rceil)$ \\
\' \` $\rhd$ $n$ is the number of variables in $\varphi$ \\
2 \' \> \key{do} \= randomly choose an assignment $\beta \in \{0,1\}^n$ \\
\' \> \> under the uniform distribution
\algnewpage
3 \' \> \key{for} \= $j \leftarrow 1$ \key{to} $n$ \\
4 \' \> \> \key{if} \= $(\varphi(\beta) = 1)$ \\
5 \' \> \> \> \key{then} \= \key{return}
the satisfying assignment $\beta$ to $\varphi$ \\
6 \' \> \> \> \key{else} \> choose a clause $C = (x \vee y \vee z)$
with $C(\beta) = 0$ \\
7 \' \> \> \> \> randomly choose a literal $\ell \in \{x,y,z\}$ \\
\' \> \> \> \> under the uniform distribution \\
8 \' \> \> \> \> determine the bit $\beta_{\ell} \in \{0,1\}$
in~$\beta$ assigning $\ell$ \\
9 \' \> \> \> \> swap $\beta_{\ell}$ to
$1-\beta_{\ell}$ in~$\beta$ \\
10 \' \key{return} ``$\varphi$ is not satisfiable''
\end{algN}
Here, we are not interested in recognising languages by stochastic automata,
but rather we will use them to describe a random walk by the randomised
algorithm \textsc{Random-SAT}. Given
a Boolean formula $\varphi$ with $n$ variables, \textsc{Random-SAT} tries to find a
satisfying assignment to $\varphi$'s variables, if one exists.
On input~$\varphi$, \textsc{Random-SAT} starts by guessing a random initial
assignment~$\beta$, where each bit of $\beta$ takes on the value $0$ and $1$
with probability~$1/2$. Suppose $\varphi$ is satisfiable. Let
$\tilde{\beta}$ be an arbitrary fixed assignment of~$\varphi$. Let $X$ be a
random variable that expresses the {\em Hamming distance between $\beta$
and~$\tilde{\beta}$}, i.e., $X$ gives the number of bits in which $\beta$
and $\tilde{\beta}$ differ. Clearly, $X$ can take on values $j \in \{0, 1,
\ldots , n\}$ and is distributed according to the binomial distribution with
parameters $n$ and~$1/2$. That is, the probability for $X = j$ is exactly
$\binom{n}{j} 2^{-n}$.
\textsc{Random-SAT} now checks whether the initial assignment $\beta$ already
satisfies~$\varphi$, and if so, it accepts. Otherwise, if $\beta$ does not
satisfy~$\varphi$, there must exist a clause in $\varphi$ not satisfied
by~$\beta$. \textsc{Random-SAT} now picks any such clause, randomly chooses under
the uniform distribution some literal in this clause, and ``flips'' the
corresponding bit
in the current assignment~$\beta$. This procedure is repeated $n$
times. If the current assignment $\beta$ still does not satisfy~$\varphi$,
\textsc{Random-SAT} restarts with a new initial assignment, and repeats this entire
procedure $t$ times, where $t = \left\lceil \left( 4/3 \right)^n
\right\rceil$.
\begin{figure}[!t]
\centering
\input{figs/markov.eepic}
\caption{\label{fig:markov}
Transition graph of a stochastic automaton for describing \textsc{Random-SAT}.}
\end{figure}
Figure~\ref{fig:markov} shows a stochastic automaton $\mathcal{S},$ whose
edges are not marked by symbols but only by transition probabilities. The
computation of \textsc{Random-SAT} on input $\varphi$ can be viewed as a random
walk on $\mathcal{S}$ as follows. Starting from the initial state~$s$, which
will never be reached again later, \textsc{Random-SAT}$(\varphi)$ first moves to one
of the states $j \in \{0, 1, \ldots , n\}$ according to the binomial
distribution with parameters $n$ and~$1/2$. This is shown in the upper part
of Figure~\ref{fig:markov} for a formula with $n = 6$ variables. Reaching
such a state $j$ means that the randomly chosen initial assignment $\beta$ and
the fixed satisfying assignment $\tilde{\beta}$ have Hamming distance~$j$. As
long as $j \neq 0$, \textsc{Random-SAT}$(\varphi)$ changes one bit $\beta_{\ell}$ to
$1-\beta_{\ell}$ in the current assignment~$\beta$, searching for a satisfying
assignment in each iteration of the inner ${\tt for}$ loop. In the random
walk on $\mathcal{S}$, this corresponds to moving one step to the left to
state $j-1$ or moving one step to the right to state $j+1$, where only states
less than or equal to $n$ can be reached.
The fixed assignment $\tilde{\beta}$ satisfies~$\varphi$, so it sets at least
one literal in each clause of $\varphi$ to true. If we fix {\em exactly\/}
one of the literals satisfied by $\tilde{\beta}$ in each clause, say~$\ell$,
then \textsc{Random-SAT}$(\varphi)$ makes a step to the left if and only if $\ell$
was chosen by \textsc{Random-SAT}$(\varphi)$. Hence, the probability of moving from
state $j>0$ to state $j-1$ is 1/3, and the probability of moving from state
$j>0$ to state $j+1$ is 2/3.
If the state $j = 0$ is reached eventually after at most $n$ iterations of
this process, $\beta$ and $\tilde{\beta}$ have Hamming distance~$0$, so
$\beta$ satisfies~$\varphi$ and \textsc{Random-SAT}$(\varphi)$ returns $\beta$ and
halts accepting. Of course, one might also hit a satisfying assignment
(distinct from~$\tilde{\beta}$) in some state $j \neq 0$. But since this
would only increase the acceptance probability, we neglect this
possibility here.
If this process is repeated $n$ times unsuccessfully, then the
initial assignment $\beta$ was chosen so badly that \textsc{Random-SAT}
now dumps it, and restarts the above process
from scratch with a new initial assignment. The entire
procedure is repeated at most $t$ times, where $t = \left\lceil \left(
4/3 \right)^n \right\rceil$. If it is still unsuccessful after $t$
trials, \textsc{Random-SAT} rejects its input.
Since the probability of moving away from state $0$ to the right is larger
than the probability of moving toward $0$ to the left, one might be tempted to
think that the success probability of \textsc{Random-SAT} is very small. However,
one should not underestimate the chance that one already after the initial
step from $s$ reaches a state close to~$0$. The closer to $0$ the random walk
starts, the higher is the probability of reaching $0$ by random steps to the
left or to the right.
We only give a rough sketch of estimating the success probability (assuming
that $\varphi$ is satisfiable) and the runtime of \textsc{Random-SAT}$(\varphi).$
For convenience, suppose that 3 divides $n.$ Let $p_i$ be the probability
for the event that \textsc{Random-SAT}$(\varphi)$ reaches the state $0$ within $n$
steps after the initial step, under the condition that it reaches some state
$i \leq n/3$ with the initial step from~$s$. For example, if the state $n/3$
is reached with the initial step and if no more than $n/3$ steps are done to
the right, then one can still reach the final state $0$ by a total of at most
$n$ steps. If one does more than $n/3$ steps to the right starting from
state~$n/3$, then the final state cannot be reached within $n$ steps. In
general, starting from state $i$ after the initial step, no more than
$(n-i)/2$ steps to the right may be done. As noted above, a step to
the right is done with probability~$2/3$, and a step to the left is done with
probability~$1/3$. It follows that
\begin{eqnarray}
\label{equ:randomsat-pi}
p_i & = & \binom{n}{\frac{n-i}{2}} \left( \frac{2}{3} \right)^{\frac{n-i}{2}}
\left( \frac{1}{3} \right)^{n-\frac{n-i}{2}} .
\end{eqnarray}
Now, let $q_i$ be the probability for the event that \textsc{Random-SAT}$(\varphi)$
reaches some state $i \leq n/3$ with the initial step. Clearly, we have
\begin{eqnarray}
\label{equ:randomsat-qi}
q_i & = & \binom{n}{i} \cdot 2^{-n} \koz .
\end{eqnarray}
Finally, let $p$ be the probability for the event that \textsc{Random-SAT}$(\varphi)$
reaches the final state $0$ within the inner ${\tt for}$ loop. Of course,
this event can occur also when starting from a state $j > n/3$. Thus,
\begin{eqnarray*}
p & \geq & \sum_{i=0}^{n/3} p_i \cdot q_i \koz .
\end{eqnarray*}
Approximating this sum by the entropy function and estimating the binomial
coefficients from~(\ref{equ:randomsat-pi}) and~(\ref{equ:randomsat-qi}) in the
single terms by Stirling's formula, we obtain the lower bound $\Omega(\left(
3/4 \right)^{n})$ for~$p$.
To reduce the error probability, \textsc{Random-SAT} performs a total of $t$
independent trials, each starting with a new initial assignment~$\beta$. For
each trial, the probability of success (i.e., the probability of finding a
satisfying assignment of $\varphi$, if one exists) is at
least~$\left(3/4\right)^n$, so the error is bounded by $1 -
\left(3/4\right)^n$. Since the trials are independent, these error
probabilities multiply, which gives an error of $\left(1 -
\left(3/4\right)^n\right)^{t} \leq \mbox{e}^{-1}$. Thus, the total
probabilitiy of success of \textsc{Random-SAT}$(\varphi)$ is at least $1 - 1/\mbox{e}
\approx 0.632$ if $\varphi$ is satisfiable. On the other hand,
\textsc{Random-SAT}$(\varphi)$ does not make any error at all if $\varphi$ is
unsatisfiable; in this case, the output is : ``$\varphi$ is not satisfiable''.
The particular choice of this value of~$t$ can be explained as follows. The
runtime of a randomised algorithm, which performs independent trials such as
\textsc{Random-SAT}$(\varphi),$ is roughly reciprocal to the success probability of
one trial, $p \approx \left(3/4\right)^n.$ In particular, the error
probability (i.e., the probability that that in none of the $t$ trials a
satisfying assignment of $\varphi$ is found even though $\varphi$ is
satisfiable) can be estimated by $(1-p)^{t} \leq e^{-t\cdot p}.$ If a fixed
error of $\varepsilon$ is to be not exceeded, it is enough to choose $t$ such
that $e^{-t\cdot p} \leq \varepsilon$; equivalently, such that $t \geq
\ln(1/\varepsilon)/p$. Up to constant factors, this can be accomplished by
choosing $t = \left\lceil \left(4/3\right)^n \right\rceil$. Hence, the
runtime of the algorithm is in~$\tilde{\bigoh}\left((4/3)^n\right)$.
\begin{gyak}
\ujgyak Start the algorithm \textsc{Backtracking-SAT} for the Boolean formula
$\varphi = (\neg x \vee y \vee \neg z) \wedge (x \vee \neg y \vee z) \wedge
(\neg u \vee y \vee z) \wedge (u \vee \neg y \vee z)$ and construct step by
step a satisfying assignment of~$\varphi$. How does the resulting recursion
tree look like?\label{ex:backtrack}
\end{gyak}
\section{Graph isomorphism and lowness}\label{sec:graphisomorphie}
In this section, we need some of the group-theoretic and graph-theoretic
foundations presented in Section~\ref{sec:algebra}. In particular, recall the
notion of permutation group from Definition~\ref{def:permutationgroup} and the
graph isomorphism problem ($\graphiso$, for short) and the graph automorphism
problem ($\graphauto$, for short) from Definition~\ref{def:graphiso}; see also
Example~\ref{exa:graphiso} in Chapter~\ref{chap:cryptology}. We start by
providing some more background from complexity theory.
\subsection{Reducibilities and complexity hierarchies}
\label{sec:reducibilities-hierarchies}
In Section~\ref{sec:np-completeness}, we have seen that the problems $\texttt{SAT}$
and $\threesat$ are $\np$-complete. Clearly, $\p = \np$ if and only if {\em
every\/} $\np$ problem (including the $\np$-complete problems) is in~$\p$,
which in turn is true if and only if {\em some\/} $\np$-complete problem is
in~$\p$. So, no $\np$-complete problem can be in $\p$ if $\p \neq \np.$ An
interesting question is whether, under the plausible assumption that $\p \neq
\np$, every $\np$ problem is either in $\p$ or $\np$-complete. Or, assuming
$\p \neq \np$, can there exist $\np$ problems that are \emph{neither}
in $\p$ \emph{nor} $\np$-complete?
A result by Ladner~\cite{LadnerLy75} answers this question.
\begin{tetel}[Ladner]\label{thm:ladner}
If $\p \neq \np$ then there exist sets in $\np$ that are neither in
$\p$ nor $\np$-complete.
\end{tetel}
The problems constructed in the proof of Theorem~\ref{thm:ladner} are not
overly natural problems. However, there are also good candidates of
``natural'' problems that are neither in $\p$ nor $\np$-complete. One such
candidate is the graph isomorphism problem. To provide some evidence for this
claim, we now define two hierarchies of complexity classes, the {\em low
hierarchy\/} and the {\em high hierarchy}, both introduced by
Schöning~\cite{Schoning83}. First, we need to define the {\em polynomial
hierarchy}, which builds upon~$\np$. And to this end, we need a more
flexible reducibility than the (polynomial-time) many-one reducibility
$\manyonetext$ from Definition~\ref{def:manyone-red}, namely the {\em Turing
reducibility\/}~$\turingtext$. We will also define the (polynomial-time)
{\em nondeterministic Turing reducibility}, $\turingnptext$, and the
(polynomial-time) {\em strong nondeterministic Turing reducibility},
$\strongturingnptext$. These two reducibilities are important for the
polynomial hierarchy and for the high hierarchy, respectively. Turing
reducibilities are based on the notion of oracle Turing machines, which we now
define.
\begin{defi}[Oracle Turing machine]\label{def:oracle-machine}
An \ki{oracle set}\inddef{oracle set} is a set of strings. An {\em oracle Turing
machine\/}~$M$, with oracle~$B$, is a Turing machine that has
a special worktape, which is called the {\em oracle tape\/} or {\em query
tape}. In addition to other states, $M$ contains a special {\em query
state}, $s_{\tt{}?}$, and the two {\em answer states\/} $s_{\tt{}yes}$
and~$s_{\tt{}no}$. During a computation on some input, if $M$ is not in the
query state~$s_{\tt{}?}$, it works just like a regular Turing machine.
However, when $M$ enters the query state $s_{\tt{}?}$, it interrupts its
computation and queries its oracle about the string $q$ currently written on
the oracle tape. Imagine the oracle $B$ as some kind of ``black box'': $B$
answers the query of whether or not it contains $q$ within one step of $M$'s
computation, regardless of how difficult it is to decide the set~$B$. If $q
\in B$, then $M$ changes its current state into the new state $s_{\tt{}yes}$
and continues its computation. Otherwise, if $q \not\in B$, $M$ continues its
computation in the new state~$s_{\tt{}no}$. We say that the computation of
$M$ on input $x$ {\em is performed relative to the oracle~$B$}, and we
write~$M^{B}(x)$.
The language accepted by $M^B$ is denoted~$L(M^B)$. We say a language $L$ is
{\em represented by an oracle Turing machine $M$\/} if and only if $L =
L(M^{\emptyset})$. We say a class $\mathcal{C}$ of languages is {\em
relativizable\/} if and only if it can be represented by oracle Turing
machines relative to the empty oracle. For any relativizable class
$\mathcal{C}$ and for any oracle~$B$, define the class {\em $\mathcal{C}$
relative to $B$\/} by
\[
\mathcal{C}^B = \{ L(M^B) \condition \mbox{$M$ is an
oracle Turing machine representing some set in~$\mathcal{C}$}\} \koz .
\]
For any class $\mathcal{B}$ of oracle sets, define $\mathcal{C}^{\mathcal{B}}
= \bigcup_{B \in \mathcal{B}} \mathcal{C}^B$.
\end{defi}
Let NPOTM (respectively, DPOTM) be a shorthand for {\em nondeterministic\/}
(respectively, {\em deterministic\/}) {\em polynomial-time oracle Turing
machine}. For example, the following classes can be defined:
\begin{eqnarray*}
\np^{\scriptnp} & = & \bigcup_{B \in \scriptnp} \np^B = \{ L(M^B) \condition
\mbox{$M$ is an NPOTM and $B$ is in $\np$}\} \koz ; \\
\p^{\scriptnp} & = & \bigcup_{B \in \scriptnp} \p^B = \{ L(M^B) \condition
\mbox{$M$ is a DPOTM and $B$ is in $\np$}\} \koz .
\end{eqnarray*}
For the empty oracle set $\emptyset$, we obtain the unrelativized classes
$\np = \np^{\emptyset}$ and $\p = \p^{\emptyset}$, and we then write NPTM
instead of NPOTM and DPTM instead of DPOTM.
In particular, oracle Turing machines can be used for prefix search.
Let us consider an example.
\begin{pelda}[Prefix search by an oracle Turing machine]
\label{exa:prefix-search}
Suppose we wish to find the smallest solution of the graph isomorphism
problem, which is in~$\np$; see Definition~\ref{def:graphiso} in
Subsection~\ref{sec:algebra}. Let $G$ and $H$ be two given graphs with $n \geq
1$ vertices each. An isomorphism between $G$ and $H$ is called a {\em
solution of ``$\pair{G,H} \in \graphiso$''}. The set of isomorphisms
between $G$ and~$H$, $\iso{G,H}$,
contains all solutions of ``$\pair{G,H} \in \graphiso$''.
Our goal is to find the lexicographically smallest solution if $\pair{G,H} \in
\graphiso$; otherwise, we output the empty string $\lambda$ to indicate that
$\pair{G,H} \not\in \graphiso$. That is, we wish to compute the function $f$
defined by
$f(G,H) = \min \{\pi \condition \pi \in \iso{G,H}\}$
if $\pair{G,H} \in \graphiso$, and $f(G,H) = \lambda$
if $\pair{G,H} \not\in \graphiso$,
where the minimum is to be taken according to the lexicographic order on
$\mathfrak{S}_n$. More precisely, we view a permutation $\pi \in
\mathfrak{S}_n$ as the string $\pi(1) \pi(2) \cdots \pi(n)$ of length $n$ over
the alphabet $[n] = \{1, 2, \ldots ,n\}$, and we write $\pi < \sigma$ for
$\pi, \sigma \in \mathfrak{S}_n$ if and only if there is a $j \in [n]$ such
that $\pi(i) = \sigma(i)$ for all $i < j$ and $\pi(j) < \sigma(j)$.
From a permutation $\sigma \in \mathfrak{S}_n$, one obtains a {\em partial
permutation\/} by cancelling some pairs $(i, \sigma(i))$. A partial
permutation can be represented by a string over the alphabet $[n] \cup
\{\ast\}$, where $\ast$ indicates an undefined position. Let $k \leq n$. A
{\em prefix of length $k$ of $\sigma \in \mathfrak{S}_n$} is a partial
permutation of $\sigma$ containing each pair $(i, \sigma(i))$ with $i \leq k$,
but none of the pairs $(i, \sigma(i))$ with $i > k$. In particular, for
$k=0$, the empty string $\lambda$ is the (unique) length $0$ prefix
of~$\sigma$. For $k = n$, the total permutation $\sigma$ is the (unique)
length $n$ prefix of itself.
%
Suppose that $\pi$ is a prefix of length $k < n$ of $\sigma \in
\mathfrak{S}_n$ and that $w = i_1 i_2 \cdots i_{|w|}$ is a string over~$[n]$
of length $|w| \leq n-k$ with none of the $i_j$ occurring in~$\pi$. Then,
$\pi w$ denotes the partial permutation that extends $\pi$ by the pairs
$(k+1,i_1), (k+2,i_2), \ldots , (k+|w|,i_{|w|})$. If in addition $\sigma(k+j)
= i_j$ for $1 \leq j \leq |w|$, then $\pi w$ is also a prefix of~$\sigma$.
%
For our prefix search, given two graphs $G$ and~$H$, we define the set of
prefixes of the isomorphisms in~$\iso{G,H}$ by
\begin{eqnarray*}
\prefixiso & = & \left\{ \pair{G,H,\pi} \
\begin{array}{|l}
\mbox{$G$ and $H$ are graphs with $n$ vertices each and} \\
(\exists w \in [n]^{\ast})\ [w = i_1 i_2 \cdots
i_{n-|\pi|} \mbox{ and } \pi w \in \iso{G,H} ]
\end{array}
\right\}.
\end{eqnarray*}
Note that, for $n \geq 1$, the empty string $\lambda$ does not encode a
permutation in $\mathfrak{S}_n$. Furthermore, $\iso{G,H} = \emptyset$ if and
only if $\pair{G,H,\lambda} \not\in \prefixiso$, which in turn is true if and
only if $\pair{G,H} \not\in \graphiso$.
Starting from the empty string, we will construct, bit by bit, the smallest
isomorphism between the two given graphs (if there exists any).
% Figure~\ref{fig:prefix-search} shows
We below present an DPOTM $N$ that, using the $\np$ set
$\prefixiso$ as its oracle, computes the function $f$ by prefix search; see
also Exercise \ref{ex:prefix-search}. Denoting the class of
functions computable in polynomial time by $\fp$, we thus have shown that $f$
is in~$\fp^{\scriptprefixiso}$. Since $\prefixiso$ is in $\np$ (see
Exercise \ref{ex:prefix-search}), it follows that $f$ is
in~$\fp^{\scriptnp}$.
\end{pelda}
\begin{alg}{\textsc{N-Pre-Iso}$(G,H)$}\inddef{\textsc{N-Pre-Iso}}
1 \' \key{if} \= $(\pair{G,H,\lambda} \not\in \prefixiso)$ \\
2 \' \> \key{then} \= \key{return} $\lambda$ \\
3 \' \> \key{else} \> $\pi \leftarrow \lambda$ \\
4 \' \> \> $j \leftarrow 0$ \\
5 \' \> \> \key{while} \= $j < n$ \` $\rhd$ $G$ and $H$ have $n$ vertices each. \\
6 \' \> \> \> \key{do} \= $i \leftarrow 1$ \\
7 \' \> \> \> \> \key{while} \= $\pair{G,H,\pi i} \not\in \prefixiso$ \\
8 \' \> \> \> \> \> \key{do} \= $i \leftarrow i+1$ \\
9 \' \> \> \> \> \> $\pi \leftarrow \pi i$ \\
10\' \> \> \> \> \> $j \leftarrow j+1$ \\
11\' \> \> \key{return} $\pi$ \\
\end{alg}
Example~\ref{exa:prefix-search} shows that also Turing machines computing
functions can be equipped with an oracle, and that also function classes such
as $\fp$ can be relativizable.
We now return to oracle machines accepting languages and use them to define
several reducibilities. All reducibilities considered here are
polynomial-time computable.
\begin{defi}[Turing reducibilities]\label{def:turing-reducibility}
Let $\Sigma = \{0,1\}$ be a binary alphabet, let $A$ and $B$ be sets of
strings over~$\Sigma$, and let $\mathcal{C}$ be any complexity class. The set
of complements of sets in $\mathcal{C}$ is defined by
$\co\mathcal{C} = \{\overline{L} \condition L \in \mathcal{C}\}$.
Define the following reducibilities:
\begin{itemize}
\item \ki{Turing reducibility:} $A \turing B \Lolra A = L(M^B)$ for some
DPOTM $M.$\inddef{Turing reducibility}
\item \ki{Nondeterministic Turing reducibility:} $A \turingnp B \Lolra A =
L(M^B)$ for some NPOTM~$M.$\inddef{nondeterministic Turing reducibility}
\item \ki{Strong nondeterministic Turing reducibility:} $A
\strongturingnp B \Lolra A \in \np^B \cap \conp^B.$\inddef{strong nondeterministic Turing reducibility}
\item Let $\leq_r$ be one of the reducibilities defined above. We call a set
$B$ {\em $\leq_r$-hard for~$\mathcal{C}$} if and only if $A \leq_r B$ for
each set $A \in \mathcal{C}$. A set $B$ is said to be \ki{$\leq_r$-complete in $\mathcal{C}$}
if and only if $B$ is $\leq_r$-hard for $\mathcal{C}$ and $B \in \mathcal{C}.$
\item $\p^{\mathcal{C}} = \{A \condition (\exists B \in \mathcal{C})\, [A
\turing B]\}$ is the \ki{closure of $\mathcal{C}$ under the
$\turingtext$-reducibility.}
\item $\np^{\mathcal{C}} = \{A \condition (\exists B \in \mathcal{C})\, [A
\turingnp B]\}$ is the \ki{closure of $\mathcal{C}$ under the
$\turingnptext$-reducibility.}
\end{itemize}
\end{defi}
Using the $\turingtext$-reducibility and the $\turingnptext$-reducibility
introduced in Definition~\ref{def:turing-reducibility}, we now define the
polynomial hierarchy, and the low and the high hierarchy within $\np$.
\begin{defi}[Polynomial hierarchy]\inddef{polynomial hierarchy}\label{def:ph}
Define the \ki{polynomial hierarchy} $\ph$ inductively as follows:
$\deltalevel{0} = \sigmalevel{0} = \pilevel{0} = \p$,\ \ $\deltalevel{i+1} =
\p^{\sigmalevel{i}}$,\ \ $\sigmalevel{i+1} = \np^{\sigmalevel{i}}$ and\ \
$\pilevel{i+1} = \co\sigmalevel{i+1}$ for $i \geq 0$, and
$\ph = \bigcup_{k \geq 0}
\sigmalevel{k}$.
\end{defi}
In particular, $\deltalevel{1} = \p^{\sigmalevel{0}} = \p^{\scriptp} = \p$ and
$\sigmalevel{1} = \np^{\sigmalevel{0}} = \np^{\scriptp} = \np$ and
$\pilevel{1} = \co\sigmalevel{1} = \conp$. The following theorem, which is
stated without proof, provides some properties of the polynomial hierarchy,
see Problem \ref{ex:structure-ph}.
\begin{tetel}[Meyer and Stockmeyer\nevindex{Stockmeyer, Larry J.}]\label{thm:structure-ph}
For alle $i \geq 1$ holds:
\begin{enumerate}
\item $\sigmalevel{i-1} \cup \pilevel{i-1} \seq \deltalevel{i} \seq
\sigmalevel{i} \cap \pilevel{i}$.
\item $\sigmalevel{i}$, $\pilevel{i}$, $\deltalevel{i}$, and $\ph$ are closed
under $\manyonetext$-reductions. $\deltalevel{i}$ is even closed under
$\turingtext$-reductions.
\item $\sigmalevel{i}$ contains exactly those sets $A$ for which there exist a
set $B$ in $\p$ and a polynomial $p$ such that for each $x \in \sigmastar$:
\[
x \in A \Lolra (\exists^p w_1)\, (\forall^p w_2)\, \cdots\, (\mathfrak{Q}^p
w_i)\, [\pair{x,\anvecplus{w}{i}} \in B],
\]
where the quantifiers $\exists^p$
and $\forall^p$ are polynomially length-bounded, and $\mathfrak{Q}^p =
\exists^p$ if $i$ is odd, and $\mathfrak{Q}^p = \forall^p$ if $i$ is even.
\label{thm:structure-ph:alternating-quantifiers}
\item If $\sigmalevel{i-1} = \sigmalevel{i}$ for some~$i$, then $\ph$
collapses to $\sigmalevel{i-1} = \pilevel{i-1} = \deltalevel{i} =
\sigmalevel{i} = \pilevel{i} = \cdots = \ph$.
\item If $\sigmalevel{i} = \pilevel{i}$ for some~$i$, then $\ph$ collapses
to $\sigmalevel{i} = \pilevel{i} = \deltalevel{i+1} = \sigmalevel{i+1} =
\pilevel{i+1} = \cdots = \ph$.
\item There are $\manyonetext$-complete problems for each of the classes
$\sigmalevel{i}$, $\pilevel{i}$, and $\deltalevel{i}$. In contrast, if
$\ph$ has a $\manyonetext$-complete problem, then $\ph$ collapses to a
finite level, i.e., $\ph = \sigmalevel{k} = \pilevel{k}$ for some~$k$.
\end{enumerate}
\end{tetel}
\begin{defi}[Low hierarchy and high hierarchy within NP]~
\label{def:low-high-hierarchies-in-np}
\noindent For $k \geq 0$, define the $k$th level of the
\begin{itemize}
\item {\em low hierarchy $\lowh = \bigcup_{k \geq 0} \low{k}$ in $\np$\/} by
$\low{k} = \{ L \in \np\condition \sigmalevelrel{k}{L} \seq
\sigmalevel{k}\}$;
\item {\em high hierarchy $\highh = \bigcup_{k \geq 0} \high{k}$ in $\np$\/}
by $\high{k} = \{ H \in \np\condition \sigmalevel{k+1} \seq
\sigmalevelrel{k}{H}\}$.
\end{itemize}
\end{defi}
Informally put, a set $L$ is in $\low{k}$ if and only if it is useless as an
oracle for a $\sigmalevel{k}$ computation. All information contained in $L
\in \low{k}$ can be computed by the $\sigmalevel{k}$ machine itself without
the help of an oracle. On the other hand, a set $H$ in $\high{k}$ is so rich
and provides so useful information that the computational power of a
$\sigmalevel{k}$ machine is increased by the maximum amount an $\np$ set can
provide: it ``jumps'' to the next level of the polynomial hierarchy. That is,
by using an oracle $H$ from $\high{k}$, a $\sigmalevel{k}$ machine can
simulate any $\sigmalevel{k+1}$ computation. Thus, $H$ is as useful for a
$\sigmalevel{k}$ machine as any $\np$-complete set.
% The inclusion structure of the hierarchies defined above is shown in
% Figure~\ref{fig:ph-low-high} as Venn diagrams. Darker classes are contained
% in lighter classes, and incomparable classes have the same gray level. None
% of the inclusions is known to be strict.
For $k = 0$, the question of whether
or not $\sigmalevel{k} \neq \sigmalevel{k+1}$ is nothing other than the
$\p$-versus-$\np$ question.
%
Theorem~\ref{thm:low-high-characterization} lists some important properties of
these hierarchies, omitting the proofs, see~\cite{Schoning83} and
Exercise \ref{ex:structure-ph}. For the class $\co\am$ mentioned
in the first item of Theorem~\ref{thm:low-high-characterization}, the reader
is referred to the definition of the Arthur-Merlin hierarchy introduced in
Subsection~\ref{sec:arthur-merlin}; cf.\
Definition~\ref{def:arthur-merlin-hierarchie}. Ladner's theorem
(Theorem~\ref{thm:ladner}) is a special case (for $n = 0$) of
item~\ref{thm:low-high-characterization-item6} in
Theorem~\ref{thm:low-high-characterization}.
\begin{tetel}[Schöning]\label{thm:low-high-characterization}
\begin{enumerate}
\item $\low{0} = \p$ and $\low{1} = \np \cap \conp$ and $\np \cap \co\am \seq
\low{2}$.
\item $\high{0} = \{H \condition \mbox{$H$ is $\turingtext$-complete
in~$\np$} \}$.
\item $\high{1} = \{H \condition \mbox{$H$ is
$\strongturingnptext$-complete in~$\np$} \}$.
\item $\low{0} \seq \low{1} \seq \cdots \seq \low{k} \seq \cdots \seq \lowh
\seq \np$.
\item $\high{0} \seq \high{1} \seq \cdots \seq \high{k} \seq \cdots \seq
\highh \seq \np$.
\item For each $n \geq 0$, $\low{n} \cap \high{n}$ is nonempty if and only if
$\sigmalevel{n} = \sigmalevel{n+1} = \cdots = \ph$.
\item \label{thm:low-high-characterization-item6} For each $n \geq 0$, $\np$
contains sets that are neither in $\low{n}$ nor in $\high{n}$ if and only if
$\sigmalevel{n} \neq \sigmalevel{n+1}$.
\item There exist sets in $\np$ that are neither in $\lowh$ nor in $\highh$
if and only if $\ph$ is a strictly infinite hierarchy, i.e., if and only if
$\ph$ does not collapse to a finite level.
\end{enumerate}
\end{tetel}
\subsection{Graph isomorphism is in the low hierarchy}
We now turn to the result that the graph isomorphism problem ($\graphiso$) is
in $\low{2}$, the second level of the low hierarchy. This result provides
strong evidence against the $\np$-completeness of~$\graphiso$, as can be seen
as follows. If $\graphiso$ were $\np$-complete, then $\graphiso$ would be in
$\high{0} \seq \high{2}$, since by Theorem~\ref{thm:low-high-characterization}
$\high{0}$ contains exactly the $\turingtext$-complete $\np$ sets and in
particular the $\manyonetext$-complete sets in~$\np$. Again by
Theorem~\ref{thm:low-high-characterization}, we have that $\low{2} \cap
\high{2}$ is nonempty if and only if $\ph$ collapses to~$\sigmalevel{2}$,
which is considered very unlikely.
To prove the lowness of the graph isomorphism problem, we first need a
technical prerequisite, the so-called hashing lemma, stated here as
Lemma~\ref{lem:hashing}. Hashing is method widely used in computer science
for dynamic data management. The idea is the following. Assign to every data
set some (short) key in a unique manner. The set of all potential keys,
called the universe~$U$, is usually very large. In contrast, the set $V \seq
U$ of those keys actually used is often much smaller. A {\em hash
function\/} $h : U \rightarrow T$ maps the elements of $U$ to the {\em
hash table\/} $T = \{0, 1, \ldots , k-1\}$. Hash functions are
many-to-one mappings. That is, distinct keys from $U$ can be mapped to the
same address in~$T$. However, if possible, one wishes to map two distinct
keys from $V$ to distinct addresses in~$T$. That is, one seeks to avoid
collisions on the set of keys actually used. If possible, a hash function
thus should be injective on~$V$.
Among the various known hashing techniques, we are here interested in {\em
universal hashing}, which was introduced by Carter and
Wegman~\cite{Carter79} in 1979. The idea is to randomly
choose a hash function from a suitable family of hash functions. This method
is universal in the sense that it does no longer depend on a particular set
$V$ of keys actually used. Rather, it seeks to avoid collisions with high
probability on {\em all\/} sufficiently small sets~$V$. The probability here
is with respect to the random choice of the hash function.
In what follows, we assume that keys are encoded as strings over the alphabet
$\Sigma = \{0,1\}$. The set of all length $n$ strings in $\sigmastar$ is
denoted by~$\Sigma^{n}$.
\begin{defi}[Hashing]\label{def:hashing}
Let $\Sigma = \{0,1\}$, and let $m$ and $t$ be positive integers with $t > m$.
A {\em hash function\/} $h : \Sigma^{t} \rightarrow \Sigma^{m}$ is a linear
mapping determined by a Boolean $t {\times} m$ matrix $B_h = (b_{i,j})_{i,j}$,
where $b_{i,j} \in \{0,1\}$. For $x \in \Sigma^{t}$ and $1 \leq j \leq m$,
the $j$th bit of $y = h(x)$ in $\Sigma^{m}$ is given by $y_j = (b_{1,j} \wedge
x_1) \oplus (b_{2,j} \wedge x_2) \oplus \cdots \oplus (b_{t,j} \wedge x_t)$,
where $\oplus$ denotes the logical {\em exclusive-or\/} operation,
i.e.,
\[
a_1 \oplus a_2 \oplus \cdots \oplus a_n = 1 \Lolra |\{ i \condition a_i = 1\}|
\equiv 1 \bmod 2 \koz .
\]
Let $\mathcal{H}_{t,m}$ be a
family of hash functions for the parameters $t$ and~$m$:
\[
\mathcal{H}_{t,m} = \{h : \Sigma^{t} \rightarrow \Sigma^{m}
\condition \mbox{$B_h$ is a Boolean $t {\times} m$ matrix}\} \koz .
\]
On~$\mathcal{H}_{t,m}$, we assume the uniform distribution: A hash function
$h$ is chosen from $\mathcal{H}_{t,m}$ by picking the bits $b_{i,j}$ in $B_h$
independently and uniformly distributed.
Let $V \seq \Sigma^{t}$. For any subfamily $\widehat{\mathcal{H}}$ of
$\mathcal{H}_{t,m}$, we say there is a {\em collision on $V$\/} if
\[
(\exists \vec{v} \in V)\, (\forall h \in \widehat{\mathcal{H}})\,
(\exists \vec{x} \in V)\, [\vec{v} \neq \vec{x} \wedge h(\vec{v}) =
h(\vec{x})] \koz .
\]
Otherwise, $\widehat{\mathcal{H}}$ is said to be {\em collision-free on $V$}.
\end{defi}
A collision on $V$ means that none of the hash functions in a subfamily
$\widehat{\mathcal{H}}$ is injective on~$V$. The following lemma says that,
if $V$ is sufficiently small, a randomly chosen subfamily of
$\mathcal{H}_{t,m}$ must be collision-free. In contrast, if $V$ is too large,
collisions cannot be avoided. The proof of Lemma~\ref{lem:hashing} is
omitted.
\begin{lemma}[Hashing lemma]\label{lem:hashing}
Let $t, m \in \N$ be fixed parameters, where $t > m$. Let $V \seq
\Sigma^{t}$ and let $\widehat{\mathcal{H}} = (h_1, h_2, \ldots , h_{m+1})$ be
a family of hash functions randomly chosen from $\mathcal{H}_{t,m}$ under the
uniform distribution.
Let
\[
C(V) = \{\widehat{\mathcal{H}} \condition (\exists v \in V)\, (\forall h \in
\widehat{\mathcal{H}})\, (\exists x \in V)\, [v \neq x \wedge h(v) = h(x)]\}
\]
be the event that for $\widehat{\mathcal{H}}$ a collision occurs on~$V$.
Then, the following two statements hold:
\begin{enumerate}
\item If $|V| \leq 2^{m-1}$, then $C(V)$ occurs with probability at most $1/4$.
\item If $|V| > (m+1)2^{m}$, then $C(V)$ occurs with probability~$1$.
\end{enumerate}
\end{lemma}
In Section~\ref{sec:zero-knowledge}, the Arthur-Merlin hierarchy has been
defined, and it was mentioned that this hierarchy collapses to its second
level. Here, we are interested in the class $\co\am$, cf.\
Definition~\ref{def:arthur-merlin-hierarchie} in
Subsection~\ref{sec:arthur-merlin}.
\begin{tetel}[Schöning]
\label{thm:graphiso-is-in-coam}
$\graphiso$ is in $\low{2}$.
\end{tetel}
\begin{biz}
By Theorem~\ref{thm:low-high-characterization}, every $\np \cap \co\am$ set
is in~$\low{2}$. Thus, to prove that $\graphiso$ in $\low{2}$, it is enough
to show that $\graphiso$ is in $\co\am$. Let $G$ and $H$ be two graphs with
$n$ vertices each. We wish to apply Lemma~\ref{lem:hashing}. A first idea
is to use
\begin{eqnarray*}
A(G,H) & = &
\{\pair{F,\varphi} \condition \mbox{$F \cong G \wedge \varphi \in \auto{F}$}\}
\cup
\{\pair{F,\varphi} \condition \mbox{$F \cong H \wedge \varphi \in \auto{F}$}\}
\end{eqnarray*}
as the set $V$ from that lemma. By Lemma~\ref{lem:graphiso-pair}, we have
$|A(G,H)| = n!$ if $G \cong H$, and $|A(G,H)| = 2n!$ if $G \not\cong H$.
The $\co\am$ machine we wish to construct for $\graphiso$ is polynomial-time
bounded. Thus, the parameters $t$ and $m$ from the hashing lemma must be
polynomial in~$n$. So, to apply Lemma~\ref{lem:hashing}, we would have to
choose a polynomial $m = m(n)$ such that
\begin{equation}
\label{equ:graphiso-is-in-coam}
n! \leq 2^{m-1} < (m+1) 2^{m} < 2 n! \koz .
\end{equation}
This would guarantee that the set $V = A(G,H)$ would be large enough to tell
two isomorphic graphs $G$ and $H$ apart from two nonisomorphic graphs $G$ and
$H$. Unfortunately, it is not possible to find a polynomial $m$ that
satisfies~(\ref{equ:graphiso-is-in-coam}). Thus, we define a different
set~$V$, which yields a gap large enough to distinguish isomorphic graphs from
nonisomorphic graphs.
Define $V = A(G,H)^n = \underbrace{A(G,H) \times A(G,H) \times \cdots
\times A(G,H)}_{\mbox{\scriptsize $n$ times}}$. Now,
(\ref{equ:graphiso-is-in-coam}) changes to
\begin{equation}
\label{equ:graphiso-is-in-coam-neu}
(n!)^n \leq 2^{m-1} < (m+1) 2^{m} < (2 n!)^n \koz ,
\end{equation}
and this inequality can be satisfied by choosing $m = m(n) = 1 + \lceil n \log
n! \rceil$, which is polynomially in~$n$ as desired.
Construct a $\co\am$ machine $M$ for~$\graphiso$ as follows. Given the graphs
$G$ and $H$ each having $n$ vertices, $M$ first computes the parameter~$m$.
The set $V = A(G,H)^n$ contains $n$-tuples of pairs each having the form
$\pair{F,\varphi}$, where $F$ is a graph with $n$ vertices, and where
$\varphi$ is a permutation in the automorphism group~$\auto{F}$. The elements
of $V$ can be suitably encoded as strings over the alphabet $\Sigma =
\{0,1\}$, for a suitable polynomial $t = t(n)$. All
computations performed so far are deterministic.
Then, $M$ performs Arthur's probabilistic move by randomly choosing a family
$\widehat{\mathcal{H}} = (h_1, h_2, \ldots , h_{m+1})$ of hash functions from
$\mathcal{H}_{t,m}$ under the uniform distribution. Each hash function $h_i
\in \widehat{\mathcal{H}}$ is represented by a
Boolean $t {\times} m$ matrix. Thus, the $m+1$ hash functions
$h_i$ in $\widehat{\mathcal{H}}$ can be represented as a string
$z_{\widehat{\mathcal{H}}} \in \sigmastar$ of length $p(n)$ for a suitable
polynomial~$p$. Modify the collision predicate $C(V)$ defined in the hashing
lemma as follows:
\[
B = \{\pair{G,H,z_{\widehat{\mathcal{H}}}} \condition (\exists v \in V)\,
(\forall i : 1 \leq i \leq m+1)\, (\exists x \in V)\, [v \neq x \wedge
h_i(v) = h_i(x)]\} \koz .
\]
Note that the $\forall$ quantifier in $B$ ranges over only polynomially many
$i$ and can thus be evaluated in deterministic polynomial time. It follows
that the two $\exists$ quantifiers in $B$ can be merged into a {\em single\/}
polynomially length-bounded $\exists$ quantifier. By
Theorem~\ref{thm:structure-ph}, $B$ is a set in $\sigmalevel{1} = \np$. Let
$N$ be an NPTM for~$B$. For the string~$z_{\widehat{\mathcal{H}}}$ that
encodes $m+1$ randomly picked hash functions from~ $\mathcal{H}_{t,m}$, $M$
now simulates the computation of~$N(G,H,z_{\widehat{\mathcal{H}}})$. This
corresponds to Merlin's move. Finally, $M$ accepts its input $\pair{G,H}$ if
and only if $N(G,H,z_{\widehat{\mathcal{H}}})$ accepts.
We now estimate the probability (taken over the random choice of the hash
functions in~$z_{\widehat{\mathcal{H}}}$ that $M$ accepts its input
$\pair{G,H}$. If $G$ and $H$ isomorphic, then $|A(G,H)| = n!$ by
Lemma~\ref{lem:graphiso-pair}. Inequality~(\ref{equ:graphiso-is-in-coam-neu})
implies $|V| = (n!)^n \leq 2^{m-1}$. By Lemma~\ref{lem:hashing}, the
probability that $\pair{G,H,z_{\widehat{\mathcal{H}}}}$ is in $B$
(and that $M(G,H)$ thus accepts)
is at most~$1/4$. However, if $G$ and $H$ are nonisomorphic,
Lemma~\ref{lem:graphiso-pair} implies that $|A(G,H)| = 2n!$.
Inequality~(\ref{equ:graphiso-is-in-coam-neu}) now gives $|V| = (2n!)^n >
(m+1)2^{m}$. By Lemma~\ref{lem:hashing}, the probability that
$\pair{G,H,z_{\widehat{\mathcal{H}}}}$ is in $B$ and $M(G,H)$ thus accepts
is~$1$. It follows that $\graphiso$ is in $\co\am$ as desired.
\end{biz}
\subsection{Graph isomorphism is in SPP}
The probabilistic complexity class $\rp$ was introduced in
Definition~\ref{def:rp} in Subsection~\ref{sec:rsa}. In this section, two other
probabilistic complexity classes are important that we will now define: $\pp$
and~$\spp$, which stand for {\em P\/}robabilistic {\em P\/}olynomial Time and
{\em S\/}toic {\em P\/}robabilistic {\em P\/}olynomial Time, respectively.
\begin{defi}[PP and SPP]\label{def:pp-spp}
The class $\pp$ contains exactly those problems $A$ for which there exists an
NPTM $M$ such that for each input~$x$: If $x \in A$ then $M(x)$ accepts with
probability at least~$1/2$, and if $x \not\in A$ then $M(x)$ accepts with
probability less than~$1/2$.
For any NPTM $M$ running on input~$x$, let $\acc{M}(x)$ denote the number of
accepting computation paths of~$M(x)$ and let $\rej{M}(x)$ denote the number
of rejecting computation paths of~$M(x)$. Define $\gap{M}(x) = \acc{M}(x) -
\rej{M}(x)$.
The class $\spp$ contains exactly those problems $A$ for which there exists an
NPTM $M$ such that for each input~$x$: $(x \in A \Lora \gap{M}(x) = 1)$ and
$(x \not\in A \Lora \gap{M}(x) = 0)$.
\end{defi}
In other words, an $\spp$ machine is ``stoic'' in the sense that its ``gap''
(i.e., the difference between its accepting and rejecting computation paths)
can take on only two out of an exponential number of possible values, namely
$1$ and~$0$. Unlike~$\pp$, $\spp$ is a so-called ``{\em promise class}''.
since an $\spp$ machine $M$ ``promises'' that $\gap{M}(x) \in \{0,1\}$ for
each~$x.$
The notion of lowness can be defined for any relativizable complexity
class~$\mathcal{C}$: A set $A$ is said to be {\em $\mathcal{C}$-low} if and
only if $\mathcal{C}^A = \mathcal{C}$. In particular, for each~$k$, the $k$th
level $\low{k}$ of the low hierarchy within $\np$ (see
Definition~\ref{def:low-high-hierarchies-in-np}) contains exactly the $\np$
sets that are $\sigmalevel{k}$-low. It is known that all sets in $\spp$ are
$\pp$-low. This and other useful properties of $\spp$ are listed in the
following theorem without proof; see
also~\cite{Fenner94,Koebler92a,Koebler92b}.
\begin{tetel}
\label{thm:spp-properties}
\begin{enumerate}
\item $\spp$ is $\pp$-low, i.e., $\pp^{\scriptspp} = \pp$.
\item $\spp$ is self-low, i.e., $\spp^{\scriptspp} = \spp$.
\item Let $A$ be a set in $\np$ via some NPTM~$N$ and let $L$ be a set in
$\spp^A$ via some NPOTM~$M$ such that, for each input~$x$, $M^{A}(x)$ asks
only queries $q$ satisfying $\acc{N}(q) \leq 1$. Then, $L$ is in~$\spp$.
\item Let $A$ be a set in $\np$ via some NPTM~$N$ and let $f$ be a function in
$\fp^A$ via some DPOTM~$M$ such that, for each input~$x$, $M^{A}(x)$ asks
only queries $q$ satisfying $\acc{N}(q) \leq 1$. Then, $f$ is
in~$\fp^{\scriptspp}$.
\label{thm:spp-properties-fp-tothe-spp}
\end{enumerate}
\end{tetel}
The following theorem says that the lexicographically smallest permutation in
a right coset (see Definition~\ref{def:permutationgroup} in
Section~\ref{sec:algebra}) can be determined efficiently. The lexicographic
order on $\mathfrak{S}_n$ is defined in Example~\ref{exa:prefix-search}.
\begin{tetel}\label{thm:least-element-in-right-coset}
Let $\mathfrak{G} \leq \mathfrak{S}_n$ be a permutation group with
$\mathfrak{G} = \langle G \rangle$ and let $\pi$ be a permutation
in~$\mathfrak{S}_n$. There is a polynomial-time algorithm that, given
$\pair{G, \pi}$, computes the lexicographically smallest permutation in the
right coset $\mathfrak{G} \pi$ of $\mathfrak{G}$ in~$\mathfrak{S}_n$.
\end{tetel}
\begin{biz}
% Figure~\ref{fig:least-element-in-right-coset} shows the
We now state our
algorithm ${\LERC}$ for computing the lexicographically smallest permutation
in the right coset $\mathfrak{G} \pi$ of $\mathfrak{G}$ in~$\mathfrak{S}_n$,
where the permutation group $\mathfrak{G}$ is given by a generator~$G$, see
Definition~\ref{def:permutationgroup} in Subsection~\ref{sec:algebra}.
\begin{alg}{LERC$(G,\pi)$}\inddef{\textsc{LERC}}
1 \' compute the tower $\mathfrak{G}^{(n)} \leq
\mathfrak{G}^{(n-1)} \leq \cdots \leq \mathfrak{G}^{(1)} \leq
\mathfrak{G}^{(0)}$ of stabilisers in $\mathfrak{G}$ \\
2 \' \> $\varphi_0 \leftarrow \pi$ \\
3 \' \> \key{for} \= $i \leftarrow 0$ \key{to} $n-1$ \\
4 \' \> \> \key{do} \= $x \leftarrow i+1$ \\
5 \' \> \> \> compute the element $y$ in the orbit $\mathfrak{G}^{(i)}(x)$ for which $\varphi_i(y)$ \\
\' \> \> \> is minimum \\
6 \' \> \> \> compute a permutation $\tau_i$ in $\mathfrak{G}^{(i)}$
such that $\tau_i (x) = y$ \\
7 \' \> \> \> $\varphi_{i+1} \leftarrow \tau_i \varphi_i$\\
8 \' \> \key{return} $\varphi_n$ \\
\end{alg}
By Theorem~\ref{thm:permutationgroup}, the tower $\boldid = \mathfrak{G}^{(n)}
\leq \mathfrak{G}^{(n-1)} \leq \cdots \leq \mathfrak{G}^{(1)} \leq
\mathfrak{G}^{(0)} = \mathfrak{G}$ of stabilisers of $\mathfrak{G}$ can be
computed in polynomial time. More precisely, for each $i$ with $1 \leq i \leq
n$, the complete right transversals $T_i$ $\mathfrak{G}^{(i)}$
in~$\mathfrak{G}^{(i-1)}$ are determined, and thus a strong generator $S =
\bigcup_{i =1}^{n-1} T_i$ of~$\mathfrak{G}$.
Note that $\varphi_0 = \pi$ and $\mathfrak{G}^{(n-1)} = \mathfrak{G}^{(n)} =
\boldid$. Thus, to prove that the algorithm works correctly, it is enough to
show that for each $i$ with $0 \leq i \leq n-1$, the lexicographically
smallest permutation of $\mathfrak{G}^{(i)} \varphi_i$ is contained in
$\mathfrak{G}^{(i+1)} \varphi_{i+1}$. By induction, it follows that
$\mathfrak{G}^{(n)} \varphi_n = \{\varphi_n\}$ also contains the
lexicographically smallest permutation of $\mathfrak{G} \pi =
\mathfrak{G}^{(0)} \varphi_0$. Thus, algorithm ${\LERC}$ indeed outputs the
lexicographically smallest permutation of $\varphi_n$ of~$\mathfrak{G} \pi$.
To prove the above claim, let us denote the orbit of an element $x \in [n]$ in
a permutation group $\mathfrak{H} \leq \mathfrak{S}_n$ by~$\mathfrak{H}(x)$.
Let $\tau_i$ be the permutation in~$\mathfrak{G}^{(i)}$ that maps $i+1$ onto
the element $y$ in the orbit $\mathfrak{G}^{(i)}(i+1)$ for which
$\varphi_{i}(y) = x$ is the minimal element in the set $\{\varphi_{i}(z)
\condition z \in \mathfrak{G}^{(i)}(i+1)\}$.
By Theorem~\ref{thm:permutationgroup}, the orbit $\mathfrak{G}^{(i)}(i+1)$ can
be computed in polynomial time. Since $\mathfrak{G}^{(i)}(i+1)$ contains at
most $n-i$ elements, $y$ can be determined efficiently. Our
algorithm ensures that $\varphi_{i+1} = \tau_i \varphi_i$. Since every
permutation in $\mathfrak{G}^{(i)}$ maps each element of $[i]$ onto itself,
and since $\tau_i \in \mathfrak{G}^{(i)}$, it follows that for each $j$ with
$1 \leq j \leq i$, for each $\tau \in \mathfrak{G}^{(i)}$, and for each
$\sigma \in \mathfrak{G}^{(i+1)}$,
\[
(\sigma \varphi_{i+1})(j) = \varphi_{i+1}(j) = (\tau_i \varphi_i)(j) =
\varphi_i(j) = (\tau \varphi_i)(j) \koz .
\]
In particular, it follows for the lexicographically smallest permutation $\mu$
in $\mathfrak{G}^{(i)} \varphi_i$ that every permutation from
$\mathfrak{G}^{(i+1)} \varphi_{i+1}$ must coincide with $\mu$ on the first
$i$ elements, i.e. on~$[i]$.
Moreover, for each $\sigma \in \mathfrak{G}^{(i+1)}$ and for the
element $x = \varphi_{i}(y)$ defined above, we have
\[
(\sigma \varphi_{i+1})(i+1) = \varphi_{i+1}(i+1) = (\tau_i \varphi_i)(i+1) = x \koz .
\]
Clearly, $\mathfrak{G}^{(i+1)} \varphi_{i+1} = \{ \varphi \in
\mathfrak{G}^{(i)} \varphi_{i} \condition \varphi(i+1) = x\}$. The claim now
follows from the fact that $\mu(i+1) = x$ for the lexicographically smallest
permutation $\mu$ of $\mathfrak{G}^{(i)} \varphi_i$.
Thus, ${\LERC}$ is a correct algorithm. It easy to see that it is also
efficient.
\end{biz}
Theorem~\ref{thm:least-element-in-right-coset} can easily be extended to
Corollary~\ref{cor:least-element-in-right-coset}, see
Exercise \ref{ex:least-element-in-right-coset}.
\begin{kov}\label{cor:least-element-in-right-coset}
Let $\mathfrak{G} \leq \mathfrak{S}_n$ be a permutation group with
$\mathfrak{G} = \langle G \rangle$, and let $\pi$ and $\psi$ be two given
permutations in~$\mathfrak{S}_n$. There exists a polynomial-time algorithm
that, given $\pair{G, \pi, \psi}$, computes the lexicographically smallest
permutation of~$\psi \mathfrak{G} \pi$.
\end{kov}
We now prove Theorem~\ref{thm:graphiso-is-in-spp}, the main result of this
section.
\begin{tetel}[Arvind and Kurur]\label{thm:graphiso-is-in-spp}
$\graphiso$ is in~$\spp$.
\end{tetel}
\begin{biz}
Define the (functional) problem $\genauto$ as follows: Given a graph~$G$,
compute a strong generator of the automorphism group~$\auto{G}$; see
Definition~\ref{def:permutationgroup} and the subsequent paragraph and
Definition~\ref{def:graphiso} for these notions. By
Mathon's~\cite{Mathon79}\nevindex{Mathon, R.} result, the problems $\genauto$ and
$\graphiso$ are Turing-equivalent (see also~\cite{Koebler92b}),
i.e., $\genauto$ is in $\fp^{\graphiso}$ and $\graphiso$ is
in~$\p^{\genauto}$. Thus, it is enough to show that $\genauto$ is in
$\fp^{\scriptspp}$ because the self-lowness of $\spp$ stated in
Theorem~\ref{thm:spp-properties} implies that $\graphiso$ is in
$\p^{\genauto} \seq \spp^{\scriptspp} \seq \spp$, which will
complete the proof.
So, our goal is to find an $\fp^{\scriptspp}$ algorithm for $\genauto$.
Given a graph~$G$, this algorithm has to compute a strong generator
$S = \bigcup_{i=1}^{n-1} T_i$ for $\auto{G}$, where
\[
\boldid = \auto{G}^{(n)} \leq \auto{G}^{(n-1)} \leq \cdots \leq
\auto{G}^{(1)} \leq \auto{G}^{(0)} = \auto{G}
\]
is the tower of stabilisers of $\auto{G}$ and $T_i$, $1 \leq i \leq n$, is a
complete right transversal of $\auto{G}^{(i)}$ in~$\auto{G}^{(i-1)}$.
Starting with the trivial case, $\auto{G}^{(n)} = \boldid$, we build step by
step a strong generator for $\auto{G}^{(i)}$, where $i$ is decreasing.
Eventually, we will thus obtain a strong generator for $\auto{G}^{(0)} =
\auto{G}$. So suppose a strong generator $S_i = \bigcup_{j=i}^{n-1} T_j$ for
$\auto{G}^{(i)}$ has already been found. We now describe how to determine a
complete right transversal $T_{i-1}$ of $\auto{G}^{(i)}$ in~$\auto{G}^{(i-1)}$
by our $\fp^{\scriptspp}$ algorithm. Define the oracle set
\[
A = \left\{ \pair{G,S,i,j,\pi} \
\begin{array}{|l}
\mbox{ $S \seq \auto{G}$ and $\langle S \rangle$ is a pointwise
stabilizer of $[i]$} \\
\mbox{ in $\auto{G}$, $\pi$ is a partial permutation, which pointwise} \\
\mbox{ stabilises $[i-1]$, and $\pi(i) = j$, and there is a $\tau$ in} \\
\mbox{ $\auto{G}^{(i-1)}$ with $\tau(i)=j$ and ${\LERC}(S,\tau)$
extends~$\pi$} \\
\end{array}
\right\}.
\]
By Theorem~\ref{thm:least-element-in-right-coset}, the lexicographically
smallest permutation ${\LERC}(S,\tau)$ of the right coset $\langle S \rangle
\tau$ can be determined in polynomial time by our algorithm.
% in Figure~\ref{fig:least-element-in-right-coset}.
The partial permutation $\pi$
belongs to the input $\pair{G,S,i,j,\pi}$, since we wish to use $A$ as an
oracle in order to find the lexicographically smallest permutation by prefix
search; cf.\
% Figure~\ref{fig:prefix-search} in
Example~\ref{exa:prefix-search}.
% Figure~\ref{fig:orakel-in-spp} gives an
Consider the following NPTM $N$ for~$A$:
\begin{alg}{N$(G,S,i,j,\pi)$}\inddef{\textsc{N} algorithm}
1 \' verify that $S \seq \auto{G}^{(i)}$ \\
2 \' nondeterministically guess a permutation $\tau \in
\mathfrak{S}_n$; \hfill $\slash \slash$ $G$ has $n$ vertices \\
3 \' \key{if} \= $\tau \in \auto{G}^{(i-1)}$ and $\tau(i)=j$ and $\tau$ extends $\pi$ and $\tau = \textsc{LERC}(S,\tau)$ \\
4 \' \> \key{then} \= accept and halt \\
5 \' \> \key{else} \> reject and halt
\end{alg}
Thus, $A$ is
in~$\np$. Note that if $\tau(i)=j$ then $\sigma(i)=j$, for each permutation
$\sigma$ in the right coset $\langle S \rangle \tau$.
We now show that if $\langle S \rangle = \auto{G}^{(i)}$ then the number of
accepting computation paths of $N$ on input $\pair{G,S,i,j,\pi}$ is either $0$
or~$1$. In general, $\acc{N}\pair{G,S,i,j,\pi} \in
\left\{0,|\auto{G}^{(i)}|/|\langle S \rangle|\right\}$.
Suppose $\pair{G,S,i,j,\pi}$ is in~$A$ and $\langle S \rangle =
\auto{G}^{(i)}$. If $\tau(i)=j$ for some $\tau \in \auto{G}^{(i-1)}$ and
$j>i$, then the right coset $\langle S \rangle \tau$ contains exactly those
permutations in $\auto{G}^{(i-1)}$ that map $i$ to~$j$. Thus, the only
accepting computation path of $N\pair{G,S,i,j,\pi}$ corresponds to the unique
lexicographically smallest permutation $\tau = {\LERC}(S,\tau)$. If, on the
other hand, $\langle S \rangle$ is a strict subgroup of $\auto{G}^{(i)}$, then
$\auto{G}^{(i)}\tau$ can be written as the disjoint union of $k =
|\auto{G}^{(i)}|/|\langle S \rangle|$ right cosets of $\langle S \rangle$. In
general, $N\pair{G,S,i,j,\pi}$ thus possesses $k$ accepting computation paths
if $\pair{G,S,i,j,\pi}$ is in~$A$, and otherwise it has no accepting
computation path.
\begin{algN}{M-A$(G)$}\inddef{\textsc{M-A} algorithm}
1 \' set $T_i := \{\idJR\}$ for each~$i$, $0 \leq i \leq
n-2$; \hfill $\slash \slash$ $G$ has $n$ vertices \\
\' \` $\rhd$ $T_i$ will be a complete
right transversal of $\auto{G}^{(i+1)}$ in $\auto{G}^{(i)}.$ \\
3 \' set $S_i := \emptyset$ for each~$i$, $0 \leq i \leq n-2$ \\
4 \' set $S_{n-1} := \{\idJR\}$ \\
\' \` $\rhd$ $S_i$ will be a strong generator for $\auto{G}^{(i)}.$ \\
5 \' \key{for} \= $i \leftarrow n-1$ \key{downto} 1 \\
\' \` $\rhd$ $S_i$ is already found at the start \\
\' \` $\rhd$ of the $i$th iteration, and $S_{i-1}$ will now be computed. \\
6 \' \> \key{do} \= let $\pi: [i-1] \rightarrow [n]$ be the partial permutation \\
\' \> \> with $\pi(a)=a$ for each $a \in [i-1]$ \\
7 \' \` $\rhd$ For $i=1$, $\pi$ is the nowhere
defined partial permutation. \\
8 \' \> \> \key{for} \= $j \leftarrow i+1$ \key{to} $n$ \\
9 \' \> \> \> \key{do} \= $\hat{\pi} := \pi j$, i.e., $\hat{\pi}$ extends
$\pi$ by the pair $(i,j)$ with $\hat{\pi}(i)=j$ \\
10 \' \> \> \> \> \key{if} \= $(\pair{G,S_{i},i,j,\hat{\pi}} \in A)$ \ $\{$ \\
11 \' \> \> \> \> \> \key{then} \= \` $\rhd$ Construct the smallest permutation \\
\' \` in $\auto{G}^{(i-1)}$ mapping $i$ to $j$ by prefix search. \\
12 \' \> \> \> \> \> \> \key{for} \= $k \leftarrow i+1$ \key{to} $n$ \\
13 \' \> \> \> \> \> \> \> \key{do} find the element $\ell$ not in the image\\
\' \> \> \> \> \> \> \> of
$\hat{\pi}$ with $\pair{G,S_{i},i,j,\hat{\pi}\ell} \in A$ \\
14 \' \> \> \> \> \> \> $\hat{\pi} := \hat{\pi}\ell$ \\
15 \' \` $\rhd$ Now, $\hat{\pi}$ is a total permutation in $\mathfrak{S}_n$
\algnewpage
16 \' \> \> \> \> \> $T_{i-1} \leftarrow T_{i-1} \cup \hat{\pi}$ now, $T_{i-1}$ is \\
\' \` a complete right transversal of $\auto{G}^{(i)}$
in~$\auto{G}^{(i-1)}$ \\
17 \' \> $S_{i-1} := S_i \cup T_{i-1}.$ \\
18 \' \key{return} $S_{0}$ \` $\rhd$ $S_{0}$ is a strong generator for $\auto{G} = \auto{G}^{(0)}$ \\
\end{algN}
% Figure~\ref{fig:fp-tothe-spp-alorithm-for-auto} shows the
The above algorithm is an $\fp^A$ algorithm
$M^A$ for $\genauto$. The DPOTM $M$ makes only queries $q =
\pair{G,S_i,i,j,\pi}$ to its oracle $A$ for which $\langle S_i \rangle =
\auto{G}^{(i)}$. Thus, $\acc{N}(q) \leq 1$ for each query $q$ actually asked.
By item~\ref{thm:spp-properties-fp-tothe-spp} of
Theorem~\ref{thm:spp-properties}, it follows that $\genauto$ is
in~$\fp^{\scriptspp}$.
The claim that the output $S_{0}$ of $M^A(G)$ indeed is a strong generator for
$\auto{G} = \auto{G}^{(0)}$ can be shown by induction on~$n$.
The induction base is $n-1$, and $S_{n-1} = \{\idJR\}$
of course generates $\auto{G}^{(n-1)} = \boldid$.
For the induction step, assume that prior to the $i$th iteration a
strong generator $S_i$ for $\auto{G}^{(i)}$ has already been found. We now
show that after the $i$th iteration the set $S_{i-1} = S_i \cup T_{i-1}$ is a
strong generator for~$\auto{G}^{(i-1)}$. For each $j$ with $i+1 \leq j \leq
n$, the oracle query ``$\pair{G,S_{i},i,j,\hat{\pi}} \in A$?'' checks whether
there is a permutation in $\auto{G}^{(i-1)}$ mapping $i$ to~$j$. By prefix
search, which is performed by making suitable oracle queries to $A$ again, the
lexicographically smallest permutation $\hat{\pi}$ in $\auto{G}^{(i-1)}$ with
$\hat{\pi}(i)=j$ is constructed. Note that, as claimed above, only queries
$q$ satisfying $\acc{N}(q) \leq 1$ are made to~$A$, since $S_i$ is a strong
generator for $\auto{G}^{(i)}$, so $\langle S_i \rangle = \auto{G}^{(i)}$. By
construction, after the $i$th iteration, $T_{i-1}$ is a complete right
transversal of $\auto{G}^{(i)}$ in~$\auto{G}^{(i-1)}$. It follows that
$S_{i-1} = S_i \cup T_{i-1}$ is a strong generator for~$\auto{G}^{(i-1)}$.
Eventually, after $n$ iterations, a strong generator $S_{0}$ for $\auto{G}
=\auto{G}^{(0)}$ is found.
\end{biz}
From Theorem~\ref{thm:graphiso-is-in-spp} and the first two items of
Theorem~\ref{thm:spp-properties}, we obtain
Corollary~\ref{cor:graphiso-is-in-spp}.
\begin{kov}
\label{cor:graphiso-is-in-spp}
$\graphiso$ is low for both $\spp$ and~$\pp$, i.e., $\spp^{\graphiso} = \spp$
and~$\pp^{\graphiso} = \pp$.
\end{kov}
\begin{gyak}
\ujgyak
By Definition~\ref{def:turing-reducibility}, $A \turing B$ if and only
if $A \in \p^B$. Show that $A \turing B$ if and only if $\p^A \seq \p^B$.
\label{ex:turing-reducibility}
\ujgyak Show that the set $\prefixiso$ defined in
Example~\ref{exa:prefix-search} is in $\np$. Moreover, prove that the
machine $N$
% shown in Figure~\ref{fig:prefix-search}
defined in Example~\ref{exa:prefix-search} runs in polynomial time,
i.e., show that $N$ is a DPOTM.\label{ex:prefix-search}
\end{gyak}
\newpage
\begin{fld}
\ujfld{Strong NPOTM}
{A \ki{strong NPOTM}\inddef{strong NPOTM} is an NPOTM with three types of final states,
i.e., the set $F$ of final states of $M$ is partitioned into~$F_a$
(accepting states), $F_r$ (rejecting states), and $F_?$ (``don't know''
states) such that the following holds: If $x \in A$ then $M^B(x)$ has at
least one computation path halting in an accepting state from $F_a$ but no
computation path halting in a rejecting state from~$F_r$. If $x \not\in A$
then $M^B(x)$ has at least one computation path halting in a rejecting state
from~$F_r$ but no computation path halting in an accepting state from~$F_a$.
In both cases $M^B(x)$ may have computation paths halting in ``don't know''
states from~$F_?$. In other words, strong NPOTMs are machines that never
lie. Prove the following two assertions:
\textbf{(a)} $A \strongturingnp B$ if and only if there exists a strong NPOTM $M$ with $A = L(M^B)$.
\textbf{(b)} $A \strongturingnp B$ if and only if $\np^A \seq \np^B$.
\textit{Hint.} Look at Exercise \ref{ex:turing-reducibility}.\label{ex:strong-np-turing-reducibility}
}
\ujfld{Proofs}
{Prove the assertions from Theorems~\ref{thm:structure-ph}
and~\ref{thm:low-high-characterization}. (Some are far from being trivial!)
\label{ex:structure-ph}
}
\vspace{-4mm}
\ujfld{Modification of the proofs}
{Modify the proof of Theorem~\ref{thm:least-element-in-right-coset} so as
to obtain Corollary~\ref{cor:least-element-in-right-coset}.
\label{ex:least-element-in-right-coset}
}
\end{fld}
\begin{fejmegj}
Parts of the Chapters~\ref{chap:cryptology} and~\ref{chap:complexity} are
based on the book~\cite{Rothe04b}\nevindex{Rothe, Jörg} that provides the proofs
omitted here, such as those of Theorems~\ref{thm:structure-ph},
\ref{thm:low-high-characterization}, and~\ref{thm:spp-properties} and of
Lemma~\ref{lem:hashing}, and many more details.
More background on complexity theory can be found in the
books~\cite{Hemaspaandra02,Papadimitriou94,Wagner86,Wechsung00}.
\nevindex{Wechsung, Gerd}\nevindex{Wagner, Klaus W.}\nevindex{Papadimitriou, Christos H.}
\nevindex{Hemaspaandra, Lane A.} A valuable source on the theory of $\np$-completeness is still the
classic~\cite{GareyJo79} by Garey\nevindex{Garey, Michael R.} and Johnson.
\nevindex{Johnson, David S.} The $\turingtext$-reducibility was introduced by
Cook~\cite{Cook71}, and the $\manyonetext$-reducibility by
Karp~\cite{Karp72}. A deep and profound study of
polynomial-time reducibilities is due to Ladner, Lynch, and
Selman~\cite{LadnerLy75}.\nevindex{Ladner, Richard E.}\nevindex{Lynch, Nancy Ann}\nevindex{Selman, Alan L.}
Exercise~\ref{ex:strong-np-turing-reducibility} and
Problem~\ref{ex:strong-np-turing-reducibility} are due to
Selman~\cite{Selman78}.
Dantsin et al.~\cite{Dantsin02}\nevindex{Dantsin, E.} obtained
% the currently best
an upper bound of
$\tilde{\bigoh}(1.481^n)$ for the deterministic time complexity
of $\threesat$, which was further improved by Brueggemann and
Kern~\cite{bru-ker:j:improved-local-search-threesat}
to~$\tilde{\bigoh}(1.4726^n)$.
The randomised algorithm presented in
Subsection~\ref{sec:sat-randomized} is due to
Schöning~\cite{Schoning99};\nevindex{Schöning, Uwe} it is based on a ``limited
local search with restart''. For $\ksat{k}$ with $k \geq 4$, the algorithm by
Paturi et al.~\cite{Paturi98}\nevindex{Paturi, R.} is slightly
better than Schöning's algorithm.
% The currently best randomised algorithm for
% $\ksat{k}$ with $3 \leq k \leq 4$ has a runtime of~$\tilde{\bigoh}(1.324^n)$.
Iwama\nevindex{Iwama, K.} and
Tamaki\nevindex{Tamaki, S.} \cite{iwa-tak}
combined the ideas
of Schöning~\cite{Schoning99} and Paturi et
al.~\cite{Paturi98} to obtain a bound of $\tilde{\bigoh}(1.324^n)$
for $\ksat{k}$ with $k \in \{3, 4\}$. For $\ksat{k}$ with
$k \geq 5$, their algorithm is not better than that by Paturi et
al.~\cite{Paturi98}.
Figure~\ref{tab:sat-algorithmen} gives an overview over some algorithms for the
satisfiability problem.
% The currently best results are boldfaced.
\begin{figure}[!t]
\centering
\small
\begin{tabular}{|l||l||l|l|l|l|}
\hline
Algorithm & Type & $\threesat$ & $\ksat{4}$ & $\ksat{5}$ &
$\ksat{6}$ \\
\hline \hline
Backtracking & det.
& $\tilde{\bigoh}(1.913^n)$
& $\tilde{\bigoh}(1.968^n)$
& $\tilde{\bigoh}(1.987^n)$
& $\tilde{\bigoh}(1.995^n)$
\\ \hline
Monien and & det.
& $\tilde{\bigoh}(1.618^n)$
& $\tilde{\bigoh}(1.839^n)$
& $\tilde{\bigoh}(1.928^n)$
& $\tilde{\bigoh}(1.966^n)$ \\
Speckenmeyer~\cite{Monien85} & & & & &
\\ \hline
Dantsin et al.~\cite{Dantsin02} & det.
& $\tilde{\bigoh}(1.481^n)$
& $\tilde{\bigoh}(1.6^n)$
& $\tilde{\bigoh}(1.667^n)$
& $\tilde{\bigoh}(1.75^n)$
\\ \hline
Brueggemann and Kern~\cite{bru-ker:j:improved-local-search-threesat} & det.
& $\tilde{\bigoh}(1.4726^n)$
& ---
& ---
& ---
\\ \hline \hline
Paturi et al.~\cite{Paturi98} & prob.
& $\tilde{\bigoh}(1.362^n)$
& $\tilde{\bigoh}(1.476^n)$
& $\tilde{\bigoh}(1.569^n)$
& $\tilde{\bigoh}(1.637^n)$
\\ \hline
Schöning~\cite{Schoning99} & prob.
& $\tilde{\bigoh}(1.334^n)$ & $\tilde{\bigoh}(1.5^n)$
& $\tilde{\bigoh}(1.6^n)$ & $\tilde{\bigoh}(1.667^n)$
\\ \hline
Iwama and Tamaki~\cite{iwa-tak} & prob.
& $\tilde{\bigoh}(1.324^n)$
& $\tilde{\bigoh}(1.474^n)$
& --- & ---
\\ \hline
\end{tabular}
\caption{\label{tab:sat-algorithmen} Runtimes of some algorithms for the
satisfiability problem.}
\end{figure}
For a thorough, comprehensive treatment of the graph isomorphism problem the
reader is referred to the book by Köbler,\nevindex{Köbler, J.} Schöning,\nevindex{Schöning, Uwe} and
Torán~\cite{Koebler93},\nevindex{Torán, J.} particularly under
complexity-theoretic aspects. Hoffman~\cite{Hoffman82}\nevindex{Hoffman, C.} investigates
group-theoretic algorithms for the graph isomorphism problem and related problems.
The polynomial hierarchy was introduced by Meyer\nevindex{Meyer, A.} and
Stockmeyer~\cite{Meyer72,Stockmeyer77}.\nevindex{Stockmeyer, Larry J.} In
particular, Theorem~\ref{thm:structure-ph} is due to them.
Schöning~\cite{Schoning83} introduced the low hierarchy and the high hierarchy
within $\np$. The results stated in
Theorem~\ref{thm:low-high-characterization} are due to
him~\cite{Schoning83}. He also proved that $\graphiso$ is in
$\low{2}$, see~\cite{Schoning87}. Köbler et
al. \cite{Koebler92a,Koebler92b}\nevindex{Köbler, J.} obtained the first
lowness results of $\graphiso$ for probabilistic classes such as~$\pp$. These
results were improved by Arvind and Kurur~\cite{Arvind02}
who proved that $\graphiso$ is even in~$\spp$. The class $\spp$ generalises
Valiant's class~$\up$, see~\cite{Valiant76}. So-called
``promise classes'' such as
$\up$ and $\spp$ have been thoroughly studied in a number of papers; see,
e.g., \cite{Arvind02,Borchert00,Fenner94,Koebler92a,Koebler92b,Rao94,Rothe95}.
\nevindex{Arvind, V.}\nevindex{Kurur, P.}
Lemma \ref{lem:hashing} is due to Carter\nevindex{Carter, J. L.} and
Wegman\nevindex{Wegman, M. N.} \cite{Carter79}.\nevindex{Borchert, B.}
The author is grateful to Uwe Schöning\nevindex{Schöning, Uwe} for his helpful comments on an earlier
version of this chapter and for sending the slides of one of this talks that
helped simplifying the probability analysis for the algorithm \textsc{Random-SAT}
sketched in Subsection~\ref{sec:sat}; a more comprehensive analysis can be found
in Schöning's book~\cite{Schoning01}\nevindex{Schöning, Uwe}. Thanks are also due to Dietrich
Stoyan,\nevindex{Stoyan, Dietrich}
Robert Stoyan,\nevindex{Stoyan, Robert}
Sigurd Assing,\nevindex{Assing, Sigurd}
Gábor Erdélyi\nevindex{Erdélyi, Gábor}
and Holger Spakowski\nevindex{Spakowski, Holger} for proofreading previous
versions of Chapters~\ref{chap:cryptology} and~\ref{chap:complexity}.
This work was supported in part by the Deutsche
Forschungsgemeinschaft (DFG) under grants RO~1202/9-1 and RO~1202/9-3
and by the Alexander von Humboldt Foundation in the TransCoop program.
\end{fejmegj}
\index{complexity theory|)}