Saturday, May 26, 2012

How RSA Works With Examples

NOTE: I have moved my blog (including an update of this post): See http://doctrina.org/How-RSA-Works-With-Examples.html

In this post, I am going to explain exactly how RSA public key encryption works. One of the three seminal events in 20th century cryptography, RSA opens the world to host of various cryptographic protocols (like digital signatures, cryptographic voting etc). All discussions on this topic (including this one) are very mathematical, but the difference here is that I am going to go out of my way to explain each concept with a concrete examples. The reader who only has a beginner level of mathematical knowledge should be able to understand exactly how RSA works after reading this post along with the examples.

Background Mathematics

The Set Of Integers Modulo P

The set:

\begin{equation}\label{bg:intmod} \mathbb{Z}_p = \{0,1,2,...,p-1\}\end{equation}
Is called the set of integers modulo p (or mod p for short). It is a set that contains Integers from $0$ up until $p-1$.

Example: $\mathbb{Z}_{10} =\{0,1,2,3,4,5,6,7,8,9\}$

Integer Remainder After Dividing

When we first learned about numbers at school, we had no notion of real numbers, only integers. Therefore we were told that 5 divided by 2 was equal to 2 remainder 1, and not $2\frac{1}{2}$. It turns out that this type of math is absolutely vital to RSA, and is one of the reasons that secure RSA. A very formal way of stating a remainder after dividing by another number is an equivalence relationship:

\begin{equation} \label{bg:mod} \forall x,y,z,k \in \mathbb{Z}, x \equiv y \bmod z \Longrightarrow x = k\cdot z + y\end{equation} 
Equation $\ref{bg:mod}$ states that if $x$ is equivalent to the remainder (in this case $y$) after dividing by an integer (in this case $z$), then $x$ can be written like so: $x = k\cdot z + y$ where $k$ is an integer.

Example: If $y=4$ and $z=10$, then the following values of $x$ will satisfy the above equation: $x=4, x=14, x=24,...$. In fact, there are an infinite amount of values that $x$ can take on to satisfy the above equation (that is why I used the equivalence relationship $\equiv$ instead of equals). Therefore, $x$ can be written like so: $x = k\cdot 10 + 4$, where $k$ can be any of the infinite amount of integers.
There are two important things to note:
  1. The remainder $y$ stays constant, whatever value $x$ takes on to satisfy quation $\ref{bg:mod}$.
  2. Due to the above fact, $y \in \mathbb{Z}_z$ ($y$ is in the set of integers modulo $z$)

Multiplicative Inverse And The Greatest Common Divisor

A multiplicative inverse for $x$ is written as $x^{-1}$ and is defined as so:
\begin{equation}x\cdot x^{-1} = 1\end{equation}
The greatest common divisor (gcd) between two numbers is the largest integer that will divide both numbers.  For example, $gcd(4,10) = 2$. 

The interesting thing is that if two numbers have a gcd of 1, then the smaller of the two numbers has a multiplicative inverse in the modulo of the larger number. It is expressed in the following equation: 

\begin{equation}\label{bg:gcd}x \in \mathbb{Z}_p, x^{-1} \in \mathbb{Z}_p  \Longleftrightarrow \gcd(x,p) = 1\end{equation}
The above just says that if an inverse only exists if the greatest common divisor is 1. An example should set things straight...

Example: Lets work in the set $\mathbb{Z}_9$, then $4 \in \mathbb{Z}_9$ and $gcd(4,9)=1$. Therefore $4$ has a multiplicative inverse (written $4^{-1}$) in $\bmod 9$, which is $7$. And indeed, $4\cdot 7 = 28 = 1 \bmod 9$. But not all numbers have inverses. For instance, $3 \in \mathbb{Z}_9$ but $3^{-1}$ does not exist! This is because $gcd(3,9) = 3 \neq 1$.

Euler's Totient

Euler's Totient is the number of elements that have an inverse in a set of modulo integers. The totient is denoted using the Greek symbol phi $\phi$. From  $\ref{bg:gcd}$ above, we can see that the totient is just a count of number of elements that have their $\gcd$ with the modulus equal to 1. Now for any prime number $p$, every number from $1$ up to $p-1$ has a $\gcd$ of 1 with $p$. This brings us to an important equation regarding the totient and prime numbers:

\begin{equation}\label{bg:totient} p \in \mathbb{P}, \phi(p) = p-1 \end{equation}
Example: $\phi(7) = \left|\{1,2,3,4,5,6\}\right| = 6$
(Note: In set theory, anything between |{...}| just means the amount of elements in {...} - called cardinality for those who are interested)

RSA

With the above background, we have enough tools to describe RSA and show how it works. RSA is actually a set of two algorithms:
  1. A key generation algorithm.
  2. A function $F$, that takes as input a point $x$ and a key $k$ and produces either an encrypted result or plaintext, depending on the input and the key.

Key Generation

The key generation algorithm is the most complex part of RSA. The aim of the key generation algorithm is to generate both the public and the private RSA keys. Sounds simple enough! Unfortunately, weak key generation makes RSA very vulnerable to attack. So it has to be done correctly. Here is what has to happen in order to generate secure RSA keys:
  1. Large Prime Number Generation: Two large prime numbers $p$ and $q$ need to be generated. These numbers are very large: At least 512 digits, but 1024 digits is considered safe.
  2. Modulus: From the two large numbers, a modulus $n$ is generated by multiplying $p$ and $q$
  3. Totient: The totient of $n, \phi(n)$ is calculated.
  4. Public Key: A prime number is calculated from the range $[3,\phi(n))$ that has a greatest common divisor of $1$ with $\phi(n)$.
  5. Private Key: Because the prime in step 4 has a gcd of 1 with $\phi(n)$, we are able to determine it's inverse with respect to $\bmod \phi(n)$.
After the five steps above, we will have our keys. Lets go over each step.

Large Prime Number Generation

It is vital for RSA security that two very large prime numbers be generated that are quite far apart. Generating composite numbers, or even prime numbers that are close together makes RSA totally insecure.

How does one generate large prime numbers? The answer is to pick a large random number (a very large random number) and test for primeness. If that number fails the prime test, then add 1 and start over again until we have a number that passes a prime test. The problem is now: How do we test a number in order to determine if it is prime?

The answer: An incredibly fast prime number tester called the Rabin-Miller primality tester is able to accomplish this. Give it a very large number, it is able to very quickly determine with a high probability if its input is prime. But there is a catch (and readers may have spotted the catch in the last sentence): The Rabin-Miller test is a probability test, not a definite test. Given the fact that RSA absolutely relies upon generating large prime numbers, why would anyone want to use a probabilistic test? The answer: With Rabin-Miller, we make the result as accurate as we want. In other words, Rabin-Miller is setup with parameters that produces a result that determine if a number is prime with a probability of our choosing. Normally, the test is performed by iterating $64$ times and produces a result on a number that has a $\frac{1}{2^{128}}$ chance of not being prime. The probability of a number passing the Rabin-Miller test and not being prime is so low, that it is okay to use it with RSA. In fact, $\frac{1}{2^{128}}$ is such a small number that I would suspect that nobody would ever get a false positive.

So with Rabin-Miller, we generate two large prime numbers: $p$ and $q$. 

Modulus

Once we have our two prime numbers, we can generate a modulus very easily:
\begin{equation}\label{rsa:modulus}n=p\cdot q\end{equation}
RSA's main security foundation relies upon the fact that given two large prime numbers, a composite number (in this case $n$) can very easily be deduced by multiplying the two primes together. But, given just $n$, there is no known algorithm to efficiently determining $n$'s prime factors. In fact, it is considered a hard problem. I am going to bold this next statement for effect: The foundation of RSA's security relies upon the fact that given a composite number, it is considered a hard problem to determine it's prime factors.

The bolded statement above cannot be proved. That is why I used the term "considered a hard problem" and not "is a hard problem". This is a little bit disturbing: Basing the security of one of the most used cryptographic atomics on something that is not provable difficult. The only solace one can take is that throughout history, numerous people have tried, but failed to find a solution to this.

Totient

With the prime factors of $n$, the totient can be very quickly calculated:

\begin{equation}\label{RSA:totient}\phi(n) = (p-1)\cdot (q-1)\end{equation}
This is directly from equation $\ref{bg:totient}$ above. It is derived like so: $\phi(n) = \phi(p\cdot q) = \phi(p) \cdot \phi(q) = (p-1)\cdot (q-1)$. The reason why the RSA becomes vulnerable if one can determine the prime factors of the modulus is because then one can easily determine the totient.

Public Key

Next, the public key  is determined. Normally expressed as $e$, it is a prime number chosen in the range $[3,\phi(n))$. The discerning reader may think that $3$ is a little small, and yes, I agree, if $3$ is chosen, it could lead to security flaws. So in practice, the public key is normally set at $65537$. Note that because the public key is prime, it has a high change of a gcd equal to $1$ with $\phi(n)$. If this is not the case, then we must use another prime number that is not $65537$, but this will only occur if $65537$ is a multiple of $\phi(n)$, something that is quite unlikely, but must still be checked for.

An interesting observation: If in practice, the number above is set at $65537$, then it is not picked at random; surely this is a problem? Actually, no, it isn't. As the name implies, this key is public, and therefore is shared with everyone. As long as the private key cannot be deduced from the public key, we are happy. The reason why the public key is not randomly chosen in practice is because it is desirable not to have a large number. This is because it is more efficient to encrypt with smaller numbers than larger numbers.

The public key is actually a key pair of the exponent $e$ and the modulus $n$ and is present as follows
$(e,n)$

Private Key

Because the public key has a gcd of $1$ with $\phi(n)$, the multiplicative inverse of the public key with respect to $\phi(n)$ can be efficiently and quickly determined using the Extended Euclidean Algorithm. This multiplicative inverse is the private key. The common notation for expressing the private key is $d$. So in effect, we have the following equation (one of the most important equations in RSA):
\begin{equation}\label{RSA:ed} e\cdot d = 1 \bmod \phi(n) \end{equation}
Just like the public key, the private key is also a key pair of the exponent $d$ and modulus $n$:
$(d,n)$
One of the absolute fundamental security assumptions behind RSA is that given a public key, one cannot efficiently determine the private key. I intend to write a future blog post explaining why RSA works, so I am not going to explain this now.

Function Evaluation

This is the process of transforming a plaintext message into ciphertext, or vice-versa. The RSA function, for message $m$ and key $k$ is evaluated as follows:
\begin{equation}F(m,k) = m^k \bmod n\end{equation}
There are obviously two cases:
  1. Encrypting with the public key, and then decrypting with the private key.
  2. Encrypting with the private key, and then decrypting with the public key.
The two cases above are mirrors. I will explain the first case, the second follows from the first
Encryption: $F(m,e) = m^e \bmod n = c$, where $m$ is the message, $e$ is the public key and $c$ is the cipher.
Decryption: $F(c,d) = c^d \bmod n = m$.

And there you have it: RSA!

Final Example: RSA From Scratch

This is the part that everyone has been waiting for: an example of RSA from the ground up.

Calculation of Modulus And Totient

Lets choose two primes: $p=11$ and $q=13$. Hence the modulus is $n = p \times q = 143$. The totient of n $\phi(n) = (p-1)\cdot (q-1) = 120$. 

Key Generation

For the public key, a random prime number that has a greatest common divisor (gcd) of 1 with $\phi(n)$ and is less than $\phi(n)$ is chosen. Let's choose $7$ (note: both $3$ and $5$ do not have a gcd of 1 with $\phi(n)$. So $e=7$, and to determine $d$, the secret key, we need to find the inverse of $7$ with $\phi(n)$. This can be done very easily and quickly with the Extended Euclidean Algorithm, and hence $d=103$. This can be easily verified: $e\cdot d = 1 \bmod \phi(n)$ and $7\cdot 103 = 721 = 1 \bmod 120$.

Encryption/Decryption

Lets choose our plaintext message, $m$ to be $9$:
Encryption: $m^e \bmod n = 9^7 \bmod 143 = 48 = c$
Decryption: $c^d \bmod n = 48^{103} \bmod 143 = 9 = m$

A Real World Example

Now for a real world example, lets encrypt the message "attack at dawn". The first thing that must be done is to convert the message into a numeric format. Each letter is represented by an ascii character, therefore it can be accomplished quite easily. I am not going to dive into the converting strings to numbers or vice-versa, but just to note that it can be done very easily. How I will do it here is to convert the string to a bit array, and then the bit array to a large number. This can very easily be reversed to get back the original string given the large number. Using this method, "attack at dawn" becomes $1976620216402300889624482718775150$.

Key Generation

Now to pick two large primes, $p$ and $q$. These numbers must be random and not too close to each other. Here are the numbers that I generated:

p
12131072439211271897323671531612440428472427633701410925634549312301964373042085619324197365322416866541017057361365214171711713797974299334871062829803541

q
12027524255478748885956220793734512128733387803682075433653899983955179850988797899869146900809131611153346817050832096022160146366346391812470987105415233

With these two large numbers, we can calculate n and $\phi(n)$

n 145906768007583323230186939349070635292401872375357164399581871019873438799005358938369571402670149802121818086292467422828157022922076746906543401224889672472407926969987100581290103199317858753663710862357656510507883714297115637342788911463535102712032765166518411726859837988672111837205085526346618740053

$\phi(n)$ 145906768007583323230186939349070635292401872375357164399581871019873438799005358938369571402670149802121818086292467422828157022922076746906543401224889648313811232279966317301397777852365301547848273478871297222058587457152891606459269718119268971163555070802643999529549644116811947516513938184296683521280

e - the public key
$65537$ has a gcd of 1 with $\phi(n)$, so lets use it as the public key. To calculate the private key, use extended euclidean algorithm to find the multiplicative inverse with respect to $\phi(n)$.

d - the private key
89489425009274444368228545921773093919669586065884257445497854456487674839629818390934941973262879616797970608917283679875499331574161113854088813275488110588247193077582527278437906504015680623423550067240042466665654232383502922215493623289472138866445818789127946123407807725702626644091036502372545139713

Encryption/Decryption

Encryption: $1976620216402300889624482718775150^e \bmod n$
35052111338673026690212423937053328511880760811579981620642802346685810623109850235943049080973386241113784040794704193978215378499765413083646438784740952306932534945195080183861574225226218879827232453912820596886440377536082465681750074417459151485407445862511023472235560823053497791518928820272257787786
Decryption:
35052111338673026690212423937053328511880760811579981620642802346685810623109850235943049080973386241113784040794704193978215378499765413083646438784740952306932534945195080183861574225226218879827232453912820596886440377536082465681750074417459151485407445862511023472235560823053497791518928820272257787786$^d \bmod n$
1976620216402300889624482718775150 (which is our plaintext "attack at dawn")
This real world example shows how large the numbers are that is used in the real world.

Conclusion

RSA is the single most useful tool for building cryptographic protocols (in my humble opinion). In this post, I have shown how RSA works, I will follow this up with another post explaining why it works.

Sunday, May 20, 2012

The 3 seminal events in cryptography


Cryptography is the art/science of keeping a secret. This need has been present since prerecorded human history, but we have some very famous early examples such as the caesar cipher. Yet these early examples were very easy to break. It took until the twentieth century for cryptography to be rigorously defined and useful to a wider audience, and it was largely done so in three seminal papers:
  1. Communication Theory of Secret Systems by Claude Shannon
  2. New Directions In Cryptography by Whitfield Diffie and Martin Hellman
  3. A Method for Obtaining Digital Signatures and Public-Key Cryptosystems by Ronald Rivest, Adi Shamir and Leanard Adleman
For this explanation, it helps to give a quick definition of the cryptographic process. A message (called the plain text) must be encrypted to cipher text. It is done so via an algorithm that takes two inputs - the plain text to be encrypted, and a secret key.

Communication Theory of Secret Systems by Claude Shannon

How does one define security of a cipher? This question had not been rigorously answered until the above paper in 1949 by Claude Shannon was published. Claude Shannon has often been called the father of the digital world. One of the founding fathers of this amazing age that we live in, there is hardly anything that we take for granted in the electronic age that Claude Shannon did not have a hand in designing.

Thank goodness Shannon had some spare time to devote to cryptography. His paper defines a secure cipher as not letting anyone learn anything about the plain text given just the cipher text. He thus defines a perfect cipher as the following: Given a cipher text, the probability that it resulted from any plain text is equal. Okay, that does not sound so special, but that is because I dumbed it down a bit. It actually means that when you are given some cipher text, you have absolutely no idea what plain text was used as input to produce the cipher text, seeing as all plain text have the same probability to produce the cipher text, it could have been any plain text - we just don't know!

Unfortunately, Shannon also proved that in order to have this perfect security, one needs a key at least as big as the message space. This makes perfect security impracticable: Given a message that it is a few gigabytes of size, the key to producing a perfect cipher must also be a few gigabytes large! So modern cryptography tries to relax the conditions of perfect security by defining something called semantic security. Semantic security says that no efficient algorithm must have the ability to determine what plaintext produced the cipher text. More importantly, by relaxing Shannon's security definition, we are able to use keys that are tiny in comparison. Because it is not perfect by Shannon's definition, an algorithm can be produced that should be able to break the cipher. But as long as we make it so that the effort required is tantamount to waiting until the end of the universe, we are safe.

Semantic security defines our modern cryptographic age. To be sure, if a cipher is semantic secure, it can also suffer from chosen plaintext attacks and chosen ciphertext attacks, but we have a metric which we can use to measure how effective our security is. And indeed, modern cryptographic atomics, like AES (advanced encryption standard) is as far as we know, semantically secure. And so we use it all the time.

New Directions In Cryptography by Whitfield Diffie and Martin Hellman

The next big thing that happened has all to due with secret keys. A cipher such as AES is semantically secure, and so using it should be secure for two parties. But how can two parties agree upon a secret key to use with encryption without letting anyone know? In the Internet age, this problem is even more acute than one may suspect: Given two people that are separated by vast geographical distances and connected by the Internet who wish to communicate securely, how do they agree upon a secret key without anyone else being able to discover this secret key. The Internet is a very insecure medium by fact of anyone being able to connect to it. So agreeing upon a secret key in a secure manner over an insecure medium is not a trivial problem.

Diffie-Hellman to the rescue! Their seminal paper allows us to perform the amazing feat of securely agreeing upon a secret over an insecure medium. I'm going to save it for another post in order to explain how Diffie-Hellman works (it requires some math background), but suffice it to say that Diffie-Hallman makes it possible for secure communication to happen anywhere over any connection.

A Method for Obtaining Digital Signatures and Public-Key Cryptosystems by Ronald Rivest, Adi Shamir and Leanard Adleman

The last of the seminal papers was a really huge idea. Up until now, people had used the same key to both encrypt and decrypt their messages. One could think about encryption as a door with a lock: The same key used to lock the door is also used to unlock the door. The above paper changed all that!

Commonly known as RSA (taken from the first letter of the surnames of the three inventors), it ushered in the era of public key encryption, and allowed cryptography to do more than just keep secrets - it allowed cryptography to also uniquely identify a party thus making digital signatures possible.

Public key cryptography works by having two keys: A public key, and a private key. The names are very descriptive: A public key is kept public and is shared with the world, a private key is kept private and may not be shared with anyone. If one encrypts with one key, then one must use the other key to decrypt. So if two users wish to communicate securely, then one of the users just asks for the other user's public key. The message is then encrypted using the public key, but the user with the private key is the only entity able to decrypt it. So it is useless to anyone else.

Just as interesting is if something is encrypted using the private key. Only the public key can be used to decrypt. This may seem counter-intuitive at first, but think about what problem this solves. If I have your public key, and you give me something encrypted using your private key, then I am able to ensure that you are the person - and not anyone masquerading as you - who encrypted the message. This is because you are the only person with the private key. So your public key will only work with things that are encrypted with your private key, something which only you have access to. And this is the essence of a digital signature.