Introduction to Cryptography
Before we get into the the world of cryptography, I want you to close your eyes, empty your mind, and think of cryptography for a few seconds. Let me know what comes to your mind. Lines of code? Complex differential equations? Image of Alan Turing breaking an Enigma machine? Or a teenager hacking a website? Does cryptography seem mysterious and incomprehensible to you? Well, I hope you will feel more comfortable with talking about cryptography after reading this article.
This article is organized in the following manner. A brief introduction to the history of cryptography is presented, followed by detailed discussions of the building blocks of cryptography. During the journey, you will encounter some keywords such as stream ciphers, block ciphers, message integrity, public key exchange, and etc. Don't get intimidated by those words. I will use as many examples as possible to illustrate the key concepts. Number 1 rule of learning: learn from examples.
What is Cryptography and What It Does
Cryptography studies how to ensure secure communication in the presence of third parties called adversaries. Roughly speaking, cryptography is about encryption, the process of converting plain information (called plaintext) into unreadable text (called ciphertext), and decryption, the process of converting ciphertext back to plaintext. A cipher is a pair of algorithms that create the encryption and the reversing decryption.
Before modern era, cryptography mainly focused on message confidentiality (i.e. encryption) - conversion of a comprehensible form into an incomprehensible form. Even a third party intercepted the ciphertext, it knows nothing about the plaintext. In recent decades, the cryptography field has expanded to include message integrity checking, sender/receiver identity authentication, digital signature, among others.
Some classic ciphers before the computer era includes transposition ciphers, which rearrange the order of letters in a message (e.g., 'love Siberian cat' becomes 'eovl nabeiriS atc' in a trivially simple rearrangement scheme), and substitution ciphers, which systematically replace letters or groups of letters with other letters or groups of letters (e.g., 'cryptokitties' becomes'fubswrnlwwlhv' by replacing each letter in the plaintext by a letter 3 positions further down the Latin alphabet).
In the computer era, especially since the 1970s, complex and secure algorithms have been developed by scientists and practitioners. Some well known algorithms include Data Encryption Standard (DES), Advanced Encryption Standard (AES, replacing DES), RSA algorithm, and etc. Usually, these algorithm use a key to encrypt a message. There are very few algorithms that are proven to be unconditionally secure. The one-time pad (will be discussed later) is one. Many algorithms we use today are only secure if certain mathematical problems are intractable, such as the integer factorization or the discrete logarithm problems. For example, the infeasibility of factoring extremely large integers is the basis for believing that RSA is secure. As computing power increases, which increases the scope of brutal force attack, some currently secure algorithms won't be technically secure without further improvement such as increasing the key lengths. In the presence of rapid quantum-computing development, post-quantum cryptographic system designs are becoming more and more challenging.
Symmetric-Key Cryptography
In this article, we are focusing on modern era cryptography which usually involves using a key to encrypt a message. Depending on whether the same key is used for decryption, cryptography can be further divided into two categories: symmetric-key cryptography (same key is used for both encryption and decryption), public-key cryptography (a public key is used for encryption, while a different private key is used for decryption). Symmetric-key algorithms are discussed first, then the public-key ones.
A Simple Cipher Example - The One-Time Pad
As mentioned above, a cipher is a pair of algorithms that create the encryption and the reversing decryption. The encryption output (the ciphertext) is often randomized since the key used in the encryption process is often randomly generated, while the decryption output (the plaintext) should be deterministic, meaning the same key and the same ciphertext should generate a unique output.
The one-time pad cipher was invented by Vernam in 1917. For the one-time pad cipher, the message (the plaintext, e.g., n bit 0 and 1 string) has the same length as the ciphertext. And the key is a random bit string as long as the message. Before we proceed, I would like to introduce a common bit operation - xor, with the following operations: 1 xor 0 = 1, 1 xor 1 = 0, 0 xor 0 = 0. If two bits are the same, then the xor result is 0, otherwise is 1. Xor operation symbol is a plus sign enclosed by a circle. See the following example:
Assuming Bob wants to send Alice the message M which is a 8 bit string of 10011001, a random key K being 11001110, after the xor operation, the ciphertext C is 01010111. After receiving the ciphertext C, to recover the original message M, Alice just needs to do another xor operation: C xor K = M xor K xor M = M. Bingo! She recovers the original message M. Encryption and decryption are super fast using one-time pad. However, the fact that the message and the key are of the same lengths are very problematic which makes one-time pad very difficult to use in practice.
Although impractical to use, one-time pad has a very interesting feature - perfect secrecy. In Shannon's 1949 paper Communication Theory of Secrecy Systems, he rigorously studied the cipher security, and the basic idea is that if all an adversary (bad guy) can see is the ciphertext (CT), he should know nothing about the plaintext (PT). Given a cipher with encryption (E) and decryption (D) mechanisms, the message space M (all possible messages), the key space K (all possible keys), the ciphertext space C (all possible ciphertexts), he defined a cipher with perfect secrecy:
His declaration goes like this: For any two messages m0 and m1 with the same lengths, for a random key k, for any ciphertext c, if the probability of picking a random key, and encrypt m0 to c, meaning E(m0, k) = c is the same as the probability of E(m1, k) = c, then if an adversary intercepts the ciphertext c, he can't tell whether the message is m0 or m1, meaning he learns nothing about the plaintext solely from the ciphertext. It also means one-time pad is resistant again ciphertext only attack, but other attacks may be possible.
And guess what? One-time pad has perfect secrecy! And the proof is actually quite straightforward. To begin with, for any m and c:
Then the question is how many keys are there in the key space which satisfy E(m, k) = c? The answer is 1. Why? Because E(m, k) = c means c = m xor k, which means k is c xor m. So given m and c, k is unique. That means P[E(m, k) = c] is constant. In that case, P[E(m0, k) = c] = P[E(m1, k) = c]. Problem solved. Some readers might wonder if there exist ciphers with perfect secrecy but much shorter keys. The bad news is no. After proving that one-time pad has perfect secrecy, Shannon continued to prove that for ciphers to have perfect secrecy, the key must be at least as long as the message. And one-time pad is the optimal case for ciphers with perfect secrecy.
In short, the idea behind one-time pad is good, but it is quite hard to use in reality.
Stream Cipher
Now we know that one-time pad is impractical in reality. Can we make it practical with some modification? Yes. And here comes the stream cipher. The idea is to replace the original random key k with a "pseudorandom" key. To achieve that, we need another tool called pseudorandom generator (PRG). Before we proceed, let me introduce a notation here: {0,1}^n means all n-bit strings with 0 and 1. A PRG is basically a function G which transfers a bit string in the seed space {0, 1}^s into a bit string in {0, 1}^n, where n >> s (meaning n much larger than s). The output of PRG should be unpredictable and should look random. The following image shows how PRG makes one-time pad practical mathematically and visually:
Stream Cipher Real World Example 1 - RC4
I will briefly show you two real-world examples of stream ciphers. The first one relatively old but really famous and widely used before - RC4. It was designed back in 1987. I will give you a high level description and talk about some weaknesses and leave it at that. RC4 takes in variable seed sizes. Here I use 128-bit seed, which would be use as the key for the stream cipher. The first thing RC4 does is to expand the 128-bit secret key into 2048 bits, which are gonna be used as the internal state for the generator. Once it's done this expansion, it basically executes a very simple loop, where every iteration of this loop outputs one byte of output.
RC4 is actually fairly popular. It is used in the HTTPS protocol. These days, Google uses RC4 in its HTTPS. However, over the years, some weaknesses have been discovered which make RC4 vulnerable. The first weakness is the bias in initial output with the probability of second byte equal to 0 twice as larger as 1/256, which means that if you use the RC4 output to encrypt a message the second byte is likely to not be encrypted at all. By the way, there is nothing special about the second byte. It turns out the first and the third bytes are also biased. And in fact it's now recommended that if you're gonna use RC4, what you should do is ignore basically the first 256 bytes of the output and just start using the output of the generator starting from byte 257. The second attack that was discovered is that in fact if you look at a very long output of RC4 it so happens that you're more likely to get the sequence 00. If RC4 is completely random the probability of seeing adjacent two bytes being zero, zero would be exactly 1/256 squared. It turns out RC4 is a little biased and the bias is 1/256 cubed. Given the above vulnerabilities, it is usually recommended that RC4 should not be used in new systems.
Stream Cipher Real World Example 2 - Salsa20 (one of eStream ciphers)
As mentioned above, RC4 is a relatively weak cipher. Let's move on to better examples, which in particular have better pesudo-random generators (PRG) coming from what's called the eStream project. The eStream project concluded in 2008, and they qualify five different stream ciphers. All these stream ciphers are a little different from what we are used to. All these stream ciphers as normal have a seed, but in addition they also have a second input called a nonce. See the following image:
Different from the traditional PRG which takes in only a seed as the secret key, the eStream PRG takes in additional input called a nonce. A nonce is a value which never repeats itself as long as the key is the same. The bottom line here is that the pair (k, r) is never used more than once. But you can reuse the key k here, as long as the nonce is different for the same key, the pair (k, r) is still unique. So the presence of a nonce saves us the trouble of generating a new key k each time.
Out of the five stream ciphers included in the eStream project, I am only gonna introduce one of them - Salsa20. Unlike RC4 which is designed solely for software, Salsa20 is designed for both software implementations and hardware implementations. Let me explain how it works. Salsa20 takes either 128 or 256-bit keys. I will use a 128-bit (8 bytes) key k here. The nonce r happens to be 64 bit (8 bytes). There is additional input called block number i which is 64-bit (8 bytes) long. First of all, the expansion function generates four constants, whose values depending on the size of the secret key (128 bit here). Then the expansion function combines generated four constants, the key, the nonce, and the block number, forming a 64-byte data block. The block looks like the following:
Each time the expansion function is called, the block number will increase by 1.
The core of the Salsa20 algorithm is a invertible hash function h which takes a 64-byte data block as input and output another 64-byte data block. The hash function does this 20 times (where 20 in Salsa20 comes from). The 64-byte output after 20 times is added with the original 64-byte input, and that gives the final result. The addition of the two 64-byte output is done in a unique way where the addition is performed word ( a word is 4 byte long) by word. The sum of two words a and b is equal to (a+b) mod 2^32. The result is a valid 4-byte long word. The work flow of Salsa20 algorithm is shown as follows:
The whole process can be denoted as big H(k, r, i). As the counter i increases, more and more 64-byte data blocks are generated and concatenated, as long as we need it to be. The Salsa20 stream cipher can be described by the following graph:
This paragraph is about crytoanalysis of the stream ciphers. Feel free to skip it if you are not interested. Since you have grasped the main idea of stream ciphers, let's discuss the security of stream ciphers. As mentioned above, in order for a cipher to have perfect secrecy, a key must be at least as long as the message. For the stream ciphers, the key usually much shorter than the message, which means stream ciphers don't have perfect secrecy. But that is not the end of the world. Instead, we focuses on another security measure which makes stream ciphers practical - semantic security. The detailed definition is quite abstract and obscure. And I will not bother you with that. I will give you a high-level description and leave the details to curious readers. To say an encryption algorithm is semantically secure, we must ensure that for all explicit messages m0, m1, the adversaries can't distinguish the distributions of the encryptions of m0 and m1. To make it more clear, if the adversaries know what m0 and m1 are, by looking at the encryptions of m0 and m1, they should have no idea which encryption comes from which. Not surprisingly, stream ciphers derived from secure PRGs are semantically secure.
Block Cipher
Now that we know how stream ciphers work, let's move on to a more powerful cipher - block cipher. A block cipher is made up of two algorithms, encryption algorithm E and decryption algorithm D. Unlike a stream cipher which can take arbitrarily long inputs, a block cipher takes an N bit plain text as input, and it outputs exactly the same number of bits as outputs. So it maps N bits on inputs to exactly N bits of outputs. There are two canonical examples of block ciphers. The first one is called DES. In DES the block size, namely the number of input bits, is 64. So DES will map 64-bit blocks to 64-bit blocks and it does it using a key that's 56 bits long. Another block cipher, which is more recent, is called AES. AES has slightly different parameters, with the block size being 128 bits.So, AES will map a 128 bits of input to 128 bits of output. And it actually has three possible sizes of keys. The summary of DES and AES are shown below:
Block ciphers are typically built by iteration. They take in as input a key k, for example in the case of AES the key could be 128 bits long, and the first thing that happens to the key is that it gets expanded into a sequence of keys k1 to kn called round keys. Now, the way the cipher uses these round keys is by iteratively encrypting the message again and again using what's called a round function. So here we have this function R that take two inputs. It takes as input the round key, and it takes as input the current state of the message. So here we have our input message. Say, for AES, the message would be 128 bits exactly, because each block in AES is exactly 128 bits. And then the first thing that happens is we apply the first round function using the key k1 to the message. And we get some new message out, as a result. Then we take this message m1, we apply the next round function to it using a different key, using the key k2. Then we get the next round message out. And so on and so forth until all the rounds have been applied and then the final output is actually the result of the cipher. Different ciphers have different number of rounds, as well as different round functions. For example, for DES the number of rounds is 16, while for AES the number of rounds is 10.
To be continued in a separate article:
Message Integrity Code (MAC)
Authenticated Encryption
Public-Key Cryptography
Follow me on Twitter: @pangdufao
Infrastructure (we're hiring!)
7 年This is really useful for understanding the research and science behind block chain