We know that the issue of securely sending information without it being read by others has been with us for a long time. Herodotus tells us about incidents in the 5th c. BC when Persia was at war with the Greeks. Two techniques were mentioned. One was writing the message on a writing tablet then adding a wax layer on top to hide it. Since writing tablets normally had a wax layer that looked OK, and a message got through. This is really more of an example of *steganography*, which comes from the Greek *steganos* (covered) and *graphei* (writing). Steganography is hiding a message in such a way the observer does not know there is a message at all. Later examples include microdots (minute film hidden in the period of a sentence), and in the digital age, hiding a message in the code for a picture like a JPEG.

The problem is that once the observer knows about it, it is easy to defeat the secrecy and grab the message. WWII intelligence agencies learned all about microdots and how to find them, and once you know where to look there is no secrecy at all. What you want is way to stop someone from reading your message even if they physically have it in their possession, and that is known as encryption, from the greek *kryptos* (hidden). Encryption uses a cipher to turn your message from one that is read by anyone to a message that should, ideally, be unreadable to anyone who does know know how to *decrypt* the message. An early example was written of in Julius Caesar’s *Gallic Wars*, and is therefore known as a *Caesar cipher*. This cipher moved a each letter of the alphabet a fixed number of spaces. So if you moved everything one letter, “HAL” becomes “IBM”. ROT13 is a common Caesar cipher. This is of course very easy to decrypt since you only need to test a handful of variations once you know the method. To make a more secure system of encryption, people next moved to a more random and less systematic method, creating the what we call *substitution ciphers*. Here there is no pattern for how the letters are substituted for each other. In the U.S. we see these often in newspapers as “brain teaser” puzzles, and they are not too hard. The Arab scholar Al-Kindi showed the way in the 9th. century by showing that language is subject to statistical analysis. In English, for example, the most common letter is “e”, the second most common letter is “t”, and so on. The top of this list is “e,t,a,o,i,n,s,h,r,d,l,u”. And you take the enciphered text, look for the most common letter, assume it “e”, and you are off to the races.

The next step was taken by the Italian *Bellaso*, and later re-discovered by the Frenchman *Vigenere* who now gets all of the credit, so it is called the Vigenere square. (Sic transit gloria mundi, poor Bellaso). This uses a key word or phrase to essentially change the substitution cipher for each letter, which initially was very hard to break, but Charles Babbage (yes, the same Babbage of Difference Engine fame) showed that even this could be defeated by statistical analysis. But then Joseph Mauborgne showed that you could make a completely secure cipher using a one-time pad. This is a pad on which each sheet has a completely random key for creating your Vigenere square. You make two copies, one for encoding, and a duplicate for decoding. Done properly, there is no known way to defeat this type of encryption, but there are problems. First, you have create all of these pads and ship them to all of the people who need to communicate with you. Second, if even one of these pads is ever intercepted in any way, you no longer have any security. Third, it is very laborious, particularly if you need to send a lot of messages. For these reasons no nation has ever adopted one-time pads for the bulk of its security needs.

The next step involves mechanical systems of encryption. The first ones were just simple pairs of disks with different diameters. You could rotate one disk to align up the A with a different letter on the second disk, and then begin encrypting. An example known to old timers in the U.S. is the Captain Midnight Secret Decoder Ring. If you think about it, this is just a simple Caesar Cipher, although more efficient than doing it all with pencil and paper. But just after WWI, a German inventor named Arthur Scherbius took the basic idea and solved a lot of the problems to create the Enigma machine. This machine changed the settings after each letter was encrypted, making it all a lot more complicated and hence more secure. The German government adopted this, and believed it to be completely unbreakable. But in fact Polish cryptanalysts figured out how to crack the encryption, and passed their results on to Britain and France, and Britain created a mammoth operation at Bletchley Park that decrypted German messages all through the war. While there was sloppiness in the German implementation, even if this had been eliminated they still could have decrypted the messages (though with more difficulty) because a mechanical system like the Enigma machine has a built-in flaw: no mechanical system can be truly random, and if it isn’t random, there will be a crack in the wall that a skillful cryptanalyst can exploit. The Poles, and then the British, realized that the key lay in mathematics, and recruited a large number of mathematicians to work on the cryptanalysis of these messages.

While the Enigma machine was the main one used by the Nazis, there was an even more secure encryption called the *Lorenz Cipher*, and to decrypt these messages the British created what was the first modern computer, beating Eniac by several years. Colossus could attempt to find the key by checking many possible combinations at once. This was the beginning of computerized decryption, and shortly thereafter computerized encryption was also attempted by several people. But this faced very active opposition by the NSA in the U.S., which after WWII was the dominant country in both computers and cryptanalysis. And this is an important point. If the NSA could simply throw computing power at any encryption and break it, they would never have behaved the way they did, and still do to this day. It is the very fact that they cannot do so that leads them to weaken the standards and oppose research.

By the 1960s it was clear that computers could create encryption schemes that could not be broken so long as the users did not make a mistake. But the big problem was distributing the keys. The key used to create the cipher is essential, and getting it to the people who need to use it without anyone else getting it is a big problem. Whitfield Diffie and Martin Hellman, working with Ralph Merkle, created what Hellman has suggested should be called the Diffie-Hellman-Merkle key exchange algorithm which showed that it was possible to securely exchange keys even through a public medium, and Diffie later had the insight that the key could be asymmetric, meaning that the key used to encrypt the message could be different from the key used to decrypt the message. This would enable Alice to encrypt a message and send it to Bob (in discussion of crypto it is always Alice and Bob who are communicating; see Wikipedia) using Bob’s public encrypting key, and Bob could then decrypt it using his private decrypting key which only he knows. Diffie thought this was theoretically possible, and then a team at MIT actually found a mathematical function to do this. The team was Ronald Rivest, Adi Shamior, and Leonard Adleman, and by their initials this became known as RSA encryption, and it is still basically the standard in use today. The way it works, without going into extremely deep mathematics, is by using a one-way function, which is a mathematical function that can operate on a number, but when you get the result there is no way to go back and see what the initial number is. So using a public key with a one-way function, Alice can post this key on a public site, print it in a newspaper, put it on handbills and tack it up all over town, or whatever. Anyone can use it to encrypt a message, but this key can never decrypt the message. Only her private key can decrypt. These two keys are generated together as a *key pair*, based on taking two very large prime numbers, a dash of randomness, and some interesting mathematics. If you really want to look at the math, start with the Wikipedia page for the RSA Algorithm.

So the key to modern encryption is that it is an example of applied mathematics. Every message you write can be encoded using ASCII or some similar encoding scheme into a series of binary digits (zeros and ones). So that means that any message is equivalent to a number, and any number can be operated on using mathematics. And using mathematics we can determine just how secure it is, and that is why we can have confidence that encryption can be made secure even from government decryption. They may threaten you with jail if you don’t reveal the key (in civilized countries), or even threaten you and your family with torture (in totalitarian dictatorships), but they cannot break the encryption if you don’t help them at some point.

Again, the bottom line that everyone needs to understand is that if you use this properly, it cannot be decrypted using *brute force* in any reasonable time. It is not hard to encrypt data using a key strong enough that it would take every computer known in the entire world a billion years working day and night to crack the cipher and decrypt the message. And the NSA knows this, which is why they tried very hard to stop this technology getting out, and even indicted Phil Zimmerman, author of PGP, for “exporting munitions” when his code got out of the U.S. (BTW, he was never successfully prosecuted). And to this day, the NSA rarely tries to brute force any encrypted data, since it is hopeless. What they try to do is get the keys (often by legal compulsion), or find a way to weaken the keys, as they did with the Elliptical Curve Cipher.

*Listen to the audio version of this post on Hacker Public Radio!*