In 1990, the internet as we know it was born. Since the beginning, it has used the HyperText Transfer Protocol (HTTP) for moving information around the world. That’s why the beginning of web addresses start with HTTP.
Plain old HTTP is not secure because it transports information in plain text. This means that anyone who intercepts the traffic can read it. That includes not only the hacker who’s monitoring the coffee shop’s WiFi, but your internet service provider (ISP) as well. Kind of like a switchboard operator can listen in on phone calls.
But people soon decided they wanted to use the internet for sensitive data (like credit card numbers), so we had to figure out a way to make HTTP secure so that no one could see your credit card number as it zoomed between your browser and the web server.
So in 1994, Netscape Communications enhanced HTTP with some encryption. Essentially, they married a new encryption protocol named Secure Socket Layer (SSL) to the original HTTP. This became known as “HTTP over SSL” or “HTTP Secure”. Otherwise known as HTTPS.
Today, more than 50% of all websites are HTTPS. That number has been growing radically in the last few years since Edward Snowden revealed that the NSA is spying on everyone’s internet traffic.
The idea, as stated by many, is to migrate the entire internet into a completely HTTPS environment, where all website traffic is encrypted by default.
How HTTPS Works
HTTPS keeps your stuff secret by encrypting it as it moves between your browser and the website’s server. This ensures that anyone listening in on the conversation can’t read anything. This could include your ISP, a hacker, snooping governments, or anyone else who manages to position themselves between you and the web server.
For a long time, SSL was the standard protocol used by HTTPS. The newest version of SSL is now called Transport Layer Security (TLS) but they are essentially the same thing. I’ll refer to it from now on as SSL/TLS since both monikers are used interchangeably, but technically I’m talking about the newer TLS.
Essentially, you need three things to encrypt data:
- The data you want to encrypt
- A unique encryption key (just a long string of random text)
- An encryption algorithm (a math function that “garbles” the data)
You plug the data and the key into the algorithm and what comes out the other side is cipher text. That is, the encrypted form of your data which looks like gibberish.
To decrypt the cipher text on the other end, you just reverse the process with the same key and it reverses the encryption, restoring the original form of the data. It’s the secrecy of the encryption key that makes the whole process work. Only the intended recipients of the data should have it, or else the purpose is defeated.
When you use the same encryption key on both ends it’s called symmetric encryption. This is what your home WiFi uses. You have just one key, or “password”, which you plug into both your wireless router and your laptop. Easy peasy.
How an SSL connection is established
An SSL connection between a client and server is set up by a handshake, the goals of which are:
- To satisfy the client that it is talking to the right server (and optionally visa versa)
- For the parties to have agreed on a “cipher suite”, which includes which encryption algorithm they will use to exchange data
- For the parties to have agreed on any necessary keys for this algorithm
Once the connection is established, both parties can use the agreed algorithm and keys to securely send messages to each other. We will break the handshake up into 3 main phases – Hello, Certificate Exchange and Key Exchange.
- Hello – The handshake begins with the client sending a ClientHello message. This contains all the information the server needs in order to connect to the client via SSL, including the various cipher suites and maximum SSL version that it supports. The server responds with a ServerHello, which contains similar information required by the client, including a decision based on the client’s preferences about which cipher suite and version of SSL will be used.
- Certificate Exchange – Now that contact has been established, the server has to prove its identity to the client. This is achieved using its SSL certificate, which is a very tiny bit like its passport. An SSL certificate contains various pieces of data, including the name of the owner, the property (eg. domain) it is attached to, the certificate’s public key, the digital signature and information about the certificate’s validity dates. The client checks that it either implicitly trusts the certificate, or that it is verified and trusted by one of several Certificate Authorities (CAs) that it also implicitly trusts. Much more about this shortly. Note that the server is also allowed to require a certificate to prove the client’s identity, but this typically only happens in very sensitive applications.
- Key Exchange – The encryption of the actual message data exchanged by the client and server will be done using a symmetric algorithm, the exact nature of which was already agreed during the Hello phase. A symmetric algorithm uses a single key for both encryption and decryption, in contrast to asymmetric algorithms that require a public/private key pair. Both parties need to agree on this single, symmetric key, a process that is accomplished securely using asymmetric encryption and the server’s public/private keys.
The client generates a random key to be used for the main, symmetric algorithm. It encrypts it using an algorithm also agreed upon during the Hello phase, and the server’s public key (found on its SSL certificate). It sends this encrypted key to the server, where it is decrypted using the server’s private key, and the interesting parts of the handshake are complete. The parties are sufficiently happy that they are talking to the right person, and have secretly agreed on a key to symmetrically encrypt the data that they are about to send each other. HTTP requests and responses can now be sent by forming a plaintext message and then encrypting and sending it. The other party is the only one who knows how to decrypt this message, and so Man In The Middle Attackers are unable to read or modify any requests that they may intercept.