Cryptography

The following overview of cryptography was written by @neko3 and provides a good explanation of many of the core topics in the field.

If you attended the session and/or do not need the explanation, scroll to the bottom to find links to other resources and challenges for use in today's session!

Cryptography is the underlying building block for security. It keeps our data secure, while it zaps over the internet, while it rests on someone's Cloud service, and it protects our privacy. But it can also be used by malicious people to hide their actions, or deny us access to our own data (see ransomware). Which is why it is really important to understand how various cryptographic schemes work and how to spot weak crypto.

Encodings

While not technically crypto schemes, data encodings are often related to cryptography. They are a means of obfuscating, or transforming data, without a secret. Think of them like a mapping. The most common encoding is base64, heavily used in websites. Encoding binary data in base64 is done by taking four 6-bit groups of binary data, and transforming them in one of 64 possible output characters.

e.g. Hello World is 01001000 01100101 01101100 01101100 01101111 00100000 01010111 01101111 01110010 01101100 01100100 in binary.

We re-distribute the binary to be groups of 6:

010010 000110 010101 101100 011011 000110 111100 100000 010101 110110 111101 110010 011011 000110 0100

Take 4 groups, and map:

010010 maps to S
000110 maps to G
010101 maps to V
101100 maps to s
...

The resulting encoding is SGVsbG8gV29ybGQ=. The = character at the end is called padding. It exists because our initial grouping did not end up with a number of groups which is a multiple of 4 (the last group was 011011 000110 0100[00]). We are missing one full group which is why we have one = padding character.

Classical/Ancient/Old Ciphers

People have tried to transmit information covertly since Antiquity, especially for military purposes. Those ciphers are now very simple to understand and break, but it is interesting to look at how it all started.

The most famous cipher would be Caesar's cipher, from the Roman times. It is a fairly simple cipher, where the letters of the alphabet are shifted by some position. It is also called a substitution cipher. For example, for a left shift of 3, a becomes d, b becomes e and so on. The secret component of the cipher is the amount of shift.

e.g Hello World becomes Khoor Zruog with a shift of 3.

The Caesar cipher is the base idea between more modern and complex ciphers, such as Vigenere, whereby instead of a simple shift, a key is chosen and the letter of the key decides the shift for each plaintext letter.

e.g. Hello World encrypted with the key AFNOM becomes Hjyza Wtezp.

H shifted by 1 (A) is H
e shifted by 6 (F) is j
...

The ROT13 cipher is also a Caesar cipher descendant, whereby the letters are shifted by the value 13.

e.g. Hello World becomes Uryyb Jbeyq

None of these ciphers provide security nowadays, but they are fun ways of obfuscating data.

Private Key Crypto

Also known as symmetric key cryptography, these schemes use only a single private key, shared between two parties, to secure communication. If the key is leaked/stolen (i.e. an attacker knows it), all security is lost.

Symmetric key crypto is fast and does not require a lot of computational resources. Which is why it is the preferred method of implementing security in embedded devices. But, because comms are compromised if the shared key is recovered, key management is a big accompanying topic.

When speaking of symmetric key crypto, two classes of ciphers are distinguished:

block ciphers -- these break down the plaintext into chunks (blocks) and encrypt each block
stream ciphers -- from a key, a pseudo-random of numbers (bytes) are generated; these bytes are then combined (usually XORed) with the plaintext, one byte at a time, to create the encrypted output (the ciphertext)

The basic idea is to take (the/some) plaintext, and combine it in some mathematical way with the key, to obtain the ciphertext.

symmetric key

Some well known block ciphers:

Some well known stream ciphers:

A5/1 (used in 2G)
RC4 (used in TLS)
Salsa20

Public Key Crypto

Also known as asymmetric key cryptography, these scheme use pairs of keys:

a public key, which can be widely distributed and does not need to be kept secret
a private key, which only the user should know, and must be kept secret

There is a mathematical relation between the two keys. The idea is you encrypt with someone's public key, and only they could decrypt the message, as only they should possess the private key to match the public one.

Public key cryptography is more computationally expensive, and it is often used to establish a secure communication channel on which symmetric keys can be exchanges.

In fact, public key cryptography is the base on which all our secure internet communications is built upon (e.g. Hypertext Transfer Protocol Secure (HTTPS), which uses Transport Layer Security (TLS) to encrypt traffic)

asymmetric key

Some well known public key crypto algorithms:

Hashes

While symmetric and public key crypto schemes allow you to both encrypt and decrypt a message (so the operation is reversible), hashes are one-way functions that map data of an arbitrary length to a fixed-size value. Hash functions are meant to be impossible to reverse. All you can do is guess the input and verify if the computed hash is the one you expect.

Because hashes have a fixed length output, and the input can be of any size, the mapping between them is limited. This is where the collision resistance property of a hash function becomes very important. A collision, in a nutshell, means that two (different) inputs, hash to the same output. Hash functions try to minimise the chance of this happening, therefore hash length is an important factor.

e.g. if I had a hash function with a 1 byte output, the data I hash could take one of 256 values. Finding 2 inputs to hash to the same value would be easy!

Hash functions are often used to send passwords over the internet. Instead of sending the actual password, the hash is transmitted. Therefore, if it is intercepted, an attacker would have to try and bruteforce it. A common method of bruteforcing passwords is to use a dictionary attack, whereby words from a list are hashed and the output is compared to the hash you are trying to break. You can get creative with this, as you can try different ways of constructing the passwords. E.g. people may use one word and their birthday, or use 1337 speak when constructing their password.

Two great tools to help in hash cracking are

You would also need a good word list, the most famous one being rockyou

Some well know hash algorithms:

RIPEMD-160
MD5
SHA-1
SHA-2
SHA-3 (the new standard)
BLAKE3 (very new)

Session material + resources

Primers + explanations

AES
RSA (with some implementaions and demos) - you might have to scroll down a bit for the theory
Some common RSA attacks

Challenges

If you like OverTheWire-style challenges, have a go at Krypton.
Another good set of challenges is CryptoHack
If you prefer a more tutorialised, guided walk of cryptography and cryptanalysis, check out CryptoPals.
For more CTF-style challenges, check out PicoCTF
As ever, our CTF has some crypto challenges in the bank.

Handy tools

Happy hacking :)

AFNOM Sessions