This is a repo providing solutions written in Go to the Matasano Cryptopals Challenges.
This challenge asks us to decode a hex-encoded string into a byte slice and re-encode it as base64.
This challenge asks us to xor two hex-encoded strings together. This can be accomplished by decoding the strings into byte slices and xoring them together.
Here is a tiny truth table for XOR :)
A | B | Output |
---|---|---|
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 0 |
This challenge asks us to decode a hex-encoded string xored with a single
character, which should look something like hello ^ XXXXX
. Since we know that
the value of the byte type can only be 0..255 we can brute force the ciphertext
against each value to derive the plaintext and key.
However, we want to return the correct plaintext, not 256 potential candidates, so we will need additional infrastructure for this challenge. For each decrypted candidate we can perform frequency analysis. We can construct a map of the frequency of each letter as it appears in English texts, keeping global state of which decrypted string has the highest score. This allows us to return only the highest scoring string rather than all potential candidates, which will prove essential in later challenges.
This challenge asks us to open a file (conveniently located at testdata/4.txt
)
and apply our logic from challenge 3. We will need to brute force each string
and return the highest scoring plaintext.
This challenge asks us to look into decrypt a ciphertext encrypted with
repeating-key XOR. We can modify our existing XOR code to instead do x[i]
⊕
y[i%len(y)]
(modulo length of y).
This challenge is definitely the hardest to understand without a background in cryptography. To make it easier we can break it into multiple sections.
Firstly, we will need to understand the concept of hamming distance at the bit level. In essence, we are counting the number of differing bits between two strings of equal length. This can be accomplished by xoring the strings together and counting the number of bits that are set to one.
As mentioned in the example, we can validate that our code works correctly by
testing it against the strings this is a test
and wokka wokka!!!
; it should
come out to 37.
The concept of hamming distance can be applied to help us solve this challenge when we understand how it interacts with English ASCII. The entire English alphabet (uppercase and lowercase) is represented in 52 out of the 256 possible values in ASCII, and it helps that the letters are conveniently located right next to each other and thus have a normalized average hamming distance far lower than legitimately random data.
A nice thing to note is that when XOR encrypted with the same key, two strings have the same hamming distance as their plaintext forms; it is only a matter of us finding the key length through brute force; the noticeably lower normalized hamming distance will be our key length :)
The example algorithm as described on the challenge page advises us to attempt
to guess the key length from 2 to 40, which entails taking the first two chunks
of the text of size KEYSIZE (the first block would be slice[:keysize]
, the
second would be slice[keysize:keysize*2]
). However, this sample size is too
small to be accurate, which is why they advise taking up to the first four
chunks of the ciphertext and finding the mean among those. In my testing, I
found this to yield the wrong key size still (although it could have just been
buggy code), but I was still determined to figure it out without magic
numbers.1
In my research, I happened upon
this site discussing this topic in detail
as well as another set of cryptopals solutions written in
Python
which prompted me to begin working on code that would iterate through the
length of the ciphertext and compare each chunk (slice[size*(i+1):size*(i+2)]
)
against the first chunk. This worked as expected, which allowed us to determine
the key size.
Armed with the key size, we attempt to split the ciphertext into chunks of
len(keysize)
. We transpose the blocks (create a slice of byte slices) by
placing the first byte of each chunk in one slice, the second byte in the
second, and so on. We can then attempt to brute force each byte slice with
scoring + single character XOR; we are then able to reconstruct the key and
ultimately solve the challenge.
This challenge asks us to decrypt a ciphertext encrypted in ECB mode. We use the
crypto/aes package, decrypting the
ciphertext 16 bytes at a time with Block.Decrypt
.
This challenge takes advantage of the fact that AES-ECB encrypts the same plaintext block into the same ciphertext (lack of diffusion). We track the encrypted blocks in a map so we can easily lookup the existence of the ciphertext and determine whether a certain string was encrypted with ECB mode.
This challenge asks us to pad a message to a specific block size with PKCS#7. In essence, this means padding to a specified uint8 length while ensuring the value of the added bytes is equivalent to the number of bytes added.
This challenge asks us to decrypt a AES-CBC encrypted ciphertext. CBC is a
confidentiality-only mode whose encryption can be formalized as
C[i] = E(P[i] ^ C[i-1])
, with decryption being P[i] = D(C[i]) ^ C[i-1]
.
This challenge asks us to construct an oracle that encrypts a given input with
CBC mode 50% of the time and ECB mode the other half of the time. The key should
be securely generated with the operating system's CSPRNG (see crypto/rand
) and
it should prepend 5-10 random bytes and append 5-10 random bytes to the
plaintext before encryption.
We will use our previous function to detect ECB mode by crafting 3 contiguous blocks and sending it to the oracle. 3 blocks ensures that no matter how many bytes are prepended/appended, we will always encrypt two blocks of identical plaintext.
https://book-of-gehn.github.io/articles/2018/06/10/Breaking-ECB.html
This challenge asks us to construct a function that will take an arbitrary input for email (the below map)
{
"email": "[email protected]",
"uid": "10",
"role": "user",
}
and encode it into "URL encoded form" like [email protected]&role=user&uid=10
.
This input is then encrypted with ECB mode and the oracle is provided to the
attacker. We aren't able to directly encode the bytes '&' and '=', so we have to
play some clever tricks with padding. We know that the block size is 16 bytes,
so we just have to separate the key=
and the literal value into separate
blocks. Using a 4 byte email allows us to get exactly this.
email=AAAA&role= user&uid=10
We can then craft a large email so that the word admin is not in the first block, and then replace the first block. The plaintext should ultimately look like the below.
email=AAAA&role= admin&role=user& uid=10
This challenge asks us to extend the previous append-only oracle in Challenge 12 by prepending a consistent, but random number of random bytes. We will need to pad the prepended bytes to a full block and then solve like Challenge 12.
For instance, if we had the setup
PPPP PPPP PPAT TACK ERCO NTRO LLED <- appended bytes go here
we should pad the prepended bytes to a full block, like so:
PPPP PPPP PPXX ATTA CKER CONT ROLL ED <- appended bytes go here
^^ padding
then we should craft an oracle that will append the number of padding bytes consistently when run and solve like 12, so we can have something like the below:
PPPP PPPP PPXX <- delete these bytes by slicing
ATTA CKER CONT ROLL ED <- appended bytes go here
We then solve like normal.
This challenge just asks us to modify our unpadding PKCS#7 function to take the last byte and verify that the added bytes are of the correct number and value.
This challenge takes advantage of the lack of authentication of CBC mode to flip a couple of bits in the ciphertext to get our desired result in the plaintext.
We should think back to our CBC implementation and remember that the result of
the AES decryption pass for each block is the plaintext xored by the previous
ciphertext block. If we can change the previous ciphertext block we can create a
block that looks like C[i] ^ P[i+1] ^ DESIRED_BYTE
. We can just craft a
plaintext block that looks like XadminXtrue
, replacing the desired bytes with
the capital X and changing the ciphertext byte in the block immediately prior so
at decryption time it will look like
C[i-1] (which is really P[i] ^ C[i-1] ^ DESIRED_BYTE) ^ P[i]
, giving us
DESIRED_BYTE
in the final decrypted plaintext.
Back from a hiatus :) I'll try to write my own AES implementation in the near future. Now, onto the challenge.
We are given the IV and a padding oracle that tells us whether some padding was
valid. If we look at CBC decryption, we see the ciphertext of the previous block
is xored with the decryption of the current block. Thus, changing a bit of the
ciphertext allows us to influence the decryption of the current block. We know
that for any given m ^ m = 0
, so if we modify the last byte, we xor our guess
m
with 0x1
; if m
is correct, then the padding will likely* be valid. Then
we modify the next one for a padding of 2, and so on. But as padding gets longer
we run into the risk of a collision, and to rectify this you could also
temporarily modify the previous one and see if the padding remains valid. I
haven't implemented that here, but it would be a trivial addition.
Footnotes
-
I looked at Filippo Valsorda's solutions @ mostly-harmless/ as well as his livestream but it turns out he just guessed the magic number; this isn't good enough for me. ↩