You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2021-1-09-cryptopals-companion.md
+1-189
Original file line number
Diff line number
Diff line change
@@ -4,192 +4,4 @@ title: "A companion to the Monsanto Cryptopals challenges"
4
4
author: rctcwyvrn
5
5
---
6
6
7
-
Preparations before we start
8
-
* This companion in general will not provide any direct solutions, I instead mostly give leading questions. If you need solutions, plenty of solutions and writeups are just a google away, though remember that looking at the answers at the back of the book leads to understanding less. Feel free to DM me on discord or in the `crypto` channel if you're stuck
9
-
* Almost all ctfers use python for solving challenges across basically all categories, so that’s what I would recommend
* I’ll be assuming you’ll be writing in python for this guide
15
-
* Make sure you read the homepage, everything it says is true
16
-
* You don’t need (and aren’t expected to) know any cryptography beforehand
17
-
* They do unfortunately skip out on defining some terms, but that’s what this companion is for
18
-
* You only need simple math
19
-
* The hard math is left to the professionals and we just copy their equations :)
20
-
* No math shows up until set 5 anyway
21
-
* These challenges are attacks, not puzzles with tricks. The challenge authors make an effort to make the challenges clear and "doable", instead of obfuscated like you'd see in competitions. This companion guide also helps clear up the occasional ambiguity and confusing wording
22
-
* I like to put an extra little disclaimer whenever I recommend someone take a look at cryptopals, which is that after solving these challenges most people feel a strong feeling of dread. Dread when they realize how fragile cryptography can be and how most people writing code that uses cryptography are **completely unaware** that these fragile points exist
23
-
* The general structure of the sets is
24
-
* Set 1 is preparation and “the qualifier set”
25
-
* Sets 2-4 are various attacks on symmetric ciphers and a few hashes in set 4
26
-
* Sets 5-6 are attacks on various asymmetric primitives, all using primes and numbers
27
-
* In case you’re wondering how well cryptopals will prepare you for CTFs, take a look at CSAW CTF 2020. Three challenges are basically directly taken from cryptopals and can be solved by copying over the code
28
-
* Authy is a length extension sha attack, covered in challenge 29
29
-
* modus operandi is a ciphertext detection challenge, covered in challenge 11
30
-
* adversarial is a fixed nonce CTR, covered in challenges 6 and 20
31
-
32
-
Some quick definitions
33
-
---
34
-
**Encryption**
35
-
* Encryption is turning a message (like “cryptography is cool”) and changing into a message that appears to be a garbled mess (like “falkjdior30vjs”) based on some “key”
36
-
* The method of encrypting the message and what the key is vastly depends on the algorithm, for example in rsa the key is a number, in caesar ciphers the key is a table, and in aes the key is 16 bytes
37
-
***Plaintext** is used to refer to the message before encryption and **ciphertext** is used for the garbled message after encryption
38
-
39
-
Set 1
40
-
---
41
-
Intro
42
-
* A lot of this set comes down to how comfortable you are writing code in python
43
-
* There is a little hitch in that byte representations in python changed a lot between python2 and python3, meaning stackoverflow answers and such will be outdated and might not work, feel free to bug me on discord if you’re having trouble
44
-
45
-
Challenge 1: Some useful functions and notes about python bytes
46
-
*`int(x, 16)` will parse the string x as a hex string, and return a number
47
-
* The `Crypto.Util.number.long_to_bytes` and `Crypto.Util.number.bytes_to_long` functions from pycryptodome convert between numbers and bytes
48
-
* Bytes in python are really just an array of ints between 0 and 255
49
-
*`b”hi”` is really just `[104, 105]`
50
-
* the `bytes()` constructor takes an int array and returns a bytes object
51
-
* using a for loop or a list comprehension like `[x for x in my_bytes]` will return the int array
52
-
* you can also index into bytes objects to get the integers, `b”hi”[0] == 104`
53
-
* Remember that bytes are just numbers with base 255
54
-
*`ord()` takes a single character and returns its integer value
55
-
*`chr()` does the opposite, integer to char
56
-
* the `base64` package has functions to go from bytes <-> base64 strings, namely `b64decode` and `b64encode`
57
-
58
-
Challenge 2: XOR in python
59
-
* The xor operator in python is `^` and it takes two integers and returns the result of xor
60
-
* A note about XOR in case you aren’t familiar: The operation cancels out with itself if you do it twice
61
-
* Ie: `A xor B xor B == A`
62
-
* If you aren’t sure why this is the case, try writing out the table for XOR
63
-
* This is why “encryption” and “decryption” are actually the same action when using XOR, you end up doing the exact same thing
64
-
* How do we use `^` with bytes objects? Remember that we can index into bytes objects to get the underlying integers, so try a for loop and then combining the results afterwards with `bytes()` back into a bytes object
65
-
66
-
Challenge 3
67
-
* This general idea comes up sometimes in cryptography attacks, but the idea is that we know the original must have been some english sentence, so we can bruteforce and see which one is most like an english sentence
68
-
* This idea also can be generalized to cases where we know some "structure" that the original message must be in and use that information to inform our bruteforce
69
-
* Note: Do not be lazy and do it by eye, the “scoring” function will be needed in the next challenge
70
-
* The `sorted()` function will probably come in handy
* Basically the same as challenge 3 but at a larger scale, same principle applies
75
-
* We can assume that anything that isn’t encrypted with a single character XOR will result in a garbled mess when we perform an XOR on it
76
-
77
-
Challenge 5
78
-
* Just coding, good luck!
79
-
* The modulo operator `%` will probably come in handy here
80
-
81
-
Challenge 6
82
-
* Hamming distance function tips
83
-
* To format an integer as a binary string, use `“{0:b}”.format(x)`
84
-
* Use in combination with `bytes_to_long` to convert the bytes to the bit string
85
-
* If the lengths differ, the difference in length should be added to the distance
86
-
* Solve tips
87
-
* Try to reuse as much code from earlier challenges as possible
88
-
* Transposing means taking `[[a,b,c], [1,2,3]]` and making it into `[[a,1], [b,2], [c,3]]`
89
-
* An aside
90
-
* I’ve actually used my code for this challenge quite a few times. You might be wondering what real world algorithm uses a repeating key xor and the answer is none, but many ciphers do actually use XORs. This challenge may seem like a toy challenge, but when we get to challenge 20 in set 3, you’ll see how this attack actually works **very well** on a popular block cipher mode.
91
-
92
-
Some definitions
93
-
***AES** (Advanced Encryption Standard) is a **block cipher**, it takes a block of 16 bytes and garbles it based on a 16 byte key
94
-
* However this presents a problem, what do you do if your message is longer than 16 bytes, for example a 32 byte message?
95
-
* If you’re curious about what AES does under the hood (not needed for these challenges, but fun to learn about) https://www.youtube.com/watch?v=O4xNJsjtN6E
96
-
***ECB** stands for Electronic Code Book, and is a **block cipher mode of operation**. Block cipher modes are ways of making block ciphers (which can only encrypt a single block) usable for larger messages
97
-
* ECB is the most straightforward, cut the message into blocks and encrypt each of them separately.
98
-
* This turns out to be a terrible way of doing things, and you’ll see why in the next few challenges
* You should make both encryption and decryption functions, you’ll need it in a bit
106
-
107
-
Challenge 8
108
-
* There is a hidden assumption in this challenge that I think really should have been mentioned. **The challenge assumes that the plaintext must have a repeated block**.
109
-
* The important point here is that ECB is the only cipher mode that has this property, that the same plaintext blocks will result in the same ciphertext blocks
110
-
* This weakness of ECB turns out to cause many many problems, which you’ll see in set 2
111
-
* The worst part is ECB is seen as the “default” setting for block ciphers, despite it being almost always a terrible idea
112
-
113
-
114
-
Set 2
115
-
---
116
-
Challenge 9
117
-
* Fairly straightforward, just some more programming
118
-
119
-
120
-
Challenge 10: A new block cipher mode, CBC
121
-
***CBC** or Cipher block chaining is another block cipher mode
122
-
* The goal of CBC is to not be deterministic like ECB, ie two identical plaintext blocks should encrypt to _different_ ciphertext blocks
123
-
* How does it accomplish this?
124
-
* CBC requires an extra **initialization vector** or IV for short, which is just random bytes the size of a block
125
-
* For each plaintext block we want to xor our plaintext with something first, and then encrypt it with our cipher
126
-
* For the first block we xor it with the IV before encrypting
127
-
* For every other block we xor it with the _last ciphertext block_
128
-
* Essentially “chaining” the different blocks together, so the resulting ciphertext of a block depends on the “sum” of all the previous blocks and the IV
129
-
* https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Cipher_block_chaining_(CBC) has some very useful diagrams
130
-
* Implementing it is fairly straightforward once you understand whats going on
131
-
* Decryption is just running this in reverse, block cipher decryption first and then xor
132
-
* Note: ECB mode with one block is the same as just running the cipher, so you can reuse that as long as you’re encrypting/decrypting only one block
* This is a bit of a weird challenge, I’m not really sure what it’s accomplishing?
138
-
* I don’t think there’s a good way to detect CBC mode, so I just detected ECB mode and then guessed CBC otherwise
139
-
* Again, it only works if you send a plaintext with repeated blocks
140
-
* Idk, maybe skip this one, it kinda comes back in challenge 12 but as an “optional” thing
141
-
142
-
Challenge 12
143
-
* Now don’t skip this one though, because it’s really cool (and also shows up in real CTFs!)
144
-
* The explanation is pretty good for how to get the first byte of the unknown-string, but the next step and automating it can be a bit tricky
145
-
* For the next byte you would first send 14 A’s + the first byte of the unknown string
146
-
* The first block would then be aes(“A”s + first byte + second byte)
147
-
* Then you want to start sending 13 A’s + first byte + guesses for the second byte
148
-
* And then like last time stop when you find a match
149
-
* The congratulations at the bottom is pretty accurate for CTFs too, every once in awhile I see another one of these pop up
150
-
*[Here’s the last one I remember](https://ctftime.org/writeup/18675)
151
-
152
-
Challenge 13: Cut and paste
153
-
* This one is fun to figure out, it’s the first time the authors really let you out and try to figure out how to do what it wants
154
-
* The title is a big hint, if we could copy paste things, what _would_ we want to cut and replace?
155
-
* Given that this is ECB mode, what exactly can we cut and replace? (bits? bytes? characters? blocks? messages?)
156
-
157
-
Challenge 14
158
-
* Some clarification on this challenge, I originally thought the random prefix would be generated each time, a new random prefix of random length each time you touch the oracle
159
-
* I honestly don’t really know how one would go about solving that, so I and everyone else who has write ups for cryptopals instead assumed that the random prefix would be generated once and be reused for all later oracle calls
160
-
* So now the trouble is really just one thing, how long is that random prefix? How can you figure that out?
161
-
162
-
Challenge 15
163
-
* More programming stuff, this is just setup for challenge 17, which is gonna be a big one
164
-
165
-
Challenge 16
166
-
* Another fun one to figure out
167
-
* Try to think about what happens in CBC decryption with the user data block and the block after it
168
-
* How can we completely replace the second ciphertext block in a way to make the third block decrypt to what we want, namely “;admin=true;”? What happens to the second plaintext block in that case?
169
-
170
-
171
-
Set 3
172
-
---
173
-
Intro
174
-
* For me, this is the set where things really started to get fun
175
-
* This note at the start made me really hyped and I hope it makes you excited too
176
-
* “We've also reached a point in the crypto challenges where all the challenges, with one possible exception, are valuable in breaking real-world crypto.”
177
-
178
-
Challenge 17
179
-
* This is a great challenge
180
-
* CBC is (unfortunately) still a relatively popular block cipher mode, and is seen as "the good alternative" to ECB
181
-
* This directly builds off of what you learned about CBC decryption in challenge 16, how bit flips in one ciphertext block affect the resulting decryption in the next block
182
-
* This shows off how useful these side channel leaks are. A **side channel leak** is when a system unintentionally reveals information about it's inner workings
183
-
* For this challenge, the "decryption server" is leaking the fact that the padding is invalid
184
-
* For later challenges you'll see that even leaking how long the code takes to run can be enough to cryptography
185
-
* Some other side channels you won't see in cryptopals:
186
-
* The amount of power the CPU is using can be used to retrieve AES keys from smart cards
187
-
* Using information about the CPU cache to break OpenSSL's carefully implemented RSA algorithm https://web.eecs.umich.edu/~genkin/cachebleed/index.html
188
-
* CBC padding oracles also show up from time to time in CTFs, though more rarely due to how famous this attack is
189
-
* Here is a good explanation of the attack: https://robertheaton.com/2013/07/29/padding-oracle-attack/
190
-
* From my experience this challenge is not necessarily conceptually difficult, but can be a bit tricky to get right
191
-
* Start with just finding one byte, then work towards automating it for the rest of the bytes in the block, and then finally for multiple blocks
192
-
* Made sure the padding functions from set 2 are fully correct
193
-
* A plaintext that is all padding should be accepted
194
-
* A plaintext with no padding should be rejected
195
-
* Aside: Anyone else feeling that dread after realizing that all it takes to break a perfectly secure algorithm like AES and a reasonably sensible mode like CBC is something as small as having two different error messages for wrong password vs bad padding? Because I definitely was
7
+
Now being maintained at [https://crypto.maplebacon.org/cryptopals/intro.html](https://crypto.maplebacon.org/cryptopals/intro.html)
0 commit comments