Class 5: RSA
Overview
RSA is our first asymmetric key encryption algorithm, meaning the key used for encryption is different than the key used for decryption. In order to encrypt/decrypt with RSA and pull off the various attacks on this algorithm, we need to be comfortable with modular arithmetic in python first.
Modular Arithmetic
A Note on Notation
ℤ refers to the set of integers.
ℝ refers to the set of real numbers.
Quotient Remainder Theorem
For every integer a and every positive integer b, there exists unique q,r ∈ ℤ such that
a = b⋅q + r 0 ≤ r < b. So when we take a (mod b), we get r.
Arithmetic Properties
Addition of two integers (mod n)
a + b (mod n) ≡ (a (mod n) + b (mod n)) (mod n)
Multiplication of two integers (mod n)
a ⋅ b (mod n) ≡ (a (mod n) ⋅ b (mod n)) (mod n)
Subtraction (or the Additive Inverse)
When we are dealing with real numbers, and we compute a - b, we are also doing
a + (-b). But negative numbers aren’t in our range from 0 ≤ x < n, where
n is our modulus. If 0 ≤ b < n, then -b = n - b, and instead of
subtracting we can add (-b).
If we take n = 30, and we take powers of 29 (mod n), what pattern do you see?
Multiplicative Inverse, GCD, and Euler’s Totient Function
The Greatest Common Divisor (GCD)
The GCD(a, b) a, b ∈ ℤ is the largest integer that divides both a and b.
To calculate the GCD in python we can use
#this is a must have when doing a Crypto Challenge
from Crypto.Util.number import *
a = 6
b = 2
print(GCD(a, b))
> 2
The calculation of the GCD using this function is done in O(log(N)) time, where N is the size of the number (in bits). This fact is extremely powerful since RSA uses massive numbers, and we have a very finite amount of time to crack these CTF problems.
Here’s a fact about the GCD of two numbers.
For a, b ∈ ℤ we can find x,y ∈ ℤ such that
x⋅a + y⋅b = GCD(a, b)
This fact matters to calculating the multiplicative inverse of a mod n. This inverse, which we will call s, is a number such that
s⋅a ≡ 1 (mod n).
Now, suppose we have integers a and b, and the GCD(a, b) = 1.
We can then find x and y that satisfies
x⋅a + y⋅b = 1
Now, what’s the inverse of a mod b?
We take both sides of our equation mod b.
x⋅a (mod b) + y⋅b (mod b) ≡ 1 (mod b)
x⋅a (mod b) + 0 ≡ 1 (mod b)
x⋅a ≡ 1 (mod b)
So x is our inverse!
Okay, so how do we compute it in python?
from Crypto.Util.number import *
a = 7 # the number we want to find the inverse of
b = 30 # our modulus
print(inverse(7, 30))
> 13
a = 6
b = 12
print(inverse(6, 12))
> 1 # ??? Why is this the case???
The inverse only exists if GCD(a, b) == 1.
What is Euler’s Totient Function?
Euler’s Totient Function, denoted by Φ(n), is how many numbers less than n that are relatively prime to n.
What does relatively prime mean?
Two numbers are relatively prime if they share no common factors, so their GCD is 1
In RSA, we will mainly be concerned about Φ(n) where n is a prime, and the there are n - 1 integers less than n that do not share a factor with n.
So Φ(n) = n - 1
Also Φ(p⋅q) = (p - 1)⋅(q - 1), when p and q are prime.
Euler’s Theorem
For any integer a, aΦ(n) (mod n) = 1
^^^This is huge, it’s the reason that RSA works!
Modular Arithmetic Challenge: Crack the Affine Cipher
The Affine Cipher is a function that maps plaintext characters to ciphertext characters by taking a⋅x + b (mod n), where the GCD(a, n) = 1, for each plaintext character x. n is the size of the alphabet you are using. So if you are using lowercase letters as your alphabet n = 26 and a = 0, b = 1, etc.
Here’s a flag encrypted with the Affine Cipher ciphertext.txt
The RSA Algorithm
- Choose two large primes p and q, the choice of primes is important
- Calculate N = p⋅q and Φ = (p - 1)(q - 1)
- Choose an exponent e, such that GCD(e, Φ) = 1. This is usually 65537
- Calculate your decryption exponent d = e-1 (mod Φ)
- Your public key is (N, e) which you publish to the world
- d is your private key, never publish that (or any of p, q, and Φ)!
To encrypt your plaintext, p, you first convert p to an integer. Then the
ciphertext, c = pe (mod n).
To decrypt a ciphertext c, when you have d, p = cd (mod n), then convert p to a string.
Now send and decrypt over slack, with this key!
print(key.e)
>65537
print(key.n)
>119546758637783075126820852310213183956028506278962788973539167421549816563740172946126536868890831621992587470355427694792841749824799477393973846219705980373048973271453203719080512002915967988824996746065131197955540220010933957159761229668255226945159229475684131563512438308228380645250009471577548499359
print(key.p)
>9893370682145942967026953572091664340426935959057317402278437220088945657441992951731943880502868000786284031860973857263807191155029855499978170975388237
print(key.q)
>12083521630653439426834660791911602530882615669910573863030084061323641858216244992435837350517030764744721186399209888543665171595454805925781521368734107
Factorization Attacks
Moduli with Shared Factors
If we are given multiple moduli, it’s a good idea to check the GCD of the modulus, to see common factors. Since GCD is O(log(N)) complexity checking for common factors is MUCH faster than manual factoring!
Related Challenge:
from Crypto.Util.number import *
from secret import flag
modulus_list = [143786356117385195355522728814418684024129402954309769186869633376407480449846714776247533950484109173163811708549269029920405450237443197994941951104068001708682945191370596050916441792714228818475059839352105948003874426539429621408867171203559281132589926504992702401428910240117807627890055235377744541913L,
73988726804584255779346831019194873108586184186524793132656027600961771331094234332693404730437468912329694216269372797532334390363774803642809945268154324370355113538927414351037561899998734391507272602074924837440885467211134022878597523920836541794820777951492188067045604789153534513271406458984968338509L,
95666403279611361071535593067846981517930129087906362381453835849857496766736720885263927273295086034390557353492037703154353541274448884795437287235560639118986397838850340017834752502157881329960725771502503917735194236743345777337851076649842634506339513864285786698870866229339372558162315435127197444193L,
119235191922699211973494433973985286182951917872084464216722572875998345005104112625024274855529546680909781406076412741844254205002739352725207590519921992295941563460138887173402493503653397592300336588721082590464192875253265214253650991510709511154297580284525736720396804660126786258245028204861220690641L]
e = [114194L, 130478L, 122694L, 79874L]
message = bytes_to_long(flag)
ciphertext = [pow(message, e[i], modulus_list[i]) for i in range(4)]
ciphertext = [long_to_bytes(ciphertext[i]) for i in range(4)]
ciphertext = [ciphertext[i].encode("hex") for i in range(4)]
obj1 = open("ciphertext.txt",'w')
obj1.write(str(ciphertext))
Fermat’s Factorization Algorithm
Fermat’s Factorization algorithm works decently for moduli that are composed of close primes.
#gmpy2 is pretty cool for RSA
#replacing the math library functions with gmpy2 functions makes it work
#for larger numbers
import gmpy2
import math
def fermat(n):
x = math.ceil(math.sqrt(n))
y = x**2 - n
while not math.sqrt(y).is_integer():
x += 1
y = x**2 - n
return x + math.sqrt(y), x - math.sqrt(y)
Mathematical Attacks
Unknown N
You can’t crack RSA if you don’t know the modulus, or the exponent … right??
HINTS
GCD is really cool, really fast, really powerful.
Quotient remainder theorem is even cooler.
print(pt1)
>'RIP Jimmy Johns, you will be missed'
print(pt2)
>'RSAyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy'
print(pow(pt1, e, N))
> 37136243312715194902825584928406235525283657682725727003721902911865685912699457626719275624510200759546830788626751812490141990006185681291468792820791075087727683924995491234509081293131153108046492280906609660584225561342162738122131612475572515001081095045610089483419744190651486880693292444259126406227
print(pow(pt2, e, N))
> 92027579278568818255866803849226126628098931474043114856529263276899076120611074244237998936731276676799838109285541616227085388842702382089431528878948787830838701877990669907443292252823133095465973136139733409428610250070275480695447688980536659354404589985676637375480254742562779684329642271419871247070
print(pow(bytes_to_long(flag), e, N))
> 89896527191363342552285596608328113231337146238965456851235282108365490243543109572523985741155824590398146282394405103311919137006124962265604354222740776721552722681294226978405640606005849483021299844407142233753224229166231183549470003898396317815700071757576809381179354037743498756899618484854570966413
Common Modulus
I encrypted my flag twice with different encryption exponents, because it’s more secure that way… I think.
Hints
There’s a neat fact about the GCD of two numbers in the notes
Do you remember the properties of exponents?
#ct1 is encrypted with e1 and ct2 is encrypted with e2
ct1 = 86486974605436968323225664985717643937987648757532515394390546902377379148034323381259080363032530579939303607082465244569495111587800086263169077916240903627802377995094121218002515685344843494874850133287531008426469547647946934275225168919273373760375157066647910419425494640591728148959728201587576063033
ct2 = 13046866468973144131848665810111029526334655294930955471433017698268816761298194393342098962617466436503366899286273740670935308591431539568926586925720313096145172525548956205569769512971414246385086867194340769520382101558334593411134344494478733657007131885588111064288264224746453182012169946178930796791
e1 = 13
e2 = 15
N = 103109065902334620226101162008793963504256027939117020091876799039690801944735604259018655534860183205031069083254290258577291605287053538752280231959857465853228851714786887294961873006234153079187216285516823832102424110934062954272346111907571393964363630079343598511602013316604641904852018969178919051627
Challenges
- Intro to RSA
- Sign
- Small N
- Dr RSA and the Multiprimes of Madness
- Mersenne-d It
- Magnolia