Thanks to Alex Bakewell of SophosLabs for his help with this article.
April 2019 was a good month for bold Belgians!
Professsional Belgian cyclist Victor Campanaerts broke the world hour record, covering an amazing, unassisted, undrafted 55km in a velodrome (55,089 metres, in fact) in 60 minutes.
The previous record, set by Sir Bradley Wiggins in 2015, had stood for nearly four years.
But professional Belgian programmer Bernard Fabrot conquered an even more durable challenge.
He cracked a computational puzzle that was set back way in 1999, by none other than Professor Ron Rivest of MIT, who’s the R in the well-known public key encryption algorithm RSA.
Fabrot’s achievement is particularly interesting because Rivest specially designed the puzzle in the hope it would take 35 years to solve, assuming you started as soon as it was published.
In the end, Fabrot required 3.5 years of computer running time, thus outpacing Rivest’s estimate by a factor of 10.
The puzzle is what’s known as a “time-lock problem” – a time-consuming calculation that can only be accelerated by tuning your algorithm or by building faster computer hardware.
Time-lock puzzles are interesting, and important, because they can’t be short-circuited simply by splitting the problem into pieces and throwing more computers at it.
Time-lock puzzles are inherently sequential, typically requiring a number of loops through an algorithm where the input to each iteration of the loop can only be acquired by reading in the output of the previous iteration.
The idea is to put everyone in the same boat: you can be the biggest, richest, most energy-slurping cloud computing company in the world, but all those servers, CPUs and CPU cores won’t let you buy your way to victory.
Most problems can be split into parts, and least some of those parts can overlap.
If you were asked to count all the elephants in Africa, for example, you could fly to Cape Town and then zigzag your way north until you reached Alexandria in Egypt, noting all the elephants as you went along.
But this is an inherently parallelisable task, because – if you ignore complexities such as elephants you’ve already counted wandering across national borders, for example – the number of elephants in, say, Zambia, can be counted at the same time as the number in neighbouring Angola.
Simply put, you can set a different person to counting in each country, without worrying about the order in which they start – and in the biggest countries, you could subdivide the task again, say by using one person in each province, and so on.
The long and winding critical path
Ron Rivest’s 1999 puzzle can’t be divvied up like our elephant counting exercise – at heart, it’s just a single tight loop with the iteration count devised to last about 35 years, based on estimates of how computing power would increase over the period.
Rivest even worked in an annual upgrade, assuming puzzle solvers would suspended calculations for a moment once a year to update their computers to the latest and fastest on the market.
Why 35 years?
The puzzle was announced to commemorate the 35th anniversary of MIT’s Laboratory for Computer Science, which opened in the early 1960s, and was supposed to take a further 35 years to work out, in time for the 70th anniversary.
In the end, it took only 20 years before the puzzle was completed, by the aforementioned Bernard Fabrot.
As we said above, he needed 3.5 years of elapsed time from start to finish, thus finishing nearly 15 years early despite starting more than 15 years late.
The algorithm itself is easy to implement.
But the hard part in completing the solution is never losing track of where you are.
There aren’t any “tricks”, other than regular, precise backups of the computational state, to let you recover if your computer crashes or an error creeps in.
Without well-managed backup, any glitch or outage means going right back to square one.
There’s also the challenge, of course, of not losing faith along the way and packing it in.
Here’s the puzzle, as Rivest set it:
The problem is to compute 2^(2^t) (mod n) for specified values of t and n. Here n is the product of two large primes, and t is chosen to set the desired level of difficulty of the puzzle.
2^t is Ron Rivest’s text notation for 2 raised to the power of t, or 2t, and the notation
mod n means to take the remainder, or modulus, that is left over after dividing the original number by
For example, 3 goes into 6 exactly twice, so
6 / 3 = 2 remainder 0; but if you divide 7 by 3, you get 2 with 1 left over, so that
7 / 3 = 2 remainder 1.
t to a value of just under 80 million million and constructed a value of n with 2048 binary digits (512 hexadecimal digits or 256 bytes), like this:
t = 79,685,186,856,218 n = 0x32052C40E056ED2C9141FC76C060FA685F60C45095EB69930CBE4B2C81B19C33 55FA9149150D7082284CC3903C12B98DACC7E2FC7C16907F8E946AEFB5FD1240 77E05D944B6738334E71A9BD37E1C08F2DF3D119EB95182B0F3E87B341A217BB 433F2114FEAE1555CFB974DA3D56D4AD7C1D83FD789F34143CDD3D502C104639 EE68DDC8D56D5BC6EAAC7ED16C1F5FF02159B5D52AF94979A18A60EFCABE109E E2E90C14B6FC1225B754644D989FC1B9F87552B255997CEE22429CF49E3599DA 4B3F6D5535B83072A1D4357AE1ABFF8455B80C438EC33A5C7C6CB1ACE22C62FE 67B3040029B3C37E5EC682363A77D42FB223E194878E146D06739EC4E598A9A1
The idea is that there is no shortcut by which you can compute the final answer, unless you know the two prime numbers that were multiplied together to calculate the value of
Ron Rivest, and now Bernard Fabrot, know the prime factors of
n, let’s call them
q, but no one else does.
(Rivest needed a shortcut for his own use so that he could work out the answer in advance in order to decide when a competitor really had solved it.)
So to solve this puzzle without
q you just have to keep on multiplying over and over and over until you get to the end of the calculation.
Each multiplication in the loop uses the previous output as its input, so you can’t split the computation up between multiple computers, CPUs or even CPU cores.
How to code it
Let’s take a stab at the problem, using Python.
In Python, the operator
** means ‘raise to the power of’ or ‘exponentiate’, and
% means ‘find the remainder after division by’ or ‘modulo’, and we’ll assume that the function
seq(n) loops through the numbers
3, all the way to
exp = 2 ** 79685186856218 val = 2 ** exp val = val % 0x32052...98A9A1
exp in line 1 is a number with nearly 20 million million decimal digits.
So you need 10 terabytes just to store that value – and yet in the next line of code we aim to raise 2 to the power of that already alarmingly large number.
If we implement the
** operation as repeated multiplication, with
2**n calculated as an n-step loop of multiplying 2 by 2 by 2 … n times, you will see just how intractable this calculation looks:
exp = 1 for i in seq(796851868562180): exp = exp * 2 val = 1 for i in seq(exp): val = val * 2 val = val % 0x32052...98A9A1
That’s 80 million million loops in the first
for loop, just to find out how many gazillion bazillion loops we have to do in the real calculation in the second
Exponentiation by squaring
Fortunately, there’s a slicker way to do the calculation.
In the looped expression
val = val * 2 we boost our exponent by 1 every time, calculating 21, 22, 23, 24, 25 and so on as the loop proceeds.
But if we keep squaring
val instead of doubling it, like this…
val = val * val
…then we double the exponent each time instead of incrementing it, so we loop through 21, 22, 24, 28, 216 and so on.
So we can calculate 2(2t) with just t loops instead of 2t loops, like this:
val = 2 for i in seq(796851868562180): val = val * val # Now find the remainder val = val % 0x32052...98A9A1
It’s too big!
This code is still no good, even though we now have 80 million million loops in total, intead of a ridiculous 280,000,000.
val just gets too large to handle as we go along.
Even doing just a few loops in line 2, you can see that the time taken for each extra iteration pretty much doubles at every step, because we’re multiplying numbers that have twice as many digits each time:
millisecs | value computed 0 | 2^(2^ 1) 0 | 2^(2^ 2) 0 | 2^(2^ 3) . . . . . . . 0 | 2^(2^16) 1 | 2^(2^17) 2 | 2^(2^18) 5 | 2^(2^19) 11 | 2^(2^20) 25 | 2^(2^21) 49 | 2^(2^22) 97 | 2^(2^23) 195 | 2^(2^24) 406 | 2^(2^25) 833 | 2^(2^26) 1690 | 2^(2^27) 3513 | 2^(2^28) 7182 | 2^(2^29) 14832 | 2^(2^30)
Fortunately, in modular arithmetic, you can either take the remainder at the end, as above, or repeatedly, at every step along the way.
Only the remainder needs to be fed back from the bottom of the loop to the next multiplication.
So, you can shift the
% operation inside the
for loop (in Python the indentation shows what loop level the code is at):
val = 2 for i in seq(796851868562180): val = val * val # Calculate the remainder each time round # inside the loop, not once at the end val = val % 0x32052...98A9A1
The remainders are all a fixed length – in this case, they can’t be larger than
n-1, and given that
n is 2048 bits long, that means each stage involves multiplying 2048 bits by 2048 bits, and then dividing the product back down to a remainder of 2048 bits.
The accumulated answer
val is a 2048-bit number at the end of each iteration and thus 2048 bits at the start of the next, so each extra turn round the loop adds the same amount of time, so now we are flying:
millisecs | value computed 0 | 2^(2^ 1) 0 | 2^(2^ 2) 0 | 2^(2^ 3) 0 | 2^(2^ 4) 0 | 2^(2^ 5) 0 | 2^(2^ 6) . . . . . . . 0 | 2^(2^27) 0 | 2^(2^28) 0 | 2^(2^29) 0 | 2^(2^30)
On my Mac, doing 1,000,000 iterations takes just over 16 seconds, so I can do about 62,500 iterations a second.
At that rate, it would take me
80,000,000,000,000 / 62,500 seconds to complete the puzzle, which is just under 15,000 days, or about 40 years.
A top-end Mac and an optimised square-and-divide program might very reasonably double both my CPU speed and the computational efficiency for a 4× overall speedup, but I’d still be looking at 10 years to work it out.
Well done, Bernard
Fabrot did it in 3.5 years, and he didn’t have today’s hardware when he started.
So it was quite a result – and an exciting victory considering that he used commodity hardware.
Following Fabrot’s announcement, a cryptocurrency startup jumped on the bandwagon by publicly claiming that it solved this puzzle in two months using specialised hardware known as a Field Programmable Gate Array (FPGA).
But even though the link on its website explicitly uses the past tense in boasting that the group ‘solved it in 2 months’, the link is a dummy that doesn’t go anywhere, and the Github account supposedly hosting the source code still says ‘Code coming soon’.
The glory goes to Fabrot, who showed that quiet competence is its own reward…
…and is also an important reminder that when cryptographers warn us that attacks only ever get faster, they’re not wrong!