YouTube learns about integer overflow, Gangnam Style!

An integer overflow bug in YouTube!

Who would have thought?

Certainly not Google, which assumed a 32-bit number would be enough to keep track of how many times a video had been played.

To explain.

In a four-digit decimal number, the biggest value you can represent is achieved by putting the biggest digit in every position, so you max out at 9999.

Similarly, in a four-bit binary number, the biggest possible value you can store is 11112 (the little subscripted 2 means base 2, or binary), which is 15 in decimal.

Generalising, N bits can store 2N different values.

So, if you start counting at zero, you will reach 2N–1 before you exhaust all possible 2N values.

Therefore the biggest number you can fit into 32 bits is 232–1, or 4,294,967,295.

That’s one less than 4G, or approximately 4 billion.

Signed versus unsigned

Actually, the YouTube video counter uses what’s called a signed 32-bit number.

Loosely speaking, the most significant bit of the 32 is used to denote whether the number is positive or negative.

So, in 32 bits of signed number, there is one bit for the sign, and 31 bits available for counting, meaning that you can represent 232 different numbers ranging from -231 up to 231–1.

In other words, if you try to count past 231–1, the arithmetic goes haywire because the addition produces a carry, which propagates into the bit that determines the sign of the number.

→ In fact, because negative numbers are usually made not just by setting the special “is this number negative” bit but also by flipping all the others (the so-called complement system), when you add 1 to the biggest positive number, you wrap round to the biggest negative number and start going up from there. So, when you get an integer overflow, adding 1 does not give the next number in sequence.

In other words, if any YouTube video should ever rack up 2,147,483,647 views, it will have hit its limit.

If one more person watches it, the count will abruptly become negative. (Why did Google use signed numbers? How do you watch a video a negative number of times?)

Arbitrary limits considered harmful

Is a 2G limit in this context actually a problem?

Google’s 17-character restriction on Android passwords is a ridiculous limit; Microsoft’s 16 character Outlook.com passwords are even worse; but as arbitrary limits go, two billion YouTube views feels as though it ought to be more than enough.

Can you imagine how many real people would actually have to watch a video to rack up viewing figures that high?

It’s easy just to say, as Google now admits, “We never thought a video would be watched in numbers greater than a 32-bit integer.”

The problem, though, is that even really big numbers sometimes stop being big enough, no matter how excessive they may once have seemed.

Even daily human tasks can clock up surprisingly large counts: as we reported recently, the South African Navy has now fired its 300-year-old Noon Guns in Cape Town on more than 65535 days in the past two-and-a-bit centuries of unbroken noonday gunning.

Incidentally, 65535 is the biggest value you can store in a 16-bit number, and formed the infamous 64KB limit of early home computers.

But we’re now well past the 4GB limit of 32-bit addresses, and even the 64GB limit (36-bit addressing) that is possible on later-model 32-bit Intel CPUs.

Today’s commodity 64-bit CPUs can already support up to 48-bit addresses (256TB); are ready to move to 52 bits (4PB) when we need that much; and can, in theory, be expanded all the way to 64 bits (16EB).

Counting on YouTube

So you probably won’t be surprised to hear that the 2 gigaview limit on YouTube has now been surpassed.

Admittedly, our headline was a bit of a spoiler: the double-gigabusting video is Oppan Gangnam Style by Park Jae-sang, though you probably know him as Psy.

We’re not going to embed the video here, even though Google has apparently patched its counting mechanism to cope with the South Korean rapster’s enormous popularity.

We’ll just assume you’ve already watched it.