12 Days VI: http://www.johntitor.com

「シュタインズゲート・瓦礫と助手」/「huke」

Even though the track record for anime adaptations of Nitro+ games has been abysmal, I couldn’t help but hope that Steins;Gate would turn out well. After all, it’s the best visual novel ever or something according to Japan. And if it didn’t pan out, I guess I’d just fall back on the visual novel with huke’s pretty art.

And as I was watching it, I thought it was pretty funny and all. I was still trying to figure out why Japan loved this thing so much. It wasn’t until someone mentioned something about 2ch jokes that I think I got it.

So I’d known that Steins;Gate was about time travel, based on the synopsis. What I didn’t know was that the story was basically based on the John Titor story. And it the connection wasn’t superficial either. It wasn’t just, there’s a guy calling himself John Titor from the future on 2ch. A lot of details, like the IBM 5100 and the mechanics of time travel and divergence numbers, are taken straight from the story. You can think of Steins;Gate as one giant John Titor reference.

This is fascinating to me, because even though I’m not intimately familiar with the details of the John Titor story, I’ve been on the Internet long enough to remember this whole thing. It’s kind of weird to see something that was sort of the early-web version of an internet meme be used as the skeleton of a more fleshed-out story. I guess something like Densha Otoko would qualify as well, but I think there’s a lot less you can do with it to make something unique like Steins;Gate.

This whole aspect of Steins;Gate got me because I am a nerd and nerds friggin love references. I mean, look at any Shaft anime. But yes, Steins;Gate has tons of references and where there aren’t any, it’ll just create memes. For the most part, these references are all internet memes. If it’s not part of the John Titor mythos, they’ll be working in a 2ch meme. And then there’s stuff like EL PSY CONGROO and tutturu~ which seems like it was designed to be spammed across the internet.

The characters themselves are exactly the same kinds of people. After all, they encounter John Titor on 2ch. What kind of people read and post on 2ch? Daru and Okarin are really obvious otaku. But Mayuri seems relatively normal, except that she works at a maid cafe in Akiba and buys doujin at Toranoana. Okay. Even the most respectable and well-adjusted of the cast, Kurisu-TINA, is a closet VIPPER.

So I guess what’s special about Steins;Gate isn’t that it’s just a good story about a bunch of friends who fall into some time travel conspiracy. It’s that it’s a good story about a bunch of friends who are just like the people watching it who fall into a time travel conspiracy based on an internet urban legend.

The Future Gadget Lab vs. Kolmogorov complexity

「Steins;Gate」/「いさりび」

In Steins;Gate 12, the Future Gadget Lab is faced with a problem. They need to send a copy of a person’s memory, or about 3.24 TB, to the past, but they can only send 36 bytes. What is a MADDO SCIENTISTO to do? What they end up doing is sending that data to the Large Hadron Collider to compress with a black hole. Problem solved! Or is it?

Obviously, using a black hole to compress data is silly, since a black hole would probably just destroy the data. In fact, it’s probably unnecessary, since if it were physically possible to achieve that kind of compression, you’d be able to do it on any computer because of the Church-Turing thesis. The only thing that would change is how quickly the algorithm would run. Luckily, in theoretical computer science, when dealing with the question of whether something is possible or not, we don’t care how long it’ll take. So, is there a compression algorithm that can give us the desired result?

Basically what we’re doing when were compressing stuff is we’re trying to rewrite a bunch if zeroes and ones, or strings, into a smaller bunch of zeroes and ones that mean the same thing. In general, this is impossible to do, since there are less strings of zeroes and ones when you have less digits, so you can’t arbitrarily stuff, say eight bits of information into seven bits. What you can do is create a way to describe the zeroes and ones in a way that is smaller than what you start out with.

For instance, we can describe strings of zeroes and ones in plain English. It’s shorter to write “one hundred zeroes followed by one hundred ones” than writing

00000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000
11111111111111111111111111111111111111111111111111
11111111111111111111111111111111111111111111111111

The obvious problem with our compression scheme so far is that there are some strings that are not very easy to describe and it might be longer to describe in English than it would be to just write the bits themselves.

Again, it appears to be difficult to create a compression scheme that works for any old string. It’s much easier to create such an algorithm when you know something about the kind of string you want to compress, like if it’s a certain size or if it has any patterns or something. In the case of the FG lab, we know they won’t need to worry about compressing strings of length 288 or less. Unfortunately, there isn’t much more we know about the strings they want to compress other than the fact that they’re huge.

Actually, there’s another problem, which is that they need a way to decompress the string they’ve sent. If I give you a description of the string in English, it’s not going to do you any good unless you already know English. So in addition to sending a compressed string, they need to send information about the compression scheme they used, all in 36 bytes.

In information theory, the length of the shortest description of a string $x$ is its Kolmogorov complexity, which we’ll denote $C(x)$. Formally, $C(x)$ is defined in terms of encodings of Turing machines that generate $x$, but it’s pretty easy to argue that the choice of how we describe $x$ doesn’t matter up to a constant term.

We know that $C(x) \leq |x| + O(1)$ (where $|x|$ is the length of $x$), since we can always just write out $x$. We also know that for every length $n$, we can always find a string $x$ such that $C(x) \geq |x|$ since there are $2^n$ strings of length $n$ but only $2^{n-1}$ shorter descriptions of those strings. These strings that can’t be compressed are defined to be random.

We can quantify this idea of compressibility by saying that a string $x$ is $c$-compressible if $C(x) \leq |x| – c$. And so, we can argue about how likely it is that we have a string that can be compressed to the degree that we want. For strings of length $n$, we have that at least $2^n – 2^{n-c+1} + 1$ of them are incompressible by $c$.

We have some numbers, so let’s play around with them. Remember from earlier that the approximate size of Okarin’s memories is 3.24 TB, or $3.24 \times 2^{40} \times 8$ bits, while the limit we can send is 36 $\times$ 8 = 288 bits. If we want to squeeze Okarin’s memories into a text message, we’ll have
$$c = 3.24 \times 2^{40} \times 8 – 288 = 2.849934139166592 \times 10^{13}.$$
WolframAlpha helpfully tells me that this is slightly more than 1.4 times as many red blood cells in the human body.

We can calculate the probability that we have a compressible string by figuring out the probability that incompressible, which we can do with our lower bound on the number of incompressible strings we have. Taking $n = 3.24 \times 2^{40} \times 8$, the probability is given by
$$1 – \frac{2^n – 2^{n-c+1} + 1}{2^n} = \frac{1}{2^{c-1}} – \frac{1}{2^n}.$$
Plugging our numbers in, we get
$$\frac{1}{2^{3.24 \times 2^{40} \times 8 – 288 – 1}} – \frac{1}{2^{3.24 \times 8 \times 2^{40}}}.$$

And unfortunately, after this step, WolframAlpha kind of refused to try to do the computation. I gave it a shot on my laptop in Python but all that happened was CPU usage went up to 100% for a few minutes and it started eating all of my memory. I kind of wanted to keep using my computer instead of waiting it out for a MemoryError exception, so I interrupted it. Doing some further estimation, I threw $2^{2^{40}}$ at WolframAlpha, since it’s kind of about the same neighbourhood in magnitude as the number we’re trying to get. Apparently, it’s a number that’s about 300 billion digits long, so one in that are the kind of odds you’re looking at.

I guess this is another way of saying there’s no chance in hell that what the FG Lab tried to do would ever work. And remember that we’re not talking about any particular compression algorithms or technologies here. We can’t come up with a better algorithm or scheme. We can’t use faster computers. We can’t use quantum computers (Holevo’s theorem says that we can’t, in general, use less qubits than classical bits for the same string). From an information theoretic standpoint, it is impossible.

But I’m pretty sure everyone already could’ve guessed that anyway.