The 4th annual π day anime and mathematics post: Traveling teacher time

Korosensei is fast, but can he do better?

Korosensei is fast, but can he do better?


Korosensei can move at Mach 20 and takes advantage of this fact to make frequent trips around the world. Obviously, in the interest of efficiency, he often takes the opportunity to make multiple stops per trip. But what is the most efficient way for him to do so?

We can think about this using graphs. Suppose all the possible destinations are vertices and edges with costs connect the vertices, representing the cost of traveling from some destination $u$ to another destination $v$. At first glance, it might make sense to use physical distance in our case, but that is not necessarily a measure of the “best” way to get there. For instance, taking a bus or train between two cities may get you to the same place, but the actual cost depends on the path that’s taken and you may choose to include other criteria like the cost of tickets and other stuff like that.

So the problem is this: Korosensei has a list of places he wants to hit up during lunch hour. What is the most cost-effective way of getting to each of those places exactly once and get back to school? This is what’s known as the Traveling Salesman problem (TSP).

As it turns out, this problem, like many graph theory problems, is NP-complete, meaning it’s quite computationally hard and taxing for computers to solve. Formally, if we’re given a graph $G = (V,E)$ and a cost function $d: V \times V \to \mathbb R$, the problem is to find a cycle that visits every vertex exactly once with smallest possible cost. Usually, if we have $n$ vertices, then $G$ is the complete graph on $n$ vertices, that is, a graph where every vertex has an edge to every other vertex. Otherwise, it may not be possible to find such a cycle (and the problem of finding such a cycle is also an NP-complete problem, Hamiltonian cycle).

It’s called the Traveling Salesman problem because the idea is that if you’re a traveling salesman, you’d want to find that least-cost cycle because it’d be the most cost-effective way to hit up all the cities you need to go to in order to sell whatever it is that you’re selling. Of course, the hardness of the problem means that our poor salesman or Korosensei are stuck having to check the cost of every cycle. How many are there? Well, since we’re dealing with a complete graph, this means we can get to any vertex from any vertex so we can take any permutation of our vertices and that’s a cycle, and there are $n!$ permutations. This means that as $n$ gets larger, the difficulty of solving this problem grows exponentially.

One way we can attempt to make NP-complete problems feasible to solve is to loosen our demands a bit. Right now, we want the absolute least-cost cycle, but what if it’s not necessary for us to get the least cost cycle? What if we’re okay with getting something that’s within a factor of 2 or 3 or $f(n)$ for some function $f$ that depends on the size of the input? These are called approximation algorithms, in that they approximate the optimal solution by some factor. Unfortunately this doesn’t work for TSP.

Why? Because if I have a feasible algorithm for approximating TSP, then I’ve got an efficient algorithm for solving Hamiltonian cycle, which I mentioned was also NP-complete. Suppose I take any old graph $H$ with $n$ vertices and turn it into a complete graph with all my original edges having cost 1 and any new edges I add having cost $\rho \cdot n + 1$. If I do have some approximation algorithm that gives me a solution within a factor of $\rho$, I’ll run it and get a cycle for Korosensei to use. But, based on the cost of the cycle I get, I can tell if my original graph had a Hamiltonian cycle: if it has cost $\leq n$, then that means it found a Hamiltonian cycle using all of the edges in my original graph. Otherwise, we get a cycle with cost at least
$$(\rho \cdot n + 1) + (n-1) = \rho \cdot n + n > \rho \cdot n$$
meaning it must’ve used one of the new expensive edges I added and so we know the original graph didn’t have a Hamiltonian cycle.

Now this is a bit discouraging, but Korosensei would encourage us to keep on thinking about how to assassinate the hell out of this problem. In the most general case, the cost function $d$ has no constraints and the way that I’ve initially motivated Korosensei’s problem, $d$ can be arbitrary, with costs adjusted to his needs. However, some of the attempts to make TSP more computationally feasible have to do with making some reasonable restrictions on our cost function. This is another fairly standard approach to making computationally hard problems easier to deal with: figure out some cases of the problem that are still useful but might help to simplify or restrict the problem a bit.

For example, in most cases, it’s a reasonable expectation that if you’re going from $u$ to $v$, it’s best to go there directly rather than stopping at $w$ on your way there. This is what we call the triangle inequality:
$$d(u,v) \leq d(u,w) + d(w,v).$$
It’s called the triangle inequality because of what I just described: draw a triangle out of $u$, $v$, and $w$ and the distance between any two should be shorter than going through the third point.

I say should because for an arbitrary cost function, this isn’t always the case. One example is flight pricing. Sometimes, it’s more expensive to fly from $u$ to $w$ than it is to fly from $u$ to $v$, even if the flight stops over at $w$. Anyhow, Korosensei doesn’t have to deal with silly things like flight pricing and so we impose this reasonable triangle inequality condition on his cost function. Does it help? It turns out it does and we can get an approximation that’ll be at most twice the cost of the optimal solution.

How? Well, first we find a minimum spanning tree. A spanning tree of a graph is a tree (a graph that contains no cycles) that connects all of the vertices of a graph together. The minimum spanning tree is the spanning tree with the least weight out of all the spanning trees. The nice thing about MSTs is that we know how to do this efficiently.

Once we have our MST $T$ of our graph $G$, we double up every edge in $T$ to get a cycle $T’$. This cycle what we want yet, since it travels along all the same edges and goes to each vertex twice. But, we can get the cycle we really want by removing all of the second times we visit a vertex and we can do this because the triangle inequality tells us that by taking a more direct route, the cost of the cycle won’t increase. So now if we have our cycle $C$, then we have
$$cost(C) \leq cost(T’) \leq 2 \cdot cost(T).$$

Now, to compare to the cost of our actual optimal solution $C^*$, we notice that taking out some edge of $C^*$ makes it a spanning tree. It’s not necessarily minimal, so we’ve got
$$cost(T) \leq cost(C^* \setminus \{ e \}) \leq cost(C^*).$$
Putting that all together, we’ve got
$$cost(C) \leq cost(T’) \leq 2 \cdot cost(T) \leq 2 \cdot cost(C^*)$$
and there’s our 2-approximation.

Given how fast Korosensei is, this is probably good enough for him and is a decent tradeoff between the time it takes to solve the problem and his actual travel time.

The fourth annual π/ホワイト day animaths appreciation post

Qualia 8

Let us celebrate this day with Nadeko.

Okay, now let’s figure out what the hell Alice is trying to claim. Alice can solve equations by turning them into pictures and solve them instinctively like that. What implications does this have?

First, we probably want to talk about what the P/NP problem is. In computational complexity, we talk about the complexity of problems and we measure this as the amount of resources we need to solve a problem. The two most common measures are for time and space. That is, roughly the amount of time it takes to solve a problem or the amount of space (so, like memory) that’s required to solve a problem. This is expressed as a function of the size of the input.

Formally, all of this stuff is defined in terms of Turing machines, but we don’t need that level of specificity. Intuitively, we can think of P as the class of problems which we can solve in a reasonable amount of time (or “efficiently”) and NP as the class of problems for which we can verify the correctness of a given solution in a reasonable amount of time. By reasonable, we mean that the amount of time it takes scales reasonably depending on the size of the input or problem we want to solve.

The P/NP problem asks whether P is equal to NP. If you think about it for a bit, it seems kind of intuitive that those two classes shouldn’t be equal. It’s a lot easier to figure out whether an answer to a problem is correct than it is to find a correct answer. And most computer scientists also agree that that’s probably the case. The hard part is actually proving it, which no one has been able to do since 1971, when U of T’s Stephen Cook first formulated the problem.

The classical NP-complete problem is the satisfiability problem. If I have a boolean formula, is there an assignment of true/false values that makes the formula evaluate to true? Testing this is easy: if you give me an assignment that you claim works, I can check it in the amount of time it takes to look at each variable. Trying to come up with an assignment is a lot trickier. The easiest thing to do is try trial and error, but that could take an exponential amount of time. Of course, there are a lot of fancy tricks we’ve come up with for this problem, but we still have no way of solving this problem in a way that we can say is theoretically efficient (whether it’s the case for practical uses is a different matter).

What’s baffling about P/NP is that the idea is rather simple to prove or disprove. We have these notions of hardness and completeness which allow us to reduce problems to other problems. In particular, the class of NP-complete problems are those problems which are called NP-hard and contained in the class NP. These problems have the property of being able to reduce to any other NP-complete problem.

A reduction basically says that two problems are about as computationally hard as each other. The idea is that if I have some problem I want to solve but I don’t know how to do it, I can transform it (with a reasonable amount of time) into some other problem that I hypothetically do know how to solve. The reasoning is that if I can do that, then I know that the problem I wanted to solve takes about as much resources to solve as the problem I already know how to solve. And it’s pretty remarkable, then, that every NP-complete problem has this property of being able to reduce to any other one.

So the way to prove that P=NP is to figure out a way to solve any NP-complete problem in polynomial time. Since every NP-complete problem reduces to every other one, you just have to choose your favourite (read: easiest) NP-complete problem out of the several thousand that we know of and figure out an efficient algorithm for it. Because of the reducibility property, if you solve one efficiently, you’ve basically solved them all efficiently. This means that all problems in NP can be solved efficiently and it turns out P=NP. And yet, no one has been able to come up with an efficient algorithm for any NP-complete problems.

On the other hand, while at this point it’s pretty obvious that P≠NP, we have no idea how to prove it. The field of computational complexity and all the tools that we’ve developed over the last few decades are all interesting and useful, but they were all in an attempt to show that P≠NP and so far, nothing we’ve thrown at it has stuck. In fact, we’ve ended up creating a lot more questions that seem to have interesting structures and intuitive answers but we still have no way of proving a lot of these conjectures.

Of course, in the last decade or two, there’s been a lot of focus on quantum computing and with the rise of quantum computing, quantum complexity theory. It’s commonly believed that quantum computing gives us exponential speedup and allows us to solve NP-complete problems in polynomial time.

Well, it doesn’t.

The idea is that since you have a bunch of states in superposition, you also have all possible outcomes stored in those states. Unfortunately, it doesn’t work like that and there are a bunch of complications that arise in trying to extract outcomes of computations stored in qubits that render this whole point moot.

There’s nothing inherent in quantum computing that brings us closer to actually resolving questions that we have about the complexity hierarchy we’ve defined. In fact, quantum computations can be simulated on classical computers in exponential time, so it’s not like there’s nothing we don’t already know about computation, we can just certain kinds of computations faster.

And just like the other tools we’ve developed in computational complexity to try and resolve this question, quantum computing throws another wrench into the problem by introducing quantum analogues of classical complexity classes. Some of these, like QIP, the class of problems solvable by quantum interactive protocols, turn out to have no additional power (it was proved in 2010 by Jain, Ji, Upadhyay, and Watrous that QIP = IP). And some, like BQP, the class of problems solvable in bounded-error polynomial time, have the same problem where we’re not entirely sure how it relates to the other classes.

So knowing all of this, what does Alice’s claim mean exactly? Unsurprisingly, it doesn’t mean anything. If Alice can solve equations intuitively by drawing pictures, then she can probably already solve PSPACE-complete problems (since there are equation problems which are known to be in PSPACE), which means that it doesn’t even matter whether or not she has access to a quantum computer, since she can already solve problems harder than the ones in the quantum analogues of P and NP.

The lesson here is that little girls would not necessarily make very good mathematicians even if they can compute computationally intractable problems (this is basically what Gödel’s incompleteness theorems say too).

World End Economica episode 1: Quantitative Analysis 101

A long time ago, I was searching for news about the newest Kishida Kyodan and the Akeboshi Rockets album, POPSENSE. POPSENSE came with a bunch of these funny little blue-haired faces, one of which was the album cover. One of these faces was noticeably different, having black hair and red eyes. It turns out that was supposed to be Hagana, one of the main characters from World End Economica and it turned out that the OP track was on this album.

That’s how I found out about World End Economica, which was a weird-sounding title and surely couldn’t have had anything to do with economics. But I was wrong, because it so happened that it was written by a certain Isuna Hasekura, who was responsible for everyone’s favourite wolf goddess medieval mercantile adventure. This was exciting to learn about until I realized that it was never going to be translated because it was a doujin visual novel. I shouted into the wind on twitter and magically received a response:

So now it’s been almost a year later and finally available for purchase with my hard earned yencoins so I can actually read it. Is there anything to it beyond economics on the moon? Why, yes, in fact, there is.

World End Economica is about quantitative analysis.

Our story begins in the grand moon city with a young vagrant, Yoshiharu, who’s pretty good at trading and has an intuition for it. At this point I’m wondering how the hell human civilization managed to build a financial centre on the moon but it’s still possible for someone to succeed at trading manually. And the story answered with Hagana, an incredibly socially awkward girl whose only skill is being amazing at math and is distraught because math is actually totally useless in the real world.

You might be able to see where this is going. Quantitative analysis is what we call the application of mathematics to finance. The idea is to take into account all of the data on the market and somehow model it so that you can predict and optimize when you buy and sell various stocks, how much of each stock, and at what rate. And of course, when you’re dealing with this much data and this many calculations, you’ll need to get a computer to do all of this for you and you can get computers to automate trading for you. This automation is what’s called algorithmic trading, where you essentially rely on computers and algorithms to figure out and do stuff automatically.

Because of how well it’s able to optimize profits and how quickly it gets data and processes it, it’s how a significant amount of trading goes on today. As a result, the market becomes a giant feedback loop of inputs going into these algorithms. The algorithms process this stuff and make decisions and take actions and generates a whole new set of data to be fed back into the algorithms again.

Now, this is great for all the mathematicians and computer scientists out there because all of a sudden, there’s another career track that’s opened up that’s willing lure us away from academia with large bags of money. But then the question is that once algorithms are doing all the trading, what do the traders do? This is something that I don’t have an answer to because I don’t really know that much about finance.

This also turns out to be a question that our MC asks himself too. He’s managed to help this girl discover that her talents aren’t a waste and that she can use them to help people. But in the course of using and developing her skills, she’s basically making bank and obsoleted him out of nowhere. So, is there a role for people in trading and finance if computers can do it all?

And that could be a pretty frightening question if you think about it. We’re used to robots replacing people for menial labour because they’re stronger or more efficient and better at doing rote tasks. But now, we’re able to replace traders with computers. And if you think about that for a bit, you realize that these algorithms fighting it out on the market are responsible for a huge chunk of wealth in the world right now.

And that brings us to the other question, which is can we trust the computer? Obviously, someone has to know how to transform the processes and data so that it can be shoved into a computer, but once we have something complex enough, it kind of morphs into a magical box. There’s no way to verify that this thing you’re feeding a ton of data is doing the right thing. So when your gut and the box are in conflict, which do you trust when you have tons of peoples’ money on the line?

Unfortunately, this is only episode 1 so I have to wait an unspecified amount of time before Spicy Tails decides to translate and release the next installment to find out where this is going. So far, it’s been intriguing enough for me to stick around, even though the art and music are fairly sparse and the editing could use some work. I guess the answer to getting my attention is to write a story about math.

The (1st annual?) π day anime and mathematics lecture

Happy $\pi$ day. We’ll begin with an obligatory video.

One of the reasons I enjoyed Summer Wars so much is because the main character’s superpower is math. Well, okay, you say, he’s really good at math, but so what? A lot of people complain about the implausibility of OZ, but those of us with a basic understanding of cryptography and number theory will have been drawn to Kenji’s quick problem solving work with an eyebrow raised. So let’s talk about why Kenji is a wizard.

Kenji doing some friggin mathematics

Kenji doing some friggin mathematics

We’ll start with modular arithmetic, which Kenji mentions to Natsuki on the train ride to Ueda. When we divide numbers, we often end up with remainders. Suppose we divide some integer $k$ by $N$ and we get a remainder of $r$. Then we say that $k$ and $r$ are equivalent $\bmod{N}$ and we denote that by $k = r \bmod{N}$. Because it’s how division works, for any integer $k, r$ will be some number from $0$ to $N-1$. It turns out a lot of arithmetic operations work the same way in modular arithmetic: adding, subtracting, and multiplying numbers and then taking the modulus of the result will give you the same number as adding, subtracting, and multiplying the moduli of the numbers you started out with.

However, division doesn’t work as we would expect it to. So we have to think about division (or the equivalent operation) differently. Instead of thinking of division as splitting a group of stuff into smaller groups, we’ll think of it as multiplying by an inverse. What’s an inverse? Well, we can try thinking of it in terms of addition. It’s pretty intuitive that subtraction is the opposite of addition. If we have some integer $k$, then the additive inverse of $k$ is $-k$. When we add $k$ and $-k$, we get $0$, the additive identity. The identity is just the special number that we can add to anything and get that same thing back unchanged ($n+0 = n$). In the same way, if we multiply $k$ by its inverse, $k^{-1}$, then we’ll get $1$, since $k \times 1$ is just $k$ again. What this means is that the inverse of $k \bmod{N}$ is just some other number $j$ from $0$ to $N-1$ such that $j\cdot k = 1 \bmod{N}$ and it’s just multiplication again.

Now, the problem with this is that it’s not guaranteed that there’s always an inverse hanging around in $\bmod{N}$. In particular, if $k$ and $N$ share any divisors, then $k$ won’t have an inverse $\bmod{N}$. This is interesting because it also tells us that if we consider integers mod a prime number $P$, then every integer $\bmod{P}$ has an inverse, since $P$ doesn’t share any divisors with any integers from $0$ to $P-1$. We call these things that have inverses units. So if we have a unit $k$, then $k^m$ is also a unit, for any integer $m$. We even have a funny function $\phi$ defined such that $\phi(n)$ is the number of units in $\bmod{n}$.

Love Machine solicits help

Love Machine solicits help

So taking everything we’ve learned, we can set up a cryptosystem! The one we’ll be looking at is called RSA, after the guys who invented it. We have Bob who wants to securely send a message $M$ to Alice. Alice chooses two prime numbers $p$ and $q$ and figures out $m = pq$. She also goes and figures out $\phi(m)$, which happens to be $(p-1)(q-1)$. Finally, she picks some integer $k$, a unit in $\bmod{\phi(m)}$. She lets everyone know $m$ and $k$, but she keeps $p$, $q$, and $\phi(m)$ secret.

So Bob wants to send $M$, which is just his message conveniently in number form. He makes $M$ into a number between $0$ and $m$, and if $M$ is too big, he can just break it up into chunks. Bob figures out the smallest $b$ such that $b = a^k \bmod{m}$ and sends $b$ over to Alice. Now, since Alice has $k$ and $\phi(m)$, she can also find $k^{-1}$ pretty easily. Once she has that, she can get the original message by figuring out $b^{k^{-1}} = (M^k)^{k{-1}} = M \bmod{m}$, since $kk^{-1} = 1 \bmod \phi(m)$.

The interesting thing here is that all of the information is out there for someone to encrypt a message to send to Alice, but no one is able to decrypt it. Well, they’re able to decrypt it if they know what $p$ and $q$ are, since once they’ve got that, they can get $\phi(m)$. But it turns out getting $p$ and $q$ from $m$ (which Alice just throws up on the interwebs) is really hard. And it really works for reals, because RSA is pretty widely deployed for things like keeping your credit card information safe while you send it through the tubes to Amazon.

A conveniently displayed ciphertext

A conveniently displayed ciphertext

Let’s go back and think about units some more. Of course, there are only $N$ numbers in the integers $\bmod{N}$, so there’s a point at which $k^m$ is just $1$ again and starts over. If $k^m = 1 \bmod{N}$, we say that $m$ is the order of $k$. But why do we care about finding the order of $k$?

It turns out finding the order of elements is very, very similar to factoring an integer into primes and other related problems, like discrete logarithms. If we can find orders of elements, it won’t be too hard to figure out how to factor a number. In this case, the eavesdropper wants to figure out what $p$ and $q$ are, so they’ll want to factor $m$. And it turns out a lot of other public-key cryptosystems (like elliptic curves) are based on the difficulty of factoring.

How hard could it be? Well, we could just check every possibility, which doesn’t seem that bad for a number like 48, but once we get into numbers that are hundreds of digits long, that might start to suck. It turns out the fastest known algorithms for order finding take approximately $e^{O(\log N \log \log N)^{\frac{1}{2}}}$ steps. Current key lengths for RSA are at least 1024 bits, which would give us about 4.4 x 1029 operations. Assuming three trillion operations per second (3 GHz), it’d take a PC about 4.7 billion years. Sure, you could just throw more powerful computers at it, but they’d just double the key size and suddenly, you’d need to do 1044 operations.

It's a lot easier to write math than typeset it

It's a lot easier to write math than typeset it

Well, that’s not entirely true. One of the breakthroughs in quantum computing was coming up with a fast algorithm for factoring. It turns out quantum order finding takes $O((\log N)^2 \log \log N \log \log \log N)$ steps, which, for a 1024-bit key is just over 60 operations. Doubling the key-size to 2048 bits only increases the number of operations by just over 20. Unfortunately (or fortunately, because we’d be pretty screwed if someone could easily break RSA right now), we haven’t built any quantum computers that large yet, nor are we capable of doing so anytime soon.

tl;dr – Kenji is a quantum computer.