16 comments

6

I don't feel like this is too uncommon in research fields. You don't know until you test it. Though they might have been chasing in circles to be going for so long.

0

If you've ever had to convert an algorithm to the satisfiability problem, you'd know it's not that intuitive most of the time. That being said it did take 7 years to solve the max subarray problem.

3

Finally! However, there are a couple of interesting approximating solutions, for instance: Polylogarithmic Approximation for Edit Distance and the Asymmetric Query Complexity

1

quality submission OP, thanks. Upvoat wasn't praise enough, very interesting article I must have missed.

0

Wait, so did they prove NP = P to be false?

0

I don't understand the NP=P thing. What is it?

0

It takes 2 years of college comp Sci to understand.

1

no. They just proved that this problem also belongs in that category.

0

Paywall.

Everything must be accessible through the submission for everyone globally.

0

Dunno, it worked (and continues to work) for me.

0

Weird... now it's allowing me to view it after revisiting.

0

I get an annoying popup I can't click out of when I visit that page, so for anyone else who might be having that problem, the full article text is below.

For 40 years, computer scientists have tried in vain to find a faster way to do an important calculation known as “edit distance.” Thanks to groundbreaking work from two researchers at MIT, they now know the reason they’ve continually failed is because a faster method is actually impossible to create.

“Edit distance” is a straightforward and satisfyingly tangible idea. Imagine you have two sequences of numbers and want to know how many steps it takes to transform one into the other. In this transformation you can add a digit, remove a digit, or substitute one digit for another. For example, you have the data strings “1234” and “2345.” To turn the first into the second takes two steps — remove the 1, add a 5.

The number of steps required to make this transformation is the edit distance. It’s a cool concept and also a useful one. Biologists use edit distance all the time when comparing genomes of different organisms. A genome is essentially a long string of data — the sequence of A, G, T, and C that make up our DNA. By calculating the edit distance between the genomes of two organisms, biologists can estimate how far back in evolutionary time the organisms diverged from each other.

Edit distance is useful, but also very time-consuming to calculate. The current method for measuring it is known as the Wagner-Fischer algorithm. It was developed 40 years ago and works essentially by placing data on a grid. One string of data goes along the top row, the other goes down the left-hand column, and the algorithm fills in the grid diagonally, counting changes as it goes.

The Wagner-Fischer algorithm is a computationally-intensive method that operates in what computer scientists call “quadratic time.” In plain terms, this means that when the length of the data strings goes up a little, the number of steps needed to compare them goes up a lot. For example, if you have two sequences, each with 10 digits, it takes 100 operations (10-squared) to compute the edit distance. But if you have two sequences, each with 100 digits, it takes you 10,000 operations to compare them (100-squared). The number of digits went up a little (just 90). The number of operations went up a lot (9,900). View Story The race to preserve disappearing data

From the film industry to science, archivists are struggling to preserve the bits and bytes of our lives.

Like Airbnb, but for algorithms?

The fact that edit distance is only computable in quadratic time has big implications for genomics. The human genome contains about 3 billion base pairs. To compute the edit distance between two human genomes takes 3 billion-squared operations (to appreciate how big that is, write a 9 followed by 18 zeroes). That’s a number of operations, explains Piotr Indyk of MIT, that is “computationally infeasible.” Which is to say that our best computers chugging away for a really long time still won’t generate an answer.

Because it’s computationally infeasible to compute the edit distance between two human genomes, biologists have to rely on approximations. They’d love a faster method, but Indyk and his coauthor, MIT graduate student Arturs Backurs, recently demonstrated a faster method is impossible to create. And by impossible they don’t mean “very hard” or “our technology has to improve first.” They mean something like, “by the laws of the mathematics, it can’t be done.”

Indyk and Backurs presented their work at the annual Symposium on the Theory of Computing conference in Portland, Ore., in June. The details of how they demonstrated this impossibility are hard to describe, but their approach is easy enough to apprehend. In computer science, the most famous open question is the P equal to NP problem. It’s a mega-sized issue. Whoever proves it first would receive millions of dollars in prize money and be all over the international news. Most if not nearly all computer scientists believe the P equal to NP problem is false. Indyk and Backurs developed a clever strategy by which they showed that in order for there to be a faster way of calculating edit distance, a stronger variant of the P equal to NP problem must be true. And since most everyone is convinced that this variant of the P equal to NP problem is false, it follows there’s almost certainly no way to really improve on the Wagner-Fischer algorithm.

Indyk and Backurs’ result has been greeted with something like relief among computer scientists, who can now stop beating their heads against a wall in search of a faster method that doesn’t exist.

“I remember wondering as a student, 15 years ago, whether you could substantially beat the quadratic-time algorithm for [edit distance]. Thanks to Backurs and Indyk, we now know for the first time that you can’t,” says Scott Aaronson, a computer scientist at MIT.

For Indyk, too, his recent work acts as a permission slip to move onto other questions. Like hundreds of other computer scientists, he has spent years searching fruitlessly for a faster way to compute edit distance. Now, he says, “I will stop trying to solve the problem.”