> Most "definitely harder than NP" problems require a nontrivial background in theoretical computer science or mathematics to understand.
One of the simplest examples of this is automatic program synthesis.
Searching for the optimal (e.g., shortest) program that satisfies a given set of inputs/outputs (function) is an undecidable (worse than exponential time) problem similar to the halting problem. The exponent in this case would have been the size of the instruction set, but we still don't know how many cycles we need (i.e., is it even practical to run?).
In applications like genetic programming, we deal with the halting problem by specifying a maximum # of cycles that the program interpreter will run for each time. The ability of the synthesized programs to halt in a bounded time becomes part of selection pressure. Candidates that return poor or no results within the constraints are quickly eliminated. This can be further compounded by making the candidates evolve their own resource limits as part of their genome.
Put differently, we can approximately solve the halting problem for some smaller problems with clever search techniques. I don't think talking about these things in absolute terms is very useful. Everything depends on the problem and the customer's expectations.
I don't know if I would call that "approximately solving the halting problem", and the use of the halting problem is already short-hand for much more general and less binary results. In the 1965 [1] paper by Hartmanis and Stearns that gave birth to computational complexity theory, they generalise the halting theorem to effectively say that it's generally impossible to tell what a program would do (specifically, whether it halts) in it's Nth step without simulating it for N steps. The halting theorem is just the special case where N is any N.
What you're doing isn't approximating or getting around anything. You're simply following the "generalised halting theorem": you're interested in what the program does in N steps, and you're finding that out by simulating it for N steps. You're not making any approximation that shortcuts any computational complexity results. (Such approximations exist in some cases, but that's not what you're doing here)
> You're not making any approximation that shortcuts any computational complexity results.
I completely agree in the default/initial case.
> (Such approximations exist in some cases, but that's not what you're doing here)
However, the entire point of genetic programming is that once you find something that is "on the map" at all, you can begin to exploit the region around that program with some reasonably accurate expectations regarding how certain behaviors will unfold in the local space. The larger the population the more stable this tends to become on average.
So, for pure exploration in GP, you do indeed suffer the full weight of the halting theorem. As you transition into the exploitation regime, this becomes more of a grey area.
> So, for pure exploration in GP, you do indeed suffer the full weight of the halting theorem. As you transition into the exploitation regime, this becomes more of a grey area.
I don't understand what the grey area is. The time hierarchy theorem (that we can consider the "generalised halting theorem") is a theorem; it is absolute. What we have here isn't something that uses some approximation that is true with some probability that we could maybe call a "grey area", but something that bears the full brunt of the theorem.
That there are subsets of problems in some complexity class X that can be solved faster than X's upper limit is the very point of the complexity hierarchy. Yes, there are a subset of problems in EXPTIME that can be solved in polynomial time, hence the existence of the P class, but that doesn't mean that EXPTIME is a "grey area". If you're solving some problem quickly, then we know that what you've solved was not one of the hard problems to begin with. If you're solving some problems in polynomial time, then those problems are in P.
For example, heuristic SAT solvers solve many SAT problems quickly. But their authors don't consider that evidence that P = NP, and understand that the instances that their solvers solve are "easy" (which is not to say they're not useful). In other words, what they're saying is that many useful SAT problems are easy, not that they're able to solve the hard problems efficiently.
I think the idea in the original post was to adjust the goalposts of the original halting problem to get something easier to solve. Instead of looking for programs that "eventually" halt while reproducing the required outputs, one can look for programs that halt "in some reasonable amount of time." The time-bounded halting problem is easier to solve than the original (it is decidable). As one increases the amount of time one views as "reasonable," one gets a sequence of time-bounded halting problems that can be viewed as successively good approximations of the original. There are many similarly "approximate" or "relaxed" versions of the halting problem that are "good enough" in real life and are perfectly decidable.
> If you're solving some problem quickly, then we know that what you've solved was not one of the hard problems to begin with. If you're solving some problems in polynomial time, then those problems are in P.
I thought complexity classes applied to the problem definition, not the specific instances of the problem? An easy SAT problem instance doesn't belong in any complexity class, the set of SAT problem instances belongs to NP.
Yes, we must parameterise problems to meaningfully talk about classes [1], but
interesting algorithms are interesting because they tractably give an answer to some large set of inputs. Their authors may not always know how to meaningfully parameterise the full set of inputs that they solve efficiently, but if they're interesting, then there's some implied set that could be parameterised.
But this is less handwavy than this may sound because SAT solvers don't actually solve "the SAT problem" efficiently. Rather, they make guesses about true/false assignments to particular SAT variables, and the algorithm knows when a guess is successful and when it isn't, because when it makes a wrong guess it backtracks (i.e. it knowingly takes a step that could be said to be "exponential"). Over the entire set of SAT problems of a particular length, every guess will still be right some of the time and wrong (leading to a backtrack) some of the time. Yet the utility of these solvers comes from them making good guesses more than half the time. That means that there must be something special about the set of problems that they can solve with a smaller-than-expected number of backtracks.
Put another way, every solver algorithm could be presented with specific instances, of any size, where the number of backtracks is small or large.
> I don't think talking about these things in absolute terms is very useful. Everything depends on the problem and the customer's expectations.
Agreed. In the real world, approximate solutions to hard problems are often good enough.
For instance, the TSP might be a hard problem, but solving it exactly is pointless since the distances between nodes might be subject to measurement uncertainty anyway, and actual travel times might not be fixed as they might fall under a distribution.
Having said that, it’s still worth studying the theoretical aspects of these problems.
> There is one problem, though, that I find easily explainable. Place a token at the bottom left corner of a grid that extends infinitely up and right, call that point (0, 0). You're given list of valid displacement moves for the token, like (+1, +0), (-20, +13), (-5, -6), etc, and a target point like (700, 1). You may make any sequence of moves in any order, as long as no move ever puts the token off the grid. Does any sequence of moves bring you to the target?
If someone gives you such a sequence, it seems trivial to verify it in linear time. Even for arbitrary dimensions, and such witness can be verified in linear time.
To be in NP, witness must be verifiable in polynomial time with respect to the size of the original input. In this problem (VAS Reachability), the solution can be `2^2^2^...^K` steps long. Even if that's linear with respect to the witness, it's not polynomial with respect to the set of moves + goal.
Hmm.. I'd love to see a more formal statement of this, because it feels unintuitive.
Notably the question "given a number as input, output as many 1's as that number" is exponential in the input size. Is this problem therefore also strictly NP-hard?
> Hmm.. I'd love to see a more formal statement of this, because it feels unintuitive.
The problem is called "Vector Addition System Reachability", and the proof that it's Ackermann-complete is here: https://arxiv.org/pdf/2104.13866 (It's actually for "Vector Addition Systems with states, but the two are equivalent formalisms. They're also equivalent to Petri nets, which is what got me interested in this problem in the first place!)
> Notably the question "given a number as input, output as many 1's as that number" is exponential in the input size. Is this problem therefore also strictly NP-hard?
(Speaking off the cuff, so there's probably a mistake or two here, computability is hard and subtle!)
Compression features in a lot of NP-hard problems! For example, it's possible to represent some graphs as "boolean circuits", eg a function `f(a, b)` that's true iff nodes `a,b` have an edge between them. This can use exponentially less space, which means a normal NP-complete problem on the graph becomes NEXPTIME-complete if the graph is encoded as a circuit.
IMO this is cheating, which is why I don't like it as an example.
"Given K as input, output K 1's" is not a decision problem because it's not a boolean "yes/no". "Given K as input, does it output K ones" is a decision problem. But IIRC then for `K=3` your input is `(3, 111)` so it's still linear on the input size. I think.
My point here more than anything else is that I find this formulation unsatisfying because it is "easy" to verify that we have a witness, but is exponential only because the size of the witness is exponential in size.
What I'd like as a minimum example for "give me something that is NP-hard but not NP-complete" Is a problem whose input size is O(N), whose witnesses are O(N), but which requires O(e^N) time to validate that the witness is correct. I don't actually know that this is possible.
I disagree with the formulation of "decision problem" here. The problem, properly phrased as a decision problem, is "For a given K, does there exist a string with K 1s".
While it is straightforward to answer in the affirmative, and to produce an algorithm that produces that output (though it takes exponential time). To validate that a solution is in fact a valid solution also takes exponential time, if we are treating the size of the input as the base for the validation of the witness.
In the definition of NP, you get the input in unary, so that gets rid of that exponential. The issue with VAS is that the number of moves required could be extremely large.
There's a similar situation in chess (at least if generalized to N by N). You could assert Black has a forced win from a given position, but verifying the claim requires checking the entire game tree, rather than a succinct certificate. Generalized chess is in fact PSPACE-complete, which is generally believed to be outside of NP.
It needs to be a decision problem (or easily recast as a decision problem).
"given a number as input, output as many 1's as that number" doesn't have a yes or no answer. You could try to ask a related question like "given a number as input and a list of 1s, are there as many 1s as the number?" but that has a very large input.
Straightforward to pose as a decision problem -- you're given a sequence of {0,1} representing a base-2 input (N), output a string of {0, 1} such that there are N 1's. You could make it "N 1s followed by N 0s" to avoid being too trivial, but even in the original formulation there are plenty of "wrong" answers for each value of N, and determining whether a witness is correct requires O(2^b) time, where b is the number of bits in N.
That's not a decision problem either. You need to cast it in the form of determining membership in some set of strings (the "language"). In this case the natural choice of language is the set of all strings of the form "{encoding of n}{1111....1 n times}", which is decidable in O(log(n)+n) steps, i.e. linear time.
Linear time in the length of the sequence, yes. But is the sequence length linear in dimension size, or number of moves given? Thats what is interesting.
This applies to any computable problem though, no? At minimum the verifier has to read the witness. If we ignore PCPs and such.
The point here is that the witness grows very fast in terms of vector dimensionality and/or move set size.
Proving it's nonpolynomial is pretty easy: just make the goal vector `(10, 10, 10)` and the displacement vectors `{(1, 0, 0), (-10, 1, 0), (-10, -10, 1)}`. It takes ~1000 steps to reach the goal, and if we add one more dimension we need 10x more steps.
So it grows, at least, exponentially. Back in 1970ish someone found a problem that's at least double-exponential, proving it was 2-EXPTIME-hard. It was conjectured to be 2-EXPTIME-complete for the longest time, but turned out to be significantly harder.
Sounds implausible… Number of computational steps to find the shortest sequence maybe grows with Ackermann function. But length of the shortest sequence?
The crazy thing about the definition of NP-completeness is that Cook's theorem says that all problems in NP can be reduced in polynomial time to an NP-complete problem. So if a witness to a problem can be verified in polynomial time, it is by definition in NP and can be reduced to an NP-complete problem.
If I can verify a solution to this problem by finding a path in polynomial time, it is by definition in NP. The goal here was to present an example of a problem known to not be in NP.
> The crazy thing about the definition of NP-completeness is that Cook's theorem says that all problems in NP can be reduced in polynomial time to an NP-complete problem.
What were you trying to say here? Cook's theorem says that SAT is NP-complete. "All problems in NP can be reduced in polynomial time to an NP-complete problem" is just a part of the definition of NP-completeness.
I agree with the issue ("What's bigger than a dog? THE MOON"), but I don't agree with the need to provide an example that is provably distinct from NP. We're fine providing NP-complete problems without proving that they're distinct from P.
There are lots of accessible problems that belong to spaces that probably aren't NP. One that should be familiar to most people studying CS theory is "do these two regular expressions match the same set of strings?".
---
For the Diophantine vector reachability problem... I don't really like the Quanta presentation. ("An easy-sounding problem yields numbers too big for our universe.")
First, the problem, if you describe it accurately, doesn't sound easy at all. Diophantine problems are never easy. That's why we have real numbers.
Second, the article suggests that the problem should be introduced to children by casting in in terms of several rules of exchange ("For Alice, the transition vector (−1, −1, 1, 0) would represent the exchange of an apple and a banana for a cantaloupe."). But that would make the problem trivial: you start at the origin; without a rule of "exchange" in which the other party gives you as much as you want of something for free, you're never going to leave it.
And third, many easy problems generate numbers too big for our universe. That's not unusual at all. Compare the question "how many digits of pi do you need before the potential error in calculating the volume of a sphere the size of the visible universe is less than the volume of one hydrogen atom?". Can you imagine using more digits than that? You just involved a number too big for the universe.
The article passes over this point itself:
> It’s physically impossible to write down all the digits of 2↑↑6 — a liability of living in such a small universe.
> Second, the article suggests that the problem should be introduced to children by casting in in terms of several rules of exchange ("For Alice, the transition vector (−1, −1, 1, 0) would represent the exchange of an apple and a banana for a cantaloupe."). But that would make the problem trivial: you start at the origin; without a rule of "exchange" in which the other party gives you as much as you want of something for free, you're never going to leave it.
I'm not sure what point you're trying to make here? That specific transition vector would be one such vector among several in a given ruleset. You're correct that for any ruleset, if I don't have at least one vector that can get me away from the origin, I'm stuck there, but then I also have to consider that I still might not be able to get to where I want to go. If I have the ruleset [(2, 0), (-4, 2)] I still can't get to (3, 5).
Something I find fascinating is that we know that P != EXPTIME, and that P <= NP <= EXPTIME, but have managed to prove neither P != NP nor NP != EXPTIME. NP has to be somewhere between them but we have no idea where.
Technically correct is the best kind of correct, and The Halting Problem is a fun head scratcher for a green computer scientist. I'm glad it was part of my introduction to NP hardness.
That said, you do make a great point OP, and I'll think about it every time the Halting Problem comes up.
This seems weird to me. I would think the classic way to introduce these topics is to first talk about decidability, introduce the halting problem through that, then introduce complexity. That's certainly how I was introduced to this material.
I wouldn't think that you'd learn about the halting problem as only an example of something harder than NP hard.
I agree. NP, NP-complete, and NP-hard are all talking about how long things take. If something is uncomputable, then how long it takes doesn't seem very relevant or interesting.
"If Jimmy earns 5 cents a day and a watermelon costs $10, how long will it take for Jimmy to earn enough money to buy the color blue?"
Maybe the connection is that it feels like the halting problem could be solved if you had an infinite amount of time?
The halting problem obviously can be solved if you have an infinite amount of time, as long as terminating is guaranteed to take place within a finite amount of time. If it's possible to terminate after an infinite amount of time, you'll need a more infinite amount of time to check for halting.
The million dollar question here is... a literal million dollar question.
The natural way to go up in complexity from NP is along the polynomial hierarchy, but since it could be the case that P=NP, those problems aren't provably harder.For all we know it could even be the case that P=PSPACE.
You could invoke the nondeterministic variant of the time-hierarchy theorem.
To be honest, checking if there is a path between two nodes is a better example of NP-Hard, because it's obvious why you can't verify a solution in polynomial time. Sure the problem isn't decidable, but it's hard to give problems are decidable and explain why the proof can't be verified in P time. Only problems that involve playing optimally a game (with more than one player) that can have cycles come to mind. These are the "easiest" to grasp.
Isn't this NP-complete? The "solution" here would be the steps to take in the path which can be found by brute-force
Wikipedia:
> 2. When the answer is "yes", this can be demonstrated through the existence of a short (polynomial length) solution.
> 3. The correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by trying all possible solutions.
NP-complete is indeed the intersection of NP-hard and NP.
If you can solve any NP-hard problem then you can solve any NP-complete problem (because you can convert any instance of an NP-complete problem to an NP-hard problem), so NP-hard problems are "at least as hard" as NP-complete problems.
(But an NP-hard problem might not be in NP, ie given a purported solution to an NP-hard problem you might not be able to verify it in polynomial time)
This [1] diagram from Wikipedia represents you are saying in a visual way. NP-hard and NP-complete are defined as basically an equivalence class modulo an algorithm in P which transforms from one problem to the other. In more human terms both NP-hard and NP-complete are reducible with a polynomial algorithm to another problem from their respective class, making them at least as hard.
The difference between the two classes, again in human terms, is that NP-complete problems are decision problems "Does X have property Y?", while NP-hard problems can include more types - search problems, optimization, etc, making the class NP-hard strictly larger than NP-complete in set terms.
The statement in the article means that NP-hard problems require solving a NP-complete problem plus a P problem - hence being at least as hard.
The difference between NP-Hard and NP-Complete is that NP-Complete problems are both NP-Hard, and in NP. As opposed to NP-Hard problems that are not NP-Complete, meaning they are not themselves in NP.
Because a language being NP-hard implies it is at least as hard as the hardest NP problems. For any language that is NP-hard, if one had a turing machine that decided the language, one could construct a polynomial time transformation of any language in NP to the NP-hard language to decide it using the NP-hard turing machine.
Membership in NP is a statement about how easy a problem is (specifically, that you can verify an answer in no worse than polynomial time). NP-hard is about how hard a problem is (at a minimum).
I wonder if there would be less confusion over this if NP had been called "NP-easy" in the first place. But Wikipedia says that by now that term has already been taken.
P is "very easy": you can solve these in polynomial time. NP is "easy": you can verify a solution in polynomial time. Are there any problems you can verify but not solve in polynomial time? Nobody knows! (But ~everyone thinks so.)
NP-hard is "hard": it takes polynomial time or more to even verify these.
The article talks about a 2-dimensional grid which starts at (0,0) (bottom right) and extends infinitely to the right and the top, so all (x,y) for non-negative integers x,y exist. But x or y negative does not exist. Given a list of possible jumps e.g. (+1,+10) or (-20,+13), and a target destination, e.g. (700,1). Does there exist a series of valid jumps that takes you from (0,0) to (700,1) without ever going off grid (i.e. into negative territory)?
This problem might or might not be NP-Harder. However, now extend the problem to higher dimensional grids. At some number of dimensions, the problem becomes definitely NP-Harder (i.e. NP-hard, decidable, but not in NP)
Given a dimension, n; a collection of n-dimensional vectors of integers v_1,..,v_k; and a target n-dimensional vector u: Is there a sequence of vectors from the collection which sums to u, such that no partial sum(of an initial segment of the sequence) has any negative components?
The problem is that topics like descriptive complexity theory are typically reserved for graduate level courses, and what the author has access to is basically a survey.
There are entire hierarchies that describe what is "NP-Harder" specifically the second level of the polynomial hierarchy, where NP == Σ_1^p
IMHO one of the most useful parts of the -complete concepts is as a meet-me where one can find possible reductions that may work in your use case, but even with just P and NP you can find useful mappings like:
P = FO(LFP)=FO(n^O(1))=SO(Horn)
NP = PCP(0,poly(n)) = PCP(log n, log n) = PCP(log n, 1) = Σ^1 = SO(∃)
There are entire hierarchies that describe what is "NP-Harder" specifically the second level of the polynomial hierarchy, where NP == Σ_1^p
The arithmetic and exponential hierarchies, which the author mentions are actually nearly just as big as a jump to HALT from the polynomial hierarchy.
The reason that PSPACE is invoked (class of decision problems solvable by a Turing machine in polynomial space).
Is because is because PSPACE contains PH, the Polynomial-Time Hierarchy[1] which is what the author seems to have been looking for. The PSPACE-complete problems[2] are also a great meet-me space for either looking for some special case reduction, or understanding that complexity from your preferred abstract model.
As space* is always more valuable then time, because you have to read, which takes time, it is common to have that transition. But then things get crazy too.
HALT and NP are very closely related in a practical method because of the Entscheidungsproblem, which is the decision problem issue.
But you have to think in the General case, not a restricted problem.
You can verify any NP problem in poly time, just as you can tell that a machine halts, but you cannot pre-guess that.
A possibly useful lens for this is the generalization of HALT, Rice's Theorem[3]
> Rice's theorem states that all non-trivial semantic properties of programs are undecidable.
> Given a program P which takes a natural number n and returns a natural number P(n), the following questions are undecidable:
* Does P terminate on a given n? (This is the halting problem.)
* Does P terminate on 0?
* Does P terminate on all n (i.e., is P total)?
* Does P terminate and return 0 on every input?
* Does P terminate and return 0 on some input?
* Does P terminate and return the same value for all inputs?
* Is P equivalent to a given program Q?
If you quit thinking about NP as something that can be brute forced, as yes NP can be simulated on a TM, and think about a decider program running and deciding without running the actual program this will help connect the dots.
Understanding that this decision has to happen without access to the semantic properties (runtime behavior) but only has access to the syntax of a program will open lots of concepts for you.
If moving past the oversimplified "survey" level explanations is of interest there are many resources, but here is one play list that some trusted people I know claimed was useful[4]. One of the more common descriptive complexity books is "Descriptive Complexity" by Neil Immerman, and even complexityzoo references[1] can help.
To be clear, I can understand why these oversimplified introduction explanations can be frustrating, and the curriculum could and should be improved.
Is the problem mentioned in the article equivalent to deciding whether there exists a solution to a system of linear equations in the positive integers?
I think no, because the vector additions are considered in sequence, and at no point are you allowed to leave the positive quadrant.
Well, for a computer that is a finite state machine there are only finitely many states. So, in finite time the machine will either (a) halt or (b) return to an earlier state and, thus, be in an infinite loop. So, in this case can tell if the "program will stop" and, thus, solve "the halting problem".
Uh, we also assume that before we start, we can look at the design of "the machine" and know how many states there can be and from the speed of the machine how many new states are visited each second. So, we will know how long we have to wait before we see either (a) or (b).
Nobody is talking about a finite state machine in complexity. Its time complexity is n, and its space complexity is 0. The Halting Problem specifically pertains to Turing Machines.
What I wrote is correct but does not address the traditional Turing Machine halting problem or solve it.
The traditional halting problem is theoretical, i.e., not close to anything practical, and so is what I wrote.
And I did not claim that what I wrote is a contribution to "complexity theory".
The Turing Machine Halting Problem has long been standard computer science; from that it's common to conclude we can't tell if a program will stop. But that conclusion needs some theoretical assumptions, and my version is simpler and shows that we have to be careful drawing conclusions from the standard treatment.
What your wrote is pretty much trivial, and not related to the topic of the article.
It's also not practical, since the space needed to represent even a small program as an FSM is very large. Just try to write an FSM that reads binary coded 16 bit numbers, and has to find the maximum value. Now imagine sorting just 100 of them.
> What your wrote is ... not related to the topic of the article.
The article was: The Halting Problem is a terrible example of NP-Harder and I wrote:
> So, in this case can tell if the "program will stop" and, thus, solve "the halting problem".
> "Trivial"?
I never claimed the post was a significant contribution to computer science or complexity theory, and at most only a tiny fraction of posts at HN are such. Also, my post does not address the complexity theory question of P = NP.
If what I wrote was "new, correct, and significant", then I should have sent it to a good computer science journal. I've published enough papers in good math and computer science journals to notice that the checks take a long time to arrive. I'm not in academics and never wanted to be.
HN is not a journal of new, correct, and significant work in computer science.
From responses in this thread, the post has proven to be at least curious and interesting to some of the HN audience.
But the halting problem is specifically for any kind of program. Otherwise you can just say that every codebase is smaller than X petabytes anyway so it’s always decidable.
Architectures like x86 can only address a finite amount of RAM, since they have a fixed word size (e.g. 64 bits). However, their memory is still unlimited, since they can read and write to arbitrary IO devices (HDDs, SSDs, S3, etc.); though those operations aren't constant time, they're O(sqrt(n)), since they require more and more layers of indirection (e.g. using an SSD to store the S3 URL of the address of ....)
* It makes computability and complexity dependent on individual machines (now a program may halt on machine A, but not on machine B). For various reasons, we don't want that.
* The entire state space of a single machine consists of all the registers, memory cells, etc. But keeping track of all states that have been visited before requires exponentially more space than the space that is actually available on the machine (because you're computing the powerset). So the machine itself can't solve its halting problem, only a much more powerful one can.
Very often, impossibility results in infinitary mathematics translate back to "there's no reasonable way to do this" in actual practice where things are finite.
You wouldn't necessarily need to keep track of all states that have been visited before, to my understanding, just one extra state that you move along at half the speed (so it'll eventually also enter the loop, and then the state you're progressing at full speed will loop around and hit it). So 2X the memory and 1.5X the time/compute of the program on its own.
True, but practically I don't think you'd ever need to - should be able to narrow things down with insights/heuristics (and even with the simple algorithm alone, most programs would loop or halt far sooner).
Often seems to be presented as if humans can make some deductions or introspections that machines/AI would be incapable of seeing, or that the machine would have to rely on brute force for, but I don't think there's actually any result to that effect.
Even simpler, you're just trying to figure out if it halts or not, not the exact length of the cycle, so you can do it with only a single counter for the total number of states.
Say you have a machine with 128 bits of state, if it halts in 2^128 (roughly 340 undecillion) steps, that means it halts. If it's still going after 2^128 steps, then it'll keep going forever.
So, to see if a 128 bit machine halts, you only need an extra 128 bits to count up 1-by-1 to 2^128.
Sure, you also need a few thousand times the lifetime of the known universe just to count that high, but you know, It's O(2^128) instructions which means it's O(1).
Good luck doing that on any machine with any real amount of memory of course. Really easy to write "just count to 2^128", but not so easy to do it.
That's great and all, but nobody is talking about physical computers here.
Moreover, it's a useless observation in practice as well because the full exponentially-large state space necessary to represent such systems is not a useful formalism to approach computers with for any purpose, be it brutally practical or entirely theoretical. See https://news.ycombinator.com/item?id=23431703 for some previous comments I made on this, and some rough numbers base on computers that were already small 5 years ago and seem even smaller today. It is not useful to say "hey, that thing you're using has a finite state space and that finite state space is only 10^2,585,827,973 large!" It's not like you're going to run out.
> Big O notation also suffers from this - it's almost useless for real world problems.
This is quite a bold claim. Yeah "in the real world" you often need to take into consideration the constant factors... but to go this far to say that Big O is useless? It's very useful to figure out what would likely work (or not) for your problem, given that you know the data size - I used it for exactly that, many many times.
Let's say you have 2 algorithms. Constant time O(1) and exp time 2^n. It seems constant time is better. But in fact it runs always 42 years. While exp one starts to run longer than millisecond for sizes bigger than number of particles in universe.
Big O says how fast complexity grows, but it doesn't say what that complexity exactly is.
Does something like that happens in real world? Yes, but of course not in such drastic ways.
As a concept it is pretty useful for me in a handwavy astrophysics math way because I otherwise wouldn't necessarily know how to make my naive solution fit real world constraints.
If you are writing everyday programs your constant factors are going to be very comparable and proportional to the size of your program, unless you have somewhere in your program hardcoded numerical constants like, run this loop 1 trillion times.
> While exp one starts to run longer than millisecond for sizes bigger than number of particles in universe.
You don't really know how this works, do you? I can guarantee that thera are more than 1M particles in the universe. And if you iterate 2^1M times, doing nothing, on a current computer... that's basically indistinguishable from an infinite loop.
If we're talking about "fantasy world" vs "real world", the "always run 42 years algorithm" surely is closer to fantasy world than most reasonable Big-O analyses...
If you need to pull an example from fantasy world to illustrate your point about Big-O notation not being useful "in the real world" you're probably committing the same alleged mistakes as computational theorists...
A very exaggerated example to use exp time… Whatever your constant, for exp time, it’s going to run for times longer than the life of the universe on probably any n > 10… Since you know, log time is practically constant
Most of the time the constants are going to be similar. And by similar I mean < 3 orders of magnitude say. So 2^9 or adding 9 to N eats that up pretty quick.
Simpler algorithms are usually faster for small N, and N is usually small. Big O assumption is fantasy world where N is always large, and constant factors that slow down an algorithm can be ignored.
For example, it is common in sorting algorithms to do an n² algorithm like bubble sort when the list to sort is small (e.g. <50), whereas when it's larger, we do an n log n algorithm like merge sort.
The issue is that merge sort does recursion, which comes with some extra cost, so that an n² algorithm beats an n log n algorithm, provided n is small.
You can (and probably should) add a threshold to recursive algorithms like mergesort so that they don't end up doing recursive calls on very small arrays.
For arrays smaller than the threshold, insertion is faster than bubble sort. And if you really don't want recursion for large arrays to begin with, you have heapsort or Shell sort.
I don't think there is any practical case where using bubble sort makes sense, except "efficiency doesn't matter for my use case and this is the code I already have".
As in one of volumes of Knuth's The Art of Computing, Heap sort is in place and achieves the Gleason bound so can be regarded as not beaten by quick sort.
> Better - estimate complexity so things like "this algorithm will take 1234 + 2n + 500m operations " for our use case
I feel like that is never the level of detail you want.
Either that is way too much detail and big-oh notation abstracts away the meaningless stuff.
Or, if this level of detail matters, then you really need to go much further. Think about cache hit ratios, memory layout issues, different speed of different operations, etc.
Calculating number of operations is a weird middle ground that seems pretty useless.
Not quite bubble sort, but many recursive sorting implementations will switch to an O(n^2) algorithm like insertion sort once they reach a small number of elements. Simple but poorly-scaling methods usually have a place as base cases of hybrid algorithms.
In other news, there's a reason no BLAS implementation multiplies matrices with Strassen's algorithm, and it's not that the people writing those are too dumb.
Still, in quicksort, big O notation analysis is what tells you that you can further improve efficiency by leaving small subarrays unsorted and then performing a single call to insertion on the whole array at the very end, rather than one insertion call to sort each individual small subarray. This result is far from obvious without big O analysis. So still useful, even there.
For any function f(n) the problem "compute 2^f(n)" is going to be Ω(f(n)), because the output is f(n) bits; so merely writing down the answer takes Ω(f(n)) steps.
Note, that the number n is the input here, which is k = log(n) bits long. So the runtime is actually Ω(f(2^log(n))) = Ω(f(2^k)).
This is indeed as trivially true as you say, but this is not a decision problem, which is what the article is discussing. That said, you can use diagonalization over turing machines to construct a language that is indeed strictly more difficult.
> Most "definitely harder than NP" problems require a nontrivial background in theoretical computer science or mathematics to understand.
One of the simplest examples of this is automatic program synthesis.
Searching for the optimal (e.g., shortest) program that satisfies a given set of inputs/outputs (function) is an undecidable (worse than exponential time) problem similar to the halting problem. The exponent in this case would have been the size of the instruction set, but we still don't know how many cycles we need (i.e., is it even practical to run?).
In applications like genetic programming, we deal with the halting problem by specifying a maximum # of cycles that the program interpreter will run for each time. The ability of the synthesized programs to halt in a bounded time becomes part of selection pressure. Candidates that return poor or no results within the constraints are quickly eliminated. This can be further compounded by making the candidates evolve their own resource limits as part of their genome.
Put differently, we can approximately solve the halting problem for some smaller problems with clever search techniques. I don't think talking about these things in absolute terms is very useful. Everything depends on the problem and the customer's expectations.
I don't know if I would call that "approximately solving the halting problem", and the use of the halting problem is already short-hand for much more general and less binary results. In the 1965 [1] paper by Hartmanis and Stearns that gave birth to computational complexity theory, they generalise the halting theorem to effectively say that it's generally impossible to tell what a program would do (specifically, whether it halts) in it's Nth step without simulating it for N steps. The halting theorem is just the special case where N is any N.
What you're doing isn't approximating or getting around anything. You're simply following the "generalised halting theorem": you're interested in what the program does in N steps, and you're finding that out by simulating it for N steps. You're not making any approximation that shortcuts any computational complexity results. (Such approximations exist in some cases, but that's not what you're doing here)
[1]: On the computational complexity of algorithms: https://www.ams.org/journals/tran/1965-117-00/S0002-9947-196...
> You're not making any approximation that shortcuts any computational complexity results.
I completely agree in the default/initial case.
> (Such approximations exist in some cases, but that's not what you're doing here)
However, the entire point of genetic programming is that once you find something that is "on the map" at all, you can begin to exploit the region around that program with some reasonably accurate expectations regarding how certain behaviors will unfold in the local space. The larger the population the more stable this tends to become on average.
So, for pure exploration in GP, you do indeed suffer the full weight of the halting theorem. As you transition into the exploitation regime, this becomes more of a grey area.
> So, for pure exploration in GP, you do indeed suffer the full weight of the halting theorem. As you transition into the exploitation regime, this becomes more of a grey area.
I don't understand what the grey area is. The time hierarchy theorem (that we can consider the "generalised halting theorem") is a theorem; it is absolute. What we have here isn't something that uses some approximation that is true with some probability that we could maybe call a "grey area", but something that bears the full brunt of the theorem.
That there are subsets of problems in some complexity class X that can be solved faster than X's upper limit is the very point of the complexity hierarchy. Yes, there are a subset of problems in EXPTIME that can be solved in polynomial time, hence the existence of the P class, but that doesn't mean that EXPTIME is a "grey area". If you're solving some problem quickly, then we know that what you've solved was not one of the hard problems to begin with. If you're solving some problems in polynomial time, then those problems are in P.
For example, heuristic SAT solvers solve many SAT problems quickly. But their authors don't consider that evidence that P = NP, and understand that the instances that their solvers solve are "easy" (which is not to say they're not useful). In other words, what they're saying is that many useful SAT problems are easy, not that they're able to solve the hard problems efficiently.
I think the idea in the original post was to adjust the goalposts of the original halting problem to get something easier to solve. Instead of looking for programs that "eventually" halt while reproducing the required outputs, one can look for programs that halt "in some reasonable amount of time." The time-bounded halting problem is easier to solve than the original (it is decidable). As one increases the amount of time one views as "reasonable," one gets a sequence of time-bounded halting problems that can be viewed as successively good approximations of the original. There are many similarly "approximate" or "relaxed" versions of the halting problem that are "good enough" in real life and are perfectly decidable.
> If you're solving some problem quickly, then we know that what you've solved was not one of the hard problems to begin with. If you're solving some problems in polynomial time, then those problems are in P.
I thought complexity classes applied to the problem definition, not the specific instances of the problem? An easy SAT problem instance doesn't belong in any complexity class, the set of SAT problem instances belongs to NP.
Yes, we must parameterise problems to meaningfully talk about classes [1], but interesting algorithms are interesting because they tractably give an answer to some large set of inputs. Their authors may not always know how to meaningfully parameterise the full set of inputs that they solve efficiently, but if they're interesting, then there's some implied set that could be parameterised.
But this is less handwavy than this may sound because SAT solvers don't actually solve "the SAT problem" efficiently. Rather, they make guesses about true/false assignments to particular SAT variables, and the algorithm knows when a guess is successful and when it isn't, because when it makes a wrong guess it backtracks (i.e. it knowingly takes a step that could be said to be "exponential"). Over the entire set of SAT problems of a particular length, every guess will still be right some of the time and wrong (leading to a backtrack) some of the time. Yet the utility of these solvers comes from them making good guesses more than half the time. That means that there must be something special about the set of problems that they can solve with a smaller-than-expected number of backtracks.
Put another way, every solver algorithm could be presented with specific instances, of any size, where the number of backtracks is small or large.
[1]: This makes the notion of circuit complexity a challenging, hence the notion of "uniformity": https://en.wikipedia.org/wiki/Circuit_complexity#Uniformity
> I don't think talking about these things in absolute terms is very useful. Everything depends on the problem and the customer's expectations.
Agreed. In the real world, approximate solutions to hard problems are often good enough.
For instance, the TSP might be a hard problem, but solving it exactly is pointless since the distances between nodes might be subject to measurement uncertainty anyway, and actual travel times might not be fixed as they might fall under a distribution.
Having said that, it’s still worth studying the theoretical aspects of these problems.
The example given doesn't seem right to me.
> There is one problem, though, that I find easily explainable. Place a token at the bottom left corner of a grid that extends infinitely up and right, call that point (0, 0). You're given list of valid displacement moves for the token, like (+1, +0), (-20, +13), (-5, -6), etc, and a target point like (700, 1). You may make any sequence of moves in any order, as long as no move ever puts the token off the grid. Does any sequence of moves bring you to the target?
If someone gives you such a sequence, it seems trivial to verify it in linear time. Even for arbitrary dimensions, and such witness can be verified in linear time.
To be in NP, witness must be verifiable in polynomial time with respect to the size of the original input. In this problem (VAS Reachability), the solution can be `2^2^2^...^K` steps long. Even if that's linear with respect to the witness, it's not polynomial with respect to the set of moves + goal.
Hmm.. I'd love to see a more formal statement of this, because it feels unintuitive.
Notably the question "given a number as input, output as many 1's as that number" is exponential in the input size. Is this problem therefore also strictly NP-hard?
> Hmm.. I'd love to see a more formal statement of this, because it feels unintuitive.
The problem is called "Vector Addition System Reachability", and the proof that it's Ackermann-complete is here: https://arxiv.org/pdf/2104.13866 (It's actually for "Vector Addition Systems with states, but the two are equivalent formalisms. They're also equivalent to Petri nets, which is what got me interested in this problem in the first place!)
> Notably the question "given a number as input, output as many 1's as that number" is exponential in the input size. Is this problem therefore also strictly NP-hard?
(Speaking off the cuff, so there's probably a mistake or two here, computability is hard and subtle!)
Compression features in a lot of NP-hard problems! For example, it's possible to represent some graphs as "boolean circuits", eg a function `f(a, b)` that's true iff nodes `a,b` have an edge between them. This can use exponentially less space, which means a normal NP-complete problem on the graph becomes NEXPTIME-complete if the graph is encoded as a circuit.
IMO this is cheating, which is why I don't like it as an example.
"Given K as input, output K 1's" is not a decision problem because it's not a boolean "yes/no". "Given K as input, does it output K ones" is a decision problem. But IIRC then for `K=3` your input is `(3, 111)` so it's still linear on the input size. I think.
My point here more than anything else is that I find this formulation unsatisfying because it is "easy" to verify that we have a witness, but is exponential only because the size of the witness is exponential in size.
What I'd like as a minimum example for "give me something that is NP-hard but not NP-complete" Is a problem whose input size is O(N), whose witnesses are O(N), but which requires O(e^N) time to validate that the witness is correct. I don't actually know that this is possible.
I disagree with the formulation of "decision problem" here. The problem, properly phrased as a decision problem, is "For a given K, does there exist a string with K 1s".
While it is straightforward to answer in the affirmative, and to produce an algorithm that produces that output (though it takes exponential time). To validate that a solution is in fact a valid solution also takes exponential time, if we are treating the size of the input as the base for the validation of the witness.
In the definition of NP, you get the input in unary, so that gets rid of that exponential. The issue with VAS is that the number of moves required could be extremely large.
There's a similar situation in chess (at least if generalized to N by N). You could assert Black has a forced win from a given position, but verifying the claim requires checking the entire game tree, rather than a succinct certificate. Generalized chess is in fact PSPACE-complete, which is generally believed to be outside of NP.
It needs to be a decision problem (or easily recast as a decision problem). "given a number as input, output as many 1's as that number" doesn't have a yes or no answer. You could try to ask a related question like "given a number as input and a list of 1s, are there as many 1s as the number?" but that has a very large input.
Straightforward to pose as a decision problem -- you're given a sequence of {0,1} representing a base-2 input (N), output a string of {0, 1} such that there are N 1's. You could make it "N 1s followed by N 0s" to avoid being too trivial, but even in the original formulation there are plenty of "wrong" answers for each value of N, and determining whether a witness is correct requires O(2^b) time, where b is the number of bits in N.
That's not a decision problem either. You need to cast it in the form of determining membership in some set of strings (the "language"). In this case the natural choice of language is the set of all strings of the form "{encoding of n}{1111....1 n times}", which is decidable in O(log(n)+n) steps, i.e. linear time.
Linear time in the length of the sequence, yes. But is the sequence length linear in dimension size, or number of moves given? Thats what is interesting.
Linear in the size of the witness, however many bits it takes to express it.
This applies to any computable problem though, no? At minimum the verifier has to read the witness. If we ignore PCPs and such. The point here is that the witness grows very fast in terms of vector dimensionality and/or move set size.
It doesn’t need to be linear though. Polynomial is enough.
But I guess it can be proven that the shortest possible sequence grows faster than polynomial.
Proving it's nonpolynomial is pretty easy: just make the goal vector `(10, 10, 10)` and the displacement vectors `{(1, 0, 0), (-10, 1, 0), (-10, -10, 1)}`. It takes ~1000 steps to reach the goal, and if we add one more dimension we need 10x more steps.
So it grows, at least, exponentially. Back in 1970ish someone found a problem that's at least double-exponential, proving it was 2-EXPTIME-hard. It was conjectured to be 2-EXPTIME-complete for the longest time, but turned out to be significantly harder.
I think they proved it grows with Ackermann function.
Sounds implausible… Number of computational steps to find the shortest sequence maybe grows with Ackermann function. But length of the shortest sequence?
I think you are right. I just read this article linked in the OP: https://www.quantamagazine.org/an-easy-sounding-problem-yiel...
A sequence is easy to verify. Choosing the sequence not so much.
Roughly put that is the certificate definition of being in NP.
The goal here was to show that it was strictly NP-hard, i.e. harder than any problem in NP.
Harder to solve, not necessarily harder to verify?
If I am understanding things right.
The crazy thing about the definition of NP-completeness is that Cook's theorem says that all problems in NP can be reduced in polynomial time to an NP-complete problem. So if a witness to a problem can be verified in polynomial time, it is by definition in NP and can be reduced to an NP-complete problem.
If I can verify a solution to this problem by finding a path in polynomial time, it is by definition in NP. The goal here was to present an example of a problem known to not be in NP.
> The crazy thing about the definition of NP-completeness is that Cook's theorem says that all problems in NP can be reduced in polynomial time to an NP-complete problem.
What were you trying to say here? Cook's theorem says that SAT is NP-complete. "All problems in NP can be reduced in polynomial time to an NP-complete problem" is just a part of the definition of NP-completeness.
[dead]
I agree with the issue ("What's bigger than a dog? THE MOON"), but I don't agree with the need to provide an example that is provably distinct from NP. We're fine providing NP-complete problems without proving that they're distinct from P.
There are lots of accessible problems that belong to spaces that probably aren't NP. One that should be familiar to most people studying CS theory is "do these two regular expressions match the same set of strings?".
---
For the Diophantine vector reachability problem... I don't really like the Quanta presentation. ("An easy-sounding problem yields numbers too big for our universe.")
First, the problem, if you describe it accurately, doesn't sound easy at all. Diophantine problems are never easy. That's why we have real numbers.
Second, the article suggests that the problem should be introduced to children by casting in in terms of several rules of exchange ("For Alice, the transition vector (−1, −1, 1, 0) would represent the exchange of an apple and a banana for a cantaloupe."). But that would make the problem trivial: you start at the origin; without a rule of "exchange" in which the other party gives you as much as you want of something for free, you're never going to leave it.
And third, many easy problems generate numbers too big for our universe. That's not unusual at all. Compare the question "how many digits of pi do you need before the potential error in calculating the volume of a sphere the size of the visible universe is less than the volume of one hydrogen atom?". Can you imagine using more digits than that? You just involved a number too big for the universe.
The article passes over this point itself:
> It’s physically impossible to write down all the digits of 2↑↑6 — a liability of living in such a small universe.
> Second, the article suggests that the problem should be introduced to children by casting in in terms of several rules of exchange ("For Alice, the transition vector (−1, −1, 1, 0) would represent the exchange of an apple and a banana for a cantaloupe."). But that would make the problem trivial: you start at the origin; without a rule of "exchange" in which the other party gives you as much as you want of something for free, you're never going to leave it.
I'm not sure what point you're trying to make here? That specific transition vector would be one such vector among several in a given ruleset. You're correct that for any ruleset, if I don't have at least one vector that can get me away from the origin, I'm stuck there, but then I also have to consider that I still might not be able to get to where I want to go. If I have the ruleset [(2, 0), (-4, 2)] I still can't get to (3, 5).
Something I find fascinating is that we know that P != EXPTIME, and that P <= NP <= EXPTIME, but have managed to prove neither P != NP nor NP != EXPTIME. NP has to be somewhere between them but we have no idea where.
"NP has to be somewhere between them but we have no idea where" – I guess that this state of affairs won't change much even if we prove P != NP?
I think that's unclear. A constructive proof of P != NP might yield insights into the "gap" between them.
Technically correct is the best kind of correct, and The Halting Problem is a fun head scratcher for a green computer scientist. I'm glad it was part of my introduction to NP hardness.
That said, you do make a great point OP, and I'll think about it every time the Halting Problem comes up.
This seems weird to me. I would think the classic way to introduce these topics is to first talk about decidability, introduce the halting problem through that, then introduce complexity. That's certainly how I was introduced to this material.
I wouldn't think that you'd learn about the halting problem as only an example of something harder than NP hard.
I agree. NP, NP-complete, and NP-hard are all talking about how long things take. If something is uncomputable, then how long it takes doesn't seem very relevant or interesting.
"If Jimmy earns 5 cents a day and a watermelon costs $10, how long will it take for Jimmy to earn enough money to buy the color blue?"
Maybe the connection is that it feels like the halting problem could be solved if you had an infinite amount of time?
The halting problem obviously can be solved if you have an infinite amount of time, as long as terminating is guaranteed to take place within a finite amount of time. If it's possible to terminate after an infinite amount of time, you'll need a more infinite amount of time to check for halting.
The million dollar question here is... a literal million dollar question.
The natural way to go up in complexity from NP is along the polynomial hierarchy, but since it could be the case that P=NP, those problems aren't provably harder.For all we know it could even be the case that P=PSPACE.
You could invoke the nondeterministic variant of the time-hierarchy theorem.
To be honest, checking if there is a path between two nodes is a better example of NP-Hard, because it's obvious why you can't verify a solution in polynomial time. Sure the problem isn't decidable, but it's hard to give problems are decidable and explain why the proof can't be verified in P time. Only problems that involve playing optimally a game (with more than one player) that can have cycles come to mind. These are the "easiest" to grasp.
Isn't this NP-complete? The "solution" here would be the steps to take in the path which can be found by brute-force
Wikipedia:
> 2. When the answer is "yes", this can be demonstrated through the existence of a short (polynomial length) solution.
> 3. The correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by trying all possible solutions.
Halting problem is not even decidable. To mention it in the context of NP is a categorical fouxpas.
> NP-hard is the set all problems at least as hard as NP-complete
Im a bit confused by this. I thought NP-complete was the intersection of NP-hard and NP. Maybe this is stating the same?
NP-complete is indeed the intersection of NP-hard and NP.
If you can solve any NP-hard problem then you can solve any NP-complete problem (because you can convert any instance of an NP-complete problem to an NP-hard problem), so NP-hard problems are "at least as hard" as NP-complete problems.
(But an NP-hard problem might not be in NP, ie given a purported solution to an NP-hard problem you might not be able to verify it in polynomial time)
This [1] diagram from Wikipedia represents you are saying in a visual way. NP-hard and NP-complete are defined as basically an equivalence class modulo an algorithm in P which transforms from one problem to the other. In more human terms both NP-hard and NP-complete are reducible with a polynomial algorithm to another problem from their respective class, making them at least as hard.
The difference between the two classes, again in human terms, is that NP-complete problems are decision problems "Does X have property Y?", while NP-hard problems can include more types - search problems, optimization, etc, making the class NP-hard strictly larger than NP-complete in set terms.
The statement in the article means that NP-hard problems require solving a NP-complete problem plus a P problem - hence being at least as hard.
[1] https://en.m.wikipedia.org/wiki/File:P_np_np-complete_np-har...
I don't believe this is accurate.
The difference between NP-Hard and NP-Complete is that NP-Complete problems are both NP-Hard, and in NP. As opposed to NP-Hard problems that are not NP-Complete, meaning they are not themselves in NP.
If NP-hard isn't even a subset of NP, why is it called "NP-hard"?
Because a language being NP-hard implies it is at least as hard as the hardest NP problems. For any language that is NP-hard, if one had a turing machine that decided the language, one could construct a polynomial time transformation of any language in NP to the NP-hard language to decide it using the NP-hard turing machine.
Membership in NP is a statement about how easy a problem is (specifically, that you can verify an answer in no worse than polynomial time). NP-hard is about how hard a problem is (at a minimum).
I wonder if there would be less confusion over this if NP had been called "NP-easy" in the first place. But Wikipedia says that by now that term has already been taken.
P is "very easy": you can solve these in polynomial time. NP is "easy": you can verify a solution in polynomial time. Are there any problems you can verify but not solve in polynomial time? Nobody knows! (But ~everyone thinks so.)
NP-hard is "hard": it takes polynomial time or more to even verify these.
Because naming things is one of the two difficult problems in computer science.
I am confused about the precise statement of the problem that is claimed to be provably NP-hard, decidable and not in NP. Any clarification?
The article talks about a 2-dimensional grid which starts at (0,0) (bottom right) and extends infinitely to the right and the top, so all (x,y) for non-negative integers x,y exist. But x or y negative does not exist. Given a list of possible jumps e.g. (+1,+10) or (-20,+13), and a target destination, e.g. (700,1). Does there exist a series of valid jumps that takes you from (0,0) to (700,1) without ever going off grid (i.e. into negative territory)?
This problem might or might not be NP-Harder. However, now extend the problem to higher dimensional grids. At some number of dimensions, the problem becomes definitely NP-Harder (i.e. NP-hard, decidable, but not in NP)
Which number of dimensions, though? Is there an effective bound? Otherwise it is hard to suggest the problem as an example.
The number of dimensions is one of the variables in the problem.
Given a dimension, n; a collection of n-dimensional vectors of integers v_1,..,v_k; and a target n-dimensional vector u: Is there a sequence of vectors from the collection which sums to u, such that no partial sum(of an initial segment of the sequence) has any negative components?
The problem is that topics like descriptive complexity theory are typically reserved for graduate level courses, and what the author has access to is basically a survey.
There are entire hierarchies that describe what is "NP-Harder" specifically the second level of the polynomial hierarchy, where NP == Σ_1^p
IMHO one of the most useful parts of the -complete concepts is as a meet-me where one can find possible reductions that may work in your use case, but even with just P and NP you can find useful mappings like:
There are entire hierarchies that describe what is "NP-Harder" specifically the second level of the polynomial hierarchy, where NP == Σ_1^pThe arithmetic and exponential hierarchies, which the author mentions are actually nearly just as big as a jump to HALT from the polynomial hierarchy.
The reason that PSPACE is invoked (class of decision problems solvable by a Turing machine in polynomial space).
Is because is because PSPACE contains PH, the Polynomial-Time Hierarchy[1] which is what the author seems to have been looking for. The PSPACE-complete problems[2] are also a great meet-me space for either looking for some special case reduction, or understanding that complexity from your preferred abstract model.
As space* is always more valuable then time, because you have to read, which takes time, it is common to have that transition. But then things get crazy too.
HALT and NP are very closely related in a practical method because of the Entscheidungsproblem, which is the decision problem issue.
But you have to think in the General case, not a restricted problem.
You can verify any NP problem in poly time, just as you can tell that a machine halts, but you cannot pre-guess that.
A possibly useful lens for this is the generalization of HALT, Rice's Theorem[3]
> Rice's theorem states that all non-trivial semantic properties of programs are undecidable.
> Given a program P which takes a natural number n and returns a natural number P(n), the following questions are undecidable: * Does P terminate on a given n? (This is the halting problem.) * Does P terminate on 0? * Does P terminate on all n (i.e., is P total)? * Does P terminate and return 0 on every input? * Does P terminate and return 0 on some input? * Does P terminate and return the same value for all inputs? * Is P equivalent to a given program Q?
If you quit thinking about NP as something that can be brute forced, as yes NP can be simulated on a TM, and think about a decider program running and deciding without running the actual program this will help connect the dots.
Understanding that this decision has to happen without access to the semantic properties (runtime behavior) but only has access to the syntax of a program will open lots of concepts for you.
If moving past the oversimplified "survey" level explanations is of interest there are many resources, but here is one play list that some trusted people I know claimed was useful[4]. One of the more common descriptive complexity books is "Descriptive Complexity" by Neil Immerman, and even complexityzoo references[1] can help.
To be clear, I can understand why these oversimplified introduction explanations can be frustrating, and the curriculum could and should be improved.
[1]https://complexityzoo.net/Complexity_Zoo:P#ph [2]https://en.m.wikipedia.org/wiki/List_of_PSPACE-complete_prob... [3]https://en.m.wikipedia.org/wiki/Rice%27s_theorem [4]https://www.youtube.com/playlist?list=PLm3J0oaFux3b8Gg1DdaJO...
Is the problem mentioned in the article equivalent to deciding whether there exists a solution to a system of linear equations in the positive integers?
I think no, because the vector additions are considered in sequence, and at no point are you allowed to leave the positive quadrant.
[Edit] Yeah, it's more than just solving positive integer linear systems: https://en.m.wikipedia.org/wiki/Vector_addition_system
IIRC without that restriction the problem is "just" NP-complete.
[dead]
Well, for a computer that is a finite state machine there are only finitely many states. So, in finite time the machine will either (a) halt or (b) return to an earlier state and, thus, be in an infinite loop. So, in this case can tell if the "program will stop" and, thus, solve "the halting problem".
Uh, we also assume that before we start, we can look at the design of "the machine" and know how many states there can be and from the speed of the machine how many new states are visited each second. So, we will know how long we have to wait before we see either (a) or (b).
Nobody is talking about a finite state machine in complexity. Its time complexity is n, and its space complexity is 0. The Halting Problem specifically pertains to Turing Machines.
What I wrote is correct but does not address the traditional Turing Machine halting problem or solve it.
The traditional halting problem is theoretical, i.e., not close to anything practical, and so is what I wrote.
And I did not claim that what I wrote is a contribution to "complexity theory".
The Turing Machine Halting Problem has long been standard computer science; from that it's common to conclude we can't tell if a program will stop. But that conclusion needs some theoretical assumptions, and my version is simpler and shows that we have to be careful drawing conclusions from the standard treatment.
> What I wrote is correct
What your wrote is pretty much trivial, and not related to the topic of the article.
It's also not practical, since the space needed to represent even a small program as an FSM is very large. Just try to write an FSM that reads binary coded 16 bit numbers, and has to find the maximum value. Now imagine sorting just 100 of them.
> What your wrote is ... not related to the topic of the article.
The article was: The Halting Problem is a terrible example of NP-Harder and I wrote:
> So, in this case can tell if the "program will stop" and, thus, solve "the halting problem".
> "Trivial"?
I never claimed the post was a significant contribution to computer science or complexity theory, and at most only a tiny fraction of posts at HN are such. Also, my post does not address the complexity theory question of P = NP.
If what I wrote was "new, correct, and significant", then I should have sent it to a good computer science journal. I've published enough papers in good math and computer science journals to notice that the checks take a long time to arrive. I'm not in academics and never wanted to be.
HN is not a journal of new, correct, and significant work in computer science.
From responses in this thread, the post has proven to be at least curious and interesting to some of the HN audience.
But the halting problem is specifically for any kind of program. Otherwise you can just say that every codebase is smaller than X petabytes anyway so it’s always decidable.
No, the halting problem doesn't hold when you have finite memory (as per OP's point).
As OP says, after finitely many iterations (at most 2^n many), you have to be back at a previously seen state, and can terminate.
Architectures like x86 can only address a finite amount of RAM, since they have a fixed word size (e.g. 64 bits). However, their memory is still unlimited, since they can read and write to arbitrary IO devices (HDDs, SSDs, S3, etc.); though those operations aren't constant time, they're O(sqrt(n)), since they require more and more layers of indirection (e.g. using an SSD to store the S3 URL of the address of ....)
The halting problem requires both
1) unlimited length of programs whose halting behaviour is to be decided (so it can not be limited to program of at most 10 petabytes)
2) for those programs to have unlimited memory available
You're arguing about point 2) in response to a post about point 1).
Well, the post I responded to was a respons to a post about point 2. I simply reiterated OP's point.
This approach suffers from two major problems:
* It makes computability and complexity dependent on individual machines (now a program may halt on machine A, but not on machine B). For various reasons, we don't want that.
* The entire state space of a single machine consists of all the registers, memory cells, etc. But keeping track of all states that have been visited before requires exponentially more space than the space that is actually available on the machine (because you're computing the powerset). So the machine itself can't solve its halting problem, only a much more powerful one can.
Very often, impossibility results in infinitary mathematics translate back to "there's no reasonable way to do this" in actual practice where things are finite.
You wouldn't necessarily need to keep track of all states that have been visited before, to my understanding, just one extra state that you move along at half the speed (so it'll eventually also enter the loop, and then the state you're progressing at full speed will loop around and hit it). So 2X the memory and 1.5X the time/compute of the program on its own.
This is interesting and I wasn't aware of this algorithm, thanks.
I still maintain that it's practically infeasible to exhaust the entire state space of any modern machine.
True, but practically I don't think you'd ever need to - should be able to narrow things down with insights/heuristics (and even with the simple algorithm alone, most programs would loop or halt far sooner).
Often seems to be presented as if humans can make some deductions or introspections that machines/AI would be incapable of seeing, or that the machine would have to rely on brute force for, but I don't think there's actually any result to that effect.
Even simpler, you're just trying to figure out if it halts or not, not the exact length of the cycle, so you can do it with only a single counter for the total number of states.
Say you have a machine with 128 bits of state, if it halts in 2^128 (roughly 340 undecillion) steps, that means it halts. If it's still going after 2^128 steps, then it'll keep going forever.
So, to see if a 128 bit machine halts, you only need an extra 128 bits to count up 1-by-1 to 2^128.
Sure, you also need a few thousand times the lifetime of the known universe just to count that high, but you know, It's O(2^128) instructions which means it's O(1).
Good luck doing that on any machine with any real amount of memory of course. Really easy to write "just count to 2^128", but not so easy to do it.
That's great and all, but nobody is talking about physical computers here.
Moreover, it's a useless observation in practice as well because the full exponentially-large state space necessary to represent such systems is not a useful formalism to approach computers with for any purpose, be it brutally practical or entirely theoretical. See https://news.ycombinator.com/item?id=23431703 for some previous comments I made on this, and some rough numbers base on computers that were already small 5 years ago and seem even smaller today. It is not useful to say "hey, that thing you're using has a finite state space and that finite state space is only 10^2,585,827,973 large!" It's not like you're going to run out.
More information on the algorithm to do so: https://en.wikipedia.org/wiki/Cycle_detection
Yes, lots of these problems assume fantasy infinite world.
Big O notation also suffers from this - it's almost useless for real world problems.
> Big O notation also suffers from this - it's almost useless for real world problems.
This is quite a bold claim. Yeah "in the real world" you often need to take into consideration the constant factors... but to go this far to say that Big O is useless? It's very useful to figure out what would likely work (or not) for your problem, given that you know the data size - I used it for exactly that, many many times.
Let's say you have 2 algorithms. Constant time O(1) and exp time 2^n. It seems constant time is better. But in fact it runs always 42 years. While exp one starts to run longer than millisecond for sizes bigger than number of particles in universe.
Big O says how fast complexity grows, but it doesn't say what that complexity exactly is.
Does something like that happens in real world? Yes, but of course not in such drastic ways.
The fact that Big O notation is sometimes misleading or not helpful is not evidence that it is not generally useful.
https://www.tumblr.com/accidentallyquadratic
As a concept it is pretty useful for me in a handwavy astrophysics math way because I otherwise wouldn't necessarily know how to make my naive solution fit real world constraints.
If you are writing everyday programs your constant factors are going to be very comparable and proportional to the size of your program, unless you have somewhere in your program hardcoded numerical constants like, run this loop 1 trillion times.
> While exp one starts to run longer than millisecond for sizes bigger than number of particles in universe.
You don't really know how this works, do you? I can guarantee that thera are more than 1M particles in the universe. And if you iterate 2^1M times, doing nothing, on a current computer... that's basically indistinguishable from an infinite loop.
If we're talking about "fantasy world" vs "real world", the "always run 42 years algorithm" surely is closer to fantasy world than most reasonable Big-O analyses...
If you need to pull an example from fantasy world to illustrate your point about Big-O notation not being useful "in the real world" you're probably committing the same alleged mistakes as computational theorists...
A very exaggerated example to use exp time… Whatever your constant, for exp time, it’s going to run for times longer than the life of the universe on probably any n > 10… Since you know, log time is practically constant
Most of the time the constants are going to be similar. And by similar I mean < 3 orders of magnitude say. So 2^9 or adding 9 to N eats that up pretty quick.
You find Big O notation useless for real world problems?
I find it useful to characterise algorithmic performance, e.g. O(1), O(n) O(n^2), etc.
What do you find useless about it - and do you know of something that works better?
Simpler algorithms are usually faster for small N, and N is usually small. Big O assumption is fantasy world where N is always large, and constant factors that slow down an algorithm can be ignored.
https://news.ycombinator.com/item?id=26296339
Small N right?
[flagged]
Yeah, constant factors are not represented and often make a big real world difference. Good point.
>Simpler algorithms are usually faster for small N, and N is usually small.
This mentality is how we ended up with cpu and memory hogging Electron apps...
That's not an accurate description.
For example, it is common in sorting algorithms to do an n² algorithm like bubble sort when the list to sort is small (e.g. <50), whereas when it's larger, we do an n log n algorithm like merge sort.
The issue is that merge sort does recursion, which comes with some extra cost, so that an n² algorithm beats an n log n algorithm, provided n is small.
It has nothing with your criticism to do.
You can (and probably should) add a threshold to recursive algorithms like mergesort so that they don't end up doing recursive calls on very small arrays.
For arrays smaller than the threshold, insertion is faster than bubble sort. And if you really don't want recursion for large arrays to begin with, you have heapsort or Shell sort.
I don't think there is any practical case where using bubble sort makes sense, except "efficiency doesn't matter for my use case and this is the code I already have".
As in one of volumes of Knuth's The Art of Computing, Heap sort is in place and achieves the Gleason bound so can be regarded as not beaten by quick sort.
Electron
a) isn’t really slow for what it does
b) introduces many layers of abstraction, leading to larger memory consumption compared to „native“ apps
c) is certainly not algorithmically unoptimized, it runs on a software stack that has been tuned like few others using billions of dollars
You just loosely associated words and concepts that occupy a similar emotional niche.
Better - estimate complexity so things like "this algorithm will take 1234 + 2n + 500m operations " for our use case
Much better - research (not always possible)
Best - benchmark it using realistic data from your use case
> Better - estimate complexity so things like "this algorithm will take 1234 + 2n + 500m operations " for our use case
I feel like that is never the level of detail you want.
Either that is way too much detail and big-oh notation abstracts away the meaningless stuff.
Or, if this level of detail matters, then you really need to go much further. Think about cache hit ratios, memory layout issues, different speed of different operations, etc.
Calculating number of operations is a weird middle ground that seems pretty useless.
>Big O notation also suffers from this - it's almost useless for real world problems.
It's not the only way to optimize things but there's a reason no one sorts with bubble sort.
Yes bubble sort usually will take more time, but big O notation does not say that quick sort will be better for your real world problem.
Complexity estimations or just benchmarks are much better at that.
You should never limit yourself to big O notation when comparing algorithms.
>You should never limit yourself to big O notation when comparing algorithms.
Sure, but it's still an incredibly useful tool for considering how your algo will scale.
Not quite bubble sort, but many recursive sorting implementations will switch to an O(n^2) algorithm like insertion sort once they reach a small number of elements. Simple but poorly-scaling methods usually have a place as base cases of hybrid algorithms.
In other news, there's a reason no BLAS implementation multiplies matrices with Strassen's algorithm, and it's not that the people writing those are too dumb.
Still, in quicksort, big O notation analysis is what tells you that you can further improve efficiency by leaving small subarrays unsorted and then performing a single call to insertion on the whole array at the very end, rather than one insertion call to sort each individual small subarray. This result is far from obvious without big O analysis. So still useful, even there.
Please read accidentally quadratic from time to time...
Which is evidenced everyday when optimizing database performance by creating indexes /s
For any function f(n) the problem "compute 2^f(n)" is going to be Ω(f(n)), because the output is f(n) bits; so merely writing down the answer takes Ω(f(n)) steps.
Note, that the number n is the input here, which is k = log(n) bits long. So the runtime is actually Ω(f(2^log(n))) = Ω(f(2^k)).
This is indeed as trivially true as you say, but this is not a decision problem, which is what the article is discussing. That said, you can use diagonalization over turing machines to construct a language that is indeed strictly more difficult.
Well, since you said "for any function f" ... It's not true for, say, constant functions.
How's that so?
Calculating 2^C for any fixed C is an asymptotically constant-time operation.