That is actually his step 1. Step 0 might be write down the problem as stated - but in his writing step 1 is really, "structure the problem." Of course you might have structured it wrong, that is in part where step 2 comes in.
My view of Uncle Bob is that his advise is attempt improve the average situation, even if taken literally as a gospell.
It’s better that everybody do route tdd, than no tests at all. It’s better to have all methods exactly 3 lines of code, than all methods being 3 pages long.
Uncle bob preaches better baseline, experience will teach all the nuance.
Disclaimer: I dont do tdd. But it’s nice method of work to know.
TDD as an incremental search from a simple start point towards a solutions. Put in those terms, it'll work when the path from start to solution is smoothly differentiable.
I had a similar reaction to Jeffries' Gilded Rose solution in Ruby [1]. After 13 blog posts he ended up with something [2] a lot less elegant than Victor Shepelev who also used TDD but came up with a one-shot solution [3]. Just like Norvig with his Sudoku solver, Shepelev codifies the rules as a set of relationships. He writes:
When rereading the requirements, we might notice that all conditions can be described as a simple dependency (name, sell_in) => change of quality.
The most natural representation of this in the modern Ruby would be pattern matching.
The smart way to write a Suduoku solver these days is to formulate the problem for an SMT solver, which is specifically intended to solve that kind of problem. The SMT solver already has tests, so you don't need to write tests.
When I wrote my first Sudoku solver I started out with a solver that solved easy problems where, at each step, there was some square with only one possible solution. That didn't work for harder problems where you have to guess. Eventually I realized that you could just pick any square, try all the numbers, and recursively solve the remaining problem. This works no matter which square you pick but it is faster if you start with the square that has the minimum number of choices available.
If you write a solver haphazardly like that you're very likely to wind up with two or more copies of the board because you're not sure what kind of queries you'll need to do. If you understand the problem, you won't.
The role of the tests is complex here. In principle they could help you refactor your data structures, in practice the sheer bulk of them could reify the data structures you have and add to the burden of refactoring them. A code-golfed version of the solver could be smaller than the tests.
Writing a chess program I found testing was a challenge. Move generators are easy to test, evaluation functions are easy to test. A decent alpha-beta search with transposition tables, killer heuristic and such is not so easy to test in terms of unit tests. Before I'd have the program play anyone I'd run a set of integration tests based on
I think of a system I inherited that had a part that was prone to race conditions, part of vanquishing the race conditions was writing a "super hammer" test that would run a few 1000 threads for about 40 seconds. Maybe it still has race conditions, but they won't be triggered often.
Long-running tests like that aren't really unit tests but for some problems they're the right tool. In general though long-running tests are a big problem because slow builds are a problem.
> but it is faster if you start with the square that has the minimum number of choices available
Alternatively, find the digit that, in its row/column/square/whatever, has the minimum number of choices available, and try each of them.
Ideally, combine the two and use Knuth’s dancing links algorithm (https://en.wikipedia.org/wiki/Dancing_Links). It, at every step, picks the one of those two approaches that has the lower number of possibilities, typically getting a lower branching factor, without spending much more time at each step.
In Sudoku I usually find it's easier to just keep track of guesses on the stack, by `solve` recursively when you make a guess. That leads to nice memory re-use characteristics and without the overhead of link pairs that might end up blowing out your level 1 data cache.
But you certainly want to be guessing on the most constrained spot you can.
Also, bitfields are a nice compact way of representing the possible values of a space and your cpu can and or or all the possible values in parallel, plus determine the number of possibilities with a modern CPU's popcount instruction.
I have only once written a Suduoku solver.
It brute-forced the whole problem space in a second or 2.
This works because it's very easy to prune invalid branches according to the rules of the game.
It just tried putting "1" in the top-left most empty square and checking if the configuration is valid. If not, try "2". recurse until you find a branch that completes.
Is this the smart way? IDK, but I was quite pleased with it.
I found it more elegant than e.g. "find some square with only one possible solution"
I also wrote a brute force, kinda dumb, depth-first Suduoku solver. I think it took me a few hours in C or C++. At every node in the search tree, I just checked to make sure no constraints were violated. If I remember correctly, it would run in less than 30 seconds and it would always find a solution if a solution existed. (It might have been a lot less than 30 seconds. I only remember thinking, "That was quick".)
I wrote a sudoku solver, that will create 9x9 array of arrays filled with numbers 1-9 for "blank" fields and only the number in defined field. Then for each square with only one number, I removed the number in that square from all arrays in row, column and subsquare. That typically left me with a solved array, but will contain all possible results for "guessing" fields. It was written in php (the thing I leaned at the time) and ran too fast for me to bother measuring, less than a second. Since I've "solved all sudokus" they stopped being fun.
Yep, pretty much the same approach and experience, only mine was in c# as I was learning the language then. "brute force" by definition will find a solution, if it exists.
> incremental search from a simple start point towards a solutions
Good point. More broadly than just TDD, we might characterize an "easy" problem as one where the surface from conception to completion is broadly visible and there's a logical set of steps (i.e. "differentiable"). Or maybe "easy" is I can see the full set of steps immediately, and "tractable" is that I've got a good guess.
But the set of steps you can take is highly dependent on your knowledge and skillset. For example, if I know how about completing the square, then deriving the quadratic equation is a basic algebra exercise. If I don't, then I might struggle for a long time before I manage to discover that key insight.
I'm sure there are coding methodologies which can lead to better or worse results, but they augment knowledge or skill, they don't replace it.
Isn't there a less-conceptual (but still conceptual) problem that correctness of software is commonly abrupt rather than continuous? You don't get a series of almost-right programs gradually approximating the right program, you have a correct program and variations on it may fail completely.
Of course, whether this is literally true depends on what sort of algorithmic problem you're approaching.
But there must be many real-world problems in which the very-nearly-correct program produces completely wrong behavior that doesn't resemble the correct behavior at all. In those circumstances, you couldn't expect to find the correct program empirically through iterative improvements.
Edit: Maybe the incremental test-driven approach would work in cases where you have an externally given specification or design that already breaks the algorithmic part up into smaller, easier, and more verifiable pieces.
Obvious counters aside (like syntax issues or whatever) I have almost the opposite intuition. Most of my programs start out as partial solutions to a problem I don't fully understand, and it is only through interaction with the environment (users, or other machines sometimes) that the edge-cases and incorrect assumptions become clear. These programs have a lifecycle of refinement to deployment to analysis to refinement and repeat. At each step they are workable or almost-so, and over time the solutions start to map the domain more correctly. Sometimes (like in business) the domain is evolving simultaneously!
(can give examples if anyone's interested but this is getting long already)
I imagine this wouldn't work so well for hard algorithmic stuff where there are mathematical properties you need to be aware of and maintain. But I find most problems I solve are more organic - people are quite resilient to fuzzy boundaries, so people-facing stuff tends to have that property too. There's a large fuzzy space of "workable solutions", so to speak, and navigating that space is kind of inevitable if you want a quality solution.
Perhaps I'm just not intelligent enough to one-shot that kind of stuff =P
>I imagine this wouldn't work so well for hard algorithmic stuff where there are mathematical properties you need to be aware of and maintain
Mathematical properties are often even more ideal candidates for being encoded into either types or property tests.
Business oriented code is usually where most people see TDD (how it is normally taught) fall down - where you need "some kind of dashboard with x, y and z" but the exact details arent precisely nailed down. However, if you can do it right these scenarios work extremely well with snapshot test driven development. Most people just dont do it or dont know how.
Okay interesting, I've never heard of snapshot testing. I'll have to play with it some time.
I agree that mathematical problems are much easier to test, but I think only once you know the mathematics. Like I think it's possible that TDD fell flat for the sudoku solver because the dude just didn't know what properties he wanted. In that situation writing tests is like casting bones.
But I'm not convinced one way or the other... for me tests have always been most useful for regression and basic quality checks. Which is very useful! Means you never (hopefully anyway) take a step backwards as you evolve a program.
Aren't snapshot tests regression tests? "Snapshot test driven development" to me implies that you would generate the snapshot you want (somehow) and write code until the output matched the snapshot.
2. Write code that gets the test to pass, generating a snapshot.
3. Iterate on the code until the snapshot looks right (maybe bringing stakeholders in the loop to ask "does this dashboard look right?").
4. Lock the snapshot down and commit.
5. Refactor (same as with vanilla TDD).
Arguably the "test" is fully written by stage 1, the only part missing is the correct generated artefact. I dont really give a fuck about semantics of the word "test" though, the process is what matters.
The snapshots and test steps and metadata can also be compiled together to generate documentation with the right framework - documentation as a side effect of TDD.
That sounds pretty cool, did you have any framework in mind in that last paragraph? Sounds miles better than what I'm currently doing, which is occasionally click through features I know are prone to bugs and manually check them... and manually writing documentation too.
> Isn't there a less-conceptual (but still conceptual) problem that correctness of software is commonly abrupt rather than continuous? You don't get a series of almost-right programs gradually approximating the right program, you have a correct program and variations on it may fail completely.
I consider it to be plausible that such a topology could exist (at least for many situations). The problem rather is that such a topology would likely behave very different from users' expectations.
Working large systems overwhelmingly started out as working small systems, with working systems all in-between.
This is not an endorsement of TDD, but shows that there is a correctness path from small to large, without usually needing to take large leaps in-between, and taking such a path tends to be the most successful strategy.
> Wouldn't it be nice if you just had a general method you could apply to solve everything?
Would it? Why?
> But the reality is there's no single approach we can take.
> I know, I know, this is all so deeply... unsatisfying.
Why is it so unsatisfying?
This is the only thing that I disagree with the post, that this need to have a general approach to solving problems is necessary for solving problems to be satisfying. What is so disconcerting about a problem that seems to be unsolvable? Why do we need to solve everything immediately?
Maybe it's important not only to accept, but to embrace the fact that there is a sort of magic to the process of creativity that we can't capture in a bottle. But for that to happen we need to divorce ourselves from the necessity of productivity. This way we are more likely to be able to just sit with the problem without beating ourselves over not being able to solve it and without feeling like the problem owe us a resolution in return for our efforts.
It doesn't prevent anyone however from pursuing better tools and more general approaches to systematizing thought and problem solving. We can do both -- tackling a problem creatively or systematically -- as long as we feel like we are able to choose doing it. I think if we feel like we must find these general methods, they will likely elude us indefinitely.
This sounds very similar to the challenge that companies had implementing Six Sigma (or other statistically-based quality programs). The tools significantly helped the average person improve the quality of their processes as determined by a variety of metrics. However, they were never going to tell you the solution or improvement to implement. They could give hints as to where to look for a solution, but someone still had to dream up the solution. For this reason, brain storming was always one of the most critical tools in the Six Sigma toolkit.
I was fascinated by the "Sudoku Affair", found myself speculating on the internal mindset of TDD advocates, and ended up with an unsatisfying conclusion that you can't systematize thought. Not my best writing but thought I'd share nonetheless.
I think there’s an analogy in the dumb management dream that there’s a process by which you can replace these expensive skilled workers with cheap unskilled idiots and still get great results.
Another variation: the magic process that lets you get great results from people who don’t care.
Excellence doesn’t come solely from process. The benchmarks of great process with unskilled workers who don’t care is fast food, and this product is consistently mediocre. Consistency is good but consistent mediocrity is an unworthy ambition in many fields.
NASA is massively process driven but they get people to space and back, and the people in their process are highly skilled and care deeply.
Larry Wall, inventor of Perl, used to say (and probably still does) that complexity has to go somewhere. If you’re solving a complex problem then either it’s a simple program with complex tools, or a complex program with simple tools.
That’s really stuck with me. The art of library, framework, language, API design is to provide a set of tools — if you make simple tools then complex problems become complex programs. And if you offer complex tools, complex problems can have simple solutions. And people will gripe at you for all the punctuation for decades. :)
But the complexity exists and can’t be wished away. Which is kinda your point too. Thanks for writing!
> Consistency is good but consistent mediocrity is an unworthy ambition in many fields.
Sorry to take only a small part of an otherwise great comment but is this actually true? It seems to me that there is a great many fields in which consistency is more important than excellence, especially if the striving for excellence produces great misses as well sometimes. In a well-designed system with some allowed tolerance, as long as it's good enough you are fine. Take the electricity grid for example: There's no prizes for maintaining the frequency to within a nano-Hertz of the spec. There are very large fines for being outside the spec (+/-0.050 Hertz for the EU grid). Being consistently within spec is much more valuable than occasionally performing much better than the spec.
It is only in extreme winner-takes-all fields like sports, spacefaring and entrepreneuring that being the absolute best is what you want. In most other fields being consistently decent beats out varying excellence. I definitely wouldn't want my dentist to take a risky moonshot in pursuit of excellence, for example.
"I definitely wouldn't want my dentist to take a risky moonshot in pursuit of excellence". Excellent comment.
I was aware as I was writing it that consistent mediocrity is indeed a profitable target. Perhaps we're quibbling over "mediocrity"? Staying within the spec seems enough of the challenge for a grid, I'm not sure that I'd define excellence as narrower and narrower variation around the spec there.
I was making a moral judgement in "unworthy". I like cars to come off the factory floor consistently good. I own a Tesla, this statement has high salience for me. That seems like a challenge. I wouldn't respect the Lada factory for consistently turning out cars that break down or fall apart, just as I don't respect McDonalds for consistently delivering underwhelming food. I acknowledge the consistency, I acknowledge it's profitable, but they're not achieving consistent greatness.
> there’s a process by which you can replace these expensive skilled workers with cheap unskilled idiots and still get great results.
> Another variation: the magic process that lets you get great results from people who don’t care.
Isn't that pretty much what the US military does? They take in "dumb" teens + docs + processes and get whatever it is they need out of this? It's also cheap (compared w/ industry). And by the end of the exercise the "dumb" teens become skilled in their own niche, some of them highly skilled (see nuclear specialists, ATCs, pilots, navigators, managers, procurement, etc.)
The military processes are at the base of lots of organisational stuff we have in software dev and business as well (agile, scrum, ooda, etc)
Where do you get the idea that this is cheap? Militaries are somewhat famous for being incredible money sinks. They need incredibly specific skills that are not really taught in the civilian world. This means the military needs to train everyone at its own cost. So while the input might be "cheap, unskilled idiots", there is then a significant expense to turn them into expensive skilled non-idiots before they are ready for duty.
As someone who worked inside the military for 14 years, it is also quite a stretch to say they consistently get "great" results tbh.
I feel like you're saying that education (as practiced by the US military) is a somewhat reliable process for taking unskilled people and making them skilled. I agree.
The US Military have admissions criteria. They bounce people out in Basic Training and in every other part of training. Not everyone gets to be an ATC, a pilot, or a nuclear specialist. Air Traffic Control turns out not to be a process you can put an unskilled person into. It requires training and careful integration, and once you have a skilled person with many hours invested in them, perhaps you can let them direct traffic.
Training isn't a magic process that takes unskilled people who don't care and delivers great results every time. The equivalent in our world would be "I'll hire cheap people who don't know how to program, put them through a Bootcamp process, and then I'll have great programmers." That didn't work.
I am trying to say that: every version of programming where there's been The Process You Just Have To Follow (from Jackson Structured Design) has failed significantly and hasn't been a substitute for hiring smart people who care. If someone came to me with a business plan that was "hire mediocre people who don't care, and we'll achieve great results because of My Process", I'd be veeeeery skeptical indeed.
"Scrum" isn't really "Agile", though. Maybe someone developed it through a truly Agile process, but then they froze it and held it up as an unchanged ideal and started down the oh-so-appealing road of telling anyone that finds it doesn't work for them that it's because they weren't doing it right.
I find myself having to distinguish between what I call "Real Agile" and Scrum quite a bit, because Scrum is exactly the sort of thing that Real Agile was a reaction against.
I've raided Scrum for ideas in my agile processes, but I would never rigidly do exactly it.
Royce's original "Waterfall" paper[1] is significantly more iterative & agile than Scrum! It's outdated since at the time (1970) programming was done mostly on paper & then run later on a mainframe or special-purpose computer, but many of the core ideas still hold true for modern programming environments.
Of course many of the organizations that claimed to be implementing Waterfall omitted the iterative steps, then wondered why it didn't work. That also happens with "Agile" processes. Sticking to the waterfall analogy, it's like if you stopped the evaporation & precipitation cycle that refills the upstream aquifer & wondered why the waterfall stopped flowing!
Agile is a set of considerations to ponder if you think you want to operate in a manager-less environment. The supplemental 12 Principles[1] goes into a bit more detail, explaining that developers need to take on jobs like communicating with the business people and amongst themselves to supplant what a manager would normally take care of.
Scrum, on the other hand, is a "process" to manage teams. It even assigns what it calls a "Scrum Master" to act as a manager. Literally the opposite of Agile. Scrum does propose that if you follow it, it can help lead you to eventual reach a state of Agile, which seems to be the source of the association, but when have you actually ever seen that happen?
I am aware that Scrum essentially defines itself as "agile". However, the Agile Manifesto defines a very, very specific sort thing as "Agile", and is what I'm calling "real agile", and Scrum is not it.
Scrum could be it, if it presented itself as "here's a set of things to consider as tools you could deploy, but hey, do whatever works". But it doesn't. It is every bit as prescriptive as any of the methodologies that real Agile is a revolt against and can and does have all of the pathologies when it is applied in places where it doesn't make sense, or even just excessively rigidly. Scrum as it is practiced in the real world responds to "it's not working" with "do it more accurately, then!", not "oh, well, fix it up as you see fit". That's why so many of us here have such a visceral distaste of it. Many of us have enough experience and run-ins with "Scrum" to know that anyone trying to claim "Oh, but it 'really' wants you to be Agile and change it to work however you need to" is in practice just motte-and-bailey. That's not how it works in the wild.
You need to figure this out sooner or later or you're going to be deeply and repeatedly taken advantage of in life: Just because someone puts a label on something doesn't mean that label is accurate. Scrum isn't Agile and I don't care how many times someone grabs a label printer, prints out the word "Agile", and slaps it on Scrum. It's plainly obviously not an Agile methodology and never was.
Agile isn't a methodology; it's a meta-methodology, which is why it's so hard to productize.
It is not even that. It is basically just a roundabout statement of "We believe software developers should be in control of the entire software development process". Or even more succinctly, "No managers". The Twelve Principles highlights the things developers need to consider when there isn't a manager around to do that work for them.
Scrum, while having little to do with Agile in and of itself, suggests that if you follow it, developers can start to learn how to operate on their own. This seems to be why it is commonly associated with Agile.
> Agile isn't a methodology; it's a meta-methodology, which is why it's so hard to productize
More specifically, its a meta methodology that specifically rejects methodology as a one-size-fits-all, or even custom but top-down-imposable, product, but explicitly holds that methodology is an emergent product of continuous optimization within and specific to the team doing the work.
Scrum is training wheels for agile. If your team were truly terrible and previously had 6 monthly iterations or something you'll be seeing some "agile" benefits which may seem amazing compared to the bullshit you put up with before.
Training wheels do ultimately need to come off, though.
If you were using kanban, TDD, pairing, CI, close customer feedback, multiple daily releases and all that good stuff, enforcing Scrum is little different to putting training wheels on a tour de france team's bikes, patting the cyclists on the head telling them that if they use the wheels correctly it'll give them a speed boost.
I'd be somewhat more charitable and say that it's an attempt to break down a big, complicated problem into chunks that are more easily digestible. The problem I think Jeffries ran into is that Sudoku and the search space is
essentially atomic, where breaking it down further isn't helpful, while seeming like it's simple to break it down to row, column, and box and work from there.
There's the principle "never let them see you sweat" and if you're trying to convince people of an idea like TDD you never want to be seen floundering. You don't publish anything about your sudoku solver until you've succeeded at it. Otherwise you're just proving "TDD sux" which, for all "X sux", TDD sux more than X.
This is a rare type of article - a concrete analysis of different approaches to programming (that are arguably themselves reflections of different cognitive styles), that outlines the shortcomings of one approach in a specific domain, without generalising too much.
You see similar in other arenas, too. And sometimes, as annoying as it is, popular techniques really do have better success than not. Even more annoying, unpopular techniques can often have better success rate than we care to acknowledge.
The examples in my mind are: outlining, task breakdown, object modeling, and rote repetition.
And this is exactly thinking AI is going to make human thought and skill redundant is crazy. LLMs can never get to a point where you whip them up to solve any general intellectual challenge. Even creating CRUD apps (that are well-behaved, secure, performant, scalable, and maintainable) can't really be totally systematized.
I respect he's open about his work and struggles, and it's cool he's programming at 86, but it does seem like his approach makes it harder for him rather than easier.
For example with the bowling score calculator, it's great to start with some tests, but I think he then marched towards a specific OOP formulation which obscured rather than clarified the problem.
It's making the same reasoning mistake that a lot of discussion of the Entscheidungsproblem makes - the problem talks about a generic algorithm to answer for all programs P if they can solve T, for all tasks T, the discussion assumes that you can't decide if P solves T for any P or T.
With that in mind, let's look at the crux of the argument: "If we can't decide if a program P solves a task T, then we certainly can't solve the even harder problem of finding a program P that solves a given task."
That's simply not true. We can decide if a program P solves a task T, for a specific program. Moreover, that means that for large classes of tasks, we actually can throw mud at the wall and see if it sticks - as long as it's decidable if the particular program P solves the particular task T.
And for any problems you can exhaustively test, hey, you really can rely entirely on TDD and hill-climbing the problem space. Hence, bowling scores being easier than Sudoku solvers.
As soon as you leave the (almost) exhaustively testable space, things become harder. And it's worth keeping in mind that TDD originates from a payroll system - something that's more amenable to exhaustive testing ("do all of our employees get the right amount, and did HR/finance stop yelling, and is our CFO not getting jailed for tax evasion") than a systematic approach. (Government plus corporate bureuacracy means that there are absolutely no deep structures to find. It's all about the special cases)
You can still do "TDD" at a higher level. You just need to accept that TDD is simply a primitive form of formally establishing your constraints, and that some constraints are hard to establish purely through testing. But that there exist many formalism to allow you to express higher level constraints.
This is at the core of the Haskell/Rust quip that "once the type checker/borrow checker is quiet, you can be confident the solution works".
Maybe constraint-driven design would've been a better name.
I won't argue with the idea whether you can test a particular program produces a result (we aren't so interested in programs that run forever in any case). Your observation of the origins of TDD makes sense in this regard: there may be domains where the TDD approach is more useful, but it's not a generic panacea, and assuming it is can lead to sadness.
My point was more about arbitrary programs and tasks. In general, I don't think there's any particular process you can follow that, in effect, reduces programming or math to a checklist. I'm sure there are techniques that can help structure many problems but they're necessarily fairly general e.g. "think about the problem, take a nap, think about it some more".
What I was hoping to highlight was more the danger of trying to hill-climb without having the right set of skills, thus assuming you can solve a problem by following a particular pattern.
To give an example, I remember trying to derive the quadratic equation when I was young, and no amount of algebraic repositioning was going to help me unless I know how to complete the square! But once that type of trick is in your toolbox, you can begin to solve all sorts of other problems.
Even the bowling calculator can become hard if you don't know what your doing, not because TDD is bad or a wrong way to do things, but because you don't know how to reason about your program. Even bowling has 12^13 outcomes, with a bunch of silly special cases if you approach it the wrong way. Thus a truly naive approach might lead you down a weird direction.
I mean, you can. "Have an initial idea. Define a utility function. Apply gradient descent".
It's just that all three steps are really really hard.
TDDs insight is that gradient descent is relatively easy if the utility function is one-dimensional and monotonic. (Bonus point, it still works with the set of initial ideas being empty)
The "tricks in your toolbox" are ultimately all about simplifying the utility function from "exhaustive mapping of problem domain to solution demain proves valid" to a simpler one. In your example, you mapped the problem domain from "solve order two polynomial" to "complete square, take square root, solve order one polynomial".
You _could_ apply these tricks mechanically (hey, that's what symbolic algebraic systems do in your example), but it would require an initial formal specification of the problem - that's ultimately what Norvig does for his Sudoku approach - and, for a general approach, a way to reason over formal specifications.
It always boils down to "how well do you understand the problem, and how well can you describe it formally". TDD works best for "not at all, not at all".
I'm reminded of Rich Hickey's Hammock Driven Development talk, whose thesis for design is basically:
1. Think about the problem and write it all down
2. Research what other people have done to solve this problem
3. Think about it some more and write all that down too
4. Sleep on it
I think the conclusion is the same: Good design does not naturally arise from good programming, and background/domain knowledge is essential
That’s pretty much just Feynman’s algorithm but with more steps:
1. Write down the problem.
2. Think real hard.
3. Write down the solution.
Understand the problem...
That is actually his step 1. Step 0 might be write down the problem as stated - but in his writing step 1 is really, "structure the problem." Of course you might have structured it wrong, that is in part where step 2 comes in.
But Uncle Bob said that if I follow his TDD process by rote then my code will automatically attain good design! /s
My view of Uncle Bob is that his advise is attempt improve the average situation, even if taken literally as a gospell.
It’s better that everybody do route tdd, than no tests at all. It’s better to have all methods exactly 3 lines of code, than all methods being 3 pages long.
Uncle bob preaches better baseline, experience will teach all the nuance.
Disclaimer: I dont do tdd. But it’s nice method of work to know.
This was an interesting read but the source article is fascinating https://explaining.software/archive/the-sudoku-affair/
TDD as an incremental search from a simple start point towards a solutions. Put in those terms, it'll work when the path from start to solution is smoothly differentiable.
I had a similar reaction to Jeffries' Gilded Rose solution in Ruby [1]. After 13 blog posts he ended up with something [2] a lot less elegant than Victor Shepelev who also used TDD but came up with a one-shot solution [3]. Just like Norvig with his Sudoku solver, Shepelev codifies the rules as a set of relationships. He writes:
1. https://ronjeffries.com/articles/020-01ff/gilded-rose-1/2. https://github.com/RonJeffries/gold-1/blob/master/wrappers.r...
3. https://zverok.substack.com/i/149071314/the-implementation
The smart way to write a Suduoku solver these days is to formulate the problem for an SMT solver, which is specifically intended to solve that kind of problem. The SMT solver already has tests, so you don't need to write tests.
When I wrote my first Sudoku solver I started out with a solver that solved easy problems where, at each step, there was some square with only one possible solution. That didn't work for harder problems where you have to guess. Eventually I realized that you could just pick any square, try all the numbers, and recursively solve the remaining problem. This works no matter which square you pick but it is faster if you start with the square that has the minimum number of choices available.
If you write a solver haphazardly like that you're very likely to wind up with two or more copies of the board because you're not sure what kind of queries you'll need to do. If you understand the problem, you won't.
The role of the tests is complex here. In principle they could help you refactor your data structures, in practice the sheer bulk of them could reify the data structures you have and add to the burden of refactoring them. A code-golfed version of the solver could be smaller than the tests.
Writing a chess program I found testing was a challenge. Move generators are easy to test, evaluation functions are easy to test. A decent alpha-beta search with transposition tables, killer heuristic and such is not so easy to test in terms of unit tests. Before I'd have the program play anyone I'd run a set of integration tests based on
https://www.chessprogramming.org/Bratko-Kopec_Test
I think of a system I inherited that had a part that was prone to race conditions, part of vanquishing the race conditions was writing a "super hammer" test that would run a few 1000 threads for about 40 seconds. Maybe it still has race conditions, but they won't be triggered often.
Long-running tests like that aren't really unit tests but for some problems they're the right tool. In general though long-running tests are a big problem because slow builds are a problem.
> but it is faster if you start with the square that has the minimum number of choices available
Alternatively, find the digit that, in its row/column/square/whatever, has the minimum number of choices available, and try each of them.
Ideally, combine the two and use Knuth’s dancing links algorithm (https://en.wikipedia.org/wiki/Dancing_Links). It, at every step, picks the one of those two approaches that has the lower number of possibilities, typically getting a lower branching factor, without spending much more time at each step.
In Sudoku I usually find it's easier to just keep track of guesses on the stack, by `solve` recursively when you make a guess. That leads to nice memory re-use characteristics and without the overhead of link pairs that might end up blowing out your level 1 data cache.
But you certainly want to be guessing on the most constrained spot you can.
Also, bitfields are a nice compact way of representing the possible values of a space and your cpu can and or or all the possible values in parallel, plus determine the number of possibilities with a modern CPU's popcount instruction.
This is how I implemented it: https://codeberg.org/uecker/toys/src/branch/main/sudoku.c
I have only once written a Suduoku solver. It brute-forced the whole problem space in a second or 2.
This works because it's very easy to prune invalid branches according to the rules of the game.
It just tried putting "1" in the top-left most empty square and checking if the configuration is valid. If not, try "2". recurse until you find a branch that completes.
Is this the smart way? IDK, but I was quite pleased with it. I found it more elegant than e.g. "find some square with only one possible solution"
I also wrote a brute force, kinda dumb, depth-first Suduoku solver. I think it took me a few hours in C or C++. At every node in the search tree, I just checked to make sure no constraints were violated. If I remember correctly, it would run in less than 30 seconds and it would always find a solution if a solution existed. (It might have been a lot less than 30 seconds. I only remember thinking, "That was quick".)
I wrote a sudoku solver, that will create 9x9 array of arrays filled with numbers 1-9 for "blank" fields and only the number in defined field. Then for each square with only one number, I removed the number in that square from all arrays in row, column and subsquare. That typically left me with a solved array, but will contain all possible results for "guessing" fields. It was written in php (the thing I leaned at the time) and ran too fast for me to bother measuring, less than a second. Since I've "solved all sudokus" they stopped being fun.
Yep, pretty much the same approach and experience, only mine was in c# as I was learning the language then. "brute force" by definition will find a solution, if it exists.
> incremental search from a simple start point towards a solutions
Good point. More broadly than just TDD, we might characterize an "easy" problem as one where the surface from conception to completion is broadly visible and there's a logical set of steps (i.e. "differentiable"). Or maybe "easy" is I can see the full set of steps immediately, and "tractable" is that I've got a good guess.
But the set of steps you can take is highly dependent on your knowledge and skillset. For example, if I know how about completing the square, then deriving the quadratic equation is a basic algebra exercise. If I don't, then I might struggle for a long time before I manage to discover that key insight.
I'm sure there are coding methodologies which can lead to better or worse results, but they augment knowledge or skill, they don't replace it.
Isn't there a less-conceptual (but still conceptual) problem that correctness of software is commonly abrupt rather than continuous? You don't get a series of almost-right programs gradually approximating the right program, you have a correct program and variations on it may fail completely.
Of course, whether this is literally true depends on what sort of algorithmic problem you're approaching.
But there must be many real-world problems in which the very-nearly-correct program produces completely wrong behavior that doesn't resemble the correct behavior at all. In those circumstances, you couldn't expect to find the correct program empirically through iterative improvements.
Edit: Maybe the incremental test-driven approach would work in cases where you have an externally given specification or design that already breaks the algorithmic part up into smaller, easier, and more verifiable pieces.
Obvious counters aside (like syntax issues or whatever) I have almost the opposite intuition. Most of my programs start out as partial solutions to a problem I don't fully understand, and it is only through interaction with the environment (users, or other machines sometimes) that the edge-cases and incorrect assumptions become clear. These programs have a lifecycle of refinement to deployment to analysis to refinement and repeat. At each step they are workable or almost-so, and over time the solutions start to map the domain more correctly. Sometimes (like in business) the domain is evolving simultaneously!
(can give examples if anyone's interested but this is getting long already)
I imagine this wouldn't work so well for hard algorithmic stuff where there are mathematical properties you need to be aware of and maintain. But I find most problems I solve are more organic - people are quite resilient to fuzzy boundaries, so people-facing stuff tends to have that property too. There's a large fuzzy space of "workable solutions", so to speak, and navigating that space is kind of inevitable if you want a quality solution.
Perhaps I'm just not intelligent enough to one-shot that kind of stuff =P
>I imagine this wouldn't work so well for hard algorithmic stuff where there are mathematical properties you need to be aware of and maintain
Mathematical properties are often even more ideal candidates for being encoded into either types or property tests.
Business oriented code is usually where most people see TDD (how it is normally taught) fall down - where you need "some kind of dashboard with x, y and z" but the exact details arent precisely nailed down. However, if you can do it right these scenarios work extremely well with snapshot test driven development. Most people just dont do it or dont know how.
Okay interesting, I've never heard of snapshot testing. I'll have to play with it some time.
I agree that mathematical problems are much easier to test, but I think only once you know the mathematics. Like I think it's possible that TDD fell flat for the sudoku solver because the dude just didn't know what properties he wanted. In that situation writing tests is like casting bones.
But I'm not convinced one way or the other... for me tests have always been most useful for regression and basic quality checks. Which is very useful! Means you never (hopefully anyway) take a step backwards as you evolve a program.
Aren't snapshot tests regression tests? "Snapshot test driven development" to me implies that you would generate the snapshot you want (somehow) and write code until the output matched the snapshot.
Snapshot test driven development is:
1. Write test.
2. Write code that gets the test to pass, generating a snapshot.
3. Iterate on the code until the snapshot looks right (maybe bringing stakeholders in the loop to ask "does this dashboard look right?").
4. Lock the snapshot down and commit.
5. Refactor (same as with vanilla TDD).
Arguably the "test" is fully written by stage 1, the only part missing is the correct generated artefact. I dont really give a fuck about semantics of the word "test" though, the process is what matters.
The snapshots and test steps and metadata can also be compiled together to generate documentation with the right framework - documentation as a side effect of TDD.
That sounds pretty cool, did you have any framework in mind in that last paragraph? Sounds miles better than what I'm currently doing, which is occasionally click through features I know are prone to bugs and manually check them... and manually writing documentation too.
hitchstory
I always use the following analogy:
I f your customer orders a feature, you implement all the code, just the button to call the feature is missing, then you delivered nothing.
If you just add the button to the program, but implement nothing else, you delivered the feature. It's just still buggy.
> Isn't there a less-conceptual (but still conceptual) problem that correctness of software is commonly abrupt rather than continuous? You don't get a series of almost-right programs gradually approximating the right program, you have a correct program and variations on it may fail completely.
I consider it to be plausible that such a topology could exist (at least for many situations). The problem rather is that such a topology would likely behave very different from users' expectations.
Yes. The difference between "A program that does what you want" and "A program that crashes on startup" can be one character.
Ya but the real fun ones are the "a program does what you want 99.993% of the time"
The annoying situation being that sometimes that 0.007% requires a major rearchitecture!
Working large systems overwhelmingly started out as working small systems, with working systems all in-between.
This is not an endorsement of TDD, but shows that there is a correctness path from small to large, without usually needing to take large leaps in-between, and taking such a path tends to be the most successful strategy.
> Wouldn't it be nice if you just had a general method you could apply to solve everything?
Would it? Why?
> But the reality is there's no single approach we can take.
> I know, I know, this is all so deeply... unsatisfying.
Why is it so unsatisfying?
This is the only thing that I disagree with the post, that this need to have a general approach to solving problems is necessary for solving problems to be satisfying. What is so disconcerting about a problem that seems to be unsolvable? Why do we need to solve everything immediately?
Maybe it's important not only to accept, but to embrace the fact that there is a sort of magic to the process of creativity that we can't capture in a bottle. But for that to happen we need to divorce ourselves from the necessity of productivity. This way we are more likely to be able to just sit with the problem without beating ourselves over not being able to solve it and without feeling like the problem owe us a resolution in return for our efforts.
It doesn't prevent anyone however from pursuing better tools and more general approaches to systematizing thought and problem solving. We can do both -- tackling a problem creatively or systematically -- as long as we feel like we are able to choose doing it. I think if we feel like we must find these general methods, they will likely elude us indefinitely.
This sounds very similar to the challenge that companies had implementing Six Sigma (or other statistically-based quality programs). The tools significantly helped the average person improve the quality of their processes as determined by a variety of metrics. However, they were never going to tell you the solution or improvement to implement. They could give hints as to where to look for a solution, but someone still had to dream up the solution. For this reason, brain storming was always one of the most critical tools in the Six Sigma toolkit.
I was fascinated by the "Sudoku Affair", found myself speculating on the internal mindset of TDD advocates, and ended up with an unsatisfying conclusion that you can't systematize thought. Not my best writing but thought I'd share nonetheless.
I think there’s an analogy in the dumb management dream that there’s a process by which you can replace these expensive skilled workers with cheap unskilled idiots and still get great results.
Another variation: the magic process that lets you get great results from people who don’t care.
Excellence doesn’t come solely from process. The benchmarks of great process with unskilled workers who don’t care is fast food, and this product is consistently mediocre. Consistency is good but consistent mediocrity is an unworthy ambition in many fields.
NASA is massively process driven but they get people to space and back, and the people in their process are highly skilled and care deeply.
Larry Wall, inventor of Perl, used to say (and probably still does) that complexity has to go somewhere. If you’re solving a complex problem then either it’s a simple program with complex tools, or a complex program with simple tools.
That’s really stuck with me. The art of library, framework, language, API design is to provide a set of tools — if you make simple tools then complex problems become complex programs. And if you offer complex tools, complex problems can have simple solutions. And people will gripe at you for all the punctuation for decades. :)
But the complexity exists and can’t be wished away. Which is kinda your point too. Thanks for writing!
> Consistency is good but consistent mediocrity is an unworthy ambition in many fields.
Sorry to take only a small part of an otherwise great comment but is this actually true? It seems to me that there is a great many fields in which consistency is more important than excellence, especially if the striving for excellence produces great misses as well sometimes. In a well-designed system with some allowed tolerance, as long as it's good enough you are fine. Take the electricity grid for example: There's no prizes for maintaining the frequency to within a nano-Hertz of the spec. There are very large fines for being outside the spec (+/-0.050 Hertz for the EU grid). Being consistently within spec is much more valuable than occasionally performing much better than the spec.
It is only in extreme winner-takes-all fields like sports, spacefaring and entrepreneuring that being the absolute best is what you want. In most other fields being consistently decent beats out varying excellence. I definitely wouldn't want my dentist to take a risky moonshot in pursuit of excellence, for example.
"I definitely wouldn't want my dentist to take a risky moonshot in pursuit of excellence". Excellent comment.
I was aware as I was writing it that consistent mediocrity is indeed a profitable target. Perhaps we're quibbling over "mediocrity"? Staying within the spec seems enough of the challenge for a grid, I'm not sure that I'd define excellence as narrower and narrower variation around the spec there.
I was making a moral judgement in "unworthy". I like cars to come off the factory floor consistently good. I own a Tesla, this statement has high salience for me. That seems like a challenge. I wouldn't respect the Lada factory for consistently turning out cars that break down or fall apart, just as I don't respect McDonalds for consistently delivering underwhelming food. I acknowledge the consistency, I acknowledge it's profitable, but they're not achieving consistent greatness.
> there’s a process by which you can replace these expensive skilled workers with cheap unskilled idiots and still get great results. > Another variation: the magic process that lets you get great results from people who don’t care.
Isn't that pretty much what the US military does? They take in "dumb" teens + docs + processes and get whatever it is they need out of this? It's also cheap (compared w/ industry). And by the end of the exercise the "dumb" teens become skilled in their own niche, some of them highly skilled (see nuclear specialists, ATCs, pilots, navigators, managers, procurement, etc.)
The military processes are at the base of lots of organisational stuff we have in software dev and business as well (agile, scrum, ooda, etc)
Where do you get the idea that this is cheap? Militaries are somewhat famous for being incredible money sinks. They need incredibly specific skills that are not really taught in the civilian world. This means the military needs to train everyone at its own cost. So while the input might be "cheap, unskilled idiots", there is then a significant expense to turn them into expensive skilled non-idiots before they are ready for duty.
As someone who worked inside the military for 14 years, it is also quite a stretch to say they consistently get "great" results tbh.
I feel like you're saying that education (as practiced by the US military) is a somewhat reliable process for taking unskilled people and making them skilled. I agree.
The US Military have admissions criteria. They bounce people out in Basic Training and in every other part of training. Not everyone gets to be an ATC, a pilot, or a nuclear specialist. Air Traffic Control turns out not to be a process you can put an unskilled person into. It requires training and careful integration, and once you have a skilled person with many hours invested in them, perhaps you can let them direct traffic.
Training isn't a magic process that takes unskilled people who don't care and delivers great results every time. The equivalent in our world would be "I'll hire cheap people who don't know how to program, put them through a Bootcamp process, and then I'll have great programmers." That didn't work.
I am trying to say that: every version of programming where there's been The Process You Just Have To Follow (from Jackson Structured Design) has failed significantly and hasn't been a substitute for hiring smart people who care. If someone came to me with a business plan that was "hire mediocre people who don't care, and we'll achieve great results because of My Process", I'd be veeeeery skeptical indeed.
> The military processes are at the base of lots of organisational stuff we have in software dev and business as well (agile, scrum [...])
The Agile Manifesto was exactly the counter-manifesto to this, and thus any methodology that calls itself agile (e.g. Scrum) is:
Agile Manifesto
> https://agilemanifesto.org/
Principles behind the Agile Manifesto
> https://agilemanifesto.org/principles.html
"Scrum" isn't really "Agile", though. Maybe someone developed it through a truly Agile process, but then they froze it and held it up as an unchanged ideal and started down the oh-so-appealing road of telling anyone that finds it doesn't work for them that it's because they weren't doing it right.
I find myself having to distinguish between what I call "Real Agile" and Scrum quite a bit, because Scrum is exactly the sort of thing that Real Agile was a reaction against.
I've raided Scrum for ideas in my agile processes, but I would never rigidly do exactly it.
Royce's original "Waterfall" paper[1] is significantly more iterative & agile than Scrum! It's outdated since at the time (1970) programming was done mostly on paper & then run later on a mainframe or special-purpose computer, but many of the core ideas still hold true for modern programming environments.
Of course many of the organizations that claimed to be implementing Waterfall omitted the iterative steps, then wondered why it didn't work. That also happens with "Agile" processes. Sticking to the waterfall analogy, it's like if you stopped the evaporation & precipitation cycle that refills the upstream aquifer & wondered why the waterfall stopped flowing!
[1] https://www.praxisframework.org/files/royce1970.pdf
> "Scrum" isn't really "Agile", though.
Wikipedia:
> https://en.wikipedia.org/wiki/Scrum_(software_development)
"Scrum is an agile team collaboration framework commonly used in software development and other industries."
In other words: people who claim to do Scrum, but in a non-agile way are simply scammers.
Agile is a set of considerations to ponder if you think you want to operate in a manager-less environment. The supplemental 12 Principles[1] goes into a bit more detail, explaining that developers need to take on jobs like communicating with the business people and amongst themselves to supplant what a manager would normally take care of.
Scrum, on the other hand, is a "process" to manage teams. It even assigns what it calls a "Scrum Master" to act as a manager. Literally the opposite of Agile. Scrum does propose that if you follow it, it can help lead you to eventual reach a state of Agile, which seems to be the source of the association, but when have you actually ever seen that happen?
[1] https://agilemanifesto.org/principles.html
I am aware that Scrum essentially defines itself as "agile". However, the Agile Manifesto defines a very, very specific sort thing as "Agile", and is what I'm calling "real agile", and Scrum is not it.
Scrum could be it, if it presented itself as "here's a set of things to consider as tools you could deploy, but hey, do whatever works". But it doesn't. It is every bit as prescriptive as any of the methodologies that real Agile is a revolt against and can and does have all of the pathologies when it is applied in places where it doesn't make sense, or even just excessively rigidly. Scrum as it is practiced in the real world responds to "it's not working" with "do it more accurately, then!", not "oh, well, fix it up as you see fit". That's why so many of us here have such a visceral distaste of it. Many of us have enough experience and run-ins with "Scrum" to know that anyone trying to claim "Oh, but it 'really' wants you to be Agile and change it to work however you need to" is in practice just motte-and-bailey. That's not how it works in the wild.
You need to figure this out sooner or later or you're going to be deeply and repeatedly taken advantage of in life: Just because someone puts a label on something doesn't mean that label is accurate. Scrum isn't Agile and I don't care how many times someone grabs a label printer, prints out the word "Agile", and slaps it on Scrum. It's plainly obviously not an Agile methodology and never was.
Agile isn't a methodology; it's a meta-methodology, which is why it's so hard to productize.
> it's a meta-methodology
It is not even that. It is basically just a roundabout statement of "We believe software developers should be in control of the entire software development process". Or even more succinctly, "No managers". The Twelve Principles highlights the things developers need to consider when there isn't a manager around to do that work for them.
Scrum, while having little to do with Agile in and of itself, suggests that if you follow it, developers can start to learn how to operate on their own. This seems to be why it is commonly associated with Agile.
> Agile isn't a methodology; it's a meta-methodology, which is why it's so hard to productize
More specifically, its a meta methodology that specifically rejects methodology as a one-size-fits-all, or even custom but top-down-imposable, product, but explicitly holds that methodology is an emergent product of continuous optimization within and specific to the team doing the work.
Scrum is training wheels for agile. If your team were truly terrible and previously had 6 monthly iterations or something you'll be seeing some "agile" benefits which may seem amazing compared to the bullshit you put up with before.
Training wheels do ultimately need to come off, though.
If you were using kanban, TDD, pairing, CI, close customer feedback, multiple daily releases and all that good stuff, enforcing Scrum is little different to putting training wheels on a tour de france team's bikes, patting the cyclists on the head telling them that if they use the wheels correctly it'll give them a speed boost.
I'd be somewhat more charitable and say that it's an attempt to break down a big, complicated problem into chunks that are more easily digestible. The problem I think Jeffries ran into is that Sudoku and the search space is essentially atomic, where breaking it down further isn't helpful, while seeming like it's simple to break it down to row, column, and box and work from there.
There's the principle "never let them see you sweat" and if you're trying to convince people of an idea like TDD you never want to be seen floundering. You don't publish anything about your sudoku solver until you've succeeded at it. Otherwise you're just proving "TDD sux" which, for all "X sux", TDD sux more than X.
This is a rare type of article - a concrete analysis of different approaches to programming (that are arguably themselves reflections of different cognitive styles), that outlines the shortcomings of one approach in a specific domain, without generalising too much.
You see similar in other arenas, too. And sometimes, as annoying as it is, popular techniques really do have better success than not. Even more annoying, unpopular techniques can often have better success rate than we care to acknowledge.
The examples in my mind are: outlining, task breakdown, object modeling, and rote repetition.
And this is exactly thinking AI is going to make human thought and skill redundant is crazy. LLMs can never get to a point where you whip them up to solve any general intellectual challenge. Even creating CRUD apps (that are well-behaved, secure, performant, scalable, and maintainable) can't really be totally systematized.
Wow, I just read the original Sudoku post by Norvig yesterday for the first time, and now today this shows up here!
It was bound to happen to someone.
Wow, the Ron Jeffries articles are sort of embarrassing, and he doesn't even realize. This why dogma never works.
I respect he's open about his work and struggles, and it's cool he's programming at 86, but it does seem like his approach makes it harder for him rather than easier.
For example with the bowling score calculator, it's great to start with some tests, but I think he then marched towards a specific OOP formulation which obscured rather than clarified the problem.
It's making the same reasoning mistake that a lot of discussion of the Entscheidungsproblem makes - the problem talks about a generic algorithm to answer for all programs P if they can solve T, for all tasks T, the discussion assumes that you can't decide if P solves T for any P or T.
With that in mind, let's look at the crux of the argument: "If we can't decide if a program P solves a task T, then we certainly can't solve the even harder problem of finding a program P that solves a given task."
That's simply not true. We can decide if a program P solves a task T, for a specific program. Moreover, that means that for large classes of tasks, we actually can throw mud at the wall and see if it sticks - as long as it's decidable if the particular program P solves the particular task T.
And for any problems you can exhaustively test, hey, you really can rely entirely on TDD and hill-climbing the problem space. Hence, bowling scores being easier than Sudoku solvers.
As soon as you leave the (almost) exhaustively testable space, things become harder. And it's worth keeping in mind that TDD originates from a payroll system - something that's more amenable to exhaustive testing ("do all of our employees get the right amount, and did HR/finance stop yelling, and is our CFO not getting jailed for tax evasion") than a systematic approach. (Government plus corporate bureuacracy means that there are absolutely no deep structures to find. It's all about the special cases)
You can still do "TDD" at a higher level. You just need to accept that TDD is simply a primitive form of formally establishing your constraints, and that some constraints are hard to establish purely through testing. But that there exist many formalism to allow you to express higher level constraints.
This is at the core of the Haskell/Rust quip that "once the type checker/borrow checker is quiet, you can be confident the solution works".
Maybe constraint-driven design would've been a better name.
I won't argue with the idea whether you can test a particular program produces a result (we aren't so interested in programs that run forever in any case). Your observation of the origins of TDD makes sense in this regard: there may be domains where the TDD approach is more useful, but it's not a generic panacea, and assuming it is can lead to sadness.
My point was more about arbitrary programs and tasks. In general, I don't think there's any particular process you can follow that, in effect, reduces programming or math to a checklist. I'm sure there are techniques that can help structure many problems but they're necessarily fairly general e.g. "think about the problem, take a nap, think about it some more".
What I was hoping to highlight was more the danger of trying to hill-climb without having the right set of skills, thus assuming you can solve a problem by following a particular pattern.
To give an example, I remember trying to derive the quadratic equation when I was young, and no amount of algebraic repositioning was going to help me unless I know how to complete the square! But once that type of trick is in your toolbox, you can begin to solve all sorts of other problems.
Even the bowling calculator can become hard if you don't know what your doing, not because TDD is bad or a wrong way to do things, but because you don't know how to reason about your program. Even bowling has 12^13 outcomes, with a bunch of silly special cases if you approach it the wrong way. Thus a truly naive approach might lead you down a weird direction.
> reduces programming or math to a checklist
I mean, you can. "Have an initial idea. Define a utility function. Apply gradient descent".
It's just that all three steps are really really hard.
TDDs insight is that gradient descent is relatively easy if the utility function is one-dimensional and monotonic. (Bonus point, it still works with the set of initial ideas being empty)
The "tricks in your toolbox" are ultimately all about simplifying the utility function from "exhaustive mapping of problem domain to solution demain proves valid" to a simpler one. In your example, you mapped the problem domain from "solve order two polynomial" to "complete square, take square root, solve order one polynomial".
You _could_ apply these tricks mechanically (hey, that's what symbolic algebraic systems do in your example), but it would require an initial formal specification of the problem - that's ultimately what Norvig does for his Sudoku approach - and, for a general approach, a way to reason over formal specifications.
It always boils down to "how well do you understand the problem, and how well can you describe it formally". TDD works best for "not at all, not at all".