Testing Thought

The Definitions I’m Using for This Essay

Thought: any mental event: perception, memory, association, image, reaction.

Idea: a thought that asserts something about the world, a system, or what should be done. An opinion, conviction, or principle. A plan, purpose, or goal.

Many thoughts are musings, distractions, ruminations, associations, entertainment - they pass through and are mostly forgotten, your brain deems them relevant for just a moment. However, when a thought or idea keeps emerging in your head, independent of relevancy to recent events, it is effectively associated with something unresolved - it should be developed, synthesised, reasoned about, or expressed somewhere - you need perspective on it. If you keep turning a thought over in your mind, adding to, removing from it, compounding it into a narrative without ever writing it down or building any material trace of it - not attempting to properly verify it and remove bias from it - you are just engaging in a kind of self-flattery, or self-torture when it doesn’t. When an underlying thought keeps emerging and it develops into variations of an idea that doesn’t motivate a change in behaviour, the transformation of the original content hasn’t been internalised. It hasn’t mattered enough to you because it either doesn’t ring true or it makes you uncomfortable. Many wait to collect more data.

The problem is that ideas have a kind of truth-weight; you either find them emotionally aligned, or hold analytical conviction in them before verifying. And, something that feels true feels especially good when it is yours - you feel productive, and maybe even smart. That feeling is not evidence for anything though. The only way to find out whether an idea has weight is to push it into contact with something that can reject it - a sharper version of yourself, another mind, a system that depends on it being right. Without that contact, conviction is just attachment.

There are methods for finding out which of your ideas hold real weight. It is the same method, in different guises, across philosophy, science and many other disciplines (notably for me - software engineering, and cyber security). Each domain runs a archetypal loop: produce a candidate, expose it to a stricter referee, analyse the result, update based on what comes back (with each referee being itself a candidate with it’s own referee). The process of production is the discipline of putting concepts in front of judges who do not share your priors. You do this so you can make something reproducible and generate value upon it’s application.

At each complete version, you can repeat, fork, reduce, expand or re-purpose - or use it as inspiration, example or reference.

The Stages of Production Are Progressive Evaluation Pressure

Each stage is a stricter test than the one before.

Stage	What It Tests
Conception	Nothing.
Research	Verification of presuppositions with existing knowledge, plausibility, comparison with existing and adjacent knowledge.
Writing / Editing	Legibility, whether or not the idea survives being made explicit and communicated.
Publishing / presenting	Plausibility, does it survive contact with other people’s models.
Demonstrating / Teaching	Utility. Making the case that it can bear load in someone else’s hands.

The stages are not a one-way conveyor belt. You cannot prove you have accounted for what you do not know - the set of possible counterexamples and unconsidered angles is not enumerable in advance. Every stage’s verdict is therefore provisional. A piece of evidence, a counterexample, a reading you had not anticipated can arrive at any moment and send the idea back to any prior stage - not just the one immediately before. When writing surfaces a missing premise, go back to research. If publication exposes a counterargument you cannot answer, go back to writing. When demonstration fails, the failure is surely somewhere upstream. Each stage is dialectical with every one before it - the point is iterating toward something reproducible. The reason to illustrate it as stages is that the test at each gate is distinct, not that the path through them is, the stages gives you direction and a destination. In practice, it’s navigation.

Let’s take a research proposal as an example idea.

Conception to Research:

You look up the relevant literature, compare your idea against existing work, check whether your presuppositions and intuitions still hold, and decide whether the claim is plausible given adjacent knowledge - that the boundary of your idea has an interface to the rest of the world.

Most people stop here. The idea slots into something already known, and that’s validating. But it isn’t verification. Plausibility against existing knowledge is the cheapest test in the stack - it confirms only that the idea is not obviously contradicted by what is already established. It says nothing about whether the idea adds anything, whether it holds under pressure, or whether it survives being made explicit.

Research to Writing / Editing:

You produce representations of the idea - the abstract, the argument, the chain of reasoning - in a form that has to hold together as prose. The act of writing forces internal contradictions to surface. Gaps that intuition was bridging become missing paragraphs. Claims that felt obvious in your head turn out to need three paragraphs of support, or none at all because they were never claims, just associations. The idea either survives being made explicit or it does not. The goal is to communicate your point and reasoning. Then you edit, you improve clarity and rhetoric, and find what’s essential and what’s a side or footnote - the goal being to debloat and sharpen.

Writing to Publishing / Presenting:

Publishing exposes the idea to readers who do not share your priors. They will not finish your sentences. They will not extend charitable interpretation to sloppy claims. They will object, ignore, or misread you in a way that reveals the idea was less clear than you thought. Their reactions are evaluation signal: where does the argument fail to land for a mind that is not yours? Silence is signal too - a piece that gets read and produces nothing usually wasn’t saying anything.

Publishing / Presenting to Demonstration / Teaching:

Teaching is the strictest stage because someone else’s outcome now depends on the idea being right. A reader who disagrees with your essay loses nothing. A user who builds on your software, your framework, your method - they pay the cost when the idea fails. This is where flattering ambiguities get burned out. Either the idea bears load under someone else’s pressure of your performance or it does not.

Publishing and demonstration share a property the earlier stages do not: they produce public artifacts. The argument fixed in prose, the system running in production - these can be re-read, re-run, contested, attacked, or reproduced. That is the point of getting the idea out at all. Without an artifact, the idea reverts to the unverifiable state of thought. With one, it becomes available to three further referees: peers who re-derive the argument and either find or fail to find the same conclusions; adversaries who probe for failure modes the author never considered; and the future version of yourself, who has lost the priors that were quietly filling gaps and now reads the work as a stranger. The point of producing the artifact is to admit those tests. An idea that exists only in your head admits none of them.

Thoughts and ideas purely in your head are unverifiable. The idea cannot fail because it does not commit anything. Writing is the first real test - putting the idea in linear prose forces internal contradictions to surface that intuition was hiding. Publishing exposes the idea to other minds and models, that can raise concerns and counterexample. Demonstration is the strictest stage, because now someone else’s outcome, if they adopt your idea’s application, depends on the idea being some level of true. You are demonstrating a manifestation of the idea as evidence that other people can reproduce and use it.

Each stage strips away a layer of self-validation. By the time your idea is in production, it has either been falsified four ways or survived four tests.

The Scientific Method Is One Version of This

The Scientific Method handles taking a narrow class of ideas to production - those that can be operationalised into a falsifiable prediction. The core practice of fulfilling the goals of a research proposal: understanding the hypothesis and existing literature, running experiments, recording and analysing observations, opening up to refutation - is rigorous within its domain because the core referee does not lie, reproducible measurement is not flattering you, not bored of you, not ignoring you for political reasons. The scientific method is a commitment to investigating the falsity of an idea: claims about the world must be testable against the world. Experiment design, statistical inference, and peer review are machinery built around it.

Most ideas are not of this kind. Claims about what is worth doing, what is true about people, what a system should look like - these cannot be operationalised cleanly. But they can still be tested and refined by being deployed. The structure is the same: a claim, exposure to something that can reject it, an update on the result.

The mistake is treating “not formally falsifiable” as “not testable”. Plausibility and relevance are verifiable. Utility is testable. The practice behind the discipline of caring whether you were right is the same.

Conviction Must Scale With Reach

Early stages tolerate exploration. You can be wrong in your notebook. You can be wrong in a draft. The cost of being wrong is bounded by how many models depend on you being right - and at that stage, the answer is one.

A published essay is read by people who will form opinions on the basis of your reasoning. A shipped product is used by people who will rearrange their life and work around it. A piece of infrastructure is depended on by systems that were not built with your failure modes in mind. Each step outward multiplies the consequences of a wrong claim.

The bar for conviction has to rise to match. At the notebook stage the standard is is this interesting and relevant. At the publication stage it’s is this true enough to commit to. At the demonstration and teaching stage it’s is this true enough that I am willing to be responsible for what someone builds on top of it.

Demanding production-grade conviction at the notebook stage - is costly, because it kills exploration before any idea is rough-shaped enough to keep testing. The discipline is matching the bar to the stage. Do not aim for perfection, aim for iteration.

Stake

The frictionless way to hold an idea is anonymously. you can publish under a pseudonym, share without claiming, float it as a thought experiment - and if the idea fails, you lose nothing. this isn’t prudence. without your name on the work, the gap between what you claimed and what turned out to be true costs you nothing to close. you won’t close it and you will not learn. you’re avoiding the full feedback signal - the feedback isn’t addressed and sent to where the submission truly comes from. this is why having a stake is essential if you really care about an idea and the reliability of it, even just for yourself - you must stake something on it.

Stake is not about a bet you win or lose. It is the mechanism that gives feedback purchase. Unstaked criticism slides past, because nothing about you depends on the claim being true. Staked criticism cannot: it lands on something you have committed to, and you have to do something with it - revise, defend, or retract. All three are iteration. None of them happen without the commitment. Putting your name on the work is the smallest version of that commitment, and the one most people refuse.

Interalize this:

Feedback is force - criticism, a counterargument, a user complaint, a failed prediction. It arrives with some pressure behind it.
Stake is the surface that force can grip on. When you have committed to a claim publicly, criticism has somewhere to land - it lands on the gap between what you said and what is true, and you have to do something about that gap.
Without stake, the same criticism just slides off. Someone tells you your anonymous draft is wrong; you shrug, because nothing of yours was on the line. The force was real but it had no surface to act on, so it produced no movement.

On Perfectionism & Procrastination

Perfectionism is editing mistaken for iteration. You polish the same draft for the tenth time because you believe the next pass will close the gap between what you have and what you imagined. It will not. Editing without feedback is unbounded variation - the changes have nothing to select against, so they accumulate without converging. What makes a draft feel finished is not another pass; it is contact with a reader who can tell you what is unclear, what is wrong, and what you over-explained. The confidence you are trying to manufacture by editing comes from elsewhere - from the next stage, which you are avoiding. Basically put, you’re producing variation without committing to a selection and getting proper feedback back.

Procrastination is the symptom of an unscoped idea. You cannot start because there is no start - the work feels like one undifferentiated block of do the thing, which is impossible to begin. The cause is usually upstream: you have not written the idea down enough to discover that it is actually four claims resting on three presuppositions, half of which you have not verified. Scope produces steps; steps are what you can begin. The fix is not willpower. It is going back one stage and doing the work the idea has been skipping.

Both failure modes are refusals to advance a stage. The perfectionist will not publish, and the procrastinator will not start. The cause is the same - the idea never gets the test that would actually develop it.

Feedback Must Be Analysed, Not Just Received

Once an idea is in contact with the world, signal arrives. Not all of it is informative.

Audiences reject ideas for the wrong reasons. They accept ideas for the wrong reasons. A piece of software can be loved for a coincidental property and ignored for its core insight. A claim can survive review because the reviewer was not paying attention. A market can reward something because of pricing, not because of fit. Make sure you’re requesting for feedback in the right environment, ask for feedback from people who can be honest, and are your target audience. Otherwise from an outsider, who understands your target audience and has a high degree of competency at something. Get a few samples.

The naive move is to update directly on whatever lands - cheers as confirmation, silence as failure. The disciplined move is to ask what each piece of feedback actually tells you. Did this person object because the argument is wrong, or because the conclusion is uncomfortable? Did this user drop off because the idea fails, or because the onboarding does? The fidelity of the update depends on the fidelity of the analysis. Approximation toward truth is still possible - it just cannot afford the step where you skip asking what the data is actually saying.

Production Is Reality Contact

You manifest and realise approximate truths through this process. It is either useful to other people or it is not. The feedback through contact with reality is what distinguishes a load-bearing idea from a self-flattering, implausible, or inexplicable one.

Every domain that takes ideas seriously has built a process for this. The scientist runs an experiment. The software engineer ships to production. The security practitioner submits to adversaries and welcomes adversaries. The writer publishes and orates. The referees and the tolerances differ, however the structure and underlying progression does not. Take the thing in your head, make it exist in a form that can be consumed and rejected by others, learn from the rejection, iterate, release to the public.

Additionally, It’s what a real adventure and the building of character involves. The goal is to produce value in the world and develop yourself and skill as you go. You are the incumbent of many good ideas and people risking putting themselves out there and holding themselves to a similar standard.

What this converges toward is not certainty. It is approximation - good ideas, then load-bearing ideas, and sometimes, after enough contact has held, ideas that approximate a truth. The process is the thing you can execute today.

The Scientific Method

Jacob Sussmilch

Explorer

Testing Thought

The Definitions I’m Using for This Essay

The Stages of Production Are Progressive Evaluation Pressure

The Scientific Method Is One Version of This

Conviction Must Scale With Reach

Stake

On Perfectionism & Procrastination

Feedback Must Be Analysed, Not Just Received

Production Is Reality Contact

Graph View

Table of Contents

Backlinks

Jacob Sussmilch

Explorer

Testing Thought

The Definitions I’m Using for This Essay

The Stages of Production Are Progressive Evaluation Pressure

The Scientific Method Is One Version of This

Conviction Must Scale With Reach

Stake

On Perfectionism & Procrastination

Feedback Must Be Analysed, Not Just Received

Production Is Reality Contact

Related

Graph View

Table of Contents

Backlinks