Evaluate Your Evaluator

The standard complaint is that non-technical founders and management can’t evaluate technical specialists. The problem isn’t that they haven’t done haven’t shipped code - it’s that they’ve never been a beginner at something difficult long enough to know what mastery costs. And if that’s the case, without that experience, every specialist looks roughly the same: someone making confident claims about a domain they (the founder) don’t understand. The founder hires on charisma, fires on output legibility, and never closes the loop because they can’t tell competent work from theatrical work.

For some the fix is to find a technical co-founder. The actual fix, if they are going to work with any specialist at all is narrower: the founder needs competence in at least one domain whose artefacts overlap with the specialists they’re evaluating. Chart literacy is the most cross-cutting of these. Almost every specialty eventually defends itself in numbers - engineers in latency graphs, designers in funnel data, ops in throughput metrics, sales in cohort retention. A founder who can read a chart honestly - who knows what a confidence interval is, who notices when the y-axis starts at 80%, who asks for the denominator - has a tool that travels across most of the company. The marketing-analytics founder is technical in a real sense, and that technicality buys evaluation power well beyond marketing.

But there’s a ceiling. Domain-specific competence breeds domain-specific evaluation ability. It does not breed domain-general humility. The engineer-founder who micromanages designers because “I ship products” is the cautionary case - fluent in one technical register, certain that fluency generalises, and wrong about it. A founder fluent in analytics can evaluate analytics work and the work of anyone whose output is data. They still can’t tell you whether your database schema is reasonable, or whether your designer is any good. Pretending otherwise is the same failure mode in a different uniform.

What does transfer is the meta-skill of knowing how long hard things take and how easily they fail. A founder who has built one difficult thing - a model, a pipeline, a campaign that actually moved a number - usually retains a permanent allergy to “this should be simple.” That allergy is the most valuable thing a founder brings to specialist evaluation. It’s worth more than any specific technical skill, because it’s the precondition for taking a specialist seriously when they push back on a deadline.

The argument generalises. Board members, stakeholders, HR, and line managers face the same evaluation problem and usually fail it for the same reasons. The bar should be the same, in three parts:

have you built something from scratch?
can you read the artefacts the people you evaluate produce?
have you tried to specialise at one thing?
- …aiming for five years, with the hope of hitting the plateau where getting better stops being a matter of putting in hours and starts being a matter of restructuring how you think.

Many evaluators in most organisations clear none of these. That isn’t an indictment of the people; it’s an observation that the evaluation layer has been staffed with primary qualifications of availability and presentability, and the specialists below it are being graded by people who don’t know what speciality requires. The fix isn’t demanding every evaluator become a specialist or work as one. It’s making sure everyone in the evaluation chain has done at least one hard thing for long enough to understand how set backs from false priors can emerge.

So, in any interview I take from here on, at some point, I’ll be interviewing the interviewer. And when someone evaluates my performance at work I’ll be doing the same. Three questions: have you built something from scratch, can you read what I’d produce, have you specialised long enough to have plateaued at it? Three no’s, and the relevant question isn’t whether I should take the job - it’s whether anyone competent could trust being graded by you. And actually there is the reality check to ask evaluators as well: Who evaluates your performance and how?

Structural Hubris Under Asymmetric Power

Jacob Sussmilch

Explorer

Evaluate Your Evaluator

Graph View

Backlinks