Your Best Thinking Is Wasted on the Wrong Decisions

I got some feedback at my performance review last week that I’ve been turning over since. The short version: I’m good at evaluating technical decisions, but I’m not always good at making the shape of those decisions visible to the people around me. I had a rich mental model for how a major architectural choice fit together. I’d written the proposal, circulated it, gotten broad support. And I still didn’t bring people along for the ride. The evaluation was sound. The high-level communication was fine. But the vast landscape of technical detail and tradeoffs that lived beneath the proposal (the part that other engineers would need to navigate after the decision was made) stayed in my head longer than it should have. I think, if I’m honest, the organizational hunger for the adoption made it easy to skip past that work. When everyone is excited to move forward, the voice that says “we should slow down and make sure the whole team can see what I see” is easy to mistake for unnecessary caution.

This is a post about what I’ve been thinking about since then. It’s partly a framework I’m developing for myself: a way to be more deliberate about which decisions deserve what kind of effort, and how to make that reasoning legible. It’s partly a pitch that the framework might be useful to others. I’m not confident I’ve got it all right yet. But the act of writing it down is itself part of the exercise, so here we are.

The problem I’m trying to solve is this: even teams that do the right things (write ADRs, circulate proposals, have real conversations about architecture) can still misallocate their deliberation. The document gets written but the wrong things get documented. The meeting happens but it addresses the top-level question (“should we adopt X?”) while the load-bearing questions (“how exactly does X change the way we build things for the next three years?”) slide past, answered implicitly by whoever has the deepest mental model. This works right up until that person is unavailable or wrong.

You know the pattern. Someone proposes a tool in Slack. A few people react with thumbs-up. Someone else raises a concern. A thread develops. The thread gets long enough that someone suggests “let’s take this offline,” which means a meeting, which means a calendar invite, which means the decision has now acquired process. Not because anyone decided it needed process, but because the conversation exceeded the carrying capacity of a Slack thread, which is roughly four messages before someone says something that requires context nobody present has1.

Other decisions happen the opposite way. Someone just does the thing. They open a PR, or they set up a service, or they add a dependency, and the decision is made the moment it’s merged. Nobody discussed it. Nobody objected. The absence of objection becomes, retroactively, consensus.

Both of these are fine, sometimes. The risk is that without a mental model for which approach fits which situation, the choice between them is governed by habit, or by the ambient culture of the team, or by whichever engineer happens to feel most strongly about it that day. The Slack-thread-to-meeting pipeline gets applied to linter configs. The just-do-it approach gets applied to database choices. Not because anyone chose wrong deliberately, but because the decision about how to decide never got made. It fell into place the way water finds the lowest channel. The result is a kind of ambient misallocation: time and rigor distributed without reference to the shape of the thing being decided, like applying the same amount of seasoning to every dish regardless of what you’re cooking. Sometimes you get it right by accident. Sometimes you get a tablespoon of salt on a crème brûlée.

A thing I have come to believe, more strongly since last week, is that the ability to distinguish these cases is one of the most valuable skills an engineering organization can develop. And it is teachable, which is good, because most of us develop the intuition eventually but without a conscious model for it. We learn to feel which decisions are heavy without quite being able to say why, the way a carpenter learns which wood will split before the chisel lands2. I’m trying to build the conscious model. This is what I have so far.


Concave, Convex, and the Shapes Between

Nassim Taleb introduces the concept of convexity in decision-making3. The name sounds like it belongs in a treatise on the properties of lenses, which, in a sense, it does: it is a way of looking at choices that reveals structure you were already navigating blind.

Some decisions have asymmetric payoffs, and the direction of the asymmetry matters. The names come from the shape of the payoff curve when you graph it (outcome on one axis, value on the other). A concave curve bows downward: it’s a ceiling, and the further you move from the sweet spot the faster things get worse. A convex curve bows upward: it’s a floor, and the further you move from the starting point the faster things get better. If you imagine yourself standing on the curve, concavity means you’re on a hilltop where any step in the wrong direction accelerates your descent. Convexity means you’re in a valley where any step takes you up4. The mathematics is from options pricing. The intuition is older than that.

Concave Decisions

The downside of being wrong dwarfs the upside of being right.

Decision quality → Outcome good bad wrong right 18-month migration, best engineer quits nobody notices (it just works) e.g. databases, auth models

A concave decision is one where the downside of being wrong vastly exceeds the upside of being right. Choose correctly and things work. Choose incorrectly and the consequences unfurl over months, each month revealing new dimensions you hadn’t anticipated, like opening a door and finding behind it not a room but a corridor, and behind that corridor another door, and behind that door another corridor. Your database choice is concave. Pick the right one and teams build confidently on its strengths; its query patterns shape how people think about their data, its guarantees become assumptions baked into every service. Pick the wrong one and you spend the next year migrating, during which time half the team is building the future and the other half is maintaining the past and the two halves are not having a great time.

Convex Decisions

The upside of being right compounds; the downside of being wrong is bounded.

Decision quality → Outcome good bad wrong right swap it out, lose a week months of dashboards, institutional knowledge e.g. monitoring tools, linters

A convex decision is the opposite. The downside of being wrong is bounded: you switch, you adapt, you lose a week. The upside of being right compounds over time. Your monitoring tool choice is convex. Choose wrong and you waste a few weeks, rip it out, try something else. Choose well and you build months of dashboards and alerts and institutional knowledge that makes every future incident shorter. The worst case is a minor setback. The best case is a quiet, compounding gift.

Linear Decisions

The cost of being wrong ≈ the cost of deliberation. Just pick one.

Decision quality → Outcome good bad wrong right slightly worse slightly better deliberation cost e.g. date libraries, YAML parsers

There is also the linear decision, where the upside and downside are roughly proportional and the cost of being wrong is comparable to the cost of the deliberation itself. Choosing a Markdown parser. Choosing an HTTP client library. These decisions have a correct answer, probably, but the difference between the correct answer and the second-best answer is smaller than the time you’d spend figuring out which is which.

I find this taxonomy useful not because it’s novel (the underlying observation is just “some things matter more than others,” which is not exactly a revelation) but because it gives you a name for a thing you already do intuitively. I wrote recently about how naming a pattern makes it discussable. The same might apply here. Most experienced engineers already modulate their effort based on what’s at stake, but without shared language for the distinction, every conversation about “how much rigor does this need” starts from scratch. The categories are simple. Whether they’re the right categories: I think so, but I’m still testing that belief against real decisions.


Recognizing the Shape

So: how do you tell? Here is a heuristic I’ve been arriving at. It is not complete, and I’m not sure it can be. There are older decisions in the codebase that predate the team’s memory, and they have their own gravity. But it has helped me ask better questions about decisions I’ve already made, which is at least a start.

Reversibility

The single most important property of a decision is whether you can undo it.

A choice you can reverse in a week is fundamentally different from a choice you will live with for years. Not because the choice itself is less important in some cosmic sense, but because the cost of being wrong is bounded by the cost of reversal. If that cost is low, the expected value of extensive deliberation is also low.

Choosing an internal logging library is reversible. You wrap it in an interface, you swap the implementation, you spend a day updating import statements. Annoying, but finite.

Choosing a message broker for your event-driven architecture is not reversible. Or rather, the reversal cost is high enough that it warrants real thought up front. We are adopting Kafka at Mercury right now, and this is not a Tuesday decision. It is the kind of decision where you talk to the teams who will produce events and the teams who will consume them and the team that will operate the infrastructure, and you think carefully about schema evolution and exactly-once semantics and what happens when a consumer falls behind, because the answers to those questions become architectural constraints that persist long after the meeting ends. The broker becomes part of the skeleton. Skeletons are not casually rearranged5.

Before any technical decision, it’s worth asking: “if this turns out to be wrong, what does the reversal look like?” If the answer is “swap out a library,” it’s convex. If the answer involves the words “migration,” “coordinated rollout,” or “we’d basically have to rewrite,” it’s concave. If the answer is “I don’t know,” that is itself information6. You’ve taken on risk whose shape you can’t yet see, and it may be worth understanding the shape before you proceed.

Coupling

A decision that affects one component is convex. A decision that affects the interfaces between components is concave. Interfaces are shared; they are load-bearing walls, and you cannot remove a load-bearing wall without first understanding what it supports. “What it supports” is often more than you thought.

Your choice of internal HTTP client library: affects one team. Convex. Your choice of serialization format for inter-service communication: affects every service. Concave. Your choice of authentication model: affects every service, every client, every deployment. Deeply concave.

A useful proxy: count the number of teams that would need to coordinate if you reversed the decision. If the answer is one, it’s convex. If the answer is “all of them,” it’s concave. If the answer is “technically just one, but also sort of all of them because of that shared library everyone imports,” you’ve discovered a hidden coupling. Worth understanding that before making further decisions in its vicinity.

The Knowledge Ratchet

Some tools reward prolonged use. The team builds expertise, creates custom dashboards, develops institutional knowledge that makes the tool more valuable over time. This accumulated knowledge is a form of capital, and switching tools means writing it off.

This is why monitoring choices are more concave than they first appear. A monitoring tool you’ve used for six months has six months of dashboards, alerts, runbooks, and “I know exactly which query to run when the payment service is slow” embedded in it. Switching means not just learning new syntax. It means rebuilding operational intuition from scratch, during which time your ability to understand your own production environment is diminished.

The pattern: tools with high knowledge accumulation start convex (low cost to try, easy to switch early) and become concave over time7. There is a window in which switching is cheap, and the window closes quietly. One day you realize you are no longer evaluating whether to keep using the tool. You are calculating the cost of leaving. These are different conversations. The first is a choice. The second is a negotiation with your own sunk costs8.


Temporal, Kafka, and What I Got Wrong

That’s the theory. Here is what it has looked like from the inside.

When I drove the adoption of Temporal at Mercury, I think I recognized the shape correctly. We were replacing our background job system: the mechanism by which a financial technology company processes the work that happens between “the user clicked a button” and “the thing actually happened.” I’ve written about that migration before in the context of essential versus accidental complexity. The short version: what looked like added complexity (a new orchestration system, new concepts, new infrastructure) turned out to be a net reduction in the complexity that actually mattered, because it replaced a tangle of ad-hoc retry logic and implicit ownership with explicit, type-checked workflow definitions.

The push took time. There were prototypes. There were conversations with the teams whose workflows would change. There were questions about operational complexity, about failure modes, about what happens when Temporal itself is unhappy. Every system is unhappy sometimes, and the relevant question is never “will it fail” but “how does it express failure, and can we understand the expression well enough to respond.” I think the deliberation was roughly proportional to the stakes, and that proportionality is part of why the adoption went as smoothly as it did.

We are currently in the process of adopting Kafka, and this is the one I mentioned up front: the one where the feedback landed. Event-driven architecture is a concave commitment. The schema formats, the topic structure, the consumer group semantics all become constraints that downstream teams build on. (I wrote about the design principles behind these constraints last year, which is itself an attempt at making some of this thinking legible.) I recognized this. I wrote the proposal, circulated it widely, and there was genuine support and hunger for adoption across the organization.

I think that hunger is part of what got me. When the organization wants to move, and you have the technical vision, there’s a strong pull to just go. To let the momentum carry the decision forward and sort out the details in flight. I had years of accumulated thinking about how all the pieces fit together (Kafka itself, the schema registry, Debezium for change data capture, the connect framework, the timelines, the migration paths) and transmitting that model to my own team turned out to be much harder than I expected. Not the high-level case for adoption. The enormous branching tree of technical detail that would shape how we actually built things after saying yes.

Before going on parental leave, I tried to solve this the way I solve most problems: with code. I produced a directed line of PRs for every step of the Kafka implementation I could anticipate, at least in draft form, hoping they would be either shippable or at least useful as a map. They were useful. But in reflection, what my team needed more was not a trail of implementation artifacts. They needed the conceptual model. They needed to understand why this piece connected to that piece, what the tradeoffs were at each junction, where the real complexity lived versus where I’d already simplified it away. I had given them the answers without adequately sharing the reasoning, which is a particular kind of disservice to a team of incredibly smart, skilled engineers who were more than capable of arriving at good implementations themselves if they could see the terrain. I had optimized for the wrong thing: shipping code when I should have been shipping understanding.

The feedback was fair. The technical evaluation was sound (I still believe that) but the legibility of the decision was insufficient. Not the decision to adopt Kafka, which had broad support, but the vast web of subsequent decisions that the adoption implies. Asking people to navigate a terrain you’ve mapped but haven’t shared the map for is asking a lot, even when the map is good. I think this might be a failure mode specific to concave decisions: because the stakes are high, the person driving the adoption does a lot of thinking. Because they’ve done a lot of thinking, they develop a rich internal model. Because the model is rich and coherent to them, they underestimate the gap between their understanding and everyone else’s. The big decision was communicated well. The thousand smaller decisions that lived inside it were not, and for concave decisions, the communication is part of the decision9. You haven’t finished deciding until the people affected can see what you see.

I’m still working on this. I don’t think I’ve fully internalized it yet. But it’s changed how I think about what “thorough evaluation” means for concave choices. It’s not just about getting the technical answer right. It’s about making the shape of the commitment visible to the people who will live inside it.


The Deliberation Budget

Here is what I think the practice should look like. I haven’t had time to apply it yet (the framework is newer than the decisions it’s meant to inform) but writing it down is part of how I intend to hold myself to it.

For concave decisions: slow down. Write the document. Do the evaluation. Talk to the team that tried this last year and hated it. Talk to the team that tried it and loved it. Understand why they disagree; the disagreement is where the information lives. Prototype. These decisions are load-bearing. They deserve your best thinking, because the cost of getting them right is some extra time spent, and the cost of getting them wrong is a year of rework.

Concave decisions include: databases, message brokers, programming languages for core services, authentication models, API contracts between teams, deployment architectures, workflow orchestration systems, and anything where the phrase “we’d have to migrate” appears in the reversal plan.

For convex decisions: move fast. Pick the thing that looks reasonable. Try it. Set a review point in a month. If it’s working, keep going. If it’s not, switch. The cost of choosing wrong is a few wasted weeks. The cost of deliberating is also a few wasted weeks, and at least the first option gives you real data. I’ve written before about how framing a decision as an experiment can defuse the kind of engineering debate that never actually ends but just wears everyone down. “Let’s try your approach for this one service and see what happens” only works when the decision is convex. But when it is, it’s a cheat code: you turn tension into curiosity, and nobody has to stake their ego on being right before the data is in.

Convex decisions include: logging and monitoring tools (early in adoption), internal libraries, development workflow tooling, code formatters, linters, and anything where the reversal plan is “swap it out and update the imports.”

For linear decisions: just pick one. If two options are roughly equivalent, the optimal strategy is to choose and move on. The difference between the options is smaller than the time you’d spend evaluating them.

Linear decisions include: which date library, which Markdown parser, which YAML library (they are all, in their own way, imperfect; the imperfections are different but the magnitude is the same; choose and proceed).

The Decision Record, Calibrated

For a concave decision, write a real ADR. Problem statement, options considered, trade-offs, recommendation, risks. This document will be useful in two years when someone asks “why Kafka and not Pulsar” and the answer involves reasoning worth preserving.

For a convex decision, write three sentences. “We chose Loki for log aggregation because it integrates with our existing Grafana setup and Alex had experience with it. We’ll revisit in Q3.” That’s the whole document. It contains everything anyone will need.

For a linear decision, write nothing. Nobody will ever ask why you chose this date library. If they do, “it was fine” is a complete answer.


Words That Do the Work

I’ve been writing a series of posts that keep arriving at the same conclusion from different angles: naming a thing gives people the ability to coordinate around it. I won’t belabor the point again. But I will say what it looks like for this particular thing.

“This is concave” is a small phrase. It compresses a lot: the reversal cost, the coupling, the ratchet. It tells the room to slow down. “This is convex” tells the room to keep moving. An engineer who has these words doesn’t need to argue from first principles each time. The words carry the argument10. And because they’re just words, short and memorable and slightly odd, they spread through a team faster than any process document.

You could use different words. You could say “load-bearing” and “decorative.” You could say “one-way door” and “two-way door,” which is the Amazonian formulation of the same essential insight11. What matters is that some words exist for this distinction on your team, and that they do the work of modulating effort so that nobody has to relitigate from scratch each time.

Three questions. That’s the whole framework.

  1. If we get this wrong, what does the reversal look like?
  2. How many teams would need to coordinate to undo this?
  3. Will this accumulate knowledge that makes switching expensive over time?

If the answers are “painful,” “many,” and “yes”: concave. Slow down. The corridor ahead is long and the doors only open forward.

If the answers are “straightforward,” “just us,” and “not really”: convex. Move fast. The cost of the wrong choice is less than the cost of the deliberation.

And if you’ve been discussing it for an hour and nobody is saying anything new: it’s probably linear, and the most useful thing anyone can do is just make the call.

There’s a fourth question I’m adding to my own list, the one I didn’t ask loudly enough last time:

  1. Can the people who will live with this decision see the shape of it the way I do?

If the answer is no, the decision isn’t finished yet, no matter how sound the technical evaluation is. I knew this in theory. I am learning it in practice, somewhat uncomfortably, which I suppose is how most of the important lessons arrive.


I started this post trying to build a framework and ended up writing something closer to a confession. I think that’s probably the right shape for it. The hard part is remembering that a decision you can see clearly is still invisible to the people you haven’t shown it to.

Footnotes

  1. Slack threads break down quickly. The typical pattern: someone proposes a solution, a few people react positively, then someone raises an edge case that requires explanation. By the fourth or fifth reply, the discussion has branched into multiple sub-topics and people are talking past each other. At this point someone inevitably suggests “let’s take this offline,” which kicks off the meeting-scheduling process. The medium shapes the outcome: what might have been a quick decision becomes a multi-day process not because the decision needed that much deliberation, but because the communication channel couldn’t contain it.

  2. There is a large body of research on expert intuition. Gary Klein’s Sources of Power is the canonical treatment. The finding is that experienced practitioners don’t deliberate through decisions so much as recognize them, the way a chess grandmaster sees a board position and knows the move before they can articulate why. The problem with recognition-based expertise is that it’s difficult to transfer. You can’t hand someone your pattern library. You can, however, give them categories, and categories are a decent scaffold for the intuition to grow around.

  3. Primarily in Antifragile: Things That Gain from Disorder (2012), though the mathematical treatment appears earlier in The Black Swan and in his more technical work on option pricing. The core insight (that the shape of a payoff matters more than the expected value) comes from financial derivatives theory, which is one of those fields that is either deeply elegant or deeply sinister depending on how close you stand.

  4. I realize I’ve just described concave as a hilltop and convex as a valley, which may seem backwards if you’re thinking about the words in terms of physical shapes. The terminology comes from the curve, not from where you’re standing. A concave function curves like the inside of a bowl, but you’re balanced on top of it, which is the important part. If it helps: concavity is where gravity is not your friend. Convexity is where gravity does the work for you. If this doesn’t help, the charts should make it clear.

  5. I say this with the weariness of someone who has tried. You can replace a message broker. What you cannot replace is the five years of implicit assumptions and feature creep that grew up around it like ivy on a wall. The ivy is structural now. You’ll find out which parts when you pull it away.

  6. The Rumsfeld matrix (known knowns, known unknowns, unknown unknowns) was widely mocked when Donald Rumsfeld articulated it in 2002, which is unfortunate because the underlying taxonomy (borrowed from the Johari window in psychology) is genuinely useful. Rumsfeld conspicuously omitted one of the quadrants: unknown knowns, the things you don’t know that you know. I propose that these are your unexamined assumptions, the background knowledge so deeply internalized you’ve forgotten it’s knowledge at all. In engineering decisions, unknown knowns could be things like “of course we’d never put a database on spot instances”… beliefs so obvious to you that you don’t think to write them down, right up until the new hire does exactly that. “I don’t know what the reversal looks like” is a known unknown, which is manageable: you can go find out. The dangerous case is the unknown unknown, the reversal cost you haven’t thought to ask about because you don’t yet know the dimension exists. Concave decisions are where unknown unknowns do the most damage, which is another way of saying: the less you understand the shape, the more you should want to understand it before committing.

  7. There’s a striking parallel here with Wardley mapping, Simon Wardley’s framework for mapping the evolution of technology components. In Wardley’s model, components evolve from genesis (novel, uncertain, high experimentation) through custom-built and product stages to commodity (well-understood, standardized, boring). The management approach that works at one stage is wrong for another; genesis demands exploration, commodity demands efficiency. The shape of a decision, I think, tracks a similar axis: early-stage components are convex (try things, learn, switch cheaply) and late-stage components are concave (they’re load-bearing now, the switching cost is real). The knowledge ratchet is what moves a component along that evolution curve, sometimes faster than you notice.

  8. Daniel Kahneman and Amos Tversky’s work on loss aversion is relevant here: the pain of writing off accumulated knowledge feels disproportionate to the rational cost of switching. This is how tools that should have been replaced eighteen months ago acquire a quiet permanence. Not because anyone defends them with conviction, but because the cost of leaving is always slightly more visible than the cost of staying.

  9. This echoes something I think originates with Richard Rumelt in Good Strategy Bad Strategy: a strategy that cannot be communicated is not a strategy. It’s a conviction held privately, and private convictions, however sound, do not coordinate organizations. They just wait, patiently, to be rediscovered by the next person who arrives at the same conclusion independently.

  10. This is also why senior engineer attrition is so quietly devastating. It’s not the code they wrote (code outlives its authors routinely). It’s the calibration they carried: the intuitive sense for which decisions were load-bearing and which were decorative, the ability to say “this one matters” in a way the room trusted. That judgment doesn’t get committed to version control. It walks out the door on someone’s last Friday and leaves behind a team that still has the vocabulary but has lost some of the fluency.

  11. Jeff Bezos distinguishes between “Type 1” (irreversible, consequential) and “Type 2” (reversible, low-stakes) decisions in his 2015 letter to shareholders. He argues that as organizations grow, they tend to apply heavyweight Type 1 processes to Type 2 decisions, producing “slowness, unthoughtful risk aversion, failure to experiment sufficiently, and consequently diminished invention.” The Taleb framing adds the knowledge-ratchet dimension: decisions can change type over time, which the Bezos binary doesn’t capture.