25 May 2026

The Model Is Not the Product. The Harness Is.

Why so many people are still disappointed by AI, and why the next coding skill is learning how to build the environment around the model.

AICodingAgentsLuma

A note summarising Ryan Lopopolo's OpenAI harness engineering article and the Luma harness gap. — The note that crystallised this for me: humans should steer, agents should execute, and the repo should carry the context.

I think a lot of people are missing the point with AI.

Not because they are sceptical. Some scepticism is healthy. And not because the models are weak. The models are incredible. The problem is that most people are still using AI like a clever chat box, when the real leverage comes from building a system around it.

That system is the harness.

I have been feeling this very clearly while working on the Luma repo and trying to make it agent-first. When I use AI without a harness, I still have to carry too much of the work in my own head. I have to explain the context, paste the error, describe the runtime, remind it what matters, tell it where the files are, ask it to run checks, and then manually inspect whether the thing actually works.

That is useful, but it is not the real thing.

The real thing is when the agent can see the repo, understand the docs, run the app, inspect the logs, reproduce the bug, make the change, run the tests, check the UI, and come back with evidence.

That is when AI starts to feel less like a chatbot and more like a serious teammate.

Why people get frustrated

I see this pattern with a lot of friends.

They try AI. They ask it to do something useful. It gives a decent answer, but then it falls short. It does not know the full context. It makes a wrong assumption. It cannot check the actual app. It cannot see the logs. It cannot run the build. It does not know the design principles. It forgets some earlier decision. So the human gets frustrated and concludes that AI is overhyped.

But that is the wrong conclusion.

The better conclusion is: the model was operating without enough environment.

If you put a brilliant person in a room with no repo access, no terminal, no browser, no logs, no tests, no context, and no source of truth, they would also struggle. We should not be surprised when a model struggles in the same situation.

The model is only one part of the system.

What a harness actually gives the model

A harness is the structure around the model that lets it do reliable work.

For coding, that means things like:

A short AGENTS.md that acts as a map, not a giant instruction dump.
Repo-local docs that explain architecture, product decisions, design principles, and current plans.
Scripts that make the app easy to install, run, test, lint, seed, reset, and inspect.
A local development environment the agent can actually boot.
Logs, metrics, traces, screenshots, and browser tools the agent can query directly.
Tests and checks that define what "done" means.
Guardrails around dangerous actions.
A memory or plan system so long-running work does not depend on one fragile chat thread.
Cleanup loops that continuously remove drift and stale assumptions.

This is the difference between "asking AI for help" and designing a workplace where AI can perform.

Ryan Lopopolo's OpenAI post on harness engineering put this beautifully: "Humans steer. Agents execute." That line has been stuck in my head because it names the shift. The human job is moving up a layer. It is less about typing every line of code and more about designing the environment, choosing the direction, setting the quality bar, and building feedback loops that let agents do dependable work.

Different models become different teammates

The other point people miss is that "AI" is not one thing.

Different models are useful in different ways.

Fast mini GPT-5-class models are already good enough to feel like a diligent friend who helps you keep the work tidy: checking details, running through edge cases, making sure the boring things are done properly, and helping you get to green builds.

Models like Claude Opus 4.7 feel different. They are stronger creative and thinking partners. They are especially useful when the work needs taste, judgment, UX thinking, front-end feel, or the kind of patient exploration where you want someone to reason with you before you commit.

The skill is not picking one model and worshipping it. The skill is learning how to route work. Some tasks need speed and diligence. Some need taste and depth. Some need a background agent grinding through checks. Some need a sharper thinking partner beside you.

Good AI work is increasingly about orchestration.

What agent-first means for a repo

For me, an agent-first repo is not a repo where agents write random code quickly.

It is a repo where the project is legible to agents.

If an agent cannot answer these questions from the repo itself, the repo is not agent-first yet:

What is this product trying to do?
How do I run it locally?
What does good UX look like here?
What are the important architectural boundaries?
Where does data live?
What commands prove that my change worked?
What should I never touch without asking?
Where are the current plans, tradeoffs, and known problems?
How do I inspect runtime behaviour when something breaks?

If the only person who knows those answers is the founder, the harness is missing.

That is the part I am trying to improve in Luma. I do not want to be the one manually carrying all the coordination state. I want the repo to carry more of it. I want the docs, scripts, tests, runtime access, logs, and cleanup loops to make the agent better by default.

This changes what it means to code well

Coding better in the AI era does not only mean writing better functions.

It means designing better feedback loops.

It means making your work observable. It means writing docs that are actually useful. It means encoding taste and architecture into checks where possible. It means making local setup boring. It means keeping plans and decisions close to the code. It means giving the agent tools instead of giving it lectures.

This is why I think people who want to code better need to learn this stuff now.

Not because AI will magically do everything.

Because AI is already good enough that the bottleneck is often the harness around it.

The models are not the disappointing part. The disappointing part is giving a powerful model a weak environment and expecting it to behave like a full team.

The next wave of software will not be built by people who simply "use AI".

It will be built by people who know how to design systems where humans steer, agents execute, and the work proves itself.

Why people get frustrated

What a harness actually gives the model

Different models become different teammates

What agent-first means for a repo

This changes what it means to code well

Sources