You spent a week building with a coding assistant, and it felt like a superpower. You described a feature, it wrote the code, you corrected it in plain English, and a working app came together faster than you believed it could. So when you added an "ask us anything" box to the app itself, you expected the same superpower. Instead a stranger typed a question your product had no business answering, the box answered it anyway in the same confident tone, and the reply was wrong in a way nobody caught until a user pointed it out.
The two felt like the same AI. They are not, and telling them apart is the first move in building an AI product.
The AI that helped you build your app and the AI inside your app are two different systems with two different jobs: one writes code on your machine while you build, and the other runs inside your product and answers your users at run time, on every use, with nobody reviewing each reply.
The build-time AI: a tool you operate
The first AI is the coding assistant you have been steering in every drill so far, the one Working with AI as a builder covers in depth. It sits on your side of the keyboard. You hand it a task, it proposes code, and you review the result before anything ships. It bills like any subscription, the meter stops when you close the laptop, and when it gets something wrong, the mistake lands on your own machine, where you catch it before a single user exists. You have been running it since Part I, and that familiarity carries.
That AI does not appear in your finished product. By the time a stranger opens your app, the assistant is gone and its work has become ordinary code.
The run-time AI: a part inside your product
The second AI is different in every way that matters. It is a model your product calls over an API while a real person is waiting, the round trip you toured earlier in this level. The user acts, your app sends a request to the model, the model returns text, and that text reaches the user's screen without you in the loop. This is the AI that Where AI fits in your product called role two and role three: the model that generates content for your user, or works behind the scenes on your data.
A support assistant answering a customer inside a help widget, like Intercom's Fin, is this AI. So is a feature we built into one of our own products, fuelthefam.com, that turns a photo of your fridge into a shopping list: the user snaps a picture, the app sends it to a model, and the list comes back while they wait. Nobody on our team reads each list before the user sees it. That is the defining trait of the run-time AI, and it is what makes it a different animal from the assistant that helped build it.
Why the difference changes how you build
Those differences have consequences, and the rest of this part is built on them.
It bills on every use, not once. Each call to the model costs money, so the bill scales with how many people use the feature and spikes when usage does. Per-call prices change too often to print here, so we speak in orders of magnitude and leave current numbers to the providers' pricing pages and the Playbook's cost section. What matters now is the pattern: the build-time assistant is a fixed cost you pay while working, and the run-time model is a variable cost you pay for as long as the feature is live.
Its output ships unreviewed. While you build, you read every change before accepting it. At run time there is no such gate, because the model produces an answer and your user reads it in the same moment. The model returns a wrong answer in the same fluent, confident tone as a right one, so a reply that reads well is not evidence that it is correct.
It answers strangers, not you. While building, you write every prompt yourself. At run time, the input comes from whoever is using your product, including people who will try to push the model somewhere you never intended. The model produces a response to the words it is given, and at run time you do not control those words.
You did not add AI to your product. You put a specific model inside it, and how that model behaves is now your job.
The model is a part you specify, not a genie you summon
Put those three together and the reframe is this. "Adding AI" sounds like flipping a switch, but what you are actually doing is wiring a specific model into a specific spot in your build, giving it a specific job, and taking responsibility for what it produces. The model is a component, like the database or the auth layer you already met, with one difference: this component answers in sentences and gets things wrong in fluent prose.
That is good news, because a part is something you can specify and control. The chapters ahead do exactly that, in order. You will make your first call to a model, choose one that fits your build, write prompts that behave like instructions instead of wishes, get output your code can rely on, give the model the facts it was never trained on, and customize it without overreaching. By the end you will wire one real model-backed feature into a build of your own.
Try it now
No setup: Pick two AI features you use, one that clearly answers you inside a product (a support chat, an email draft, a meeting summary) and one coding assistant. For each, name where the AI runs and who pays: on your machine while you work, or inside the product on every use. Then picture a product you want to build and name the spots where a model would run while a user waits, the moments Where AI fits in your product will later sort into roles. Those spots are the run-time models this part teaches you to wire in.
With your tools: Open Claude Code in the project you have been building and ask it, without writing any code yet: "If I added a feature that calls a language model at run time, which file would hold that call, and what would the request and the response look like?" Read its answer to see exactly where the seam between the two AIs falls in your own codebase. If your tools are not set up yet, The Setup Clinic gets you to a working session in one sitting. In Codex or Cursor the move is the same: ask the sidebar chat where a run-time model call would live in your project, and have it describe the request and response without building them.
Chapter Summary
- The AI that helps you build your app and the AI inside your app are two different systems, and confusing them is the first mistake in building an AI product.
- The build-time AI is a coding assistant you operate: it bills once, you review its output, and its mistakes land on your machine before any user exists.
- The run-time AI is a model your product calls while a user waits: it bills on every use, its output ships unread, and the input comes from strangers.
- Per-call pricing changes too fast to memorize, so think in orders of magnitude and check the providers' pricing pages and the Playbook for current numbers.
- The model returns a wrong answer in the same confident voice as a right one, so fluent output is not proof of correct output.
- "Adding AI" really means wiring a specific model into a specific spot in your build and owning how it behaves, the same way you own the database or the auth layer.
- Treat the run-time model like a capable new hire, not the assistant you trusted last week: narrow its job, cap its spend, bound its reach, and verify before you rely on it.
- Next up, Make your first model call takes you from this idea to a real request and response you can run today.
Sources
- OpenAI, Anthropic, and Google developer documentation on calling models over an API, 2026.
- Intercom documentation on the Fin AI agent, 2026.