Frontend, what users see · The Builder's Stack

EnvironmentsBackend

Double-tap a photo on Instagram and the heart flips red before your finger has left the glass, the like count climbing in the same motion, none of it waiting for a server. The request that records your like is still leaving your phone, because the layer you are touching answers first and does the bookkeeping afterward.

That layer is the frontend, everything users see and touch: the buttons, the text, the colors, the layout that rearranges when you rotate the phone, the animation confirming a tap landed.

In the journey of a request, this layer was the client, the part that runs on the user's device and turns taps into requests and responses into pixels. You have been judging frontends your whole life. What this chapter adds is the vocabulary, including the pieces AI products invented, and a way to describe a screen so a tool can build it.

HTML, CSS, and JavaScript, the materials of every screen

Every page a browser shows you mixes the same materials, and whenever you have seen HTML, CSS, and JavaScript named together, this is what they were doing.

HTML is the structure, the skeleton of the page. It declares what exists: here is a heading, here is a button, here is an image.
CSS is the appearance, the clothes on that skeleton. It declares how things look: the heading is dark green, the buttons rounded, the image full width on phones.
JavaScript is the behavior, the reactions. It declares what happens when the user acts: the button is tapped, so flip the heart red and send the like to the server.

You will not write these by hand. Your tools write them constantly and narrate as they go; the triad makes that narration readable. A tool editing an .html file or a component is changing what exists. A tool touching a .css file is changing how things look. A tool editing a function wired to an event like onClick is changing what the page does. Modern frameworks blend the materials, so one React component file can carry structure and behavior together, but the jobs inside it stay distinct.

UI is the look, UX is the experience

These two abbreviations get used interchangeably, and your tools will ask you about both, so keep them apart. UI, user interface, is the look of a screen, the part a screenshot captures: colors, typography, spacing, where the buttons sit. UX, user experience, is what using the product feels like over time: the path from sign-up to first value, the places people get stuck, how many taps the most common action takes.

The two are related, and they are not the same job. A meditation app can be gorgeous in screenshots and still take nine taps to start a session; Craigslist has stayed famously plain for decades and still moves apartments and sofas with almost no friction. When a tool asks what a screen should look like, that is a UI question. When it asks what a user should do first after signing up, that is a UX question. Answer them separately and both answers improve.

What AI products add to the screen

The classic frontend vocabulary settled long ago: buttons, forms, menus, lists, cards. AI products kept all of it and added new elements, because a generated answer behaves differently from anything those older parts were built to hold. It arrives over seconds rather than milliseconds, rests on sources, is sometimes wrong, and sometimes comes right before an action the product takes for you. Each of those traits needed a new element of its own.

The streaming block. A text area that fills word by word while the response is still being generated, so reading starts before the work finishes.
The citation chip. A small numbered marker attached to a sentence, tying the claim to the source it rests on.
The source rail. The list of sources alongside the answer, one entry per chip, so a reader can check the original.
Approve and undo controls. When the product is about to act (send the email, apply the edit, delete the rows), the screen asks first or offers a way back afterward, so the human stays in charge.
The working state. What the screen shows while the model runs: step labels and progress notes that make a multi-second wait feel honest instead of frozen.

One real screen carries most of this vocabulary. Ask Perplexity a question and the answer streams into the block, numbered citation chips sit at the ends of sentences, and the sources are listed alongside, one per number. The screen communicates, without explanation, which sentences rest on which sources.

Describe a screen so a tool can build it

Sometimes a tool asks what a screen should look like; more often it picks a layout and starts building, and your job is to notice and steer. Either way the decision is yours, and it does not require pixel art, hex codes, or measurements. A buildable description says what the user sees, in what order, and what each element does. Lead with who is on the screen and what they came to do, then walk the elements top to bottom, giving each one its job.

For an answer screen, it sounds like this. The user has just asked a question. From the top: their question restated in small text, then a streaming block where the answer arrives, citation chips at the end of any sentence that rests on a source, and a source rail listing each source by title, with a copy button and a follow-up field at the foot.

We spec our own screens that way. It is enough for a tool to produce a first draft, and the draft will be wrong in ways you can only see once it exists. Your reaction makes the second pass closer, the loop that Working with AI as a builder turns into a method.

Try it now

No setup: Right-click anything on any page, this one included, and choose Inspect. The panel that opens is DevTools, free in every major browser: the HTML skeleton on one side, the CSS dressing your selected element on the other. Find a color value in the Styles panel and change it; the page updates instantly, and on reload the change snaps back, because you edited your browser's local copy, never the real site. Then open an AI product you use and sort one answer screen into the triad: the streaming block and the chips exist (skeleton), they are colored, rounded, and spaced (clothes), and the words arrive over time while a chip opens its source on click (behavior).

With your tools: In an empty folder, ask Claude Code for a one-page site about anything, then describe one screen change in the way this chapter taught, something like "move the heading above the image and turn the contact line into a button." Before approving, read which files it plans to touch and sort them with the triad: what exists lands in HTML or component files, how it looks lands in CSS, and what it does lands in functions wired to events. The diff stops being a wall of code and becomes a sorting exercise. If Claude Code is not installed yet, the Setup Clinic walks you through it. In Codex or Cursor the move is the same: describe the change in the sidebar and read the touched files in the diff before accepting.

Chapter Summary

The frontend is everything users see and touch on a screen, and it runs on the user's own device.
Every screen is built from three materials: HTML is the structure, CSS is the appearance, and JavaScript is the behavior. You will not write them by hand, but your tools narrate their work in those terms, so sorting a change into the three buckets makes the diff readable.
UI is the look a screenshot captures; UX is what the product feels like to use over time. They are related but separate jobs, so answer a question about each one on its own.
AI products kept the classic parts and added five new ones: the streaming block, the citation chip, the source rail, approve and undo controls, and the working state.
To get a screen built, describe what the user sees, in what order, and what each element does, starting with who is on the screen and what they came to do.
The first draft will be wrong in ways you can only spot once it exists, and your reaction makes the next pass closer.
Anything checked only on the frontend can be read or changed by any user, so secrets and any rule about money or data have to live on the server.
Next, the request your screen sends out lands somewhere with no pixels at all, in Backend, where the real work happens.

Marks this chapter complete on your course map. Reaching the end does this for you.