When an AI engine reads your page, it usually does not read the whole thing. It pulls a few chunks, often based on similarity to the user's question, and reasons from those. A page that makes sense as a whole can lose its meaning when a model only sees one section.
This is true for both retrieval-augmented search inside an engine and for the retrieval some teams build over their own docs.
What goes wrong
A few patterns that break when chunked:
- Forward references: "See the previous section." The chunk does not include the previous section. The reference is dead.
- Implicit subject: A section that talks about "the function" without naming it. The chunk does not include the heading that defined the function.
- Required context in the intro: A page that explains in the first paragraph that "all of these endpoints require admin auth." If a chunk from the middle of the page is retrieved, the admin requirement is invisible.
What helps
- Restate the subject in each section heading. "Authenticating to the Acme API" instead of "Authentication."
- Repeat constraints inline. If admin auth is required, say so on the endpoint heading or in the first sentence of that section.
- Avoid pronouns across sections. Use the resource name or the parameter name. Repetition is cheap and helps.
- Self-contained examples. Each code example should include the auth header, not refer to a "globally configured client."
Why this is not just for AI
Chunk-friendly writing also helps humans. A reader who clicks an anchor link and lands in the middle of a page can read the section and know what it is about.
The cost is a small amount of repetition. The benefit is that any reader, human or model, can pick up the page at any point and still know what they are reading.