Site structure for AI citations — clusters, hubs, and passage retrieval
Answer engines retrieve passages, not sites. When two pages cover the same topic, the one embedded in a topic cluster — linked from a hub, cross-linked to sibling spokes — is more likely to surface than the one sitting alone. Structural isolation is a retrieval risk even on a technically sound, indexed page. The fix is intentional internal linking, not more content.
How answer engines retrieve passages, not sites
When an AI assistant answers a question, it does not evaluate your site as a whole. It retrieves candidate passages — short, extractable chunks of text — from its index and ranks them by how directly they match the query. Your homepage authority does not transfer to a product page buried three clicks deep. Each passage competes on its own extractability and the contextual signals that surround it.
Internal links are one of those contextual signals. They tell crawlers which pages exist, how frequently to revisit them, and how they relate to each other. A page that receives many internal links from related pages signals topical relevance through co-occurrence. A page with no inbound internal links sends the opposite signal: it is hard to find, infrequently recrawled, and contextually isolated — a structurally orphaned page even if it is not technically blocked.
Why an isolated page loses to a clustered page
Two pages can cover the same topic equally well in terms of writing quality and keyword match. If one sits inside a cluster — linked from a hub, linked to sibling spokes, cross- linked reciprocally — and the other sits alone, the clustered page accumulates crawl depth and topical context the isolated page does not. This is not a ranking trick; it is how crawl queues and topical indexing work.
Orphaned pages (no inbound internal links from any live page) are structurally deprioritized in crawl queues. The crawler may have visited once and never returned. The index entry may be stale or absent. A technically correct page with zero internal links pointing to it is invisible to the cluster retrieval pattern even if it would be an excellent passage answer.
This page covers topical clustering. If you are working on the technical prerequisites — canonical hygiene, Core Web Vitals, sitemap accuracy — see Technical SEO for AI search first. That is the floor this page builds on.
The hub-and-spoke pattern
A topic cluster has one hub page and several spoke pages.
The hub pageanswers the topic-level question directly in its opening passage. It establishes the frame — “here is what AI citation visibility means and why it matters” — and then links to spoke pages for depth. The hub is not a table of contents; it must contain a direct, extractable answer to the top-level query.
Each spoke page answers exactly one sub-question. It links back to the hub and links to two or three sibling spokes that answer related sub-questions. This reciprocal web tells crawlers: these pages are about the same topic, they answer different aspects of it, and they all connect to a central authority.
Linking rules for a hub-and-spoke cluster
- Every spoke links to the hub. No exceptions.
- Every spoke links to two or three sibling spokes covering adjacent sub-questions.
- The hub links to all spokes. The link text names the sub-question the spoke answers.
- Links are inline where the topic is introduced, not just collected at the bottom of the page. A link buried in a “Related articles” footer carries less contextual signal than a link in a body paragraph where the topic is named.
Prompt Goblin's own /learn cluster as a live example
The /learn section of this site is built as a hub-and-spoke cluster in progress. The hub topic is AI citation visibility. The spoke pages each answer a distinct sub-question: why schema is not enough, how Bing rank connects to citations, what the audit checklist looks like, how to fix a rank-but-not-cited gap, and so on. Each spoke links back to related spokes and to the methodology page. This page is one of those spokes. We are practicing what we measure — no claimed citation results, just honest structural work underway.
Sub-question mapping — giving each question one home
Fan-out thinking is the practice of decomposing a topic into the questions users actually ask around it. For a topic like “AI citation visibility,” the fan-out might produce: Why am I not cited after adding schema? Why do I rank but not get cited? How do I audit for citation gaps? What does hub-and-spoke linking look like in practice? Each of those is a spoke candidate.
The discipline is giving each sub-question exactly one home. When two pages both attempt to answer the same sub-question, they compete with each other, divide internal link equity, and confuse both crawlers and retrievers. The fix is to identify the authoritative page for that sub-question, consolidate the duplicate content there, and redirect the weaker version.
How to run a sub-question map
- Start with the hub topic query. Write down the exact question a user would type.
- Fan out: list every related sub-question you know users ask. Use search autocomplete, “People also ask” in Google, and any support or sales questions you receive.
- Audit your existing pages: which sub-question does each page answer? Does any sub-question appear on more than one page?
- Assign each sub-question to exactly one page. If a page exists for it, reinforce it. If no page exists, create a spoke.
- Wire the links: every spoke links the hub and its two or three nearest sibling spokes.
Structure smells, retrieval consequences, and fixes
| Structure smell | Retrieval consequence | Fix |
|---|---|---|
| Page has zero inbound internal links | Structurally orphaned — deprioritized in crawl queues; stale or absent index entry | Add it to the hub as a spoke link; add inline links from two sibling pages |
| Two pages answer the same sub-question | Cannibalization — link equity split; retrieval selects one unpredictably | Consolidate into the stronger page; 301-redirect the duplicate |
| Hub page is a link list with no answer content | No extractable passage for the top-level topic query; hub cannot anchor the cluster | Add a direct-answer opening paragraph before the spoke list |
| Spoke does not link back to hub | Topical relationship unresolved; spoke appears isolated despite quality content | Add inline hub link where the topic is first introduced in the spoke |
| Tag/category pages generate hundreds of thin archive URLs | Crawl budget diluted across pages with no direct-answer content; none retrievable | Noindex tag archives; consolidate topic navigation into hub pages |
| Spokes link only to hub, not to sibling spokes | Retriever sees a star pattern, not a semantic web; sibling co-relevance unreinforced | Add two or three inline links from each spoke to the most adjacent sibling spokes |
Common mistakes
Orphan pages
The most common structural failure is publishing a page and not linking to it from anywhere. This happens most often with resource pages, tool pages, and older blog posts that predate the current site structure. Audit for orphans by running a crawl of your own site and filtering for pages with zero inbound internal links.
Two pages answering the same question
When a site has both “What is AEO?” and “AEO explained” as separate pages, retrievers may select either one, reducing the probability of either being the dominant passage. Sub-question mapping prevents this by assigning every question to exactly one page before writing begins.
Hub pages that are link lists with no answer content
A hub page that opens with “Here are our guides on X” followed immediately by a list of links has no extractable passage for the top-level query. It cannot anchor a cluster because it answers nothing. Add a direct opening paragraph — the best 60-word answer to the topic question — before the spoke links.
Infinite tag and category sprawl
Blogging platforms and CMS tools auto-generate tag pages, category archives, author pages, and date archives. None of these contain direct-answer content. They dilute crawl budget across URLs that will never be retrieved as passages. Noindex them, or consolidate topic navigation into hub pages with real answer content.
Frequently asked questions
Does internal linking actually affect whether AI retrieves my content?
Internal links are how crawlers discover, recrawl, and contextualize pages. A page with few or no inbound internal links is structurally deprioritized in crawl queues — it is harder to find and its topical relationships are unclear. That is a retrieval risk even when the page is technically accessible.
How many spokes should a topic cluster have?
There is no fixed number. Each spoke should answer exactly one sub-question the audience actually asks. If you run a fan-out on the topic and find seven distinct sub-questions, build seven spokes. If two sub-questions are nearly identical, collapse them into one spoke rather than risking cannibalization.
What is the difference between a hub page and a link list?
A hub page directly answers the topic-level question in its opening passage and then links to spokes for depth. A link list has no answer content — it is just navigation. Answer engines can extract a passage from a hub; they cannot extract an answer from a list of links.
Can two pages on my site target the same query without hurting retrieval?
They can coexist if each answers a genuinely distinct sub-question and cross-links to the other. If both pages open with the same direct answer to the same query, they compete. Consolidate them into one definitive page and redirect the duplicate.
Sources cited on this page
This page makes no claims that require external citation. The relationship between internal linking, crawl discovery, and topical indexing is described qualitatively, consistent with publicly documented crawler behavior. All structural guidance is framed as how crawl queues and retrieval patterns work — not as guaranteed outcomes. No fabricated statistics are used. Observations about passage retrieval are directional and based on documented indexing mechanics, not controlled experiments.
What this does not guarantee
- Schema markup and internal linking are structural hygiene signals, not citation levers. No action described on this page promises citation by an AI assistant.
- No specific citation count, retrieval frequency, rank position, or AI-response outcome is promised by implementing the hub-and-spoke pattern or any other structural recommendation here.
- Fixing structural isolation removes a known retrieval risk. It does not guarantee that an answer engine will surface your content within any particular timeframe.
- Where Prompt Goblin engagement is mentioned: the refund covers the delivered work — audits, structural fixes, internal-link maps, measurement loop. It never covers a citation number or ranking position.
Want a structural audit of your site's cluster and orphan situation? Get in touch and we will map your internal-link gaps — which pages are orphaned, which sub-questions have two homes, and where the hub-and-spoke wiring is missing.
Go deeper
- Technical SEO for AI search — crawlability, canonicals, and the technical floor
- AEO audit checklist — structure checks and the full citation-readiness framework
- Rank-but-not-cited diagnostic — when structure is fine but retrieval still misses you
- Why schema markup isn't enough — what the real retrieval levers are
- Bing rank and AI citations — the direct connection between indexing and retrieval
- How the Prompt Goblin scan works — methodology