AEO audit checklist for small agencies
An AEO audit checks whether a client site is positioned for AI-search citation across three mechanical levers: brand-mention surface on the sources answer engines already retrieve, Bing/web rank with verified indexing and a clean crawl record, and answer-shaped content — server-rendered HTML with direct answers up front, logical heading hierarchy, and named entities. Schema, canonicals, and llms.txt are a hygiene pass, not citation levers.
How an AEO audit differs from a classic SEO audit
A classic technical SEO audit focuses on rank signals. An AEO audit adds the off-page retrieval layer — the sources an answer engine consults before it decides what to surface — and checks whether the page content is shaped for passage extraction, not just for ranking.
| Audit area | Classic SEO audit | AEO audit (adds or shifts) |
|---|---|---|
| Off-page signals | Backlink profile, domain authority | Brand mentions on sources answer engines already cite for your category |
| Indexing | Google Search Console coverage | Bing Webmaster Tools verified, sitemap submitted, IndexNow active |
| Content shape | Keyword density, word count, EEAT signals | Direct answer in first 100 words, named entities explicit, server-rendered HTML |
| Schema / markup | Schema.org types, rich result eligibility | Hygiene pass only — schema does not cause citations; flag mismatched types (e.g. Product on a service page) |
| Crawlability | Robots.txt, canonical, redirect chains | Same, plus: JS-only rendering = page does not exist for most AI crawlers |
| Scoring blocked pages | Score 0 if unreadable | Flag as blind spot; never score a page you could not read |
The AEO audit checklist
Work through each group in order. The three lever groups determine citation probability; the hygiene group is a final pass. Note each item as: pass, gap, or blind spot (could not read). Blind spots are findings, not zeros.
Lever 1 — Brand-mention surface
Answer engines surface brands they have seen referenced on third-party sources. This group checks whether the client brand is present on those sources.
| Checklist item | How to check |
|---|---|
| Identify which publications or directories answer engines cite when answering queries in the client's category | Run 5–10 representative queries in Perplexity and ChatGPT; note which domains appear as citations |
| Check whether the client brand is named on those cited sources | Site-search or Google query: site:<cited-domain> "<client brand>" |
| Check whether any brand mention ties the name to the agency descriptor and the client's canonical URL | Bare brand mentions without context may resolve to a competitor with a similar name |
| Count distinct third-party indexed pages that name the brand | Google: "<client brand>" -site:<client domain>; note result count as a baseline |
| Flag any brand mentions that link to the wrong domain or a competitor | Check each top-result anchor href manually; log corrections as gap items |
Lever 2 — Bing/web rank and indexing
Most large-language model assistants ground their retrieval on Bing-indexed content. Google indexing alone is not sufficient.
| Checklist item | How to check |
|---|---|
| Bing Webmaster Tools account verified and sitemap submitted | Log into BWT; confirm ownership verified and sitemap URL shows no errors |
| Key pages indexed in Bing | BWT URL inspection for homepage, top service pages, and the 5 pages most relevant to target queries |
| No crawl errors on key pages | BWT > Crawl > Crawl errors; log any 4xx or redirect chains |
| IndexNow configured or manual submission in place | Check whether an IndexNow key file exists at the root; if not, log as a gap and note the setup cost is low |
| Bing rank for primary queries (directional baseline) | Search each target query in Bing; note approximate position; this is a snapshot, not a tracked metric |
Lever 3 — Content shape
Answer engines extract passages. Check whether the page content is structured so an engine can identify and pull a direct answer without guessing.
| Checklist item | How to check |
|---|---|
| Content is server-rendered HTML — the core answer is in the HTTP response body, not injected by JavaScript | curl -A "Googlebot" <url> | grep -i <key phrase>; if the phrase is absent from the curl output, flag as JS-rendering gap |
| Direct answer to the target question appears in the first 100 words | Read the page; note how many words precede the answer sentence; flag if >100 |
| Heading hierarchy is logical: h1 → h2 → h3, no skipped levels | Run axe-core or inspect heading order in DevTools; skipped levels are a blocker |
| Named entities (brand, product, location, person) are called out explicitly in prose, not only in markup | Search the HTML body for entity names; markup-only entities are not reliably extracted |
| Key facts are in extractable plain text — not image text, CSS-generated content, or hidden elements | View source; check that the claimed answer text is in the DOM as plain text |
| Tables and lists are in plain HTML — not JS-rendered or image-based | Check table markup: <table>, <th>, <td>; if tables are missing from curl output, flag |
Hygiene pass — schema, canonicals, robots, llms.txt
This group is parse hygiene and structural signaling — not citation levers. A page that passes every item below but fails the three lever groups above will not be cited. Run this pass last.
| Checklist item | How to check |
|---|---|
| JSON-LD schema type matches page content — service/professional-service pages use Service or ProfessionalService, not Product | View source; check @type values; flag any mismatch (e.g. Product on a government service page is incorrect) |
| Schema validates without errors | Google Rich Results Test or schema.org validator; errors are blockers, warnings are informational |
| Canonical tag is self-referential (or correct) on key pages | View source; check <link rel="canonical"> href; flag any canonical pointing to a different URL than the page |
| robots.txt does not block key pages or the sitemap | Fetch /robots.txt; confirm Disallow rules do not cover target pages |
| XML sitemap is present, submitted, and includes key pages | Fetch /sitemap.xml; confirm target pages are listed; cross-check with BWT submission status |
| llms.txt is present and current | Fetch /llms.txt; confirm it lists current key pages with descriptions. Note: llms.txt is a hygiene label, not a ranking or citation signal — absence does not suppress citations |
How to score and prioritize findings
Score each finding on two axes: impact (how much this gap is likely suppressing citation probability) and effort (time and skill to fix). Prioritize high-impact, low-effort fixes first. Three rules keep the output honest:
- Lever gaps outrank hygiene gaps. A missing IndexNow integration (Lever 2) ranks above a schema validation warning (hygiene) regardless of effort.
- Blocked or unreadable pages are blind spots, not zeros. If a page returns a 403, blocks crawlers, or renders critical content only in JavaScript, log it as a blind spot with the specific access failure. Never assign a score you cannot support with a successful read.
- Qualitative observations are labelled as such. If you note a gap based on a point-in-time search-result check rather than measured data, say so explicitly in the findings.
| Finding type | Impact | Typical effort | Priority |
|---|---|---|---|
| JS-only rendering on key pages | High — page is invisible to most crawlers | Medium–high (requires dev) | 1 — fix first |
| No Bing WMT verification or sitemap | High — Bing cannot index reliably | Low (30 min setup) | 1 — fix first |
| Brand absent from cited sources | High — no retrieval surface | Medium–high (off-page work) | 1 — address in parallel |
| Answer buried past 100 words | Medium | Low (copy edit) | 2 — second pass |
| Missing heading hierarchy | Medium | Low (markup) | 2 — second pass |
| Schema type mismatch | Low–medium (parse confusion) | Low | 3 — hygiene pass |
| Missing llms.txt | Low (no evidence of citation impact) | Low | 3 — hygiene pass, optional |
Common mistakes when auditing for AEO
Auditing schema first
Schema is the fastest item to check and the lowest-leverage item to fix. Auditors who open the Rich Results Test before checking Bing indexing or brand-mention surface are optimizing the label on a box that hasn't been delivered yet. Run the lever checks first.
Promising citation counts
No structural fix guarantees a specific citation count. Citation depends on crawl timing, model update cycles, retrieval snapshot freshness, and signals outside the site owner's control. Report findings as gaps that suppress citation probability, not as items that will produce a citation count when fixed.
Treating one engine as all engines
Google AI Overviews, Perplexity, ChatGPT, and Bing Copilot each have different retrieval pipelines. A page that appears in Google AI Overviews may not appear in Perplexity, and vice versa. An honest audit notes which engine a finding applies to and avoids generalizing across all platforms from a single check.
Scoring a blocked page as zero
A page behind a login wall, a WAF block, or JS-only rendering is a page you could not read. Log the access failure as the finding. Assigning a low score based on metadata alone misrepresents the confidence level of the result.
Frequently asked questions
How is an AEO audit different from a standard SEO audit?
A standard SEO audit focuses on rank signals: crawlability, canonicals, page speed, backlink profile. An AEO audit adds two layers those miss: whether the brand is named on the external sources answer engines already retrieve for your category, and whether the page content is shaped so an engine can extract a direct answer rather than a ranking-page passage.
Should schema markup be the first thing I fix?
No. Schema is hygiene — it helps crawlers label what a page is. It does not cause citations. Fix crawlability and Bing indexing first, then work on content shape (direct answers, heading hierarchy, named entities). Schema belongs in a final hygiene pass, not at the top of the priority queue.
What do I do if a client page returns a 403 or blocks crawlers?
Flag it as a blind spot, not a zero score. A page you cannot read cannot be audited honestly. Note the blocker in your findings, attempt a stealth or render fallback if the platform supports it, and report the access gap explicitly. Never assign a score based on a failed read.
Does Prompt Goblin guarantee that fixing audit findings will produce citations?
No. Audit findings address structural inputs — crawlability, content shape, Bing indexing, brand-mention surface. These raise the probability an answer engine can retrieve and cite a page. No specific citation count, ranking position, or AI-response outcome is promised. The refund covers the delivered work, not a citation number.
Sources cited on this page
This page makes no claims that require an external quantitative source. The checklist items and prioritization framework are based on Prompt Goblin's own audit practice and qualitative observations of how answer engines retrieve content. All observations are point-in-time and labelled as such. No fabricated statistics are used.
What this does not guarantee
- Schema and markup are hygiene, not citation levers. Implementing JSON-LD, FAQPage markup, or
llms.txtdoes not cause answer engines to cite a page. These are structural signals that help parsers label content; they do not raise citation probability on their own. - No specific citation count, rank position, or AI-response outcome is promisedby any action described on this page. Fixing gaps described here addresses structural inputs. Effects depend on crawl timing, model update cycles, and signals outside the site owner's control.
- The refund covers the work, never a citation number. We measure the gap, deliver the fixes, and track the delta. We do not guarantee a citation appears or persists in any answer engine.
- We measure delta, not ETA. There is no guaranteed timeline from structural fix to observable citation change in any answer engine.
Run the Prompt Goblin scan on a client site and get a structured report of technical and hygiene gaps — crawlability, Bing indexing, content shape, and schema issues: start a free AEO scan.
Go deeper
- How to show up in ChatGPT — the three citation levers explained
- Bing rank and AI citations — why Bing indexing is the under-managed lever
- Technical SEO for AI search — crawlability, canonicals, and Core Web Vitals
- Bing Webmaster Tools setup guide
- How the Prompt Goblin scan works — methodology
- Frequently asked questions