GEO Glossary

← Back to recent revisions

Changelog archive

Every editorial revision ever recorded across the GEO Glossary, grouped by month and newest first. 257 revisions across 69 terms, 16 distinct workdays.

  1. May 2026 257 revisions

    1. 26 revisions

      • Claude citation confirmed for two of this entry's tracked queries: the definition query ('What is authoritative statement strength in AI search content optimization?') and the Aggarwal vendor-research query ('How did Aggarwal 2023 rank Authoritative tone among content methods?'). The entry surfaced as a primary source in Claude's desktop web-search answers. Same-day cross-engine signal: Perplexity also confirmed cited today across the same probes plus the operational variant.

      • Perplexity citation confirmed for all three of this entry's tracked queries: the definition query ('What is authoritative statement strength in AI search content optimization?'), the operational query ('Does using authoritative tone increase ChatGPT citation rate?'), and the vendor-research query ('How did Aggarwal 2023 rank Authoritative tone among content methods?'). The entry surfaced as a primary source with inline citations in all three Perplexity answers. First multi-probe citation event for this entry, one day after the lede was rewritten to lead with the paper's verbatim 'no significant improvement' framing.

      • PAWC labeling sweep. Aggarwal footnote now labels values as 'Table 1 main GEO-bench' with the standalone-vs-named-top-3 distinction explicit. Cluster's prior 'top 4' framing (Quotation, Statistics, Fluency, Cite Sources) replaced with the paper's verbatim named top-3 (Cite Sources, Quotation, Statistics) for combined-method strength; Fluency's 3rd-place standalone PAWC noted but not framed as in the paper's named effective group. Lede and FAQ #2 updated to compare Authoritative's null against the paper's named top-3. Table 5 Perplexity.ai per-engine caveat added.

      • Primary-source re-verification of the cluster's shared Aggarwal Table 1 PAWC numbers + verbatim quotes against the ar5iv mirror of arXiv:2311.09735. All Table 1 PAWC values, Table 1 caption verbatim, Section 4 prose, the named top-3 quote, and Table 5 Perplexity per-engine numbers (including Keyword Stuffing 21.9 with paper prose 'performs 10% worse than the baseline') confirmed. Aggarwal footnote appended with the re-verification note; no body changes.

      • Citation match rate

        Citation metrics

        Claude citation confirmed for the definition query ('What is citation match rate in AI search?'). The entry surfaced as a primary cited source in Claude's desktop answer. Cross-engine consistency: ChatGPT first cited this entry 2026-05-17; Claude now also cites for the same probe.

      • Citation share

        Citation metrics

        Claude citation confirmed for the definition query ('What is citation share in AI search?'). The entry surfaced as a primary cited source in Claude's desktop answer. First confirmed Claude citation for this entry.

      • Cite Sources Optimization

        GEO content methods

        Perplexity citation confirmed for the Aggarwal vendor-research probe ('What did Aggarwal 2023's GEO paper say about Cite Sources as a content method?'). Cite Sources Optimization | GEO Glossary surfaced as one of 11 sources, with the sibling Quotation Addition entry also cited inline. Citation comes one day after the entry's lede was rewritten to lead with the paper's verbatim named top-3 framing.

      • Cite Sources Optimization

        GEO content methods

        PAWC labeling sweep. Lede now leads with the paper's verbatim named top-3 (Cite Sources, Quotation Addition, Statistics Addition) rather than the cluster's prior 'one of four top-performing methods' framing. The paper itself names Cite Sources in the top-3 for combined-method strength even though standalone Table 1 PAWC ranks Cite Sources 4th (the paper notes Cite Sources is standalone 8% below Quotation but boosts to Average 31.4% in combinations). Aggarwal footnote now labels values as 'Table 1 main GEO-bench' and adds Table 5 (Perplexity.ai) per-engine caveat.

      • Cite Sources Optimization

        GEO content methods

        Primary-source re-verification of the cluster's shared Aggarwal Table 1 PAWC numbers + verbatim quotes against the ar5iv mirror of arXiv:2311.09735. All Table 1 PAWC values, Table 1 caption verbatim, Section 4 prose, the named top-3 quote, and Table 5 Perplexity per-engine numbers (including Keyword Stuffing 21.9 with paper prose 'performs 10% worse than the baseline') confirmed. Aggarwal footnote appended with the re-verification note; no body changes.

      • Cite-ability

        Umbrella terms

        Perplexity citation reconfirmed for the definition query ('What is cite-ability in AI search?'). Cite-ability | GEO Glossary continues to surface as a primary source ten days after first observed citation on 2026-05-20, with inline citations across the answer body.

      • FAQ Schema

        Schema cluster

        Added the primary-source URL for the August 2023 Google Search Central blog post 'Changes to HowTo and FAQ rich results' to the deprecation footnote (the prior text described the August 2023 restriction without linking to the announcement post itself). The footnote now cites both deprecation phases by primary source: the August 2023 restriction post and the May 7, 2026 full-deprecation notice on Google's FAQPage structured-data documentation page.

      • Fluency Optimization

        GEO content methods

        Claude citation confirmed for the Aggarwal vendor-research query ('What did Aggarwal 2023's GEO paper say about Fluency Optimization as a content method?'). The entry surfaced as a primary cited source in Claude's desktop answer. First confirmed Claude citation for this entry.

      • Fluency Optimization

        GEO content methods

        PAWC labeling sweep. Critical reframing: lede previously called Fluency Optimization 'one of the four top-performing single methods', but the paper's verbatim named top-3 (Cite Sources, Quotation, Statistics) does NOT include Fluency despite its 3rd-place standalone Table 1 PAWC. The paper's strongest framing of Fluency is the §5.3 Fluency-plus-Statistics combination pair (+5.5% over any single method), not as a named top single-method intervention. Aggarwal footnote labels values as 'Table 1 main GEO-bench' + Table 5 Perplexity.ai per-engine caveat.

      • Fluency Optimization

        GEO content methods

        Primary-source re-verification of the cluster's shared Aggarwal Table 1 PAWC numbers + verbatim quotes against the ar5iv mirror of arXiv:2311.09735. All Table 1 PAWC values, Table 1 caption verbatim, Section 4 prose, the named top-3 quote, and Table 5 Perplexity per-engine numbers (including Keyword Stuffing 21.9 with paper prose 'performs 10% worse than the baseline') confirmed. Aggarwal footnote appended with the re-verification note; no body changes.

      • Keyword Stuffing

        GEO content methods

        PAWC labeling sweep (same-day after publish). Inline Table 1 now labeled 'main GEO-bench' and frames relative gains as 'mathematically derived' rather than implying they are the paper's headline numbers; the paper itself frames its top-3 (Cite Sources, Quotation, Statistics) at 30-40%. Cluster's prior 'four top-performing levers' framing replaced with the paper's verbatim named top-3 in How-to-apply and How-it-relates. Fluency clarified as 3rd standalone but not in the paper's named top-3 (strongest in combined-method experiment instead). Footnote adds Table 5 per-engine caveat.

      • Keyword Stuffing

        GEO content methods

        Cross-benchmark scoping polish (same-day after PAWC sweep). Added 'under the tested public benchmarks' qualifier so the conclusion does not over-generalize beyond what Aggarwal 2023 and C-SEO Bench 2025 directly measured. Replaced 'the only two public benchmarks' with 'the two public benchmarks this entry cites' to keep the framing time-bounded as new benchmarks appear. Body and C-SEO Bench footnote now make explicit that C-SEO Bench measures document ranking, not Aggarwal's PAWC citation-share metric, making it corroborating counter-evidence rather than a direct PAWC replication.

      • Keyword Stuffing

        GEO content methods

        Epistemic re-emphasis + cluster-wide PAWC primary-source re-verification. Description, lede, and Status now lead with paper-verbatim 'little to no performance improvement' (Section 4) and 'performs 10% worse than the baseline' (Table 5 Perplexity prose); the raw -8.7% / PAWC 17.8 / below-baseline framing is subordinated as a transparency check. Mirrors the pattern applied to authoritative-statement-strength. Body adds Perplexity Table 5 prose escalation (KS 21.9 vs baseline 24.0). Aggarwal footnote across 6 anchor entries appended with primary-source re-verification note vs the ar5iv mirror of arXiv:2311.09735.

      • LLMS.txt

        Infrastructure

        Corrected the Mintlify blog post URL and announcement date. The Sources block referenced 'mintlify.com/blog/llmstxt' (404) with the date November 14, 2024; the actual post is 'Simplifying docs for AI with /llms.txt' at 'www.mintlify.com/blog/simplifying-docs-with-llms-txt', published November 20, 2024. Body sentence about the Mintlify platform-wide rollout date updated to match. Caught by the new site-wide link-audit pass across all 226 frontmatter source URLs.

      • Quotation Addition

        GEO content methods

        Claude citation confirmed for the Aggarwal vendor-research query ('What did Aggarwal 2023's GEO paper say about Quotation Addition as a content method?'). The entry surfaced as a primary cited source in Claude's desktop answer. First confirmed Claude citation for this entry.

      • Quotation Addition

        GEO content methods

        PAWC labeling sweep. Aggarwal footnote now explicitly labels the values as 'Table 1 main GEO-bench' and includes all 9 + baseline. Lede reframed from '~43% gain' to 'mathematically derived +42.6%' and notes paper's headline range is 30-40%. Paper's verbatim named top-3 (Cite Sources, Quotation Addition, Statistics Addition) now surfaced; Fluency Optimization's 3rd-place standalone PAWC position no longer presented as 'top-4 effective' because the paper itself does not name it in the top-3. Table 5 (Perplexity.ai) per-engine caveat added (different baseline 24.0, best method +22%).

      • Quotation Addition

        GEO content methods

        Primary-source re-verification of the cluster's shared Aggarwal Table 1 PAWC numbers + verbatim quotes against the ar5iv mirror of arXiv:2311.09735. All Table 1 PAWC values, Table 1 caption verbatim, Section 4 prose, the named top-3 quote, and Table 5 Perplexity per-engine numbers (including Keyword Stuffing 21.9 with paper prose 'performs 10% worse than the baseline') confirmed. Aggarwal footnote appended with the re-verification note; no body changes.

      • Statistical Density

        GEO content methods

        Claude citation confirmed for the Aggarwal vendor-research query ('What did Aggarwal 2023's GEO paper say about Statistics Addition as a content method?'). The entry surfaced as a primary cited source in Claude's desktop answer; it also surfaced as the cited source for the parallel Cite Sources Aggarwal query, a cross-entry signal.

      • Statistical Density

        GEO content methods

        PAWC labeling sweep. Aggarwal footnote now explicitly labels values as 'Table 1 main GEO-bench' and includes all 9 + baseline. Surfaces the paper's verbatim named top-3 (Cite Sources, Quotation Addition, Statistics Addition) at the 30-40% range, distinct from the standalone PAWC ranking that placed Fluency in the cluster's prior 'top-4' framing. Statistics Addition is in both rankings. Table 5 (Perplexity.ai) per-engine caveat added (different baseline 24.0, best method +22%).

      • Statistical Density

        GEO content methods

        Primary-source re-verification of the cluster's shared Aggarwal Table 1 PAWC numbers + verbatim quotes against the ar5iv mirror of arXiv:2311.09735. All Table 1 PAWC values, Table 1 caption verbatim, Section 4 prose, the named top-3 quote, and Table 5 Perplexity per-engine numbers (including Keyword Stuffing 21.9 with paper prose 'performs 10% worse than the baseline') confirmed. Aggarwal footnote appended with the re-verification note; no body changes.

      • Web Bot Auth

        Infrastructure

        Initial publish. Web Bot Auth is an emerging IETF-track standard for cryptographically verifying bot identity, built on RFC 9421 HTTP Message Signatures (Proposed Standard, Feb 2024) plus two active IETF drafts. Each request signed with the bot's Ed25519 key; verifier fetches the public key from /.well-known/http-message-signatures-directory as JWK Set. Cloudflare and Akamai are documented early implementers; most AI-engine crawler traffic remains unsigned as of 2026. Addresses the crawler-controllability gap, particularly for AI engines like Grok where observed traffic does not surface documented user agents.

      • Web Bot Auth

        Infrastructure

        Mechanism + adoption fact-check (same-day after publish). §How it works now lists three required headers (Signature-Agent, Signature-Input, Signature); Signature-Agent (not Signature-Input) carries the bot's key directory URL that the verifier reads to fetch the public key. Status in 2026 updated to the agent-first / crawler-later split: Google's experimental Google-Agent signs at agent.bot.goog and OpenAI's ChatGPT agent signs at chatgpt.com, while Googlebot, GPTBot, and OAI-SearchBot remain unsigned. Added IETF draft-meunier-web-bot-auth-architecture (Cloudflare + Google co-authored) and AWS Bedrock AgentCore implementer footnotes.

    2. 6 revisions

      • Critical paper-misrepresentation fix. Entry previously framed the Authoritative result as a '+11.8% measurable modest lift'. But Aggarwal Section 4 verbatim: 'one would expect a more persuasive and authoritative tone can boost visibility. However, to the contrary we find no significant improvement.' The +11.8% PAWC is a raw number; the paper itself frames it as null and reports no p-values. Description, metaDescription, lede, Status, C-SEO Bench section, and FAQs rewritten to lead with the paper-verbatim null. Raw PAWC retained for transparency. Entry was misrepresenting peer-reviewed research; correction restores cluster discipline.

      • Initial publish. Codifies citation precision (74.5% baseline) and citation recall (51.5% baseline) as paired model-behavior metrics distinct from publisher-visibility metrics like attribution rate. Both grounded in Liu, Zhang, Liang 'Evaluating Verifiability in Generative Search Engines' (EMNLP Findings 2023, arXiv:2304.09848), a human audit of four engines. Entry covers framework, 2023 baseline, limitations (NeevaAI shut down 2023; Bing Chat renamed to Copilot), and how to add claim-alignment recording to citation probe protocol. Joins ai-behavior cluster as third anchor.

      • Same-day peer-review revision. Critical lede fix: asymmetry was reversed (precision 74.5% > recall 51.5% means engines were better at faithful citation than at citing every claim-bearing sentence). Critical fact fix: removed unverified Profound per-response density numbers; replaced with the study's actually-reported source-concentration patterns. Added Li & Sinnamon 2024 + Profound footnotes to Sources. Softened the precision-failure framing, precision-recall trade-off, and hallucination-grounding definition. NeevaAI dates made precise. 2x2 labels improved. Back-links added to attribution-rate, cite-ability, pillar.

      • Initial publish. Codifies the citation probe protocol the cluster has been using implicitly across the six citation-metrics anchors. Six protocol components: query design (fixed 10-query prompt set, 8-week freeze), cadence (weekly main, monthly aggregation, quarterly rotation), engine coverage (per-surface separate probes), recording schema, 3-axis disambiguation (SEARCH APPEARANCE / PAGES / QUERIES tabs in GSC), and signal-vs-noise rules (k-anonymity inference, N=1 hedging, cross-engine triangulation). Methodology cluster sibling to external-traffic-disambiguation.

      • Same-day revision: upstream/downstream boundary cleanup. §5 originally borrowed GSC's 3-axis tabs which are downstream click-side tools. Rewritten to cover three real probe-side needs: multi-surface sub-path attribution, source-list URL selection, query-reformulation tracking. §6 k-anonymity (GSC concept) removed; reformulation-noise rule added. FAQ #4 replaced with two-probe-different-result handling. Internal project codename removed. §4 verbatim and screenshot downgraded to recommended. §2 cadence framed as glossary default. 12-surface list inline-linked. SaaS-skip reframed for small vs enterprise.

      • Keyword Stuffing

        GEO content methods

        Initial publish. Documents the Aggarwal et al. 2023 GEO paper's flagship negative result: Keyword Stuffing scored PAWC 17.8 vs baseline 19.5 (NEGATIVE 8.7%), the only one of 9 tested methods to fall below baseline. Paper verbatim: 'little to no performance improvement on Generative Engine's responses.' Joins geo-content-methods cluster as 6th Aggarwal method covered and the cluster's only negative-result entry. C-SEO Bench 2025 confirms the null/negative finding under multi-actor production-realistic conditions. Primary counter-evidence anchor against the SEO claim that keyword optimization transfers to generative engines.

    3. 37 revisions

      • AI citation metrics

        Citation metrics

        Expanded the surface-family list from 8 to 12 anchors (added Brave Search, Grok, DuckDuckGo AI, Meta AI shipped today). Added a Cross-surface quick reference section: comparison table across the 12 surfaces on four axes (index provenance, crawler discipline, default citation rendering, structurally unique feature) plus five cross-cutting observations. Surfaces structural differences: DuckDuckGo's 1-2 source slot vs Perplexity's ~10-20 (10x tighter zero-sum competition), Grok's effectively unenforceable robots.txt, Meta's licensing-tiered citation behavior, non-web source categories (Grok X posts, Meta Instagram/Facebook/Threads).

      • AI crawler bots

        Infrastructure

        Added Brave, DuckDuckGo (DuckAssistBot), and xAI (Grok) rows to the known-crawlers table. Brave runs its own crawler; DuckAssistBot drives DuckDuckGo's Search Assist and respects robots.txt; xAI documents GrokBot, xAI-Grok, and Grok-DeepSearch but independent research finds these are not observed in actual server logs while Grok retrieval traffic arrives from rotating datacenter / proxy IPs with spoofed Chrome and Safari UAs. Added a callout that Grok is the cluster outlier on crawler controllability: robots.txt-based exclusion of Grok requires WAF / network-level enforcement, not user-agent rules.

      • AI dev tool citations

        Citation surfaces

        Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact: AI dev tools do code-context retrieval, not web search; this is a structurally different measurement surface than consumer AI search, and most retrieval is against code repositories and workspace files rather than the open web. Per-tool variance is too high to aggregate into one AI dev tool citation rate; each tool needs its own probe configuration. Metadata dates synced to 2026-05-28.

      • AI Mode

        Citation surfaces

        Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact: AI Mode is Google's conversational counterpart to AI Overview, same Google index family, but the heavier fan-out query reformulation produces a different cited-URL set than AI Overview. Track AI Mode and AI Overview as separate surfaces, not a single Google AI rate. Glance surfaces Google-Extended scope (Gemini and Vertex AI training and grounding; not AI Mode eligibility). Metadata dates synced to 2026-05-28.

      • AI Overview citation

        Citation surfaces

        Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact: cannot opt out of AI Overview without de-indexing from Google Search. AI Overview uses Googlebot (the main Google index), not Google-Extended (which only controls Gemini and Vertex AI training and grounding). Glance also surfaces the cluster-unique property that citation match rate effectively equals attribution rate on this surface (every source-panel entry is linked). Metadata dates synced to 2026-05-28.

      • Brave Search AI citation

        Citation surfaces

        Initial publish. Brave Search is one of the few major search surfaces operating an independent index. AI citation on Brave is therefore a structurally distinct citation surface from Bing-grounded (Microsoft Copilot) or Google-family (AI Overview, AI Mode, Gemini). Four AI features tracked: AI Answers (concise summary + source references; evolved from 2023 Summarizer through April 2024 Answer with AI), Ask Brave (longer answers + chat + Deep Research, September 2025), Featured Snippets, AI-powered descriptions. Joins citation-surfaces cluster as its 9th anchor.

      • Brave Search AI citation

        Citation surfaces

        Same-day revision. Critical fact correction: IndexNow launched October 18, 2021 as a Bing and Yandex collaboration only (prior text incorrectly listed Seznam and Naver as founding partners); Naver joined July 2023. Now consistent with the IndexNow Protocol entry. Hedged the ChatGPT search retrieval-pipeline framing to match the chatgpt-search-citation entry. Softened the DuckDuckGo Bing-grounded characterization. Clarified that Featured Snippets is extractive (predates generative AI) and that AI-powered descriptions cite the single linked result rather than synthesizing multiple sources.

      • Brave Search AI citation

        Citation surfaces

        Follow-up cluster-consistency fix. Writing the new DuckDuckGo AI citation entry surfaced that DuckDuckGo's Search Assist actually runs DuckDuckGo's own DuckAssistBot crawler, not Bing-syndicated content (the Bing partnership applies to some legacy organic results, not AI surfaces). FAQ #2 here previously hedged DuckDuckGo as 'Bing-syndicated family'; corrected to explicitly classify DuckDuckGo Search Assist as own-crawler alongside Brave, removing the cluster silent contradiction.

      • Brave Search AI citation

        Citation surfaces

        Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact: Brave runs the only independent search index at scale outside Big Tech; Bing and Google optimization do not transfer. A publisher with strong Bing or Google citation can have zero Brave citation if Brave's own crawler has not indexed them well. Brave is a separate indexing target.

      • ChatGPT search citation

        Citation surfaces

        Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. The table surfaces OAI-SearchBot (search retrieval) vs GPTBot (training) as independent toggles, both verbatim from OpenAI's developer bots documentation. Load-bearing fact: blocking GPTBot does NOT remove a site from ChatGPT search; OAI-SearchBot is the separate user agent that controls search-retrieval visibility. This is the most actionable single fact in the cluster on crawler-token disambiguation. Metadata dates updated to 2026-05-28 to match the revision.

      • Citation share

        Citation metrics

        Added a citation-slot-count calibration bullet to How to apply. Citation share competition is zero-sum within each surface's slot pool, but slot counts differ by an order of magnitude across the citation-surfaces cluster (DuckDuckGo Search Assist 1-2 sources vs Perplexity ~10-20+). A 22% share on a 4-slot surface is structurally tighter competition than a 22% share on a 20-slot surface; cross-surface benchmarking should normalize against slot pool size. Backport from the cross-surface quick reference added today to the AI citation metrics pillar.

      • Added a What-remains-contested bullet on licensing-tiered citation behavior, motivated by Meta AI's December 5, 2025 publisher licensing deals. The 2x2 taxonomy still describes the observable rendering, but Meta introduces a new sub-case where the same publisher moves between linked-citation and unlinked-mention cells based on commercial-partnership status (licensed partners receive linked attribution; non-licensed do not), not per-query rendering. First cluster anchor with this structure; flagged as an open question whether other engines with publisher deals (Perplexity, OpenAI, Google) develop similar visible tiering.

      • Cite-ability

        Umbrella terms

        Added the C-SEO Bench (arXiv:2506.11097, NeurIPS Datasets & Benchmarks 2025) counter-evidence anchor to the Aggarwal footnote. C-SEO Bench tested 7 of Aggarwal's 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on document ranking. The 2023 PAWC effect sizes remain valid for the single-actor synthetic testbed; C-SEO Bench sets the empirical upper bound on production generalization. Cite-ability as a content property remains useful as practitioner shorthand, but the framework's underlying lift estimates should now reference both papers.

      • Claude citation

        Citation surfaces

        Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Surfaces ClaudeBot vs Claude-SearchBot vs Claude-User as three first-party documented Anthropic UAs with distinct roles. Load-bearing fact: Claude is a chat product, not search-first; citation only happens when the web search tool is invoked. Anthropic publishes first-party crawler documentation and respects robots.txt, contrasting with Grok (no contract). Metadata dates synced to 2026-05-28.

      • DuckDuckGo AI citation

        Citation surfaces

        Initial publish. Two citation-bearing surfaces under DuckDuckGo's AI umbrella: Search Assist (AI-generated answer above the SERP, formerly DuckAssist; launched 2023-03-08 as Wikipedia summarization, broadened in July 2024, rebranded in 2025) and Duck.ai (privacy-anonymized chat proxy to third-party models). Search Assist always links one or two sources beneath the summary and runs DuckDuckGo's own DuckAssistBot crawler. Duck.ai's citation behavior depends on the selected model (Claude / Llama / GPT / Mistral); DuckDuckGo's added value is anonymization, not its own citation discipline. Joins citation-surfaces cluster as its 11th anchor.

      • DuckDuckGo AI citation

        Citation surfaces

        Same-day revision. Fixed Anthropic model naming: 'Claude 4.5 Haiku' was wrong ordering (Anthropic convention is tier-then-version), corrected to 'Claude Haiku 4.5'. Added staleness caveat to the Duck.ai model roster (lineup changes frequently; durable insight is the third-party-proxy architecture, not the specific 2026-05 menu). Clarified that Duck.ai model availability is not equivalent to Search Assist source attribution. Added a server-log referrer caveat in How to apply: a duckduckgo.com referrer does not prove Search Assist citation.

      • DuckDuckGo AI citation

        Citation surfaces

        Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact surfaces DuckDuckGo Search Assist's 1-2 source slot (the narrowest in the cluster), the resulting zero-sum citation-share competition at ~10x Perplexity's tightness, and the DuckAssistBot (not Bing) crawler dependency for AI surfaces specifically. Model menu uses a Status pointer per the template hygiene rule on volatile data.

      • DuckDuckGo AI citation

        Citation surfaces

        Added a Referrer-based detection by surface table to the Detection methodology section (backporting the pattern from the Perplexity citation entry). DuckDuckGo's privacy-redirect design means that a duckduckgo.com referrer is shared between Search Assist citation clicks and ordinary blue-link organic clicks, so Search Assist citation traffic is not distinguishable from regular DuckDuckGo organic traffic at the referrer level. Active probing remains the defensible measurement path.

      • Gemini citation

        Citation surfaces

        Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact: Gemini is Google's standalone chat product, same Google index family as AI Overview and AI Mode but a different surface with different grounding triggers. Unlike AI Overview, Gemini grounding via Search on Vertex AI is influenced by Google-Extended per Google docs. The opt-out controls differ across the three Google surfaces and should be tracked separately. Metadata dates synced to 2026-05-28.

      • Added the 2025 C-SEO Bench (arXiv:2506.11097, NeurIPS Datasets & Benchmarks 2025) counter-evidence anchor. C-SEO Bench directly tested 7 of Aggarwal et al.'s 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on document ranking, with traditional SEO outperforming all C-SEO methods. The 2023 PAWC effect sizes remain valid for the single-actor synthetic testbed; C-SEO Bench sets the empirical upper bound for what generalizes to production multi-actor conditions. Updates the umbrella effectiveness framing on the canonical GEO entry.

      • Grok citation

        Citation surfaces

        Initial publish. Grok is xAI's chat product with three citation-bearing answer surfaces: WebSearch (index-based retrieval), DeepSearch (multi-step research with visible reasoning trace plus native X integration), and the xAI API web_search tool (citations as structured response fields). Distinct from other AI citation surfaces because Grok pairs a general web index with native access to X (Twitter) posts as a first-class citation source, and because xAI's public crawler discipline is unusually opaque: documented user agents exist but observed retrieval traffic typically arrives without them. Joins citation-surfaces cluster as its 10th anchor.

      • Grok citation

        Citation surfaces

        Same-day revision. Body-vs-source fix: prior text said 'residential IPs' but cited Stackfox documents datacenter / proxy ASNs (M247 + Datacamp); now hedged honestly. Added DataDome corroboration (2025-12-11). Iran-headline reframed: Grok believed a fake account and generated a false headline about Iran attacking Israel nine days before the actual April 2024 Iranian strikes. Version timeline extended with Grok 4.2 Public Beta (February 17, 2026) and Grok 4 Fast (September 2025). Added web vs X-post citation separation, API vs UI non-aggregation, and an accuracy caveat: a citation indicates attribution, not factual correctness.

      • Grok citation

        Citation surfaces

        Added an At-a-glance summary table at the top of the entry, establishing the section structure later used by sibling citation-surface entries. The table compresses the load-bearing publisher facts (operator, index source, crawler discipline, observed traffic patterns, citation rendering, slot count, surfaces, current model versions) into a single scannable block and surfaces the entry's load-bearing fact in a final emphasized row. For Grok specifically, the load-bearing fact is that robots.txt cannot reliably block Grok and that enforcement must happen at the WAF or network layer rather than via user-agent rules.

      • Grok citation

        Citation surfaces

        Same-day template-hygiene fix. FAQ #3 still described observed retrieval traffic as 'rotating residential IPs' while the body and the new At-a-glance table had been corrected to 'rotating proxy / datacenter IPs (Stackfox M247 + Datacamp; DataDome describes residential, treated as observation-window variation)'. Synced FAQ #3 to match the body framing, eliminating the entry-internal contradiction. Also collapsed the At-a-glance 'Current model versions' row from a five-version list to a single-version pointer that references the Status section for the full timeline; reduces duplication and staleness maintenance burden in the summary box.

      • Grok citation

        Citation surfaces

        Added a Referrer-based detection by surface table to the Detection methodology section (backporting the pattern from the Perplexity citation entry). Surfaces the measurement gap publishers face: grok.com chat citation clicks send a usable referrer, but X-integrated Grok citation clicks are indistinguishable from ordinary X navigation at the referrer level, and xAI API web_search citations are server-to-server with no Grok-attributable referrer at all. Server-log analysis therefore captures only the grok.com subset reliably.

      • Meta AI citation

        Citation surfaces

        Initial publish. Meta AI is Meta's Llama-family consumer assistant across WhatsApp, Instagram, Messenger, Facebook, meta.ai, Ray-Ban Meta smart glasses, and Quest VR; runs on Llama 4 (April 5, 2025). Structurally distinct from the 11 other cluster anchors: per Wikipedia citing the Washington Post, Meta AI has summarized news from outlets without linking to original articles since May 2024. Meta-ExternalAgent and Meta-ExternalFetcher respect robots.txt, but the consumer assistant does not consistently link citations the way ChatGPT search, Perplexity, Claude, or DuckDuckGo Search Assist do. Joins citation-surfaces cluster as its 12th anchor.

      • Meta AI citation

        Citation surfaces

        Thesis-level update. Missed the December 5, 2025 publisher licensing deals (CNN, Fox, USA Today, People Inc, Daily Caller, Washington Examiner, Le Monde) that introduce linked attribution for partner content. Restructured to a two-tier model: licensed partners get linked citation; non-licensed and non-news queries remain summarize-without-inline-attribution. Added Instagram/Facebook/Threads public posts as a distinct citation source. MAU updated to ~1B (Q1 2025) / ~1.2B (2026). Direct-cited the WaPo primary source instead of via Wikipedia. Added tier-aware probes, Meta-referrer caveat, accuracy caveat, backend-mix hedge.

      • Meta AI citation

        Citation surfaces

        Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact: Meta AI is the only cluster anchor where citation behavior depends on the publisher's commercial relationship with the vendor. The Citation tier optional row codifies the licensed-partner roster from December 5, 2025. Glance explicitly flags the news-citation behavior as actively evolving via publisher deals and to be re-checked quarterly.

      • Meta AI citation

        Citation surfaces

        Added a Referrer-based detection by surface table to the Detection methodology section (backporting the pattern from the Perplexity citation entry). Meta AI is the weakest passive-detection surface of any cluster anchor because of the in-app webview architecture across WhatsApp, Instagram, Messenger, and Facebook. Only meta.ai web sends a reliably distinguishable referrer; in-app surfaces produce referrers shared with ordinary social-link traffic, and voice / VR surfaces produce no publisher-visible click at all.

      • Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact: you cannot block Copilot retrieval without blocking Bing Search, because Copilot uses Bingbot (no separate Copilot-only crawler token). Conversely, optimizing for Bing Search and Copilot citation are the same lever; IndexNow upstream accelerates Copilot eligibility. Glance also surfaces the enterprise Microsoft 365 Copilot tenant-data scope as out-of-publisher-reach. Metadata dates synced to 2026-05-28.

      • Passage-level optimization

        Retrieval pipeline

        Added the C-SEO Bench (arXiv:2506.11097, NeurIPS Datasets & Benchmarks 2025) counter-evidence anchor to the Aggarwal footnote. The 2023 PAWC effect sizes for Quotation Addition (27.8 / ~43%), Statistics Addition (~33%), Fluency Optimization (~29%), and Cite Sources (~28%) remain valid for the single-actor synthetic testbed they were measured on, but C-SEO Bench tested 7 of the 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on document ranking. The PAWC numbers now read as an upper-bound effect rather than a production prediction.

      • Perplexity citation

        Citation surfaces

        Added an At-a-glance summary table at the top of the entry. The second instance of the section structure now used across the citation-surface entries; Perplexity is the opposite-pole stress test (widest source pool, declared crawler with documented stealth-crawler controversy, multi-API developer surface) and validates the template handles a 'normal' surface alongside an outlier. Glance surfaces Perplexity's load-bearing publisher fact: allowing PerplexityBot in robots.txt is neither necessary nor sufficient for citation, per Cloudflare August 2025. Model-roster row uses a pointer into Status, per the hygiene refinement from the Grok prototype.

      • Perplexity citation

        Citation surfaces

        Same-day metadata + consistency sync. Metadata dates (updatedAt, citationLastChecked, lastFactChecked) moved from 2026-05-26 to 2026-05-28 to match the At-a-glance addition. Load-bearing fact refined to make the Perplexity vs Grok distinction explicit: Perplexity has a declared crawler contract with published IP ranges, the dispute is over undeclared stealth beyond it; Grok has no contract at all. Added a Comet rollout footnote citing TechCrunch / Bloomberg for Android (2025-11-20) and Perplexity changelog plus AI CERTs News for iOS (2026-03-18). Expanded the PerplexityBot footnote to include Perplexity-User verbatim wording.

      • Pillar content

        Search foundations

        Added the C-SEO Bench (arXiv:2506.11097, NeurIPS Datasets & Benchmarks 2025) counter-evidence anchor to the Aggarwal footnote. The 2023 PAWC effect sizes remain valid for the single-actor synthetic testbed but C-SEO Bench tested 7 of the 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on document ranking. The pillar / spoke architecture remains a useful editorial pattern, but its citation-lift estimates should reference both papers rather than treating 2023 PAWC numbers as a production prediction.

      • Added C-SEO Bench (arXiv:2506.11097, NeurIPS Datasets & Benchmarks 2025) counter-evidence to the 'How do I optimize content for RAG' FAQ. The 2023 Aggarwal effect sizes for Statistics Addition / Cite Sources / Quotation Addition / Fluency Optimization remain valid for the original single-actor synthetic testbed but C-SEO Bench tested 7 of the 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on document ranking. Sources block also expanded to include both papers directly.

      • Added C-SEO Bench (arXiv:2506.11097, NeurIPS Datasets & Benchmarks 2025) counter-evidence inline to the Aggarwal PAWC examples in How-to-apply. The 2023 effect sizes for Statistics Addition, Cite Sources, and Quotation Addition remain valid for the single-actor synthetic testbed but C-SEO Bench tested 7 of the 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on document ranking. The cite-able-fact writing discipline itself remains useful; the original PAWC numbers should now be cited as upper-bound measurements rather than production predictions.

      • Topic clusters

        Search foundations

        Added the C-SEO Bench (arXiv:2506.11097, NeurIPS Datasets & Benchmarks 2025) counter-evidence anchor to the Aggarwal footnote. The 2023 PAWC effect sizes for Quotation Addition, Statistics Addition, Fluency Optimization, and Cite Sources remain valid for the single-actor synthetic testbed but C-SEO Bench tested 7 of the 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on document ranking. The cluster pattern remains a useful editorial structure, but its citation-lift estimates should reference both papers.

    4. 14 revisions

      • AI citation metrics

        Citation metrics

        Initial publish. Pillar synthesis of the citation-metrics cluster (6 anchor entries: attribution-rate, citation-share, citation-match-rate, cite-ability, citation-velocity, citation-rotation). Organizes the 6 metrics along three axes (output ratios, input content property, temporal signals) and adds a decision matrix, gap analysis, and adoption sequencing playbook. Pillar length (~3000 words) intentionally exceeds the term-entry target.

      • AI citation metrics

        Citation metrics

        Same-day self-review fix. Corrected a metric-conflation error in 3 places (lede, attribution-rate section, FAQ #3) that mislabelled Ahrefs' AI-cited-URL-to-Google-top-10 overlap percentages (~29% Perplexity, ~7-8% ChatGPT/Gemini/Copilot) as 'attribution rate' measurements. The Ahrefs footnote was always accurate; the body had drifted to a shorthand that conflated the two metrics. Rewrote those passages to label the percentages correctly as cross-engine overlap data and to use it as illustrative evidence for the 'aggregation hides structure' point rather than as direct attribution-rate numbers.

      • AI citation metrics

        Citation metrics

        Same-day revision. Critical: reframed 'velocity = time-derivative of attribution rate' as 'temporal leading indicator' (different units; cluster-ripple to citation-velocity anchor); softened Aggarwal attribution from 'foundational paper formalizing GEO measurement framework' to 'starting point' (paper uses PAWC + Subscription Impression, not attribution rate). Substantive: decision matrix Q5 reworded, 'rewards content effort' softened, 'Two metrics typically anchor' qualified, composite-score claim adds vendor inline examples, velocity-vs-acceleration distinction explicit.

      • ChatGPT search citation

        Citation surfaces

        Initial publish. ChatGPT search citations are the source attributions from OpenAI's ChatGPT when its web search returns real-time content. ChatGPT Search launched October 2024 (Plus first), expanded to Free with account, then opened to all users without an account in February 2025. Five surfaces: chatgpt.com web, Desktop (Mac/Win), mobile (iOS/Android), ChatGPT Atlas browser (macOS, 2025-10-21), and OpenAI API web_search. Entry covers per-surface citation behavior, detection methodology, how to optimize, what to skip, and the measurement gap. Sibling to other citation-surface cluster entries.

      • ChatGPT search citation

        Citation surfaces

        Same-day fact correction. Initial publish set citationStatus.chatgpt to 'cited' based on the wrong logic (other entries on the site have been cited by ChatGPT). citationStatus is per-entry, not per-site: it tracks whether this specific URL has been observed cited by the engine. A brand-new entry on its publish day cannot have been cited yet (no probe done, ChatGPT's index has not crawled it). Corrected to 'untested' to match the baseline state for a new entry; will update to 'cited' or 'not-cited' after the first ChatGPT probe.

      • ChatGPT search citation

        Citation surfaces

        Same-day revision. Critical fact correction: 4 places said 'February 2026' but should be 'February 2025' (OpenAI removed the ChatGPT search account requirement on 2025-02-05). Second 'February + year off-by-1' error in this cluster (gemini-citation had 2026-02-08 vs 2024-02-08). Sources block strengthened with three OpenAI canonical URLs (announcement + web search docs + bots docs). Several sentences hedged (FAQ Bing-primary backend, Desktop/mobile rendering, 'largest consumer AI' softened to 'one of the largest'). Atlas criticism rewritten from verbatim quote to paraphrase. Microsoft/OpenAI partnership footnote added.

      • Citation rotation

        Citation metrics

        Initial publish + same-day revision. Citation rotation is the temporal-stability dimension paired with citation-velocity. Literature uses several names: 'citation volatility' (industry default) and the inverse 'citation persistence'. References 3 adjacent arXiv papers (Answer Bubbles, News Source Citing Patterns, Attribution Gradients as adjacent HCI) plus industry sources (Digital Applied AI Overview study; 5W Citation Source Index 2026 consolidating ~680M citations; Leapd cross-engine brand-rate analysis; G2/Loamly Reddit citation decline coverage). 6th anchor of the citation-metrics cluster.

      • Citation velocity

        Citation metrics

        Cluster ripple from the ai-citation-metrics pillar peer review. Softened 'velocity = time-derivative of attribution rate' across 5 places (description, metaDescription, FAQ #1, body lede, territory note, How-it-relates section): the two have different units (ratio vs count per window), so they are not strict mathematical derivatives. Velocity is now framed as 'temporal leading indicator paired with attribution rate' with explicit hedge that velocity often moves before attribution rate does, not literal derivation. No formula or operational discipline changed.

      • Initial publish. Codifies the citation / mention / link three-way disambiguation that the cluster has been using implicitly across 5+ entries (attribution-rate, citation-match-rate, citation-share, brand-mentions-in-ai-answers, cite-ability) without an explicit definition entry. Foundational taxonomy hub: the citation-metrics cluster anchors all inherit denominator decisions from this distinction; standalone entry lets future entries reference it consistently instead of restating the disambiguation each time.

      • Same-day revision. Critical: acknowledged geo.wiki prior-art (removed first-codifier overclaim); softened citation-as-grounding (citation granularity varies). Substantive: opening reframed to 'source / brand / entity / URL'; 2x2 matrix rendered as table; matrix wording fixed; attribution rate counting rule scoped to glossary's system; Perplexity claim hedged. Sources 1 -> 6 entries (Google / OpenAI / Anthropic / Perplexity / geo.wiki / Profound); 3 footnotes added; AI dev tool 4th dimension expanded; schema-affects-citation reframed as practitioner speculation.

      • Definition-Lead Style

        GEO content methods

        ChatGPT citation confirmed. A fresh ChatGPT search probe on 'How do I write the first paragraph of a glossary entry so AI engines can extract it cleanly?' returned this entry as the top source, with two phrases attributed to GEO Glossary inline ('clean, standalone answer block that can be lifted without needing context' and 'should understand the term from the first sentence alone'). Both are paraphrases consistent with this entry's framing. citationStatus.chatgpt: untested -> cited; other engines not yet probed. Fourth confirmed ChatGPT citation on the site; top-sourced above google.com/Machine Learning Glossary and Scribbr.

      • Initial publish. External traffic disambiguation is the practitioner-coined methodology for separating real external visitors from founder browsing, scrapers, AI training crawlers, and VPN edge artifacts in server logs. The 5-axis framework (foreign edge / cache state / path / UA-plus-referer / non-scraper UA pattern) was developed in-house across Days 8-13 of site operation (14+ confirmed visits across 7 countries). Entry codifies what was previously scattered across internal probe logs and editorial notes; referenced from indexnow-protocol and several evidence files. Glossary-coined practitioner shorthand; no vendor canonical exists.

      • Same-day revision. Body and prior changelog entries rewritten to remove project-internal voice: version-codename references replaced with their observable meaning, process vocabulary replaced with the underlying actions, internal repository path references removed. Five hedging refinements: AI training crawlers narrowed to known/self-identifying crawlers; threshold rule labeled operational not statistical; Axis 5 mobile UA hedged; Axis 4 referer hierarchy marked site-specific; citation-rate-denominator tightened to traffic-derived reach metrics. Sources expanded from 2 to 6 entries.

      • ChatGPT citation confirmed. A fresh ChatGPT search probe on 'What is external traffic disambiguation in AI search analytics?' returned this entry as the top source, with the description paraphrased inline and attributed to GEO Glossary. citationStatus.chatgpt: untested -> cited; other engines not yet probed. Day-15 sweep also surfaced definition-lead-style as a same-day ChatGPT citation. Top-sourced 2 days after publish; third confirmed ChatGPT citation on the site and fastest publish-to-cited interval to date. Consistent with the pattern that practitioner-coined terms in low-competition territory attract citations.

    5. 9 revisions

      • AI Mode

        Citation surfaces

        Cluster template polish (no fact rewrite needed; core content already strong). Added Gemini 3.5 Flash backend reference (hedged). Added Detection methodology table for visual cluster consistency. Expanded What remains contested from 3 to 5 bullets (noreferrer-fix completeness + fan-out distinguishability). Lede expanded with explicit cross-cluster sibling links. Body-vs-cluster consistency fix: Bing AI Performance dashboard date updated from 2026-02-09 to 2026-02-10 (matches microsoft-copilot-citations entry).

      • AI Overview citation

        Citation surfaces

        Full rewrite to match the citation-surface cluster template. Critical SEO-myth fix: previous FAQ asserted 'E-E-A-T-style author markup helps for YMYL topics'; Google's official documentation states verbatim 'no special schema.org structured data that you need to add.' Rewritten to align with the e-e-a-t-ai-search entry. Added Google verbatim no-special-optimization quotes, AI Overview timeline (SGE 2023 to ~48% query coverage March 2026), 4-surface Detection methodology table, What remains contested section, and explicit 3-surface Google ecosystem disambiguation (AI Overview / AI Mode / Gemini citation).

      • Claude citation

        Citation surfaces

        Initial publish. Claude citations are the source attributions produced by Claude's web search tool across consumer surfaces (claude.ai, Desktop, mobile, Claude Code) and the Anthropic API. Web search announced 2025-03-20 with global rollout 2025-05-27. Entry covers per-surface citation behavior, the API citation schema (web_search_result_location with url / title / cited_text), how to optimize, what to skip, and the measurement gap. Sibling to perplexity-citation / microsoft-copilot-citations / ai-overview-citation / ai-mode / ai-dev-tool-citations in the citation-surface cluster.

      • Claude citation

        Citation surfaces

        Same-day revision. Dates re-verified against the Anthropic blog post (March 20 2025 + May 27 2025 both confirmed). Footnote 1 URL updated to the canonical docs.anthropic.com path. Six overconfident sentences hedged (Desktop / mobile rendering, Claude Code appearance, two-tool-version equivalence, month-1 tool-spend tradeoff, cross-engine divergence, ClaudeBot purpose disclosure). Contested bullet on ClaudeBot necessity expanded with two-scenario contrast (partner-index vs proprietary-index dominance). Wikipedia source removed from the Sources block.

      • Gemini citation

        Citation surfaces

        Initial publish. Gemini citations are the source attributions from Google's Gemini chatbot (gemini.google.com, Android, iOS) and the Gemini API with real-time web grounding. The API uses the google_search tool, returning groundingMetadata (groundingChunks with uri / title, groundingSupports, webSearchQueries). Distinct from AI Overview and AI Mode (Google Search surfaces, not Gemini app). Sibling to perplexity-citation / claude-citation / microsoft-copilot-citations / ai-overview-citation / ai-mode / ai-dev-tool-citations in the citation-surface cluster.

      • Gemini citation

        Citation surfaces

        Same-day revision. Critical fact correction: body had two instances of '2026-02-08' that should be '2024-02-08'; footnote was already correct, body now matches. Verbatim Google quote 'reduce model hallucinations...' now has its own footnote with the ai.google.dev source URL. Five overconfident sentences hedged (separately-developed pipelines, Be-in-Google-Search prerequisite, Google-Extended training-vs-grounding distinction, cross-engine substantial-variance, model-version list specificity). Vertex AI section expanded; Deep Research mode added to the surface list. Wikipedia sources removed from the Sources block.

      • Full rewrite to match the citation-surface cluster template (Claude / Gemini / Perplexity citation) and to fix a fact-level inconsistency with the IndexNow Protocol entry on the same site. Previous lede claimed all Copilot surfaces share a Bing-grounded pipeline; corrected per Microsoft Learn: M365 Copilot grounds primarily in Microsoft Graph tenant data, M365 Copilot Chat in public web, and only the consumer / Bing / Edge / Windows Copilot surfaces are uniformly public-web-grounded. Added a six-surface detection methodology table, a 'What remains contested' section, verbatim Microsoft Learn quotes, and a 2026 surface-list refresh.

      • Perplexity citation

        Citation surfaces

        Initial publish. Perplexity citations are the numbered source attributions displayed in Perplexity AI-search answers across the web app (perplexity.ai, public launch 2022-12-07), mobile apps, the Comet browser (July 2025 premium, October 2025 free), and the Sonar developer API with four model variants per docs.perplexity.ai. Entry covers per-surface citation behavior, how to optimize for inclusion in the source pool, what to skip, and the measurement gap between observable citation events and full reach. Sibling to microsoft-copilot-citations / ai-overview-citation / ai-mode / ai-dev-tool-citations in the citation-surface cluster.

      • Perplexity citation

        Citation surfaces

        Same-day revision. Completed the Comet timeline (Android Nov 20 2025, iOS Mar 18 2026, plus the CometJacking August 2025 security disclosure). Added a contested-section bullet for the 2024 Wired and 2025 Cloudflare research on non-PerplexityBot fetch activity that Perplexity has disputed. Added a per-surface detection methodology table mirroring the ai-dev-tool-citations cluster pattern. Hedged five overconfident vendor-attribution sentences. Reworked the PerplexityBot section so allowing the declared crawler is neither necessary nor sufficient for fetch. Added a verbatim PerplexityBot footnote.

    6. 6 revisions

      • AI dev tool citations

        Citation surfaces

        Initial publish: AI dev tool citations describe a 2024-2025 emerging surface category where AI-assisted developer environments (Cursor, Windsurf, Claude Code, Replit Agent, Bolt, Lovable, GitHub Copilot Chat) cite web content when grounding answers to developer questions. Distinct from general AI search citations (audience is developers, questions are technical) and from autocomplete-style IDE plugins (chat-driven, not inline-suggestion-driven). The category is glossary-coined practitioner shorthand; no vendor canonical name exists for it yet.

      • AI dev tool citations

        Citation surfaces

        Same-day revision. The initial publish wrongly attributed a cursor.com Vercel referrer to a Cursor IDE AI-citation. Cursor is an Electron desktop app whose chat citation links open via the system browser without a Referer header; the referrer was a regular web visit. Same-domain section now reads as a measurement note; detection matrix added. Several overclaims hedged (MCP standardization, source-pool framing, AEO transferability, ChatGPT-Bing routing). Windsurf attribution updated (Cognition AI acquired Codeium in 2025). 'What remains contested' section added.

      • AI dev tool citations

        Citation surfaces

        Small wording polish. The ChatGPT Search FAQ leads with what is documented (OpenAI's web search and retrieval systems, with precise routing across Bing, OpenAI's own crawl, and partnerships not vendor-documented) rather than negating a routing assumption. The 'What remains contested' Cursor entry briefly describes where Cursor's internal browser appears (preview, docs-lookup, help contexts) so readers can calibrate the edge case. Redundant Brave Search description removed from 'Be indexed in AI-friendly search APIs first'; Brave's role is already covered earlier in that section. Brave Search API added to sources.

      • IndexNow Protocol

        Infrastructure

        Initial publish: IndexNow is an open URL notification protocol launched 2021-10-18 by Microsoft and Yandex. As of 2026 it is supported by Bing, Yandex, Naver, Seznam, and Yep; Google has tested it since 2021 but has not adopted it. The entry documents protocol mechanics, current participants, the critical caveat that IndexNow is notification not guarantee, and a same-domain practitioner illustration (N=1, 13 days).

      • IndexNow Protocol

        Infrastructure

        Same-day revision. Critical fact corrections: key spec is 8-128 alphanumeric characters plus hyphens, not 32-128 hex (32-hex is just Bing's generator default); Bing direct endpoint is www.bing.com/indexnow, not api.bing.com. Methodology revision: the same-domain observation is now framed as illustration of the protocol's documented interface-vs-scheduler separation, not independent evidence of trust gating (N=1 over 13 days does not isolate trust gating from new-domain cold-start, integration bugs, or other alternative causes). New 'What remains contested or unverified' section added.

      • IndexNow Protocol

        Infrastructure

        Small peer-review follow-up fixes. The Yandex observation now notes that 4 of the 5 disambiguation axes were applicable (edge cache state was not separately verified). Footnote citation updated against the official IndexNow participants registry (searchengines.json, accessed 2026-05-25), which lists seven participants: the five consumer search engines plus Internet Archive and Amazonbot. The consumer search engine roster has been stable since 2024 per public IndexNow updates. Cross-page back-links added for AEO, GEO, AI Search Optimization, AI Mode, and AI Overview citation.

    7. 5 revisions

      • Agentic retrieval

        Retrieval pipeline

        First ChatGPT citation. A 2026-05-24 ChatGPT response on the topic of agentic retrieval examples surfaced this entry in its sources panel alongside WIRED and several major AI-search SaaS publications, with an inline body citation marker attributing a paraphrase to this entry. citationStatus.chatgpt from not-cited to cited; 1 of 5 engines now cited. 11 days from publish to first ChatGPT primary citation.

      • AI Mode

        Citation surfaces

        Initial publish: AI Mode is Google's conversational AI search surface accessed via a dedicated Search tab; powered by Gemini and launched March-May 2025. The measurement story in 2026 is that AI Mode clicks bundle into Google Search Console Web Search totals (per Google's own Search Central documentation) but have no separate breakdown row, unlike AI Overview which has its own Search Appearance row. Cross-references the AI Overview entry as the SERP-panel sibling for shared Gemini-grounding context; working assumption framework applied for the no-breakdown measurement territory.

      • AI Mode

        Citation surfaces

        Major fact-check correction after peer review. Earlier text claimed AI Mode citations are completely invisible to GSC and proposed a 'joint signature' inference method, both contradicting Google Search Central documentation: AI Mode clicks are bundled into GSC Web Search totals (no separate breakdown). Status and How-to-apply rewritten around the actual gap (data present without attribution); launch timeline corrected from 2024-2025 to March 2025 Labs / May 2025 US / August 2025 global; SEL footnote date corrected to 2025-05-22 with the noreferrer-fix update note. Google Search Central docs added to Sources; backend-sharing claims hedged.

      • Citation share

        Citation metrics

        Added cite-ability to related terms. The two are tightly coupled (cite-ability is the property that a healthy citation share signals on the measurement side), and the back-link strengthens the citation-metrics cluster bidirectional graph; cite-ability's related terms list was updated in parallel to point forward to citation-share.

      • Cite-ability

        Umbrella terms

        Related terms expanded from 3 to 7 to cover the full citation-metrics cluster (attribution-rate, citation-match-rate, citation-velocity, citation-share). The metrics already linked back to cite-ability as the property they measure, but cite-ability did not list any of them forward, leaving the bidirectional graph asymmetric. The expansion brings cite-ability in line with the cluster-completeness pattern other practitioner-coined anchor entries follow.

    8. 16 revisions

      • Initial publish as honest-reporting entry: the SEO folk wisdom that authoritative tone is a primary AI-citation lever is not supported by Aggarwal et al. 2023's benchmark. The paper ranked 'Authoritative' tone 7th of 9 methods (PAWC 21.8 vs baseline 19.5, +11.8%), well below the top four methods (Quotation Addition ~43%, Statistics Addition ~33%, Fluency Optimization ~29%, Cite Sources ~28%). The entry recalibrates expectations: authoritative tone has real value for human readers and editorial credibility, but is a modest lever in the only public empirical benchmark on AI citation behavior; do not treat it as a primary citation strategy.

      • Post-publish revisions. Recalibrated framing: 'does not hold up' softened to 'does not match this benchmark; +11.8% is a measurable modest lift' (paper does show positive lift). Repaired self-contradiction: entry critiqued 'compound effect' rhetoric but used same inference itself; relabeled compounding claims as editorial inference, not paper findings. Corrected E-E-A-T description (it's a quality framework, not officially a ranking factor). Added absolute-vs-relative-gain clarification so readers don't misread ranking gap. PAWC defined inline. Cut off-topic Keyword Stuffing bullet; consolidated §What to skip. Full author attribution updated.

      • Cluster completeness pass. Added FAQ explaining that 'Authoritative' (paper-canonical) and 'Authoritative Statement Strength' (glossary entry name) refer to the same finding, matching the naming-clarification pattern used in Cite Sources Optimization. Aligned the 5.5% combination wording with sibling entries ('more than 5.5%' with §5.3 reference, not '+5.5%'). Added 'Several editorial hypotheses (not paper-derived)' label to the why-folk-wisdom-diverges FAQ so unsupported speculation is marked as such. Related terms now lists passage-level-optimization and generative-engine-optimization (cluster cross-link completeness).

      • Cite Sources Optimization

        GEO content methods

        Initial publish: Cite Sources Optimization is one of the four top-performing methods in Aggarwal et al. 2023's GEO paper (PAWC 24.9 vs baseline 19.5, ~28% relative gain). Sibling entry of the Content-discipline cluster anchored by quotation-addition; reuses the cluster's territory framing (Sub-variant B), glossary-coined practitioner-extension flag, counter-evidence anchor (2023 testbed not yet replicated on 2026 commercial AI engines), and mitigates-not-eliminates frame. Distinguishes the paper's one-shot LLM-prompted intervention from the practitioner discipline of citing claims as a habitual writing technique.

      • Cite Sources Optimization

        GEO content methods

        Post-publish revisions. Added verified Ahrefs March 2026 citation footnote to the showcase example (an entry teaching citation discipline should cite its examples). Corrected combined-intervention framing: paper tested all pairwise top-4 combinations on a 200-example subset (Figure 4); Fluency + Statistics was the best of those measured pairs. Description and body opener rewritten to fluent multi-sentence. Clarified relationship between paper's 'Cite Sources' and this entry's 'Cite Sources Optimization'. Several mechanism-level wording softenings; note added that internal cross-links are navigation aids not authority citations. Full author attribution updated.

      • Cite Sources Optimization

        GEO content methods

        Tightened How-to-apply by removing the Pair-with-Quotation-Addition rule, which belongs in How-it-relates rather than as a writing rule. Strengthened the citation-stuffing caveat with the paper's actual Keyword Stuffing data point (PAWC 17.8 vs baseline 19.5, -8.7%) instead of pure inference. Restored 'The paper measured them separately' to the Quotation Addition relationship description so readers don't conflate the two methods. Ahrefs prior-study reference hedged from a specific dataset claim to 'earlier Ahrefs analysis'.

      • Definition-Lead Style

        GEO content methods

        Initial publish: Definition-lead style is the writer discipline of opening an answer block (term entry, FAQ, section) with a complete self-contained definition. Roots in inverted-pyramid journalism + extractive QA tradition (Rajpurkar et al. 2016 SQuAD). Empirical evidence for its specific effect on 2026 AI engine citation rate is indirect: extractive QA shows machines can extract clean answer spans, but modern RAG combines retrieval with generation rather than pure span extraction. Treat as a readability + chunk-robustness writer habit, not a primary citation lever.

      • Definition-Lead Style

        GEO content methods

        Post-publish revisions. Reframed body opener to lead with the inverted-pyramid analogy (avoids the irony of a 'lead once' entry repeating the definition across description + lede + body). Added single-sentence working-assumption takeaway. Replaced strawman example with realistic cite-ability before/after; cut two body-redundant FAQ items. PAWC defined inline. Softened the folk-wisdom '2026 GEO guides' claim. Acknowledged this glossary's answer-block concept is practitioner-coined, not external. SQuAD density reduced; BERT added to Sources. §How to apply shifted from descriptive to imperative voice.

      • Definition-Lead Style

        GEO content methods

        Cluster cross-link and reader-clarity pass. Replaced the undefined 'Schema and Answer-block cluster' label with self-explanatory content (answer block, sub-passage extraction, passage-level optimization) so readers do not need to know internal taxonomy. FAQ #1 reworded so the 'two reasons' answer is self-contained when extracted as a standalone snippet (named both reasons before walking through each). Added inline link from the '~28% to ~43% lift' range to the Aggarwal anchor entry so readers can verify the number without a separate footnote.

      • Fluency Optimization

        GEO content methods

        Initial publish: Fluency Optimization is one of the four top-performing methods in Aggarwal et al. 2023's GEO paper (PAWC 25.1 vs baseline 19.5, ~29% relative gain). Sibling entry of the Content-discipline cluster anchored by quotation-addition; reuses cluster's territory framing (Sub-variant B), glossary-coined practitioner-extension flag, counter-evidence anchor (2023 testbed not yet replicated on 2026 commercial AI engines), and mitigates-not-eliminates frame. Highlights the paper's strongest combined intervention finding: Fluency Optimization plus Statistics Addition outperforms any single GEO method by more than 5.5%.

      • Fluency Optimization

        GEO content methods

        Post-publish revisions. Corrected combined-intervention framing: paper tested all pairwise top-4 combinations on a 200-example subset (Figure 4, §5.3); Fluency + Statistics was the best of measured pairs. Added verified Ahrefs March 2026 citation footnote. Description and body opener rewritten from colon-then-participle to fluent multi-sentence (an entry on fluency should itself read cleanly). Opener precision improved to prevent extrapolation to universal 2026 ranking. Noted the paper's method (LLM rewriting) is not the recommended practitioner technique since LLM rewrites flatten voice; the discipline is to write fluently from the first draft. Full author attribution updated.

      • Fluency Optimization

        GEO content methods

        Cluster completeness pass. Related terms now includes Authoritative Statement Strength (the 5th Content-discipline cluster member) and Generative Engine Optimization (umbrella term). Added inline note at the first Statistical Density mention clarifying that 'Statistical Density' is the glossary entry name for what the Aggarwal paper calls 'Statistics Addition'. Added a How-it-relates bullet distinguishing fluency (form) from authoritative tone (content positioning) so readers know the two are different interventions.

      • Quotation Addition

        GEO content methods

        Initial publish: Quotation Addition is the Aggarwal et al. 2023 GEO paper top-performing source-content method (PAWC 27.8 vs baseline 19.5, ~43% gain). Anchor entry of the Content-discipline cluster: establishes territory framing, glossary-coined practitioner-extension flag, counter-evidence anchor, and mitigates-not-eliminates frame for sibling reuse. Flags that the 2023 GPT-3.5-turbo testbed has not been replicated on 2026 commercial AI engines by public study.

      • Quotation Addition

        GEO content methods

        Post-publish revisions. Corrected combined-intervention claim: paper isolated Fluency + Statistics as strongest pairing, not Quote + Cite Sources; tested all top-4 pairwise combinations on a 200-example subset (Figure 4 heatmap, §5.3). Reframed front-loading discipline using Liu et al. 2023 'Lost in the Middle' anchor. Karpukhin example URL switched to the EMNLP version that matches the verbatim 'greatly' quote (arXiv abstract uses 'largely'); added Ahrefs March 2026 verified example. Math correction: PAWC 27.8 vs 19.5 is ~43%, not ~41%. Full author attribution updated. Various wording softenings to avoid implying AI engines have specific verification mechanisms.

      • Quotation Addition

        GEO content methods

        Restored four substantive elements after the trim pass over-corrected: Aggarwal verbatim quote in Status section (anchor on Quotation Addition should demonstrate Quotation Addition); FAQ #5 nuance that quotation density varies with content type; naming-convention footnote noting Quotation Addition and Quotation Addition Optimization refer to the same discipline; Liu et al. 2023 added to Sources block since the body cites it. Inline-linked Cite Sources and Fluency Optimization at first body mention.

      • Statistical Density

        GEO content methods

        Cluster template alignment pass. Corrected the top-PAWC ranking (Top 3 by PAWC is Quotation Addition / Statistics Addition / Fluency Optimization, with Cite Sources at #4); the prior text had Cite Sources and Fluency swapped. Fixed 'best single intervention' wording (a combination is not a single intervention) and tightened the §5.3 reference to include Figure 4 and the 200-example subset. Added the standard 2026-commercial-engines replication caveat and a Working assumption sentence matching sibling cluster entries. Inline-linked Quotation Addition, Cite Sources, and Fluency Optimization at first body mention. Ahrefs example updated to the precise 37.9% with a full source footnote. Related terms expanded from 3 to 9 to include the PAWC cluster siblings, Definition-Lead Style, and answer-block. Sources block corrected to list all four Aggarwal institutions (the footnote already had them; the sources block was out of sync). H1 title case normalized; lastFactChecked refreshed.

    9. 10 revisions

      • Attribution rate

        Citation metrics

        Added explicit practitioner-coined self-flag to the first paragraph, making the practitioner-coined conceptual ownership claim visible rather than implicit. The term combines existing English ('attribution' + 'rate') but the operational definition (response counting + denominator choices) is first-party-conceptualized for the GEO measurement context, with no vendor or academic standard. The parenthetical 'distinct from traditional marketing attribution' remains as the precedent-acknowledgment caveat.

      • Status section compressed from ~470 words to ~170. The original had drifted to a multi-paragraph Ahrefs research summary (May 2025 vs Dec 2025 comparison, cross-engine correlation breakdown, ChatGPT-specific subnote) that read more like an Ahrefs digest than a sibling-pattern Status section. Compressed to: framing paragraph + one paragraph with the Dec 2025 lead finding (YouTube mentions 0.737), the 2 key Ahrefs cautions, and the Mike King relevance principle. Detail preserved in footnotes [^ahrefs-mentions] and [^ahrefs-mentions-dec]; no factual claims removed.

      • Citation match rate

        Citation metrics

        Added explicit practitioner-coined self-flag to the first paragraph: like attribution rate, citation match rate is not defined in any vendor or academic literature; the linked-vs-unlinked operational distinction was crystallized by GEO measurement practitioners. Makes the practitioner-coined conceptual ownership visible rather than relying on the implicit 'refinement of attribution rate' framing alone.

      • Citation share

        Citation metrics

        Added explicit practitioner-coined self-flag to the second paragraph, alongside the existing 'analog of share of voice' precedent acknowledgment. Vendor and academic literature do not define this exact AI-search operationalization; Profound, Otterly, and similar tools each implement slightly different internal definitions. Makes the practitioner-coined conceptual ownership explicit alongside the precedent-distinguishing prose that was already there. This entry attracted 2 confirmed external Google-search clicks shortly after launch, making the explicit conceptual ownership claim especially load-bearing.

      • Citation velocity

        Citation metrics

        Initial publish. Citation velocity extends the citation-measurement cluster (citation-share, citation-match-rate, attribution-rate, cite-ability, brand-mentions-in-ai-answers) with a temporal-dimension metric. The concept inherits from academic bibliometrics (Garfield 1955; ISI Science Citation Index lineage). Vendor blogs (UltraScout, Rankeo, Steakhouse) have shipped related metrics under similar names; the entry's distinct contribution is the per-engine + per-query-set + novelty-typology operational discipline that those vendor framings underspecify.

      • Citation velocity

        Citation metrics

        Major correction: the original draft conflated velocity (raw new-citation count per window) with acceleration (change in velocity); formula and framing now correct. Tightened Garfield 1955 body framing to match footnote precision. Reorganized Sources: Garfield 1955, UltraScout, Rankeo promoted from footnotes to Sources block. The 2-to-6-week lead-time framing softened to 'monitoring hypothesis, not benchmark'. The cite-ability connection softened from 'tends to drive' to 'may be associated with'. Replaced binary new/persistent split with a 5-fold typology (first-seen / new / recovered / persistent / lost).

      • Cite-ability

        Umbrella terms

        Status section tightened. The reflexive-trajectory paragraphs (4x longer than sibling Status sections, with hard 2026-05-16 / 2026-05-17 dates and cross-project comparisons) were compressed into one paragraph that preserves the self-demonstration but ages better. Hard dates and exact 0-of-5 / 1-of-5 counts moved to changelog entries above; broader 'May 2026' and 'roughly a week' framing left in body. The cross-project WriterAI 6-week comparison and Wiktionary / Wordnik / YourDictionary specifics were removed as meta-commentary that belongs in research/citations evidence files, not in a reference entry's body. No factual claims changed.

      • DefinedTerm schema

        Schema cluster

        Perplexity context-aware probe shortly after publish primary cited this entry. Source position #1 of 10; body inline citation marker '(aisearchglossary +2)' attributes a key passage about DefinedTerm as a 'specific, definable terms, concepts, or frameworks' marker to this page. Entry URL precision is partial: the answer body says 'look for DefinedTerm schema on https://aisearchglossary.com' using the root domain rather than the specific /terms/defined-term-schema URL, but the source panel and inline citation marker both anchor to our domain at primary citation strength. citationStatus.perplexity from not-cited to cited. Joins cite-ability, freshness-signals, and hybrid-retrieval as the fourth Perplexity-primary-cited entry.

      • Hybrid retrieval

        Retrieval pipeline

        Perplexity context-aware probe shortly after publish primary cited this entry. Source position #2 of 9; body inline citation marker '(aisearchglossary +2)' attributes a key sentence about rank-fusion and re-ranking to this page; the specific entry URL https://aisearchglossary.com/terms/hybrid-retrieval is listed in the Example resources block. citationStatus.perplexity flipped from untested to cited. First retrieval-cluster entry to cross the Perplexity primary-citation threshold (cite-ability + freshness-signals were the prior two, both outside the retrieval cluster). The practitioner-coined-term ranking thesis gains another data point: a vendor-canonical academic-dense entry is now cited by Perplexity, complementing the practitioner-coined entries that were cited earlier.

      • Statistical Density

        GEO content methods

        Evening of 2026-05-21: first confirmed external Google-search click landed on this entry. Visit at 21:38:29 AEDT from Singapore edge (sin1), Windows NT 10.0 + Chrome 148 (current stable version), Google referrer, cache PRERENDER 0s (first-time fetch). All five disambiguation axes pass under the documented traffic-attribution methodology. Significance: this is the third practitioner-coined anchor entry to attract a confirmed external Google click (after cite-ability and citation-share). The pattern is now multi-entry, not single-entry: practitioner-coined empty-territory terms reliably attract organic Google search clicks. Recorded as the fifth confirmed external visit.

    10. 11 revisions

      • AI crawler bots

        Infrastructure

        ChatGPT citation. A 2026-05-20 definitional probe with context-aware prompt ('in the context of AI search and GEO') returned this entry as one source in the panel of 9, with the entry title 'AI crawler bots | GEO Glossary' surfaced. citationStatus.chatgpt from not-cited to cited. Part of a same-day seven-new-citation ChatGPT sweep; full probe evidence preserved internally.

      • AI Overview

        Search foundations

        ChatGPT citation. A 2026-05-20 definitional probe for 'AI Overview citation' returned both ai-overview and ai-overview-citation entries as sources (GEO Glossary listed multiple times in panel). citationStatus.chatgpt from not-cited to cited. Part of a same-day seven-new-citation ChatGPT sweep; full probe evidence preserved internally.

      • AI Overview citation

        Citation surfaces

        ChatGPT primary citation. A 2026-05-20 definitional probe for 'AI Overview citation' returned this entry as the #1 source in the panel ('AI Overview citation | GEO Glossary, May 13 2026'). citationStatus.chatgpt from not-cited to cited. Part of a same-day seven-new-citation ChatGPT sweep; full probe evidence preserved internally.

      • Article Schema

        Schema cluster

        Added a territory framing paragraph and an independent counter-evidence anchor (Jono Alderson on structured-data limits). Softened 'Direct input to E-E-A-T' to 'structured input channel' since Google has not documented Article schema as a direct ranking signal; author and datePublished fields are inputs to ranking systems, not guaranteed levers. Expanded Related terms to mirror sibling schema entries (breadcrumb-list, howto-schema, json-ld) for symmetric cross-linking.

      • BM25

        Retrieval pipeline

        ChatGPT citation. A 2026-05-20 definitional probe returned this entry as one source in a 10+ technical sources panel ('BM25 | GEO Glossary, May 14 2026'). Notable because BM25 is heavily vendor-canonical (Wikipedia + decades of IR literature); our entry being included despite saturated incumbents is a signal that the GEO-context framing landed. citationStatus.chatgpt from not-cited to cited. Part of a same-day seven-new-citation ChatGPT sweep; full probe evidence preserved internally.

      • Cite-ability

        Umbrella terms

        Second AI engine citation: Perplexity. A 2026-05-20 definitional probe returned this entry as Perplexity's #1 source (of 9), with the primary definition paragraph paraphrasing our practitioner-coined framing ('practitioner-coined content property describing how suitable a passage is for AI extraction, quotation, and attribution'). citationStatus.perplexity from not-cited to cited; 2 of 5 engines now cited. The practitioner-coined-term ranking thesis gains a second empirical data point beyond the prior GSC ranking observation. Full probe evidence preserved internally.

      • DefinedTerm schema

        Schema cluster

        Backfilled relatedTerms with three schema cluster siblings (breadcrumb-list, article-schema, howto-schema) after bidirectional cluster-audit on 2026-05-20 surfaced that this entry was reciprocally linked by 9 other entries but only declared 5 back. Adding the three direct schema-class neighbors makes the cluster bond symmetric without exceeding UI sibling-display guidance.

      • Entity-based SEO

        Search foundations

        ChatGPT citation, near-primary position. A 2026-05-20 probe returned this entry as #2 in the sources panel, immediately following Ahrefs. citationStatus.chatgpt from not-cited to cited. Notable: ChatGPT preferred our entry over major SEO publication blog posts (Yext, SRNA, etc.) for the AI-search-context entity SEO definition. Part of a same-day seven-new-citation ChatGPT sweep; full probe evidence preserved internally.

      • FAQ Schema

        Schema cluster

        ChatGPT citation. A 2026-05-20 probe for 'FAQ Schema' returned this entry as the 4th source in a panel of ~10 ('FAQ Schema | GEO Glossary, May 16 2026'). Notable: surfaced despite Google's May 7 2026 deprecation of FAQ rich results (the entry's lead context); ChatGPT included us regardless. citationStatus.chatgpt from not-cited to cited. Part of a same-day seven-new-citation ChatGPT sweep; full probe evidence preserved internally.

      • Freshness signals

        Search foundations

        Perplexity citation (secondary). A definitional probe on 2026-05-20 returned this entry as the 10th of 10 sources for 'definition of Freshness signals,' with the plain-language version paragraph citing aisearchglossary alongside ahrefs. citationStatus.perplexity from untested to cited. Citation intensity is secondary (not primary), unlike cite-ability which Perplexity returned as #1 source on the same probe day. This is one of two entries cited by Perplexity across a 44-entry probe sweep (the negative finding: 42 of 44 entries not cited). Full probe evidence preserved internally.

      • Topic clusters

        Search foundations

        ChatGPT citation. A 2026-05-20 probe returned this entry as the 4th source in a panel led by Semrush ('Topic clusters | GEO Glossary, May 14 2026'). Notable: surfaced alongside major SEO publication content (Semrush, Seologist) on a vendor-canonical topic. citationStatus.chatgpt from not-cited to cited. Part of a same-day seven-new-citation ChatGPT sweep; full probe evidence preserved internally.

    11. 14 revisions

      • Answer block

        Search foundations

        Hedged claims about how AI engines use answer blocks: the previous wording said featured snippets and AI Overview 'select primarily at the answer-block level,' but no engine publishes its selection mechanism. Softened to observable-behavior framing. Corrected footnote 1: Google's featured snippets documentation does not publish a 40-60 word convention; the length norm is practitioner consensus from observation. Added 'glossary-coined' framing for the term itself with practitioner-described equivalents (featured snippet candidate, answer paragraph). FAQ-schema cross-reference now flags the May 7, 2026 rich-results deprecation.

      • BM25

        Retrieval pipeline

        Corrected a technical mechanism description: 'front-loading tightens BM25's term-frequency signal' was wrong because BM25 counts term frequency regardless of position. Front-loading helps via chunk-level retrieval (concepts near chunk start avoid truncation; front-loaders also concentrate keyword density). Matches hybrid-retrieval's self-aware correction. Separated vendor-documented BM25 platforms (Elasticsearch, OpenSearch, Solr, Azure AI Search, Lucene) from commercial AI engines (observable behavior consistent, not vendor-documented). Added BEIR benchmark grounding and Robertson & Walker 1994 SIGIR primary source.

      • BreadcrumbList Schema

        Schema cluster

        Propagated page's 'observed, not vendor-documented' inline hedge from Status to 5 other vendor-architecture claims (description / What-is / How-to-apply / FAQ / How-it-relates). 'Pages without breadcrumb lose context attribution' reframed to dispel schema-FOMO (strong on-page signals can perform comparably). Softened 'every mainstream CMS' with framework enumeration. Added vendor-canonical territory framing (19th use; schema cluster 5-anchor milestone complete with FAQ/HowTo/Answer block/DefinedTerm/JSON-LD). Added counter-evidence anchor: Google 'rich result not guaranteed' + Jono Alderson 2024 (2nd application).

      • DefinedTerm schema

        Schema cluster

        Three vendor-architecture claims softened: 'parse far more reliably than prose definitions', 'same shape works for Perplexity/Claude/ChatGPT retrieval', 'Suddenly central' + 'AI engines favoring schema-backed glossaries' all moved to observable-behavior framing. Other softening: inDefinedTermSet subgraph; 'schema-first GEO' to 'visible content first'; DefinedTermSet authority evaluation; 'Every DefinedTerm has' precision-fixed. Fact-audit catch: 'since 2017' corrected to Nov 2019 (schema.org v5.0). Added vendor-canonical territory framing paired with FAQ schema + JSON-LD (18th). Jono Alderson added to Sources.

      • Featured snippets

        Search foundations

        Grounded the 'highest-CTR SERP feature' claim with industry CTR studies (Ahrefs, Advanced Web Ranking, Moz) and a 2026 AI Overview distortion caveat. Softened 'AI Overview leans more on Knowledge Graph' to practitioner hypothesis. Aligned footnote 1 verbatim with the answer-block entry so the two cluster pages describe the same Google docs URL consistently. Added an explicit 'structured data does not trigger featured snippets' caveat. Labeled '3-5 FAQPage pairs' as practitioner heuristic and 'Position 0' as informal SEO term. Added reciprocal paired-with framing to answer-block (surface vs writing-unit pair).

      • Generative search index

        Search foundations

        Reframed for self-awareness: 'generative search index' is glossary practitioner shorthand (standard term: vector database), and the four-layer model is glossary editorial synthesis, since production stacks are typically loosely-coupled rather than one unified backend. Aligned the chunk-size claim with the LLMO entry's 200-1024 range and named tool defaults. Softened vendor-architecture overclaims (per-engine proprietary indices, ChatGPT training-vs-retrieval path, Perplexity hours cadence) to observable-behavior framing. Footnote 1 now disclaims that Pinecone does not use the term or four-layer framing.

      • Softened the page's own absolutist claims ('prevents hallucination', 'every assertion traceable') to match what grounding actually delivers. Replaced a fake '63% of SaaS founders' example with the verifiable Ahrefs Dec 2025 YouTube-mentions 0.737 anchor. Flagged 'hallucination grounding' as glossary-coined; standard ML uses 'grounding' / 'retrieval grounding'. Replaced the Perplexity blog root-domain source with Shuster et al. 2021 (arXiv:2104.07567), the direct empirical paper on retrieval reducing hallucination. Added reciprocal paired-with framing to the sycophancy entry, completing the meta-failure-mode pair.

      • Inverted index

        Retrieval pipeline

        Fixed 'weight term position with earlier positions scoring higher' myth: inverted indices + BM25 don't position-weight; front-loading helps via chunk-boundary avoidance and concept density. Completes 5-page cluster consistency (hybrid-retrieval / bm25 / vector-embeddings / reranking / this entry). Added field-weighting nuance (Lucene per-field boost is field-based, not within-document position-based). Softened 'every production hybrid retrieval system' universal claim. '2026 frontier learned indices' softened (Kraska et al. 2017 anchor). Added paired-with BM25 + vector-embeddings framing + IIR 2008 anchor.

      • JSON-LD

        Schema cluster

        Softened universal vendor-architecture claims: 'every major AI search surface parses JSON-LD as the canonical structured-data signal' separated into Google's documented recommendation (JSON-LD 'if your site's setup allows it'; all 3 formats valid) vs ChatGPT/Perplexity/Claude/Copilot/Gemini observable-behavior framing. Added FAQ rich-results May 7 2026 deprecation cross-reference (page uses FAQPage in examples). Added vendor-canonical territory framing paired with FAQ schema. Other softening: one-block vs @graph, sameAs canonical, GEO foundation. Footnotes 1+2 URLs precision-fixed (W3C TR/json-ld11/; Google /intro-structured-data sub-page).

      • Passage-level optimization

        Retrieval pipeline

        Aligned the chunk-size guidance with the LLMO entry's verified range (200-1024 tokens, with named tool defaults: LangChain ~250, LlamaIndex 1024, Pinecone 512-1024), replacing the narrower 256-512 estimate. Corrected footnote 2: the Aggarwal et al. 2023 GEO paper tests 9 content-modification methods at source-page level (Quotation Addition PAWC 27.8 vs no-modification baseline 19.5); the four-trait 'well-optimized passage' framework on this page is glossary editorial synthesis, not a paper finding. Softened several 'AI engines chunk and rank' universal claims to observable-behavior framing matching the rest of the retrieval cluster.

      • Updated 'Bing Chat' to 'Microsoft Copilot' (Microsoft rebranded the product at Ignite 2023, Nov 15 2023; the glossary's own microsoft-copilot-citations entry already uses the new name). Softened several universal vendor-architecture claims to observable-behavior-plus-inference framing, matching the rest of the retrieval cluster. Replaced the Perplexity Engineering Hub root-domain source with Karpukhin et al. 2020 (the DPR paper, arXiv:2004.04906; this is the retriever component Lewis et al. RAG actually uses). Footnote 1 expanded with cluster-standard precision; added vendor-canonical dual-position framing.

      • Reranking

        Retrieval pipeline

        Fixed 'rerankers weight first 1-2 sentences' position-weighting myth: cross-encoder rerankers process [query + passage] with bidirectional attention, not positional priority. Front-loading helps via chunk-boundary avoidance and concept density. Completes 4-page cluster consistency (hybrid-retrieval / bm25 / vector-embeddings / this entry). Softened 'Production-standard for most AI search engines' to vendor-documented building blocks vs commercial engines (not vendor-documented). '2026 frontier listwise reranking' softened to active research direction. Added retrieval-pipeline 4-component paired-with framing and Nogueira & Cho 2019 footnote.

      • Aligned the page with its own thesis: previously a page about cite-able content carried claims that were themselves not cite-able. Grounded the 'AI engines preferentially cite specific content' assertion with Aggarwal et al. 2023 PAWC findings (Cite Sources ~28%, Quotation Addition ~43%). Replaced the anthropic.com/research root-domain source with the Constitutional AI paper. Labeled the '2026 mitigation stack' as glossary editorial synthesis. Flagged 'extraction anchor' as glossary-coined. Added a calibrated-uncertainty caveat for domains where hedging is appropriate.

      • Vector embeddings

        Retrieval pipeline

        Fixed the 'tighter openings cluster more precisely' mechanism: standard embedding models use mean pooling or [CLS]/last-token aggregation, not position weighting. Front-loading helps via chunk-boundary truncation avoidance and concept density. Now matches hybrid-retrieval and bm25; three pages describe the mechanism verbatim. Dimensions range expanded to cluster-verbatim 384-3072 with 5 named models. Corrected 'less relevant to GEO than LLMO' framing (vector embeddings are vendor-canonical, LLMO is not). Added MTEB benchmark anchor and reciprocal paired-with BM25 framing.

    12. 7 revisions

      • FAQ Schema

        Schema cluster

        Softened the lead description from 'enabling AI engines to extract Q&A content as cite-able answer blocks' to 'commonly used to help systems parse Q&A content in a machine-readable format,' aligning with the already-hedged Status framing. Hedged the 'AI engines parse FAQPage more reliably than free-form Q&A in prose' comparative claim, because no public study isolates the structured-vs-prose effect. Source footnote replaced with a specific URL set.

      • HowTo Schema

        Schema cluster

        Fixed a Recipe / HowTo confusion in the FAQ: Recipe is its own schema.org type, not a HowTo subtype, so do not mark up recipes with HowTo hoping for rich results. Softened the specific 'September 13-14 2023' desktop deprecation date to 'in the months that followed' because the cited Google post is dated August 2, 2023 and does not give a specific September date. Several confident AI-citation-benefit claims hedged: the effect of HowTo markup itself versus the underlying step-headed content structure is not isolated by public study.

      • Hybrid retrieval

        Retrieval pipeline

        Corrected a technical claim: 'opening tokens carry more weight in some embedding models' was wrong, because standard sentence embeddings use mean pooling or fixed-position aggregation, not position weighting. The reason front-loading still helps is chunk-boundary truncation plus BM25 keyword density, not position-aware embeddings. Also broadened the embedding dimension range to 384-3072 with named tool examples, and softened vendor-architecture claims for AI engines that have not published their retrieval pipelines.

      • Corrected a footnote scope mismatch: footnote 1 (arXiv 2402.16827) was used to anchor broad claims about training-data influence, but the paper only covers instruction tuning (a fine-tuning phase after pretraining). The footnote scope and body anchor are now precisely about instruction tuning. Hedged the FAQ training-corpora claim because vendor docs do not publish training-data filters, broadened the chunk-size range to 200-1024 tokens to match named tool defaults, softened 'recall it accurately' since publishers cannot control model recall, and replaced the Anthropic research root-domain URL with the canonical platform.claude.com link.

      • LLMS.txt

        Infrastructure

        Verified Anthropic's canonical llms.txt URL: both docs.claude.com and docs.anthropic.com 301-redirect to platform.claude.com (body and Sources block updated). Softened the description to acknowledge that robots.txt and sitemap.xml are widely standardized in ways llms.txt is not. Attributed Google's 'we don't use llms.txt' position to John Mueller (2025) and Gary Illyes (Search Central Live, July 2025) with direct quotes. Replaced an unverifiable 'adoption growing month-over-month' line with a cross-section snapshot. Added Mintlify's role in originating llms-full.txt and a 'What not to expect' section.

      • Sub-document retrieval

        Retrieval pipeline

        Corrected a regression in How-to-apply: an earlier rewrite said 'directly motivates schema-first design', but sub-document retrieval shapes content structure, not schema markup. Restored to section-first content design. Softened the 'every major engine uses passage-level retrieval' claim because engine pipelines are not vendor-documented, broadened the chunking discussion to name LangChain and LlamaIndex defaults, and added DPR / ColBERT historical context to the RAG footnote.

      • Sub-passage extraction

        Retrieval pipeline

        Major rewrite. The original page presented sub-passage extraction as a universal architectural step (an extractive QA layer between retrieval and generation), which fits BERT-era classical IR but not modern LLM-based AI search engines: in those, retrieved chunks pass into the LLM context and quoting happens during generation, with no separate extraction layer. Added a Note on architecture, flagged the term as practitioner shorthand (standard literature: extractive QA / span selection), and replaced a fabricated 'FAQPage schema improves AI Overview citation by 30%' example with the Ahrefs Dec 2025 YouTube 0.737 correlation.

    13. 29 revisions

      • AI crawler bots

        Infrastructure

        Several corrections that mattered for readers copying robots.txt rules: PerplexityBot was wrongly marked as a training crawler (it's retrieval-only per Perplexity's docs); 'anthropic-ai' and 'cohere-ai' aren't in current vendor documentation; Google-Extended and Applebot-Extended are control tokens, not crawlers. Added Meta crawlers and Cloudflare's August 2025 finding on Perplexity stealth-crawler behavior. Added IP-verification guidance since UA strings alone are spoofable.

      • AI Overview

        Search foundations

        Fixed an internal contradiction where the FAQ called Google's top 10 the AI Overview 'candidate pool' while the body acknowledged that 62% of cited URLs rank outside the top 10. Softened several confident causal claims about engine behavior (E-E-A-T weighting, DefinedTerm citation rates, GSC measurability of AI Overview, schema as a citation lever) to match the hedging already used on ai-overview-citation, GEO, AEO, and freshness-signals. Added the May 7, 2026 FAQPage rich result deprecation context to schema recommendations.

      • Article Schema

        Schema cluster

        Softened the date-spoofing-penalty framing: no engine has published a penalty policy, and detection mechanisms (content-diff, re-embedding cycles) are plausible but not vendor-documented. Aligns with the freshness-signals page's reading of the Ahrefs July 2025 primary data.

      • Article Schema

        Schema cluster

        Aligned schema-confidence stance with the rest of the glossary: softened 'feed Google's E-E-A-T inference and AI engines' citation-eligibility decisions' to commonly-hypothesized inputs, removed unsupported engine-behavior claims (Person-typed authors weighted higher, headline-H1 mismatch suppresses rich-result eligibility, wrong-type signals worse than no-type), and added the May 7, 2026 FAQPage rich result deprecation context to the FAQ schema cross-reference.

      • Authority signals

        Search foundations

        Propagation patch from e-e-a-t-ai-search peer review: softened the FAQ claim 'AI engines appear to infer the same signals from structured content' to a practitioner-hypothesis framing, and added Google's documented position that E-E-A-T is not a direct ranking signal. Aligns this page with the cluster-wide three-layer hedge on E-E-A-T-as-AI-signal claims.

      • Authority signals

        Search foundations

        Major reframe: the 'four-layer authority model' is glossary editorial shorthand, not an industry standard; this disclaimer is now on the source page rather than only downstream. Confident weight claims softened to practitioner-hypothesis framing to match the E-E-A-T cluster. Added Ahrefs December 2025 brand-visibility correlations (with Ahrefs' own correlation-not-causation disclaimer), Wikidata notability caveat, the May 7 2026 FAQPage rich result deprecation note, the Bing AI Performance dashboard reference, and a Knowledge Graph operationalization chain.

      • Integrated Ahrefs' December 2025 cross-engine follow-up study (75K brands, ChatGPT + AI Mode + AI Overviews), where YouTube mentions emerged as the strongest single signal across all three engines (~0.737), beating branded web mentions; added 'correlation is not causation' caveat per Ahrefs' own disclaimer; added the content-volume finding (almost no relationship with AI visibility); reframed mentions vs citations as two independent dimensions.

      • Citation match rate

        Citation metrics

        Reframed mention vs citation as two independent dimensions (not a containment relationship), matching the brand-mentions entry; clarified the formula with reference-level vs response-level denominator; corrected the citation-share comparison from 'relative vs absolute' to 'both ratios, different normalization'; softened engine link-behavior claims and the 'backlink-equivalent' analogy.

      • Citation match rate

        Citation metrics

        Second AI engine citation on the site (after cite-ability). A fresh ChatGPT search probe on 'definition of Citation match rate' returned this entry as one of two GEO Glossary URLs cited as primary sources (this term page + the /observatory page separately), with the GEO definition explicitly attributed to GEO Glossary while industry context (87% SearchGPT/Bing match) is attributed to Seer Interactive. citationStatus.chatgpt flipped from untested to cited; Perplexity, Claude, Copilot, and Gemini probed in the same round, all still not-cited. Observatory being cited as a separate URL validates that DefinedTermSet entity recognition extends to other glossary-internal surfaces, not just /terms.

      • Citation share

        Citation metrics

        Downgraded the share-of-voice equivalence from 'directly translates' to 'analog with mechanism differences'; clarified the formula denominator (citation instances, not unique sources); added measurement-axes discipline (URL vs domain vs brand level, deduplication, per-engine vs cross-engine aggregation); added engine-coverage caveats; added Peec AI to the sources list with proper citation.

      • Cite-ability

        Umbrella terms

        Major correction after re-reading the Aggarwal paper: 'cite-ability' and the four-trait framework were misattributed as paper-derived. The paper actually tests 9 content-modification methods against a Position-Adjusted Word Count metric; the framework on this page is glossary editorial shorthand. Softened schema-as-citation-lever overclaims to match the rest of the cluster, hedged chunking-mechanism claims, removed the generic Perplexity Engineering blog source, and elevated the self-illustrating live example (this page ranks #2 organically but is cited by 0/5 AI engines as of audit) into the Status section.

      • Cite-ability

        Umbrella terms

        First AI engine citation. A fresh ChatGPT search probe on the same day returned this entry as one of four primary sources cited for 'definition of Cite-ability,' with the AI-search-context definition explicitly attributed to GEO Glossary. citationStatus.chatgpt flipped from not-cited to cited; Perplexity, Claude, Copilot, and Gemini remain not-cited as of the same probe round. The Status section's live-example paragraph now carries the 0/5 → 1/5 trajectory as a dated update, turning the self-illustrating example into a self-updating one in a 24-hour window.

      • DefinedTerm schema

        Schema cluster

        Softened the FAQ claim that DefinedTerm-tagged definitions appear in AI Overview citations 'at materially higher rates than equivalent prose-only definitions': the comparison has not been isolated by controlled study, and the supporting evidence is practitioner observation rather than measured causal effect.

      • DefinedTerm schema

        Schema cluster

        Propagation patch from cite-ability peer review: softened 'directly enhances cite-ability' to 'commonly hypothesized to support cite-ability' to match the cluster-wide schema-as-hygiene-factor stance.

      • E-E-A-T (AI search context)

        Search foundations

        Core reframing: per Google's own docs, 'E-E-A-T itself isn't a specific ranking factor' and 'Rater data is not used directly in our ranking algorithms.' Whether AI engines use E-E-A-T at all is not vendor-documented. Softened multiple unsourced specific claims (three-link floor, pseudonymous-content underperformance, engines cross-reference Wikidata/LinkedIn, vague third-party citation measurements). Added a Knowledge Graph chain showing how schema → entity recognition → KG node → downstream E-E-A-T-aligned heuristics, and aligned schema-encode-E-E-A-T framing with the cluster-wide hygiene-factor stance.

      • Entity-based SEO

        Search foundations

        Propagation patch from knowledge-graph peer review: replaced 'three high-trust links is the typical floor' with the 2-4 range used cluster-wide and added the 'not vendor-documented' caveat, matching the just-revised E-E-A-T, authority-signals, and knowledge-graph entries.

      • Entity-based SEO

        Search foundations

        Confident engine-behavior claims softened to match the rest of the cluster: 'rely heavily on entity resolution before retrieval' to 'one mechanism among several'; 'most major ranking factors are entity-mediated' to practitioner observation; 'materially higher citation rates' grounded in the Ahrefs December 2025 study (with its correlation-not-causation disclaimer); the FAQ claim 'close to a prerequisite for citation' now acknowledges the Wikipedia counter-example. Added Bing AI Performance dashboard reference and a Knowledge Graph operationalization chain cross-reference. Sources URL upgraded from wikidata.org root to the specific Notability policy page.

      • FAQ Schema

        Schema cluster

        Surfaced in ChatGPT search sources panel for the 'definition of FAQ Schema' query (appeared in the expanded sources panel with 'Yesterday' indexing timestamp) but not in the answer body's primary source URL list; ChatGPT cited Google Search Central + schema.org + 3 industry blogs instead. A recurring AI-search citation pattern: when vendor-canonical documentation exists for a concept (Google + schema.org both publish FAQPage documentation), AI engines tend to surface third-party glossary entries in retrieval but prefer the vendor sources for citation slots. citationStatus.chatgpt unchanged (the existing binary doesn't model 'surfaced but not cited'). Probe evidence preserved internally.

      • Freshness signals

        Search foundations

        Rewritten using Ahrefs' July 2025 primary 17M-citation study. Most material change: Google AI Overview is the only AI engine that cites slightly OLDER content than Google's organic SERP (16 days older), against the prior framing that grouped it with ChatGPT/Perplexity as freshness-preferring. Added the per-engine table, the '2.9-year-average' caveat (most cited content is still long-lived), and softened the date-spoofing-penalty claim to acknowledge no engine has published a penalty policy.

      • Propagation patch from cite-ability peer review: softened 'Only cite-able passages survive the grounding filter' to a relational framing: cite-able passages are the form most likely to survive, but exact grounding logic is engine-specific.

      • HowTo Schema

        Schema cluster

        Softened the 'mismatched schema types tend to be ignored or down-weighted by AI engines' claim: Google's 'Spammy structured markup' manual action is documented for deliberate misuse, but no engine has published an AI-citation down-weighting policy for schema-type-to-content-type mismatches.

      • Knowledge Graph

        Search foundations

        Major corrections after Wikidata notability policy re-check: the 'three high-trust links is the typical floor' is now the 2-4 range used cluster-wide; the 'Q-number is the strongest single signal' superlative is softened (no controlled comparison exists); 'content is homeless' acknowledges the Wikipedia counter-example. Removed a fabricated claim ('primary sources count, unlike Wikipedia' is not in Wikidata's own notability page). Footnote URL upgraded from wikidata.org root to the specific Notability policy page. Added the Ahrefs December 2025 anchor, the Bing AI Performance dashboard reference, and expanded the schema → entity → KG → downstream-heuristics chain for cluster consistency.

      • Footnote 2 URL was wrong (the 'april-2024' path does not exist); replaced with the real AI Performance post at blogs.bing.com/webmaster/February-2026/..., published 2026-02-10 by four Microsoft Product Managers. Other corrections: 'in beta' updated to 'public preview' (vendor wording); the five specific dashboard metrics are now listed; the four Copilot surfaces are now split by data source so readers do not confuse Microsoft 365 Copilot (internal tenant data, not GEO-addressable) with the web-grounded surfaces.

      • Passage-level optimization

        Retrieval pipeline

        Propagation patch from cite-ability peer review: softened 'directly amplifies cite-ability' to a relational framing: well-built passages are the form most likely to be quoted intact, not an automatic citation amplifier.

      • Pillar content

        Search foundations

        Reframed pillar-first instead of mirroring the topic-clusters entry. Several confident engine-behavior claims softened to practitioner observation: 'entity canonicalization', 'signals earn cluster recognition in Knowledge Graph layers', 'AI engines tend to cite well-built clusters as canonical references'. Critical fix: FAQPage JSON-LD spoke-page recommendation now carries the May 7 2026 rich result deprecation caveat (was telling readers to deploy schema for a SERP feature that no longer renders). Aggarwal footnote added with per-method PAWC numbers + glossary-inference label. Measurement methodology cross-references topic-clusters with pillar-specific reporting guidance.

      • Softened the freshness-penalty language: no engine has published a penalty policy, so 'discount or ignore stale signals' replaces 'penalize'. Aligns with the freshness-signals page's reading of the Ahrefs July 2025 primary data.

      • Added a caveat to the 'translate SGE techniques to AI Overview directly' guidance: FAQPage rich results were fully deprecated on May 7, 2026, so the older SGE-era playbook of shipping FAQ schema for SERP visual treatment no longer applies.

      • Softened the 'SGE and AI Overview are directly continuous' framing (Google has not publicly documented architectural continuity across the rebrand) and dropped the 'differences are mostly cosmetic' claim. Labeled the 'classical SEO era / GEO-mainstreaming era' periodization as glossary editorial framing rather than industry consensus. Corrected the SERP-surface description: the knowledge panel typically renders in the right rail on desktop, not above the blue links.

      • Topic clusters

        Search foundations

        Several confident engine-behavior claims softened to practitioner observation to match the rest of the cluster: 'engines treat cluster as canonical reference, raising citation rates' and 'DefinedTermSet converts cluster into entity collection'. Critical actionable fix: the FAQPage JSON-LD spoke-page recommendation now carries the May 7 2026 rich result deprecation caveat (was telling readers to deploy schema for a SERP feature that no longer renders). Aggarwal footnote precision-fixed (per-method PAWC numbers; the 'paper reinforces spoke pages need cite-ready' framing is glossary inference, not paper conclusion). Added a measurement methodology block closing the say/do/measure loop.

    14. 21 revisions

      • Agentic retrieval

        Retrieval pipeline

        Clarified the Ahrefs 12% AI-citation figure (it averages 5 engines including Perplexity's ~29%, hiding that ChatGPT/Gemini/Copilot are closer to ~8%); softened the query fan-out claim from 'direct evidence' to 'consistent with' Ahrefs' own framing; downgraded llms.txt advice to acknowledge no major engine has publicly confirmed using it.

      • Agentic retrieval

        Retrieval pipeline

        Softened remaining behavioral claims about AI agent architectures, source preferences, and content-condensation behavior to better match available evidence.

      • AI Overview

        Search foundations

        Updated Ahrefs measurements: their Feb-Mar 2026 study revised the AI Overview citation overlap with Google's top 10 from 76% down to 38%, and the cross-engine 12% headline was decomposed into per-engine numbers showing Perplexity at ~29% and ChatGPT/Gemini/Copilot at ~7-8%.

      • AI Overview citation

        Citation surfaces

        Updated Ahrefs AI Overview citation data (March 2026 study, ~38% of cited URLs in top 10 down from earlier ~76%); noted the drop reflects both broader candidate pool and improved measurement methodology.

      • AI Overview citation

        Citation surfaces

        Integrated full Ahrefs March 2026 breakdown (37.9% top 10 / 31.2% ranked 11-100 / 31.0% beyond top 100, plus YouTube at 5.6% of all cited URLs); rewrote FAQ to clarify top-10 ranking is the strongest single observable predictor but not a necessary condition; aligned schema framing with the GEO and AEO entries; added Google's May 7, 2026 FAQ rich results deprecation context; softened E-E-A-T author markup claim to acknowledge Google has not confirmed any specific weighting.

      • Acknowledged that the AIO acronym is also widely used in SEO coverage to mean AI Overview (Google's SERP feature); corrected the NBER paper to July 2025 with proper author attribution (Chatterji et al.); clarified that AIO's umbrella positioning over GEO and AEO is contested in industry usage; softened several engine-behavior claims to match available evidence.

      • Added Google's full FAQ rich results deprecation on May 7, 2026 (visual SERP treatment removed for all sites); corrected featured snippets launch date to January 2014; downgraded the 'FAQ schema is dominant AEO input' framing to match available evidence; hedged the AI Overview volume and GEO-subsumes-AEO positioning claims.

      • Attribution rate

        Citation metrics

        Updated Ahrefs AI citation data with proper per-engine framing (the 12% figure averages 5 measurements including Perplexity's ~29%); anchored the AI Overview 38-76% range to its source dates and methodology.

      • Attribution rate

        Citation metrics

        Added Microsoft's Bing Webmaster Tools AI Performance dashboard (public preview February 2026, the first major-engine native AI citation dashboard); clarified the formula with explicit denominator choices; distinguished AI-search attribution rate from traditional marketing attribution; resolved a page-internal contradiction between the body and FAQ on query sample size; added FAQ entries on metric distinctions and citation-vs-CTR.

      • BreadcrumbList Schema

        Schema cluster

        Initial publish

      • Cite-ability

        Umbrella terms

        First 5-engine citation probe: 0 of 5 cited, despite the page ranking #2 on Google organic search for the same query and Google's AI Overview answering it from other sources. A live demonstration of the gap between organic ranking and AI citation that GEO programs target.

      • FAQ Schema

        Schema cluster

        Added Google's full FAQ rich results deprecation on May 7, 2026 (visual SERP treatment removed for all sites, including previously-protected government and health categories); flagged that Rich Results Test FAQ support retires June 2026 and Search Console API support retires August 2026; softened claims about AI engines citing FAQ-marked content.

      • Featured snippets

        Search foundations

        Added Google's May 7, 2026 full FAQ rich result deprecation; flagged that Rich Results Test FAQ support retires June 2026 (use schema.org validator as fallback); softened the 40-60 word snippet length from 'practitioner consensus' to a common heuristic that varies by query and platform.

      • Removed the 'schema markup is the dominant GEO signal' framing: the cited Aggarwal 2023 paper actually shows that content-level edits (authoritative tone, source citation, direct quotation) drove the largest measured visibility gains, and schema was not tested. Updated Ahrefs measurements (Feb-Mar 2026 study revised AI Overview citation overlap from 76% down to 38%) and decomposed the 12% cross-engine overlap into per-engine numbers. Added a scope note distinguishing Aggarwal's broader visibility-optimization definition from this glossary's narrower citation-focused operational definition.

      • HowTo Schema

        Schema cluster

        Added context for Google's May 7, 2026 full FAQ rich result deprecation to the parallel FAQ reference; softened the 'AI Overview and ChatGPT cite step-by-step content disproportionately' claim to a practitioner observation without vendor confirmation.

      • Surface LLMO acronym in title and primary H2 (GSC showed 8 impressions for what-is-llmo cluster at avg position 32.9)

      • llms.txt bullet downgraded per cross-term Rule 9 calibration: 'observed fetching' → 'practitioners report seeing'; added Google's public statement that it does not use llms.txt; framed upside as opt-in insurance not measurable lift

      • LLMS.txt

        Infrastructure

        Status + How-to-apply sections recalibrated per Rule 9 confidence tiers: added Google's public 'we don't use llms.txt' statement; 'have been observed' downgraded to 'practitioners report seeing'; framed upside as opt-in insurance to avoid implying vendor commitment that doesn't exist

      • Rewrote the 'how do I optimize for RAG' FAQ to use the Aggarwal 2023 paper's actual method names (Statistics Addition, Cite Sources, Quotation Addition, Fluency Optimization), and noted that 'statistical density' is practitioner shorthand rather than a paper-defined metric.

      • Statistical Density

        GEO content methods

        Reframed 'statistical density' as a practitioner-coined shorthand rather than a metric defined in the Aggarwal et al. 2023 paper. The paper actually tests a content edit called Statistics Addition, measured against the Position-Adjusted Word Count (PAWC) metric, not a sentence-level density ratio. Page rewritten to use correct PAWC numbers, distinguish intervention vs correlation, and add anti-stuffing discipline (relevant + sourced + non-redundant statistics only).

      • Replaced two stale examples that were themselves not cite-able by the time of review: the 'Aug 2023 FAQ rich results limited to gov/health' fact (now superseded by the May 7, 2026 full deprecation) and a Princeton-paper claim that misstated both the term used and the number of top-performing levers. New examples use paper-accurate PAWC numbers and the current deprecation date.

    15. 18 revisions

    16. 28 revisions