Changelog archive

Every editorial revision ever recorded across the GEO Glossary, grouped by month and newest first. 384 revisions across 92 terms, 39 distinct workdays.

July 2026 22 revisions
1. Mon, Jul 13 9 revisions
  - AI crawler blocking
    Infrastructure
    First confirmed AI-search citation for this entry: Microsoft Copilot cited it at the top of its sources for the definition query, reproducing the enforcement-versus-voluntary distinction (blocking as mandatory enforcement, unlike robots.txt or AIPREF signals). One of five tested engines now cites it directly; ChatGPT, Perplexity, Claude, and Gemini did not.
  - AI Overview citation
    Citation surfaces
    Perplexity now cites this entry for the definition query, joining ChatGPT; two of five tested engines cite it directly. Claude, Copilot, and Gemini did not cite it this time.
  - Attribution rate
    Citation metrics
    Gemini now cites this entry, joining Claude and Microsoft Copilot; three of five tested engines cite it directly. Gemini and Copilot both reproduced the per-engine attribution-rate figures (Perplexity around 29 percent; ChatGPT, Gemini, and Copilot around 7 to 8 percent; Google AI Overview 38 to 76 percent). ChatGPT and Perplexity did not cite it this time.
  - Citation precision and recall
    AI behavior
    Microsoft Copilot now cites this entry, joining ChatGPT, Perplexity, and Gemini; four of five tested engines cite it directly. Multiple engines anchored to the Liu, Zhang, and Liang (EMNLP 2023) benchmark figures (74.5 percent precision, 51.5 percent recall). Claude cited a third-party source instead.
  - Citation rotation
    Citation metrics
    Broad citation gains for this entry: Perplexity, Claude, and Microsoft Copilot all now cite it directly, joining ChatGPT, and Gemini references it through the AI citation metrics overview. Four of five tested engines cite it directly and the fifth cites it partially. Copilot and ChatGPT used the term page itself, while Claude reached it through the citation-tracking page.
  - Citation velocity
    Citation metrics
    Perplexity and Gemini now cite this entry, joining Claude; three of five tested engines cite it directly and ChatGPT cites it partially. On Claude and Gemini the citation-velocity page itself was used, while ChatGPT and Perplexity reached it through sibling metric pages (AI citation metrics, citation share). Copilot did not cite it this time.
  - Context assembly
    Retrieval pipeline
    First confirmed AI-search citations for this entry: Perplexity and Gemini both cite it for the definition query, with Gemini using our definition as the opening line and rendering the select, order, and format pipeline. Two of five tested engines now cite it directly; ChatGPT, Claude, and Copilot did not.
  - Generative search index
    Retrieval pipeline
    Perplexity and Microsoft Copilot now cite this entry, joining Gemini; three of five tested engines cite it directly. Each placed our page at or near the top of its sources and used our practitioner-shorthand framing (the retrieval-corpus backend, split into passage chunks, embeddings, lexical data, and provenance). ChatGPT and Claude cited primary or academic sources instead.
  - Keyword Stuffing
    GEO content methods
    Microsoft Copilot now cites this entry, placing it second in its sources and reproducing our negative-result framing from the Aggarwal et al. study (keyword stuffing performed 8 to 10 percent worse than baseline, the only method to fall below it). One of five tested engines cites it directly; ChatGPT, Perplexity, Claude, and Gemini did not.
2. Mon, Jul 6 12 revisions
  - AI search evaluation
    Methodology
    First AI engine citation: Perplexity surfaced this entry as a primary source for 'What is AI search evaluation?', citing the pillar above the fold alongside the opening definition. First engine to cite the entry since publication; 1 of 5 tested engines now cites it. Perplexity reached this explainer directly rather than the underlying academic sources.
  - AIPREF (AI usage preferences)
    Infrastructure
    First AI engine citations: ChatGPT, Perplexity, and Claude all surfaced this entry for 'What is AIPREF (AI usage preferences)?', each drawing on our preferences-versus-authentication distinction (AIPREF declares usage preferences; it does not authenticate the requester, which is the separate Web Bot Auth effort). From 0 to 3 of 5 tested engines citing the entry.
  - Authority signals
    Search foundations
    First AI engine citation: Perplexity cited this entry for 'What is authority signals?', drawing on our citations-and-backlinks breakdown alongside established SEO glossaries. 1 of 5 tested engines now cites it.
  - Black-hat C-SEO
    GEO content methods
    Second AI engine citation: Perplexity surfaced this entry as its top source for 'What is Black-hat C-SEO?', listing it first in the answer's source list and drawing on our prompt-injection-and-deception framing. Joins Gemini; 2 of 5 tested engines now cite it.
  - Citation Footprint
    Citation metrics
    Copilot citation confirmed for 'What is Citation Footprint?', reproducing our coined framing (a cumulative, monotonic count of distinct cited URLs measuring coverage rather than intensity, distinct from citation share and citation velocity). Fourth engine to cite this coined entry; 4 of 5 tested engines now cite it.
  - Citation hallucination
    AI behavior
    First AI engine citations: ChatGPT and Perplexity both surfaced this entry for 'What is Citation hallucination?', with ChatGPT ranking it the top source and Perplexity drawing on our core distinction (a hallucinated citation points to a source that does not exist, separate from misquoting a real source or answering with none). 2 of 5 tested engines now cite it.
  - Cite-ability
    Citation metrics
    Gemini citation confirmed for 'What is cite-ability in AI search?', with our entry dominating the source panel and Gemini reproducing the four-trait framing (context-free quote, answer-first layout, factual density, echo-and-expand). Fourth engine to cite this practitioner-coined entry; 4 of 5 tested engines now cite it.
  - Deep research mode
    Retrieval pipeline
    First AI engine citations: Perplexity and Gemini both surfaced this entry as their top source for 'What is Deep research mode?', each reproducing our framing that deep research mode is an escalation of agentic retrieval applying query fan-out at much larger scale. 2 of 5 tested engines now cite it.
  - Freshness signals
    Search foundations
    ChatGPT citation confirmed for 'What is Freshness signals?', surfacing this entry as the top source and drawing on our metadata-signals breakdown (datePublished, dateModified, Last-Modified headers, version history). Joins Perplexity; 2 of 5 tested engines now cite it.
  - GEO content methods
    GEO content methods
    Perplexity citation confirmed for 'What is GEO content methods?', surfacing this pillar as its top source and reproducing our negative-result framing (most headline content tweaks were weak or null levers in Aggarwal 2023, and retrievability matters more than stylistic rewrites). Joins Gemini; 2 of 5 tested engines now cite it.
  - Position-Adjusted Word Count
    Methodology
    Perplexity citation confirmed for 'What is Position-Adjusted Word Count?', surfacing this entry as its top source, ranked above the original GEO paper it explains. Joins Claude; 2 of 5 tested engines now cite it.
  - Retrievability
    Retrieval pipeline
    First AI engine citation: Perplexity cited this entry as its second source for 'What is Retrievability?' (after Wikipedia), drawing on our Azzopardi and Vinay 2008 grounding and the framing that retrievability is an upstream lever on-page content tweaks cannot fix. 1 of 5 tested engines now cites it.
3. Wed, Jul 1 1 revision
  - AI search evaluation
    Methodology
    Folded RAG-system evaluation (RAGAS-style RAG-pipeline tooling) into the pillar rather than giving it a separate entry, since its dimensions reduce to axes already covered here: faithfulness deep-links to hallucination grounding, retrieval effectiveness to the retrieval pipeline, and the scoring is itself LLM-as-a-judge. Resolves the rag-evaluation backlog candidate as a fold, not a new term; completes the pillar's coverage of the eval landscape.
June 2026 100 revisions
1. Tue, Jun 30 3 revisions
  - AI search evaluation
    Methodology
    Deepened into the evaluation-cluster pillar: wired in the new LLM-as-a-judge entry as the scoring-mechanism spoke (the line on the judge being an evaluation condition now links to it) and added hallucination grounding as the faithfulness axis. Hub-and-spoke wiring so this entry serves as the navigable map for AI search evaluation; no underlying claims changed.
  - LLM-as-a-judge
    Methodology
    Same-day peer-review pass: corrected the Chatbot Arena attribution (Arena ranks models on crowdsourced human votes; the LLM judge scores MT-Bench, and the study used Arena's human data to validate the judge), scoped the over-80% human-agreement figure to the original study's settings, and added that the GEO benchmark's subjective-impression metric is itself G-Eval/GPT-scored, so verbosity bias may explain part of some content tactics' measured lift. Softened 'introduced' to 'named and validated' and 'cancel position bias' to 'detect and reduce'.
  - LLM-as-a-judge
    Methodology
    Initial publish: LLM-as-a-judge is using a strong model to score other models' open-ended outputs, named and systematically validated by Zheng et al. in 2023, where it matched human preferences at over 80% agreement in that study's settings. Joins the methodology cluster as the spoke that ai-search-evaluation points to when it notes the judge is an evaluation condition. Core framing: the judge is not a neutral oracle but carries documented position, verbosity, and self-enhancement biases, so a benchmark number partly reflects which model judged and how. Distinguishes judge-scored evaluation from deterministic metrics like PAWC.
2. Mon, Jun 29 13 revisions
  - AI citation metrics
    Citation metrics
    Perplexity citation confirmed for the definition query. Perplexity surfaced the AI citation metrics overview inline and in its sources-used list, reproducing the per-metric breakdown. First tracked engine to cite this pillar.
  - Brand mentions in AI answers
    Citation metrics
    Perplexity citation confirmed for the definition query; the entry surfaced inline and in Perplexity's sources. First tracked engine to cite this entry.
  - Brave Search AI citation
    Citation surfaces
    Perplexity citation confirmed for the definition query, with the entry surfaced as a top source in Perplexity's answer. Joins ChatGPT among the engines citing this entry.
  - ChatGPT search citation
    Citation surfaces
    Perplexity citation confirmed for the definition query, with the entry surfaced as the top source. Joins ChatGPT among the engines that have cited this entry.
  - Citation Footprint
    Citation metrics
    Now cited by ChatGPT, Perplexity, and Gemini for the definition query, the first citations for this coined metric. Perplexity and Gemini surfaced the citation footprint page directly; ChatGPT reached it through the terms index. Each answer carried the coinage framing and the cumulative, breadth-over-intensity distinction.
  - Citation probe protocol
    Methodology
    ChatGPT and Claude citations confirmed for the definition query, joining Perplexity and Gemini, so four of the five tracked engines now cite this entry. Each reproduced the probe-versus-protocol distinction and the fixed-panel methodology.
  - Citation share
    Citation metrics
    Gemini citation confirmed for the definition query. The metric surfaced in Gemini's answer through the related AI citation metrics and citation footprint pages, using the share-of-voice framing.
  - DuckDuckGo AI citation
    Citation surfaces
    Microsoft Copilot citation confirmed for the definition query, with the entry ranked the top source in Copilot's panel. Third tracked engine to cite this entry, joining ChatGPT and Perplexity.
  - Grok citation
    Citation surfaces
    Perplexity citation confirmed for the definition query; the entry surfaced in the Sources list of Perplexity's answer. Joins ChatGPT among the engines citing this entry.
  - Meta AI citation
    Citation surfaces
    Perplexity citation confirmed for the definition query, the first citation for this entry on any tracked engine. The entry ranked the top source in Perplexity's panel, carrying its two-tier (licensed-publisher versus general-web) framing into the answer.
  - Perplexity citation
    Citation surfaces
    Perplexity citation confirmed for the definition query, with the entry surfaced as a source in Perplexity's panel. Joins ChatGPT among the engines citing this entry.
  - Retrieval pipeline
    Retrieval pipeline
    Perplexity citation confirmed for the definition query, the first citation for this entry on any tracked engine, with the page ranked the top source.
  - Sub-document retrieval
    Retrieval pipeline
    ChatGPT and Microsoft Copilot citations confirmed for the definition query, each surfacing this entry as the top source. They join Perplexity, so three tracked engines now cite it. The page carried its passage-level retrieval framing into both answers.
3. Sun, Jun 28 1 revision
  - Knowledge cutoff
    Other
    Peer-review pass: softened the framing so the cutoff reads as a primary driver of retrieval and the citation surface, not the sole cause (retrieval also serves verification, long-tail knowledge, and user-requested sources). Reframed retrieval triggering as multi-factor (query type, uncertainty, product policy, user request), not a before/after-cutoff line. Surfaced per-engine differences into the body (Perplexity nearly always retrieves, ChatGPT conditionally, Claude answers stable knowledge without searching). Added an OpenAI model-docs anchor.
4. Sat, Jun 27 1 revision
  - Knowledge cutoff
    Other
    Initial publish: a knowledge cutoff is the fixed point after which a model's training data ends, so it has no built-in knowledge of later events. Joins the ai-behavior cluster. Core framing: the cutoff is a primary structural reason generative engines retrieve (web search / RAG) to answer beyond it, and retrieval is where citations appear, so for publishers the cutoff sits upstream of much of the citation surface rather than being only a limitation. Distinguishes parametric (training-frozen) knowledge from retrieved (fetched at answer time) knowledge.
5. Mon, Jun 22 5 revisions
  - Citation match rate
    Citation metrics
    Now cited by all five tracked engines, the first GEO Glossary entry to reach every engine we track. Perplexity and Microsoft Copilot are the two newest to cite this metric, joining ChatGPT, Claude, and Gemini. In each case it surfaced through the AI citation metrics overview that defines it, carrying its linked-versus-unlinked distinction into the answer.
  - Cite Sources Optimization
    GEO content methods
    Now cited by Claude for the first time, bringing the count to two of five tracked engines. Claude assembled its answer from a cluster of related GEO Glossary method entries (Quotation Addition, Fluency Optimization, Authoritative Statement Strength, Statistical Density, and Definition-Lead Style) rather than from a single page, citing Quotation Addition first. It shows how a method built on several techniques can surface through the entries that define each one.
  - Definition-Lead Style
    GEO content methods
    Now cited by Claude, the second of five tracked engines to cite the entry after ChatGPT. Claude drew the definition and the practical rules directly from this page, reproducing the answer-block-opening framing and the inverted-pyramid analogy it uses to explain leading with the definition.
  - Fluency Optimization
    GEO content methods
    Now cited by ChatGPT, joining Claude as the second of five tracked engines to cite the entry. ChatGPT used this page as its lead source and reproduced its Position-Adjusted Word Count framing for the method.
  - Hallucination grounding
    AI behavior
    Now cited by ChatGPT, the first of five tracked engines to cite the entry. ChatGPT surfaced this page among its sources when answering what hallucination grounding is, alongside other AI reference glossaries.
6. Sun, Jun 21 15 revisions
  - AI search evaluation
    Methodology
    Revalued the supporting Aggarwal PAWC figure to the paper's position-adjusted 'Overall' column (quotation addition 27.2 versus a 19.3 baseline); the earlier figure (27.8 vs 19.5) was the paper's plain Word Count sub-column.
  - Authoritative Statement Strength
    GEO content methods
    Corrected the Aggarwal Table 1 figures: the values previously cited as PAWC (Authoritative 21.8 vs baseline 19.5, and the rest) were the paper's plain Word Count sub-column. Updated to the paper's actual position-adjusted Word Count (the 'Overall' column: Authoritative 21.3 vs baseline 19.3, a raw +10%), which is the metric the paper's headline gains are computed on. The load-bearing finding is unchanged: the paper characterizes Authoritative tone verbatim as 'no significant improvement', a null result regardless of the raw percentage.
  - Cite Sources Optimization
    GEO content methods
    Corrected the Aggarwal Table 1 figures: the values previously cited as PAWC (Cite Sources 24.9 vs baseline 19.5, and the rest) were the paper's plain Word Count sub-column. Updated to the paper's actual position-adjusted Word Count (the 'Overall' column: Cite Sources 24.6 vs baseline 19.3, about +27%; Quotation Addition 27.2 about +41%; Keyword Stuffing 17.7 about -8%), which is the metric the paper's headline gains are computed on. The named top-3 framing and the 4th-place standalone ranking are unchanged.
  - Cite-ability
    Citation metrics
    Revalued the supporting Aggarwal PAWC figures in the footnote to the paper's position-adjusted 'Overall' column (Cite Sources 24.6, Quotation Addition 27.2, baseline 19.3); the earlier figures (24.9 / 27.8 / 19.5) were the paper's plain Word Count sub-column, not its position-adjusted metric.
  - Definition-Lead Style
    GEO content methods
    Revalued the supporting Aggarwal PAWC range to the paper's position-adjusted 'Overall' column (~27% to ~41% lift); the earlier range (~28% to ~43%) was derived from the paper's plain Word Count sub-column.
  - Fluency Optimization
    GEO content methods
    Corrected the Aggarwal Table 1 figures: the values previously cited as PAWC (Fluency Optimization 25.1 vs baseline 19.5, and the rest) were the paper's plain Word Count sub-column. Updated to the paper's actual position-adjusted Word Count (the 'Overall' column: Fluency Optimization 24.7 vs baseline 19.3, about +28%; Quotation Addition 27.2 about +41%; Keyword Stuffing 17.7 about -8%), which is the metric the paper's headline gains are computed on. Rankings unchanged (Fluency still 3rd by standalone score, still not in the paper's named top-3).
  - GEO content methods
    GEO content methods
    Corrected the Aggarwal Table 1 figures throughout the methods table and footnote: the values previously cited as PAWC (Quotation Addition 27.8 vs baseline 19.5, and the rest) were the paper's plain Word Count sub-column. Updated to the paper's actual position-adjusted Word Count (the 'Overall' column: Quotation Addition 27.2 vs baseline 19.3 about +41%, Keyword Stuffing 17.7 about -8%), which is the metric the paper's headline gains are computed on. The verdicts, the named top-3, and the null/negative findings are unchanged.
  - Keyword Stuffing
    GEO content methods
    Corrected the Aggarwal Table 1 figures: the values previously cited as PAWC (Keyword Stuffing 17.8 vs baseline 19.5, and the rest) were the paper's plain Word Count sub-column. Updated to the paper's actual position-adjusted Word Count (the 'Overall' column: Keyword Stuffing 17.7 vs baseline 19.3, mathematically about -8%), which is the metric the paper's headline gains are computed on. The negative-result finding is unchanged: Keyword Stuffing is still the only method scoring below baseline, and the paper's verbatim 'little to no performance improvement' framing stands.
  - Passage-level optimization
    Retrieval pipeline
    Revalued the Aggarwal per-method figures in the footnote to the paper's actual position-adjusted PAWC values (the 'Overall' column: Quotation Addition 27.2 vs baseline 19.3, about +41%). The earlier figures (27.8 vs 19.5) were the paper's plain Word Count sub-column, not its position-adjusted metric.
  - Pillar content
    Search foundations
    Revalued the Aggarwal per-method figures in the footnote to the paper's actual position-adjusted PAWC values (the 'Overall' column: Quotation Addition 27.2 vs baseline 19.3, about +41%). The earlier figures (27.8 vs 19.5) were the paper's plain Word Count sub-column, not its position-adjusted metric.
  - Position-Adjusted Word Count
    Methodology
    Corrected the Aggarwal Table 1 figures: the values previously given as PAWC (baseline 19.5, Quotation Addition 27.8, and so on) were the paper's plain Word Count sub-column. Updated to the paper's actual position-adjusted Word Count (the 'Overall' column: baseline 19.3, Quotation Addition 27.2 at about +41%, Keyword Stuffing 17.7 at about -8%), which is the metric the paper's headline gains are computed on. Also fixed the metric symbol to Imp_pwc and clarified that PAWC is computed over the sentences that cite a source, so it measures attributed word share within an answer rather than citation frequency or rank.
  - Quotation Addition
    GEO content methods
    Corrected the Aggarwal Table 1 figures: the values previously cited as PAWC (Quotation Addition 27.8 vs baseline 19.5, and the rest) were the paper's plain Word Count sub-column. Updated to the paper's actual position-adjusted Word Count (the 'Overall' column: Quotation Addition 27.2 vs baseline 19.3, about +41%; Keyword Stuffing 17.7, about -8%), which is the metric the paper's headline gains are computed on. The named top-3 framing and rankings are unchanged (Fluency Optimization is still 3rd by standalone score).
  - Statistical Density
    GEO content methods
    Corrected the Aggarwal Table 1 figures: the values previously cited as PAWC (Statistics Addition 25.9 vs baseline 19.5, and the rest) were the paper's plain Word Count sub-column. Updated to the paper's actual position-adjusted Word Count (the 'Overall' column: Statistics Addition 25.2 vs baseline 19.3, about +31%; Quotation Addition 27.2 about +41%; Keyword Stuffing 17.7 about -8%), which is the metric the paper's headline gains are computed on. Rankings and the named top-3 framing are unchanged (Statistics Addition still 2nd by standalone score).
  - Sycophancy vs cite-able fact
    AI behavior
    Revalued the supporting Aggarwal PAWC figures to the paper's position-adjusted 'Overall' column (Statistics Addition 25.2 vs baseline 19.3, Quotation Addition 27.2, Cite Sources 24.6); the earlier figures (25.9 / 27.8 / 24.9 vs 19.5) were the paper's plain Word Count sub-column, not its position-adjusted metric.
  - Topic clusters
    Search foundations
    Revalued the Aggarwal per-method figures in the footnote to the paper's actual position-adjusted PAWC values (the 'Overall' column: Quotation Addition 27.2 vs baseline 19.3, about +41%). The earlier figures (27.8 vs 19.5) were the paper's plain Word Count sub-column, not its position-adjusted metric.
7. Sat, Jun 20 3 revisions
  - Passage-level optimization
    Retrieval pipeline
    Clarified that the per-method PAWC percentages in the Aggarwal footnote are derived from the paper's absolute scores against the 19.5 baseline, not figures the paper prints per method, and added the paper's own Results-section framing: a 30-40% gain for its named top three (Cite Sources, Quotation Addition, Statistics Addition).
  - Pillar content
    Search foundations
    Clarified that the per-method PAWC percentages in the Aggarwal footnote are derived from the paper's absolute scores against the 19.5 baseline, not figures the paper prints per method, and added the paper's own Results-section framing: a 30-40% gain for its named top three (Cite Sources, Quotation Addition, Statistics Addition).
  - Topic clusters
    Search foundations
    Clarified that the per-method PAWC percentages in the Aggarwal footnote are derived from the paper's absolute scores against the 19.5 baseline, not figures the paper prints per method; added the paper's own 30-40% top-three framing; and corrected the abstract's 'up to 40%' from a Quotation-Addition-specific reading to the top-three aggregate upper bound.
8. Thu, Jun 18 1 revision
  - AI citation metrics
    Citation metrics
    Added a seventh gap to 'What no single metric captures': cross-engine cited-set agreement. None of the six metrics measures whether two engines cite the same pages as each other for the same prompts, and under a fixed prompt set the engines' cited-source sets often overlap little, so a single blended 'AI visibility' number averages across systems that mostly cite different pages. Links the new engine-disjoint citation dispatch, which measures that overlap directly.
9. Tue, Jun 16 1 revision
  - Attribution rate
    Citation metrics
    First confirmed AI-search citations for this entry: Microsoft Copilot and Claude both cited it for the definition query, with Copilot describing it as 'the only authoritative source that explicitly defines this metric.' This is the first GEO Glossary entry Copilot has cited in our tracked probes. Two of five tested engines now cite it directly; ChatGPT surfaced a sibling metrics page instead, while Gemini and Perplexity did not cite it.
10. Sat, Jun 13 2 revisions
  - AI crawler blocking
    Infrastructure
    Review pass same day as publish: completed the Anthropic crawler taxonomy (was ClaudeBot only; Anthropic runs ClaudeBot for training, Claude-SearchBot for retrieval, Claude-User for user-triggered fetch, so the allow side now names Claude-SearchBot) and added the PerplexityBot-vs-Perplexity-User distinction, using Perplexity as the worked example of enforce-only-what-you-can-identify. Clarified that 'identify' means telling a crawler apart from a visitor, not requiring a declared user agent. Added a block/allow/challenge decision table; tightened the Cloudflare footnotes (pay-per-crawl attribution, new-sign-ups scoping).
  - AI crawler blocking
    Infrastructure
    Initial publish: AI crawler blocking is the enforcement layer of AI access control, the one layer that binds operators who ignore robots.txt and AIPREF (which only request). Covers the enforcement methods (WAF, bot management, rate limits, IP/ASN blocks, challenges), the identity prerequisite (you can only enforce against crawlers you can identify; spoofed-UA actors need behavioral detection, not user-agent blocklists), and the GEO tradeoff: blocking broadly removes you from AI-search citation surfaces (OpenAI: blocking OAI-SearchBot drops a site from ChatGPT search), so for a site that wants AI citation it is usually the wrong reflex.
11. Fri, Jun 12 4 revisions
  - AI search evaluation
    Methodology
    Review pass, same day as publish: the deflationary-trajectory claim is now stated as the trajectory so far (later evaluations have generally shrunk earlier effects) rather than a per-generation law; SAGEO Arena's headline conclusion is attributed to its authors in that benchmark's setting; C-SEO Bench's coverage is described precisely (nine methods, seven derived from the GEO benchmark's) with authors named. Added a compact table comparing what each of the three method families answers, their strengths, and their limits, and a note that LLM-as-judge scoring is itself an evaluation condition that moves numbers.
  - AI search evaluation
    Methodology
    Initial publish: AI search evaluation as the umbrella for the three method families measuring AI search engines (academic benchmarks, vendor-internal evals, practitioner probing), organized around one observation the newest benchmarks make explicit: evaluation conditions moderate results, and each more realistic benchmark generation has shrunk the optimization effects the previous one reported (single-pipeline GEO-bench to multi-actor C-SEO Bench to end-to-end SAGEO Arena). Joins the methodology cluster alongside the probe protocol and PAWC.
  - Robots.txt (Robots Exclusion Protocol)
    Infrastructure
    Calibrated three overbroad claims after review: 'every major engine documents robots.txt support' corrected to declared-crawler scope with xAI's Grok as the documented exception; the user-initiated exemption is now attributed to the two operators that actually document it (OpenAI, Perplexity) and flagged as the operators' own classification rather than a neutral reading of the protocol; Cloudflare's stealth-crawling finding is noted as a single incident that Perplexity disputed. Allow-rule benefits restated as citation eligibility rather than a guarantee, and Google's July 2019 standardization announcement added as a primary source.
  - Robots.txt (Robots Exclusion Protocol)
    Infrastructure
    Initial publish: robots.txt is the crawl-access file standardized as RFC 9309 (2022), and this entry's focus is what it cannot do in the AI era: compliance is voluntary by the protocol's own design; blocking crawling neither removes already-indexed URLs nor expresses usage preferences; blocking is not retroactive for model training; and the major engines' user-initiated fetchers are documented by their own operators as partially exempt. Joins the infrastructure cluster as the fetch-access layer under AI access control, with the per-engine crawler detail kept in the AI crawler bots entry.
12. Wed, Jun 10 2 revisions
  - Retrieval pipeline
    Retrieval pipeline
    Initial publish. The cluster pillar for the retrieval pipeline: the index-retrieve-rerank-assemble-generate chain between a page and its answer, with an optional agentic loop. Leads with the lever a publisher actually has: you cannot tune any stage, only harden the passage you feed in so it survives all of them. Frames the relationship to content methods as a sequence, not a competition (retrievability gates, writing quality converts), and dispels the front-loading-wins-position myth: within-document position is not a retrieval weight, the real effect is within the context window the engine sets. Maps all 16 cluster terms by stage.
  - Retrieval pipeline
    Retrieval pipeline
    Precision pass. Tightened the front-loading-position point: lexical and embedding retrieval do not rank a passage higher for appearing earlier on the page (embeddings encode word order, but that is not front-of-page reward), now welded to the chunk-survival point that page position affects which chunk a sentence lands in, not its ranking. Softened mechanism wording toward the survive-the-pipeline framing, hedged the indexing and agentic steps as common patterns rather than a universal architecture, and added a note that this describes a production pattern, not a vendor-confirmed one.
13. Tue, Jun 9 4 revisions
  - Context assembly
    Retrieval pipeline
    Initial publish. Names the stage between retrieval and generation, selecting, ordering, and packing retrieved passages into the context window, as a distinct step in the retrieve-then-generate pipeline (RAG, Lewis et al. 2020). Positions it as where lost-in-the-middle (Liu et al. 2023) and context-rot effects bite: assembly order, not retrieval alone, decides whether a passage is actually used. The practitioner GEO consequence is to write self-contained passages that survive being placed at any position. Fills the context-assembly stage of the retrieval-pipeline cluster. Selected via the term-selection mechanism.
  - GEO content methods
    GEO content methods
    First citation, on Gemini. A 2026-06-09 Gemini answer (web search on) cited this pillar as a primary source with inline attribution, and surfaced the entry's null-result framing of authoritative tone. Gemini moves to cited; 1 of 5 engines now cited, four days from publish.
  - Position-Adjusted Word Count
    Methodology
    First Claude citation. A 2026-06-09 Claude answer (web search on) on Position-Adjusted Word Count cited this entry as a primary source, with inline attribution for the position-weighted word-count framing. Claude moves to cited; 1 of 5 engines now cited, four days from publish.
  - Prompt injection
    AI behavior
    Initial publish. Defines prompt injection as an attack class (adversarial text read as a command, not data) and separates direct injection (typed into the prompt) from indirect prompt injection (Greshake et al. 2023: planted in content the model retrieves). Framed defensively for AI search: the indirect variant rides the same retrieve-and-extract path GEO optimizes, so it is the security mirror of citability, not a tactic to deploy. Boundary with black-hat C-SEO drawn explicitly (this entry is the mechanism; that one is the practice domain). Selected via the term-selection mechanism.
14. Mon, Jun 8 1 revision
  - Retrievability
    Retrieval pipeline
    Initial publish. Imports the IR measure retrievability (Azzopardi & Vinay 2008) into AI search and names it as the upstream lever: whether the engine's retrieval step can find and pull a page into the answer at all, which content-method optimization sits downstream of and cannot fix. Bridges the GEO content methods pillar's conclusion (the durable lever is being retrievable and self-contained) to the retrieval-pipeline cluster. Academic origin flagged; the formula and the retrieval-bias framing are primary-source verified. Selected via the term-selection mechanism (academic-anchor; the geo-content-methods conclusion made into a term).
15. Fri, Jun 5 17 revisions
  - AI visibility
    Citation metrics
    Initial publish. Defines AI visibility as the practitioner and vendor umbrella term for brand presence across AI answers, and disambiguates it: it is not a single defined metric but a composite that bundles brand mentions, citation share, attribution rate, and often position and sentiment. Flags that vendor 'AI visibility scores' differ in what they bundle, so two tools' numbers are not comparable, and points readers to the measurable, vendor-neutral primitives.
  - AI visibility
    Citation metrics
    Peer-review pass. De-anchored the pricing from a fast-aging '$100 to $400' figure to a structural range (entry tiers in the low hundreds per month up to four figures for enterprise), with specifics moved to the footnote and dated. Added the boundary between the two umbrella entries in this cluster (AI citation metrics is the editorial organizer of the primitives; AI visibility is the market's composite name), and a note that probe-based visibility scores inherit fixed-prompt probing's blind spots. Softened 'moves the primitives' to 'can improve over time,' and made the lightweight proxy per-engine and per-prompt-set.
  - Authoritative Statement Strength
    GEO content methods
    Removed an unsourced figure from the shared Aggarwal footnote: a previously-asserted 'Average 31.4% in combinations' for Cite Sources was found unlocatable in the GEO paper on re-check (ar5iv full text + web search). The verbatim facts are unchanged: the paper names Cite Sources in its top-3 despite its 4th-place standalone score without stating why, and the only combination result it reports is Fluency plus Statistics outperforming any single method by more than 5.5%.
  - Brave Search AI citation
    Citation surfaces
    First confirmed ChatGPT citation: ChatGPT surfaced this entry as a GEO Glossary source for the 'what is Brave Search AI citation' definition query. The chatgpt status moves from untested to cited.
  - ChatGPT search citation
    Citation surfaces
    First confirmed ChatGPT citation: ChatGPT surfaced this entry among its sources (in the folded 'more sources' tray) for the 'what is ChatGPT search citation' definition query. The chatgpt status moves from untested to cited.
  - Citation Footprint
    Citation metrics
    Initial publish. A glossary-coined practitioner metric: citation footprint is the cumulative breadth of a site's AI-cited content (distinct pages ever cited, across engines, over time), kept deliberately separate from citation share (intensity at a point in time). Flags its own coinage and its main limitation, that the cumulative curve only grows and hides decay, so it should be read alongside citation velocity and rotation rather than on its own. Selected via the term-selection mechanism (the empty-coin experiment slot, paired with the citation-share control).
  - Citation precision and recall
    AI behavior
    Level reclassified intermediate->advanced under the published level rubric (prerequisite-knowledge depth): it engages primary research at the method level, aligning it with the other research-anchored AI-behavior entries (context rot, lost in the middle).
  - Citation probe protocol
    Methodology
    First confirmed Perplexity and Gemini citations: both surfaced this entry as a top source for the 'what is citation probe protocol' query, with Perplexity ranking it first and Gemini's source panel listing several GEO Glossary entries for it. The perplexity status moves from not-cited to cited and gemini from untested to cited.
  - Citation velocity
    Citation metrics
    Level reclassified advanced->intermediate under the glossary's published level rubric (prerequisite-knowledge depth; see /about): it is a citation metric readable from the foundational layer plus one new measure, not the retrieval-mechanics or primary-research grounding that defines advanced.
  - Cite Sources Optimization
    GEO content methods
    Removed an unsourced figure surfaced while building the GEO content methods pillar: a previously-asserted 'Average 31.4% in combinations' for Cite Sources (offered as the reason it appears in the paper's named top-3 over Fluency) was found unlocatable in the GEO paper on re-check via the ar5iv full text and a web search. The verbatim facts stand: the paper names Cite Sources in its top-3 despite its 4th-place standalone score, without stating why; the only combination result it reports is Fluency plus Statistics outperforming any single method by more than 5.5%.
  - Deep research mode
    Retrieval pipeline
    Initial publish. Defines deep research mode as the agentic, multi-step research feature (dozens to hundreds of searches, then a long cited report) that the major engines shipped near-simultaneously: Gemini Deep Research (Dec 2024), ChatGPT deep research and Perplexity Deep Research (Feb 2025), Grok DeepSearch (Feb 2025). Frames it as query fan-out and agentic retrieval taken to scale, and a distinct citation surface from quick-answer AI search.
  - Deep research mode
    Retrieval pipeline
    Completed the engine roster after peer review: added Microsoft's Researcher in Microsoft 365 Copilot (March 2025, distinct from the Think Deeper toggle) and Anthropic's Claude Research (April 2025), so all five tracked engines are covered, and re-scoped the December 2024 to February 2025 window as the first wave with Microsoft and Anthropic following that spring. Softened over-general claims (broadly similar rather than near-identical; products differ; Gemini Deep Research is not stated to use query fan-out, only Google's Deep Search is). Added a note that deep research is the most expensive, least-probed citation surface.
  - Generative search index
    Retrieval pipeline
    Level reclassified advanced->intermediate under the published level rubric (prerequisite-knowledge depth): it is a conceptual shorthand for the retrieval corpus that can be followed from the foundational layer, not an algorithm-level entry.
  - GEO content methods
    GEO content methods
    Initial publish. The cluster pillar for GEO content methods: a one-page evidence synthesis across the method entries. Leads with the honest finding the field rarely gives: the foundational GEO paper names only three methods as effective, rates authoritative tone and keyword stuffing as verbatim null or negative, and the 2025 multi-actor C-SEO Bench re-test found most content methods largely ineffective once adopted at scale, with a retrieval-position baseline about 7.6 times stronger. All numbers are the cluster's ar5iv-re-verified Table 1 values; the comparison table is the single source of truth for the spokes.
  - Position-Adjusted Word Count
    Methodology
    Initial publish. The metric behind nearly every GEO effect size: PAWC (Position-Adjusted Word Count) from Aggarwal et al. 2023, the position-weighted share of an AI answer's text drawn from a source. Leads with the distinction the headline numbers drop, that PAWC measures word-count share under single-actor 2023 conditions (GPT-3.5), not citation rate or ranking, with the 2025 C-SEO Bench citation-ranking contrast as the counter-anchor. Selected via the term-selection mechanism (dangling-concept scan: 11 entries re-explained PAWC inline with no page); formula primary-source verified against the ar5iv mirror of arXiv:2311.09735.
  - Statistical Density
    GEO content methods
    Level reclassified advanced->intermediate under the published level rubric (prerequisite-knowledge depth): it is a practitioner shorthand for a simple content property (adding quantitative specificity), readable from the foundational layer rather than an advanced-prerequisite topic.
  - Sub-passage extraction
    Retrieval pipeline
    Level reclassified advanced->intermediate under the published level rubric (prerequisite-knowledge depth): it is a passage-level content concept readable from the foundational layer, aligned with its sibling sub-document-retrieval rather than the retrieval-algorithm entries.
16. Thu, Jun 4 5 revisions
  - Answer Engine Optimization
    Umbrella terms
    Added the first causal-isolation evidence on how much AEO actually moves traffic: a 2026 single-domain natural experiment (Glasp.co) found ChatGPT referrals grew 5.7x overall but only about 1.82x after removing platform-level growth (untreated pages on the same domain grew 3.5x on tailwind alone), so headline AEO multiples overstate the causal effect. Framed as a single-site result, not a generalizable estimate.
  - Citation hallucination
    AI behavior
    Initial publish. Defines citation hallucination as an AI citing a source that does not exist, and separates it from the two adjacent failure modes already in the glossary: citation precision (a real source that does not support the claim) and hallucination grounding (an answer with no retrieval at all). Anchored in the Walters & Wilder 2023 fabrication-rate study and the Mata v. Avianca sanctions.
  - Citation hallucination
    AI behavior
    Added the grounded-AI-search side: a Tow Center study of eight grounded tools found wrong citations in over 60% of responses, with over half of Gemini and Grok 3 responses citing fabricated or broken URLs. Introduced the hybrid failure mode it surfaces (the source is real but the link is fabricated, broken, or points to a syndicated copy), distinct from a wholly non-existent source. Softened claims that grounded retrieval reduces the problem (it does not remove broken-link or misattribution failures), broadened the definition to cover fake cases and quotations, and corrected the Mata sanction to the two lawyers and their firm.
  - Query fan-out
    Retrieval pipeline
    Initial publish. Defines query fan-out as Google's documented AI Mode / Deep Search retrieval technique (one query split into parallel sub-queries, then synthesized), and separates that vendor-documented mechanism from the weaker practitioner claim that fan-out explains why AI engines cite pages outside Google's top 10. Fills a long-standing internal-link gap: the term was referenced across many entries with no page to point to.
  - Query fan-out
    Retrieval pipeline
    Corrected an attribution error: the 'cited from beyond the top 10' pattern is measured for the standalone assistants (ChatGPT, Gemini, Copilot, Perplexity), not for Google's AI Overview, whose citations come disproportionately from the top 10 (about 38% in Ahrefs' 2026 update, down from 76% in 2025). Turned that contrast into a finding (AI Overview hugs organic ranking while independent assistants nearly decouple), softened the AI Overview claim to a 'correlate', noted that Ahrefs' own opening sentence mis-attributes its 12% headline, and added the 15,000-prompt dataset size.
17. Wed, Jun 3 2 revisions
  - Cite-ability
    Citation metrics
    Claude citation confirmed for the definition query ('What is cite-ability in AI search?'). The entry surfaced as a cited source in Claude's web-search answer, alongside several third-party glossaries. Third engine to cite this entry (after ChatGPT and Perplexity); 3 of 5 tested engines now cite it. Notable because cite-ability is a practitioner-coined term with no upstream primary paper, so Claude cites this explanation directly rather than reaching past it to a canonical source.
  - LLM Optimization (LLMO)
    Umbrella terms
    Added a side-by-side disambiguation of LLMO against GEO, AEO, and AIO (the AI-search acronyms readers most confuse), plus two FAQ entries on whether LLMO and GEO differ in practice; the umbrella framing is hedged as contested to match the GEO and AIO entries. Also fixed two sourcing errors: the data-selection citation pointed to arXiv:2402.16827 (a pretraining-scoped survey) while the text described instruction tuning, now repointed to arXiv:2402.05123; and the LangChain default chunk size was off ~4x (RecursiveCharacterTextSplitter defaults to 4000 characters, not ~250 tokens).
18. Tue, Jun 2 14 revisions
  - AI access control
    Infrastructure
    Peer-review pass (both reviewers found the hub sound). Clarified that 'access control' is meant in a broad publisher-policy sense (mostly signals and preferences, not hard technical controls) and that the surface spans control and discovery; relabeled IndexNow as a discovery push and noted llms.txt and IndexNow are in the map for context, not because they restrict access; named the crawler types (training, retrieval, user-triggered); added that content which must not be used should not be published publicly; added a worked four-signal example; and added primary-source anchors for Web Bot Auth, IndexNow, and llms.txt.
  - AI access control
    Infrastructure
    Initial publish: AI access control is the umbrella for the signals a site uses to govern how AI systems access and use its content. Joins the infrastructure cluster as the hub mapping four distinct, commonly-conflated questions to four mechanisms: robots.txt (fetch access), llms.txt (guidance), AIPREF / Content-Usage (usage preference), and Web Bot Auth (identity), plus IndexNow on discovery. Honest framing: most are voluntary, emerging signals, not enforcement; blocking crawling, opting out of training, and verifying a bot are different actions, and the hub's value is keeping them separate rather than collapsing them into one 'AI opt-out.'
  - AI crawler bots
    Infrastructure
    Added a cross-link to the new AI access control hub entry, which positions AI crawler bots as the agents that the access-and-usage signals (robots.txt, llms.txt, AIPREF, Web Bot Auth) are aimed at, so the control surface reads as a whole rather than one signal at a time.
  - AIPREF (AI usage preferences)
    Infrastructure
    Peer-review pass. Fixed the attachment draft's title ('Associating AI Usage Preferences with Content in HTTP', not 'Attaching...') and reframed the maturity reasoning: both drafts are adopted, still-active working-group documents, and the attachment draft's lapsed latest revision is the routine six-month Internet-Draft auto-expiry, not abandonment (it carries an Aug 2026 milestone). The watch-don't-deploy conclusion stands; the reasoning is now pre-standardization, not 'expired.' Also softened framing (effort-to-standardize vs finished standard; complementary to rather than successor to robots.txt).
  - AIPREF (AI usage preferences)
    Infrastructure
    Initial publish: AIPREF is the IETF AI Preferences working group standardizing a Content-Usage signal (HTTP header or robots.txt rule) for declaring how content may be used by AI systems, using a small vocabulary (train-ai, search; values y/n). Joins the infrastructure cluster as the preference layer complementing Web Bot Auth (identity), llms.txt (guidance), and robots.txt (access). Honest maturity framing: the vocabulary draft is an active Proposed Standard, but the attachment-mechanism draft (last revised October 2025) expired in May 2026 with no successor. AIPREF declares a preference; it does not authenticate or enforce compliance.
  - Black-hat C-SEO
    GEO content methods
    Peer-review pass; both reviewers green-lit the defensive framing, no factual errors. Tightened scope and hedging: clarified that the prompt injection covered here is the ranking and citation-manipulation kind, distinct from agent-security prompt injection; narrowed the 'white-hat methods show limited effect' claim to the content-side methods C-SEO Bench tested, not all content quality, authority, or technical SEO; softened the platform-terms claim to 'broadly treated as abuse in major AI platform policies'; and linked hallucination grounding, since hidden instructions can poison retrieval context even when the source is accessible.
  - Black-hat C-SEO
    GEO content methods
    Initial publish: black-hat C-SEO is the adversarial manipulation of AI ranking/citation behavior through deception (most notably hidden prompt injection), versus white-hat C-SEO, which improves a page's genuine clarity and sourcing. Joins the GEO-content-methods cluster as the definitional boundary entry, framed defensively rather than as a how-to. Negative-result core: even the white-hat methods C-SEO Bench tested show limited measured lift, so high-risk manipulation is a weak bet, and adversarial-ranking research like StealthRank optimizes for evading detection, so detection, model updates, and platform policies all move against it.
  - Chunking
    Retrieval pipeline
    Peer-review pass. Corrected the Status framing that treated contextual chunking (Anthropic) as the single 2024-2026 response to context loss; it is now presented as one of several approaches, alongside late chunking (Jina AI: embed the whole document, then split) and semantic chunking, with the late-chunking paper added as a non-vendor-single primary source. Tightened over-absolute claims (chunk as the retrieved unit, every production RAG system) to reflect that not all retrieval works by text chunking. Added a boundary note distinguishing chunking (splitting that produces units) from sub-document retrieval (selecting among the units).
  - Chunking
    Retrieval pipeline
    Initial publish: chunking is the preprocessing step that splits a document into smaller segments (chunks, typically a few hundred tokens) so each can be embedded, indexed, and retrieved on its own. Joins the retrieval-pipeline cluster as the front-of-pipeline step many existing entries reference but none defined. Frames the publisher boundary: the retrieval operator sets chunk size, split points, and whether context is prepended; the publisher controls only whether each passage stays interpretable once cut out of the page. Vendor-reported contextual-chunking lift is one corpus's result, not a universal benchmark.
  - Citation precision and recall
    AI behavior
    Added a note to the source footnote distinguishing this paper, 'Evaluating Verifiability in Generative Search Engines' (arXiv:2304.09848), from the other Liu et al. 2023 paper now in the glossary, 'Lost in the Middle' (arXiv:2307.03172). The two share the same first and senior author but are different works; both should be cited by short title rather than a bare author-year to prevent conflation.
  - Context rot
    AI behavior
    Initial publish: context rot is the empirically observed degradation in LLM output quality as input context grows longer, even on simple tasks and well below the maximum context window (Chroma 2025, 18 models). Joins the ai-behavior cluster. Distinct from context-window overflow, and disambiguated from lost in the middle: context rot is the length axis (how much context), lost in the middle is the positional axis (where in it); this glossary frames them as cousins. Publisher framing: context length is pipeline-controlled, so the response is concise high-signal passages, not verbose padding.
  - LLMS.txt
    Infrastructure
    Added a cross-link to the new AI access control hub entry, which maps llms.txt as the guidance layer (what AI is pointed to read) alongside robots.txt, AIPREF, and Web Bot Auth as the distinct, commonly-conflated signals a site uses to govern how AI systems access and use its content.
  - Lost in the Middle
    AI behavior
    Verified the venue against the ACL Anthology record (Transactions of the ACL, Vol. 12, 2024, pp. 157-173) and disambiguated the source from the other Liu et al. 2023 paper in the glossary, 'Evaluating Verifiability in Generative Search Engines' (arXiv:2304.09848), which shares the same first and senior author but is a different work; both are now cited by short title rather than a bare author-year. Also tightened the framing of front-loading as one evidence-backed mechanism among several, and clarified that BM25 and dense-similarity scoring do not weight within-body position even where production systems layer on field or heading boosts.
  - Web Bot Auth
    Infrastructure
    Added a cross-link to the new AI access control hub entry, which maps Web Bot Auth as the identity layer (who is asking) alongside robots.txt, llms.txt, and AIPREF as the distinct, commonly-conflated signals a site uses to govern how AI systems access and use its content.
19. Mon, Jun 1 6 revisions
  - Citation precision and recall
    AI behavior
    Perplexity and Gemini citations confirmed for the definition query ('What are citation precision and citation recall in AI search engines?'), both surfacing 'Citation precision and recall - GEO Glossary' as a primary source. Combined with the same-day ChatGPT citation, this entry is now cited by three engines (ChatGPT + Perplexity + Gemini).
  - Citation precision and recall
    AI behavior
    ChatGPT citation confirmed for the definition query ('What are citation precision and citation recall in AI search engines?'). The entry surfaced as a 'GEO Glossary' source in ChatGPT search, alongside the Liu et al. verifiability paper and other references. First confirmed citation for this entry on any engine.
  - Inverted index
    Retrieval pipeline
    ChatGPT citation confirmed for the definition query ('What is an inverted index in information retrieval?'). The entry surfaced as a 'GEO Glossary' source in ChatGPT search, alongside Wikipedia and GeeksforGeeks. First confirmed citation for this entry on any engine.
  - Lost in the Middle
    AI behavior
    Initial publish: Lost in the middle is the LLM tendency, documented by Liu et al. 2023, to use information at the start and end of a long context more reliably than the middle (a U-shaped accuracy curve). Joins the ai-behavior cluster as the underlying mechanism behind the cluster's many 'front-load your key content' recommendations, and explicitly distinguishes context-position effects (LLM-side, publisher cannot control) from retrieval-side position weighting (which does not exist in BM25 or embedding ranking). Counter-evidence note: the 2023 finding is partially mitigated but not eliminated by 2024-2026 long-context models.
  - Passage-level optimization
    Retrieval pipeline
    Perplexity citation confirmed for the definition query ('What is passage-level optimization for AI search?'). 'Passage-level optimization - GEO Glossary' surfaced as a primary source. First confirmed citation for this entry on any engine.
  - Sub-document retrieval
    Retrieval pipeline
    Perplexity citation confirmed for the definition query ('What is sub-document retrieval in RAG?'). 'Sub-document retrieval - GEO Glossary' surfaced as a primary source alongside RAG-technique sources. First confirmed citation for this entry on any engine.
May 2026 262 revisions
1. Sun, May 31 5 revisions
  - Authoritative Statement Strength
    GEO content methods
    ChatGPT citation confirmed for the definition query ('What is authoritative statement strength in AI search content optimization?'), surfacing as a 'GEO Glossary' source. This completes four-engine coverage: ChatGPT, Perplexity, Claude, and Gemini all cite this entry, the first term on the site confirmed cited by four distinct engines.
  - Authoritative Statement Strength
    GEO content methods
    Gemini citation confirmed for the definition query ('What is authoritative statement strength in AI search content optimization?'). Gemini surfaced the entry as a 'GEO Glossary' source and reproduced its framing (the paper-verbatim null quote plus the +11.8% raw-vs-statistically-null distinction). Third engine to cite this entry (Perplexity + Claude + Gemini), making it the glossary's most broadly-cited term.
  - C-SEO Bench
    GEO content methods
    Same-day revisions. Method names switched to the paper's own terminology (Simple Language / Citations / Quotes / Statistics rather than Aggarwal's Easy-to-Understand / Cite Sources / Quotation Addition / Statistics Addition); Aggarwal labels kept in parentheses. Added a second Discussion quote where the paper notes Aggarwal's own PAWC data, on inspection, also points to decreasing scores. Added specific evaluation LLMs (gpt-4o-mini + claude-3-5-haiku) as concrete anchor for the more-production-realistic framing. Clarified the paper benchmarks nine C-SEO methods plus a separate baseline (ten items total).
  - C-SEO Bench
    GEO content methods
    Initial publish: C-SEO Bench is the Puerto et al. 2025 NeurIPS Datasets & Benchmarks paper introducing the first benchmark to evaluate Conversational SEO methods across multiple domains, tasks, and competing-actor counts. Headline finding: 'most current C-SEO methods are largely ineffective,' and a traditional retrieval-ranking SEO baseline was measured 7.6× more effective than the best C-SEO method tested on the retail domain. Serves as the counter-evidence anchor for the geo-content-methods cluster, complementing the Aggarwal 2023 entries with multi-actor findings that approximate production more closely than single-actor synthetic tests.
  - Citation match rate
    Citation metrics
    Gemini citation confirmed for the definition query ('What is citation match rate in AI search?'). The entry surfaced as a 'GEO Glossary' source in Gemini's answer, cited inline in several places. Third engine to cite this entry (after ChatGPT and Claude). Notable: citation match rate is a practitioner-coined metric with no vendor-canonical primary paper, so Gemini cites this explanation rather than an upstream source.
2. Sat, May 30 26 revisions
  - Authoritative Statement Strength
    GEO content methods
    Claude citation confirmed for two of this entry's tracked queries: the definition query ('What is authoritative statement strength in AI search content optimization?') and the Aggarwal vendor-research query ('How did Aggarwal 2023 rank Authoritative tone among content methods?'). The entry surfaced as a primary source in Claude's desktop web-search answers. Same-day cross-engine signal: Perplexity also confirmed cited today across the same probes plus the operational variant.
  - Authoritative Statement Strength
    GEO content methods
    Perplexity citation confirmed for all three of this entry's tracked queries: the definition query ('What is authoritative statement strength in AI search content optimization?'), the operational query ('Does using authoritative tone increase ChatGPT citation rate?'), and the vendor-research query ('How did Aggarwal 2023 rank Authoritative tone among content methods?'). The entry surfaced as a primary source with inline citations in all three Perplexity answers. First multi-probe citation event for this entry, one day after the lede was rewritten to lead with the paper's verbatim 'no significant improvement' framing.
  - Authoritative Statement Strength
    GEO content methods
    PAWC labeling sweep. Aggarwal footnote now labels values as 'Table 1 main GEO-bench' with the standalone-vs-named-top-3 distinction explicit. Cluster's prior 'top 4' framing (Quotation, Statistics, Fluency, Cite Sources) replaced with the paper's verbatim named top-3 (Cite Sources, Quotation, Statistics) for combined-method strength; Fluency's 3rd-place standalone PAWC noted but not framed as in the paper's named effective group. Lede and FAQ #2 updated to compare Authoritative's null against the paper's named top-3. Table 5 Perplexity.ai per-engine caveat added.
  - Authoritative Statement Strength
    GEO content methods
    Primary-source re-verification of the cluster's shared Aggarwal Table 1 PAWC numbers + verbatim quotes against the ar5iv mirror of arXiv:2311.09735. All Table 1 PAWC values, Table 1 caption verbatim, Section 4 prose, the named top-3 quote, and Table 5 Perplexity per-engine numbers (including Keyword Stuffing 21.9 with paper prose 'performs 10% worse than the baseline') confirmed. Aggarwal footnote appended with the re-verification note; no body changes.
  - Citation match rate
    Citation metrics
    Claude citation confirmed for the definition query ('What is citation match rate in AI search?'). The entry surfaced as a primary cited source in Claude's desktop answer. Cross-engine consistency: ChatGPT first cited this entry 2026-05-17; Claude now also cites for the same probe.
  - Citation share
    Citation metrics
    Claude citation confirmed for the definition query ('What is citation share in AI search?'). The entry surfaced as a primary cited source in Claude's desktop answer. First confirmed Claude citation for this entry.
  - Cite Sources Optimization
    GEO content methods
    Perplexity citation confirmed for the Aggarwal vendor-research probe ('What did Aggarwal 2023's GEO paper say about Cite Sources as a content method?'). Cite Sources Optimization | GEO Glossary surfaced as one of 11 sources, with the sibling Quotation Addition entry also cited inline. Citation comes one day after the entry's lede was rewritten to lead with the paper's verbatim named top-3 framing.
  - Cite Sources Optimization
    GEO content methods
    PAWC labeling sweep. Lede now leads with the paper's verbatim named top-3 (Cite Sources, Quotation Addition, Statistics Addition) rather than the cluster's prior 'one of four top-performing methods' framing. The paper itself names Cite Sources in the top-3 for combined-method strength even though standalone Table 1 PAWC ranks Cite Sources 4th (the paper notes Cite Sources is standalone 8% below Quotation but boosts to Average 31.4% in combinations). Aggarwal footnote now labels values as 'Table 1 main GEO-bench' and adds Table 5 (Perplexity.ai) per-engine caveat.
  - Cite Sources Optimization
    GEO content methods
    Primary-source re-verification of the cluster's shared Aggarwal Table 1 PAWC numbers + verbatim quotes against the ar5iv mirror of arXiv:2311.09735. All Table 1 PAWC values, Table 1 caption verbatim, Section 4 prose, the named top-3 quote, and Table 5 Perplexity per-engine numbers (including Keyword Stuffing 21.9 with paper prose 'performs 10% worse than the baseline') confirmed. Aggarwal footnote appended with the re-verification note; no body changes.
  - Cite-ability
    Citation metrics
    Perplexity citation reconfirmed for the definition query ('What is cite-ability in AI search?'). Cite-ability | GEO Glossary continues to surface as a primary source ten days after first observed citation on 2026-05-20, with inline citations across the answer body.
  - FAQ Schema
    Schema cluster
    Added the primary-source URL for the August 2023 Google Search Central blog post 'Changes to HowTo and FAQ rich results' to the deprecation footnote (the prior text described the August 2023 restriction without linking to the announcement post itself). The footnote now cites both deprecation phases by primary source: the August 2023 restriction post and the May 7, 2026 full-deprecation notice on Google's FAQPage structured-data documentation page.
  - Fluency Optimization
    GEO content methods
    Claude citation confirmed for the Aggarwal vendor-research query ('What did Aggarwal 2023's GEO paper say about Fluency Optimization as a content method?'). The entry surfaced as a primary cited source in Claude's desktop answer. First confirmed Claude citation for this entry.
  - Fluency Optimization
    GEO content methods
    PAWC labeling sweep. Critical reframing: lede previously called Fluency Optimization 'one of the four top-performing single methods', but the paper's verbatim named top-3 (Cite Sources, Quotation, Statistics) does NOT include Fluency despite its 3rd-place standalone Table 1 PAWC. The paper's strongest framing of Fluency is the §5.3 Fluency-plus-Statistics combination pair (+5.5% over any single method), not as a named top single-method intervention. Aggarwal footnote labels values as 'Table 1 main GEO-bench' + Table 5 Perplexity.ai per-engine caveat.
  - Fluency Optimization
    GEO content methods
    Primary-source re-verification of the cluster's shared Aggarwal Table 1 PAWC numbers + verbatim quotes against the ar5iv mirror of arXiv:2311.09735. All Table 1 PAWC values, Table 1 caption verbatim, Section 4 prose, the named top-3 quote, and Table 5 Perplexity per-engine numbers (including Keyword Stuffing 21.9 with paper prose 'performs 10% worse than the baseline') confirmed. Aggarwal footnote appended with the re-verification note; no body changes.
  - Keyword Stuffing
    GEO content methods
    PAWC labeling sweep (same-day after publish). Inline Table 1 now labeled 'main GEO-bench' and frames relative gains as 'mathematically derived' rather than implying they are the paper's headline numbers; the paper itself frames its top-3 (Cite Sources, Quotation, Statistics) at 30-40%. Cluster's prior 'four top-performing levers' framing replaced with the paper's verbatim named top-3 in How-to-apply and How-it-relates. Fluency clarified as 3rd standalone but not in the paper's named top-3 (strongest in combined-method experiment instead). Footnote adds Table 5 per-engine caveat.
  - Keyword Stuffing
    GEO content methods
    Cross-benchmark scoping polish (same-day after PAWC sweep). Added 'under the tested public benchmarks' qualifier so the conclusion does not over-generalize beyond what Aggarwal 2023 and C-SEO Bench 2025 directly measured. Replaced 'the only two public benchmarks' with 'the two public benchmarks this entry cites' to keep the framing time-bounded as new benchmarks appear. Body and C-SEO Bench footnote now make explicit that C-SEO Bench measures citation ranking, not Aggarwal's PAWC citation-share metric, making it corroborating counter-evidence rather than a direct PAWC replication.
  - Keyword Stuffing
    GEO content methods
    Epistemic re-emphasis + cluster-wide PAWC primary-source re-verification. Description, lede, and Status now lead with paper-verbatim 'little to no performance improvement' (Section 4) and 'performs 10% worse than the baseline' (Table 5 Perplexity prose); the raw -8.7% / PAWC 17.8 / below-baseline framing is subordinated as a transparency check. Mirrors the pattern applied to authoritative-statement-strength. Body adds Perplexity Table 5 prose escalation (KS 21.9 vs baseline 24.0). Aggarwal footnote across 6 anchor entries appended with primary-source re-verification note vs the ar5iv mirror of arXiv:2311.09735.
  - LLMS.txt
    Infrastructure
    Corrected the Mintlify blog post URL and announcement date. The Sources block referenced 'mintlify.com/blog/llmstxt' (404) with the date November 14, 2024; the actual post is 'Simplifying docs for AI with /llms.txt' at 'www.mintlify.com/blog/simplifying-docs-with-llms-txt', published November 20, 2024. Body sentence about the Mintlify platform-wide rollout date updated to match. Caught by the new site-wide link-audit pass across all 226 frontmatter source URLs.
  - Quotation Addition
    GEO content methods
    Claude citation confirmed for the Aggarwal vendor-research query ('What did Aggarwal 2023's GEO paper say about Quotation Addition as a content method?'). The entry surfaced as a primary cited source in Claude's desktop answer. First confirmed Claude citation for this entry.
  - Quotation Addition
    GEO content methods
    PAWC labeling sweep. Aggarwal footnote now explicitly labels the values as 'Table 1 main GEO-bench' and includes all 9 + baseline. Lede reframed from '~43% gain' to 'mathematically derived +42.6%' and notes paper's headline range is 30-40%. Paper's verbatim named top-3 (Cite Sources, Quotation Addition, Statistics Addition) now surfaced; Fluency Optimization's 3rd-place standalone PAWC position no longer presented as 'top-4 effective' because the paper itself does not name it in the top-3. Table 5 (Perplexity.ai) per-engine caveat added (different baseline 24.0, best method +22%).
  - Quotation Addition
    GEO content methods
    Primary-source re-verification of the cluster's shared Aggarwal Table 1 PAWC numbers + verbatim quotes against the ar5iv mirror of arXiv:2311.09735. All Table 1 PAWC values, Table 1 caption verbatim, Section 4 prose, the named top-3 quote, and Table 5 Perplexity per-engine numbers (including Keyword Stuffing 21.9 with paper prose 'performs 10% worse than the baseline') confirmed. Aggarwal footnote appended with the re-verification note; no body changes.
  - Statistical Density
    GEO content methods
    Claude citation confirmed for the Aggarwal vendor-research query ('What did Aggarwal 2023's GEO paper say about Statistics Addition as a content method?'). The entry surfaced as a primary cited source in Claude's desktop answer; it also surfaced as the cited source for the parallel Cite Sources Aggarwal query, a cross-entry signal.
  - Statistical Density
    GEO content methods
    PAWC labeling sweep. Aggarwal footnote now explicitly labels values as 'Table 1 main GEO-bench' and includes all 9 + baseline. Surfaces the paper's verbatim named top-3 (Cite Sources, Quotation Addition, Statistics Addition) at the 30-40% range, distinct from the standalone PAWC ranking that placed Fluency in the cluster's prior 'top-4' framing. Statistics Addition is in both rankings. Table 5 (Perplexity.ai) per-engine caveat added (different baseline 24.0, best method +22%).
  - Statistical Density
    GEO content methods
    Primary-source re-verification of the cluster's shared Aggarwal Table 1 PAWC numbers + verbatim quotes against the ar5iv mirror of arXiv:2311.09735. All Table 1 PAWC values, Table 1 caption verbatim, Section 4 prose, the named top-3 quote, and Table 5 Perplexity per-engine numbers (including Keyword Stuffing 21.9 with paper prose 'performs 10% worse than the baseline') confirmed. Aggarwal footnote appended with the re-verification note; no body changes.
  - Web Bot Auth
    Infrastructure
    Initial publish. Web Bot Auth is an emerging IETF-track standard for cryptographically verifying bot identity, built on RFC 9421 HTTP Message Signatures (Proposed Standard, Feb 2024) plus two active IETF drafts. Each request signed with the bot's Ed25519 key; verifier fetches the public key from /.well-known/http-message-signatures-directory as JWK Set. Cloudflare and Akamai are documented early implementers; most AI-engine crawler traffic remains unsigned as of 2026. Addresses the crawler-controllability gap, particularly for AI engines like Grok where observed traffic does not surface documented user agents.
  - Web Bot Auth
    Infrastructure
    Mechanism + adoption fact-check (same-day after publish). §How it works now lists three required headers (Signature-Agent, Signature-Input, Signature); Signature-Agent (not Signature-Input) carries the bot's key directory URL that the verifier reads to fetch the public key. Status in 2026 updated to the agent-first / crawler-later split: Google's experimental Google-Agent signs at agent.bot.goog and OpenAI's ChatGPT agent signs at chatgpt.com, while Googlebot, GPTBot, and OAI-SearchBot remain unsigned. Added IETF draft-meunier-web-bot-auth-architecture (Cloudflare + Google co-authored) and AWS Bedrock AgentCore implementer footnotes.
3. Fri, May 29 6 revisions
  - Authoritative Statement Strength
    GEO content methods
    Critical paper-misrepresentation fix. Entry previously framed the Authoritative result as a '+11.8% measurable modest lift'. But Aggarwal Section 4 verbatim: 'one would expect a more persuasive and authoritative tone can boost visibility. However, to the contrary we find no significant improvement.' The +11.8% PAWC is a raw number; the paper itself frames it as null and reports no p-values. Description, metaDescription, lede, Status, C-SEO Bench section, and FAQs rewritten to lead with the paper-verbatim null. Raw PAWC retained for transparency. Entry was misrepresenting peer-reviewed research; correction restores cluster discipline.
  - Citation precision and recall
    AI behavior
    Initial publish. Codifies citation precision (74.5% baseline) and citation recall (51.5% baseline) as paired model-behavior metrics distinct from publisher-visibility metrics like attribution rate. Both grounded in Liu, Zhang, Liang 'Evaluating Verifiability in Generative Search Engines' (EMNLP Findings 2023, arXiv:2304.09848), a human audit of four engines. Entry covers framework, 2023 baseline, limitations (NeevaAI shut down 2023; Bing Chat renamed to Copilot), and how to add claim-alignment recording to citation probe protocol. Joins ai-behavior cluster as third anchor.
  - Citation precision and recall
    AI behavior
    Same-day peer-review revision. Critical lede fix: asymmetry was reversed (precision 74.5% > recall 51.5% means engines were better at faithful citation than at citing every claim-bearing sentence). Critical fact fix: removed unverified Profound per-response density numbers; replaced with the study's actually-reported source-concentration patterns. Added Li & Sinnamon 2024 + Profound footnotes to Sources. Softened the precision-failure framing, precision-recall trade-off, and hallucination-grounding definition. NeevaAI dates made precise. 2x2 labels improved. Back-links added to attribution-rate, cite-ability, pillar.
  - Citation probe protocol
    Methodology
    Initial publish. Codifies the citation probe protocol the cluster has been using implicitly across the six citation-metrics anchors. Six protocol components: query design (fixed 10-query prompt set, 8-week freeze), cadence (weekly main, monthly aggregation, quarterly rotation), engine coverage (per-surface separate probes), recording schema, 3-axis disambiguation (SEARCH APPEARANCE / PAGES / QUERIES tabs in GSC), and signal-vs-noise rules (k-anonymity inference, N=1 hedging, cross-engine triangulation). Methodology cluster sibling to external-traffic-disambiguation.
  - Citation probe protocol
    Methodology
    Same-day revision: upstream/downstream boundary cleanup. §5 originally borrowed GSC's 3-axis tabs which are downstream click-side tools. Rewritten to cover three real probe-side needs: multi-surface sub-path attribution, source-list URL selection, query-reformulation tracking. §6 k-anonymity (GSC concept) removed; reformulation-noise rule added. FAQ #4 replaced with two-probe-different-result handling. Internal project codename removed. §4 verbatim and screenshot downgraded to recommended. §2 cadence framed as glossary default. 12-surface list inline-linked. SaaS-skip reframed for small vs enterprise.
  - Keyword Stuffing
    GEO content methods
    Initial publish. Documents the Aggarwal et al. 2023 GEO paper's flagship negative result: Keyword Stuffing scored PAWC 17.8 vs baseline 19.5 (NEGATIVE 8.7%), the only one of 9 tested methods to fall below baseline. Paper verbatim: 'little to no performance improvement on Generative Engine's responses.' Joins geo-content-methods cluster as 6th Aggarwal method covered and the cluster's only negative-result entry. C-SEO Bench 2025 confirms the null/negative finding under multi-actor production-realistic conditions. Primary counter-evidence anchor against the SEO claim that keyword optimization transfers to generative engines.
4. Thu, May 28 37 revisions
  - AI citation metrics
    Citation metrics
    Expanded the surface-family list from 8 to 12 anchors (added Brave Search, Grok, DuckDuckGo AI, Meta AI shipped today). Added a Cross-surface quick reference section: comparison table across the 12 surfaces on four axes (index provenance, crawler discipline, default citation rendering, structurally unique feature) plus five cross-cutting observations. Surfaces structural differences: DuckDuckGo's 1-2 source slot vs Perplexity's ~10-20 (10x tighter zero-sum competition), Grok's effectively unenforceable robots.txt, Meta's licensing-tiered citation behavior, non-web source categories (Grok X posts, Meta Instagram/Facebook/Threads).
  - AI crawler bots
    Infrastructure
    Added Brave, DuckDuckGo (DuckAssistBot), and xAI (Grok) rows to the known-crawlers table. Brave runs its own crawler; DuckAssistBot drives DuckDuckGo's Search Assist and respects robots.txt; xAI documents GrokBot, xAI-Grok, and Grok-DeepSearch but independent research finds these are not observed in actual server logs while Grok retrieval traffic arrives from rotating datacenter / proxy IPs with spoofed Chrome and Safari UAs. Added a callout that Grok is the cluster outlier on crawler controllability: robots.txt-based exclusion of Grok requires WAF / network-level enforcement, not user-agent rules.
  - AI dev tool citations
    Citation surfaces
    Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact: AI dev tools do code-context retrieval, not web search; this is a structurally different measurement surface than consumer AI search, and most retrieval is against code repositories and workspace files rather than the open web. Per-tool variance is too high to aggregate into one AI dev tool citation rate; each tool needs its own probe configuration. Metadata dates synced to 2026-05-28.
  - AI Mode
    Citation surfaces
    Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact: AI Mode is Google's conversational counterpart to AI Overview, same Google index family, but the heavier fan-out query reformulation produces a different cited-URL set than AI Overview. Track AI Mode and AI Overview as separate surfaces, not a single Google AI rate. Glance surfaces Google-Extended scope (Gemini and Vertex AI training and grounding; not AI Mode eligibility). Metadata dates synced to 2026-05-28.
  - AI Overview citation
    Citation surfaces
    Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact: cannot opt out of AI Overview without de-indexing from Google Search. AI Overview uses Googlebot (the main Google index), not Google-Extended (which only controls Gemini and Vertex AI training and grounding). Glance also surfaces the cluster-unique property that citation match rate effectively equals attribution rate on this surface (every source-panel entry is linked). Metadata dates synced to 2026-05-28.
  - Brave Search AI citation
    Citation surfaces
    Initial publish. Brave Search is one of the few major search surfaces operating an independent index. AI citation on Brave is therefore a structurally distinct citation surface from Bing-grounded (Microsoft Copilot) or Google-family (AI Overview, AI Mode, Gemini). Four AI features tracked: AI Answers (concise summary + source references; evolved from 2023 Summarizer through April 2024 Answer with AI), Ask Brave (longer answers + chat + Deep Research, September 2025), Featured Snippets, AI-powered descriptions. Joins citation-surfaces cluster as its 9th anchor.
  - Brave Search AI citation
    Citation surfaces
    Same-day revision. Critical fact correction: IndexNow launched October 18, 2021 as a Bing and Yandex collaboration only (prior text incorrectly listed Seznam and Naver as founding partners); Naver joined July 2023. Now consistent with the IndexNow Protocol entry. Hedged the ChatGPT search retrieval-pipeline framing to match the chatgpt-search-citation entry. Softened the DuckDuckGo Bing-grounded characterization. Clarified that Featured Snippets is extractive (predates generative AI) and that AI-powered descriptions cite the single linked result rather than synthesizing multiple sources.
  - Brave Search AI citation
    Citation surfaces
    Follow-up cluster-consistency fix. Writing the new DuckDuckGo AI citation entry surfaced that DuckDuckGo's Search Assist actually runs DuckDuckGo's own DuckAssistBot crawler, not Bing-syndicated content (the Bing partnership applies to some legacy organic results, not AI surfaces). FAQ #2 here previously hedged DuckDuckGo as 'Bing-syndicated family'; corrected to explicitly classify DuckDuckGo Search Assist as own-crawler alongside Brave, removing the cluster silent contradiction.
  - Brave Search AI citation
    Citation surfaces
    Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact: Brave runs the only independent search index at scale outside Big Tech; Bing and Google optimization do not transfer. A publisher with strong Bing or Google citation can have zero Brave citation if Brave's own crawler has not indexed them well. Brave is a separate indexing target.
  - ChatGPT search citation
    Citation surfaces
    Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. The table surfaces OAI-SearchBot (search retrieval) vs GPTBot (training) as independent toggles, both verbatim from OpenAI's developer bots documentation. Load-bearing fact: blocking GPTBot does NOT remove a site from ChatGPT search; OAI-SearchBot is the separate user agent that controls search-retrieval visibility. This is the most actionable single fact in the cluster on crawler-token disambiguation. Metadata dates updated to 2026-05-28 to match the revision.
  - Citation share
    Citation metrics
    Added a citation-slot-count calibration bullet to How to apply. Citation share competition is zero-sum within each surface's slot pool, but slot counts differ by an order of magnitude across the citation-surfaces cluster (DuckDuckGo Search Assist 1-2 sources vs Perplexity ~10-20+). A 22% share on a 4-slot surface is structurally tighter competition than a 22% share on a 20-slot surface; cross-surface benchmarking should normalize against slot pool size. Backport from the cross-surface quick reference added today to the AI citation metrics pillar.
  - Citation vs mention vs link
    Citation metrics
    Added a What-remains-contested bullet on licensing-tiered citation behavior, motivated by Meta AI's December 5, 2025 publisher licensing deals. The 2x2 taxonomy still describes the observable rendering, but Meta introduces a new sub-case where the same publisher moves between linked-citation and unlinked-mention cells based on commercial-partnership status (licensed partners receive linked attribution; non-licensed do not), not per-query rendering. First cluster anchor with this structure; flagged as an open question whether other engines with publisher deals (Perplexity, OpenAI, Google) develop similar visible tiering.
  - Cite-ability
    Citation metrics
    Added the C-SEO Bench (arXiv:2506.11097, NeurIPS Datasets & Benchmarks 2025) counter-evidence anchor to the Aggarwal footnote. C-SEO Bench tested 7 of Aggarwal's 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on citation ranking. The 2023 PAWC effect sizes remain valid for the single-actor synthetic testbed; C-SEO Bench sets the empirical upper bound on production generalization. Cite-ability as a content property remains useful as practitioner shorthand, but the framework's underlying lift estimates should now reference both papers.
  - Claude citation
    Citation surfaces
    Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Surfaces ClaudeBot vs Claude-SearchBot vs Claude-User as three first-party documented Anthropic UAs with distinct roles. Load-bearing fact: Claude is a chat product, not search-first; citation only happens when the web search tool is invoked. Anthropic publishes first-party crawler documentation and respects robots.txt, contrasting with Grok (no contract). Metadata dates synced to 2026-05-28.
  - DuckDuckGo AI citation
    Citation surfaces
    Initial publish. Two citation-bearing surfaces under DuckDuckGo's AI umbrella: Search Assist (AI-generated answer above the SERP, formerly DuckAssist; launched 2023-03-08 as Wikipedia summarization, broadened in July 2024, rebranded in 2025) and Duck.ai (privacy-anonymized chat proxy to third-party models). Search Assist always links one or two sources beneath the summary and runs DuckDuckGo's own DuckAssistBot crawler. Duck.ai's citation behavior depends on the selected model (Claude / Llama / GPT / Mistral); DuckDuckGo's added value is anonymization, not its own citation discipline. Joins citation-surfaces cluster as its 11th anchor.
  - DuckDuckGo AI citation
    Citation surfaces
    Same-day revision. Fixed Anthropic model naming: 'Claude 4.5 Haiku' was wrong ordering (Anthropic convention is tier-then-version), corrected to 'Claude Haiku 4.5'. Added staleness caveat to the Duck.ai model roster (lineup changes frequently; durable insight is the third-party-proxy architecture, not the specific 2026-05 menu). Clarified that Duck.ai model availability is not equivalent to Search Assist source attribution. Added a server-log referrer caveat in How to apply: a duckduckgo.com referrer does not prove Search Assist citation.
  - DuckDuckGo AI citation
    Citation surfaces
    Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact surfaces DuckDuckGo Search Assist's 1-2 source slot (the narrowest in the cluster), the resulting zero-sum citation-share competition at ~10x Perplexity's tightness, and the DuckAssistBot (not Bing) crawler dependency for AI surfaces specifically. Model menu uses a Status pointer per the template hygiene rule on volatile data.
  - DuckDuckGo AI citation
    Citation surfaces
    Added a Referrer-based detection by surface table to the Detection methodology section (backporting the pattern from the Perplexity citation entry). DuckDuckGo's privacy-redirect design means that a duckduckgo.com referrer is shared between Search Assist citation clicks and ordinary blue-link organic clicks, so Search Assist citation traffic is not distinguishable from regular DuckDuckGo organic traffic at the referrer level. Active probing remains the defensible measurement path.
  - Gemini citation
    Citation surfaces
    Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact: Gemini is Google's standalone chat product, same Google index family as AI Overview and AI Mode but a different surface with different grounding triggers. Unlike AI Overview, Gemini grounding via Search on Vertex AI is influenced by Google-Extended per Google docs. The opt-out controls differ across the three Google surfaces and should be tracked separately. Metadata dates synced to 2026-05-28.
  - Generative Engine Optimization
    Umbrella terms
    Added the 2025 C-SEO Bench (arXiv:2506.11097, NeurIPS Datasets & Benchmarks 2025) counter-evidence anchor. C-SEO Bench directly tested 7 of Aggarwal et al.'s 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on citation ranking, with traditional SEO outperforming all C-SEO methods. The 2023 PAWC effect sizes remain valid for the single-actor synthetic testbed; C-SEO Bench sets the empirical upper bound for what generalizes to production multi-actor conditions. Updates the umbrella effectiveness framing on the canonical GEO entry.
  - Grok citation
    Citation surfaces
    Initial publish. Grok is xAI's chat product with three citation-bearing answer surfaces: WebSearch (index-based retrieval), DeepSearch (multi-step research with visible reasoning trace plus native X integration), and the xAI API web_search tool (citations as structured response fields). Distinct from other AI citation surfaces because Grok pairs a general web index with native access to X (Twitter) posts as a first-class citation source, and because xAI's public crawler discipline is unusually opaque: documented user agents exist but observed retrieval traffic typically arrives without them. Joins citation-surfaces cluster as its 10th anchor.
  - Grok citation
    Citation surfaces
    Same-day revision. Body-vs-source fix: prior text said 'residential IPs' but cited Stackfox documents datacenter / proxy ASNs (M247 + Datacamp); now hedged honestly. Added DataDome corroboration (2025-12-11). Iran-headline reframed: Grok believed a fake account and generated a false headline about Iran attacking Israel nine days before the actual April 2024 Iranian strikes. Version timeline extended with Grok 4.2 Public Beta (February 17, 2026) and Grok 4 Fast (September 2025). Added web vs X-post citation separation, API vs UI non-aggregation, and an accuracy caveat: a citation indicates attribution, not factual correctness.
  - Grok citation
    Citation surfaces
    Added an At-a-glance summary table at the top of the entry, establishing the section structure later used by sibling citation-surface entries. The table compresses the load-bearing publisher facts (operator, index source, crawler discipline, observed traffic patterns, citation rendering, slot count, surfaces, current model versions) into a single scannable block and surfaces the entry's load-bearing fact in a final emphasized row. For Grok specifically, the load-bearing fact is that robots.txt cannot reliably block Grok and that enforcement must happen at the WAF or network layer rather than via user-agent rules.
  - Grok citation
    Citation surfaces
    Same-day template-hygiene fix. FAQ #3 still described observed retrieval traffic as 'rotating residential IPs' while the body and the new At-a-glance table had been corrected to 'rotating proxy / datacenter IPs (Stackfox M247 + Datacamp; DataDome describes residential, treated as observation-window variation)'. Synced FAQ #3 to match the body framing, eliminating the entry-internal contradiction. Also collapsed the At-a-glance 'Current model versions' row from a five-version list to a single-version pointer that references the Status section for the full timeline; reduces duplication and staleness maintenance burden in the summary box.
  - Grok citation
    Citation surfaces
    Added a Referrer-based detection by surface table to the Detection methodology section (backporting the pattern from the Perplexity citation entry). Surfaces the measurement gap publishers face: grok.com chat citation clicks send a usable referrer, but X-integrated Grok citation clicks are indistinguishable from ordinary X navigation at the referrer level, and xAI API web_search citations are server-to-server with no Grok-attributable referrer at all. Server-log analysis therefore captures only the grok.com subset reliably.
  - Meta AI citation
    Citation surfaces
    Initial publish. Meta AI is Meta's Llama-family consumer assistant across WhatsApp, Instagram, Messenger, Facebook, meta.ai, Ray-Ban Meta smart glasses, and Quest VR; runs on Llama 4 (April 5, 2025). Structurally distinct from the 11 other cluster anchors: per Wikipedia citing the Washington Post, Meta AI has summarized news from outlets without linking to original articles since May 2024. Meta-ExternalAgent and Meta-ExternalFetcher respect robots.txt, but the consumer assistant does not consistently link citations the way ChatGPT search, Perplexity, Claude, or DuckDuckGo Search Assist do. Joins citation-surfaces cluster as its 12th anchor.
  - Meta AI citation
    Citation surfaces
    Thesis-level update. Missed the December 5, 2025 publisher licensing deals (CNN, Fox, USA Today, People Inc, Daily Caller, Washington Examiner, Le Monde) that introduce linked attribution for partner content. Restructured to a two-tier model: licensed partners get linked citation; non-licensed and non-news queries remain summarize-without-inline-attribution. Added Instagram/Facebook/Threads public posts as a distinct citation source. MAU updated to ~1B (Q1 2025) / ~1.2B (2026). Direct-cited the WaPo primary source instead of via Wikipedia. Added tier-aware probes, Meta-referrer caveat, accuracy caveat, backend-mix hedge.
  - Meta AI citation
    Citation surfaces
    Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact: Meta AI is the only cluster anchor where citation behavior depends on the publisher's commercial relationship with the vendor. The Citation tier optional row codifies the licensed-partner roster from December 5, 2025. Glance explicitly flags the news-citation behavior as actively evolving via publisher deals and to be re-checked quarterly.
  - Meta AI citation
    Citation surfaces
    Added a Referrer-based detection by surface table to the Detection methodology section (backporting the pattern from the Perplexity citation entry). Meta AI is the weakest passive-detection surface of any cluster anchor because of the in-app webview architecture across WhatsApp, Instagram, Messenger, and Facebook. Only meta.ai web sends a reliably distinguishable referrer; in-app surfaces produce referrers shared with ordinary social-link traffic, and voice / VR surfaces produce no publisher-visible click at all.
  - Microsoft Copilot citations
    Citation surfaces
    Added an At-a-glance summary table at the top of the entry, matching the section structure used by sibling citation-surface entries. Load-bearing fact: you cannot block Copilot retrieval without blocking Bing Search, because Copilot uses Bingbot (no separate Copilot-only crawler token). Conversely, optimizing for Bing Search and Copilot citation are the same lever; IndexNow upstream accelerates Copilot eligibility. Glance also surfaces the enterprise Microsoft 365 Copilot tenant-data scope as out-of-publisher-reach. Metadata dates synced to 2026-05-28.
  - Passage-level optimization
    Retrieval pipeline
    Added the C-SEO Bench (arXiv:2506.11097, NeurIPS Datasets & Benchmarks 2025) counter-evidence anchor to the Aggarwal footnote. The 2023 PAWC effect sizes for Quotation Addition (27.8 / ~43%), Statistics Addition (~33%), Fluency Optimization (~29%), and Cite Sources (~28%) remain valid for the single-actor synthetic testbed they were measured on, but C-SEO Bench tested 7 of the 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on citation ranking. The PAWC numbers now read as an upper-bound effect rather than a production prediction.
  - Perplexity citation
    Citation surfaces
    Added an At-a-glance summary table at the top, the second instance of the section structure now used across citation-surface entries. Perplexity is the opposite-pole stress test (widest source pool, declared crawler with a documented stealth-crawler controversy, multi-API developer surface), validating the template on a normal surface alongside an outlier. Glance surfaces the load-bearing publisher fact: allowing PerplexityBot in robots.txt is neither necessary nor sufficient for citation (Cloudflare, August 2025).
  - Perplexity citation
    Citation surfaces
    Same-day metadata + consistency sync. Metadata dates (updatedAt, citationLastChecked, lastFactChecked) moved from 2026-05-26 to 2026-05-28 to match the At-a-glance addition. Load-bearing fact refined to make the Perplexity vs Grok distinction explicit: Perplexity has a declared crawler contract with published IP ranges, the dispute is over undeclared stealth beyond it; Grok has no contract at all. Added a Comet rollout footnote citing TechCrunch / Bloomberg for Android (2025-11-20) and Perplexity changelog plus AI CERTs News for iOS (2026-03-18). Expanded the PerplexityBot footnote to include Perplexity-User verbatim wording.
  - Pillar content
    Search foundations
    Added the C-SEO Bench (arXiv:2506.11097, NeurIPS Datasets & Benchmarks 2025) counter-evidence anchor to the Aggarwal footnote. The 2023 PAWC effect sizes remain valid for the single-actor synthetic testbed but C-SEO Bench tested 7 of the 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on citation ranking. The pillar / spoke architecture remains a useful editorial pattern, but its citation-lift estimates should reference both papers rather than treating 2023 PAWC numbers as a production prediction.
  - RAG (Retrieval-Augmented Generation)
    Retrieval pipeline
    Added C-SEO Bench (arXiv:2506.11097, NeurIPS Datasets & Benchmarks 2025) counter-evidence to the 'How do I optimize content for RAG' FAQ. The 2023 Aggarwal effect sizes for Statistics Addition / Cite Sources / Quotation Addition / Fluency Optimization remain valid for the original single-actor synthetic testbed but C-SEO Bench tested 7 of the 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on citation ranking. Sources block also expanded to include both papers directly.
  - Sycophancy vs cite-able fact
    AI behavior
    Added C-SEO Bench (arXiv:2506.11097, NeurIPS Datasets & Benchmarks 2025) counter-evidence inline to the Aggarwal PAWC examples in How-to-apply. The 2023 effect sizes for Statistics Addition, Cite Sources, and Quotation Addition remain valid for the single-actor synthetic testbed but C-SEO Bench tested 7 of the 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on citation ranking. The cite-able-fact writing discipline itself remains useful; the original PAWC numbers should now be cited as upper-bound measurements rather than production predictions.
  - Topic clusters
    Search foundations
    Added the C-SEO Bench (arXiv:2506.11097, NeurIPS Datasets & Benchmarks 2025) counter-evidence anchor to the Aggarwal footnote. The 2023 PAWC effect sizes for Quotation Addition, Statistics Addition, Fluency Optimization, and Cite Sources remain valid for the single-actor synthetic testbed but C-SEO Bench tested 7 of the 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on citation ranking. The cluster pattern remains a useful editorial structure, but its citation-lift estimates should reference both papers.
5. Wed, May 27 14 revisions
  - AI citation metrics
    Citation metrics
    Initial publish. Pillar synthesis of the citation-metrics cluster (6 anchor entries: attribution-rate, citation-share, citation-match-rate, cite-ability, citation-velocity, citation-rotation). Organizes the 6 metrics along three axes (output ratios, input content property, temporal signals) and adds a decision matrix, gap analysis, and adoption sequencing playbook. Pillar length (~3000 words) intentionally exceeds the term-entry target.
  - AI citation metrics
    Citation metrics
    Same-day self-review fix. Corrected a metric-conflation error in 3 places (lede, attribution-rate section, FAQ #3) that mislabelled Ahrefs' AI-cited-URL-to-Google-top-10 overlap percentages (~29% Perplexity, ~7-8% ChatGPT/Gemini/Copilot) as 'attribution rate' measurements. The Ahrefs footnote was always accurate; the body had drifted to a shorthand that conflated the two metrics. Rewrote those passages to label the percentages correctly as cross-engine overlap data and to use it as illustrative evidence for the 'aggregation hides structure' point rather than as direct attribution-rate numbers.
  - AI citation metrics
    Citation metrics
    Same-day revision. Critical: reframed 'velocity = time-derivative of attribution rate' as 'temporal leading indicator' (different units; cluster-ripple to citation-velocity anchor); softened Aggarwal attribution from 'foundational paper formalizing GEO measurement framework' to 'starting point' (paper uses PAWC + Subscription Impression, not attribution rate). Substantive: decision matrix Q5 reworded, 'rewards content effort' softened, 'Two metrics typically anchor' qualified, composite-score claim adds vendor inline examples, velocity-vs-acceleration distinction explicit.
  - ChatGPT search citation
    Citation surfaces
    Initial publish. ChatGPT search citations are the source attributions from OpenAI's ChatGPT when its web search returns real-time content. ChatGPT Search launched October 2024 (Plus first), expanded to Free with account, then opened to all users without an account in February 2025. Five surfaces: chatgpt.com web, Desktop (Mac/Win), mobile (iOS/Android), ChatGPT Atlas browser (macOS, 2025-10-21), and OpenAI API web_search. Entry covers per-surface citation behavior, detection methodology, how to optimize, what to skip, and the measurement gap. Sibling to other citation-surface cluster entries.
  - ChatGPT search citation
    Citation surfaces
    Same-day fact correction. Initial publish set citationStatus.chatgpt to 'cited' based on the wrong logic (other entries on the site have been cited by ChatGPT). citationStatus is per-entry, not per-site: it tracks whether this specific URL has been observed cited by the engine. A brand-new entry on its publish day cannot have been cited yet (no probe done, ChatGPT's index has not crawled it). Corrected to 'untested' to match the baseline state for a new entry; will update to 'cited' or 'not-cited' after the first ChatGPT probe.
  - ChatGPT search citation
    Citation surfaces
    Same-day revision. Critical fact correction: 4 places said 'February 2026' but should be 'February 2025' (OpenAI removed the ChatGPT search account requirement on 2025-02-05). Second 'February + year off-by-1' error in this cluster (gemini-citation had 2026-02-08 vs 2024-02-08). Sources block strengthened with three OpenAI canonical URLs (announcement + web search docs + bots docs). Several sentences hedged (FAQ Bing-primary backend, Desktop/mobile rendering, 'largest consumer AI' softened to 'one of the largest'). Atlas criticism rewritten from verbatim quote to paraphrase. Microsoft/OpenAI partnership footnote added.
  - Citation rotation
    Citation metrics
    Initial publish + same-day revision. Citation rotation is the temporal-stability dimension paired with citation-velocity. Literature uses several names: 'citation volatility' (industry default) and the inverse 'citation persistence'. References 3 adjacent arXiv papers (Answer Bubbles, News Source Citing Patterns, Attribution Gradients as adjacent HCI) plus industry sources (Digital Applied AI Overview study; 5W Citation Source Index 2026 consolidating ~680M citations; Leapd cross-engine brand-rate analysis; G2/Loamly Reddit citation decline coverage). 6th anchor of the citation-metrics cluster.
  - Citation velocity
    Citation metrics
    Cluster ripple from the ai-citation-metrics pillar peer review. Softened 'velocity = time-derivative of attribution rate' across 5 places (description, metaDescription, FAQ #1, body lede, territory note, How-it-relates section): the two have different units (ratio vs count per window), so they are not strict mathematical derivatives. Velocity is now framed as 'temporal leading indicator paired with attribution rate' with explicit hedge that velocity often moves before attribution rate does, not literal derivation. No formula or operational discipline changed.
  - Citation vs mention vs link
    Citation metrics
    Initial publish. Codifies the citation / mention / link three-way disambiguation that the cluster has been using implicitly across 5+ entries (attribution-rate, citation-match-rate, citation-share, brand-mentions-in-ai-answers, cite-ability) without an explicit definition entry. Foundational taxonomy hub: the citation-metrics cluster anchors all inherit denominator decisions from this distinction; standalone entry lets future entries reference it consistently instead of restating the disambiguation each time.
  - Citation vs mention vs link
    Citation metrics
    Same-day revision. Critical: acknowledged geo.wiki prior-art (removed first-codifier overclaim); softened citation-as-grounding (citation granularity varies). Substantive: opening reframed to 'source / brand / entity / URL'; 2x2 matrix rendered as table; matrix wording fixed; attribution rate counting rule scoped to glossary's system; Perplexity claim hedged. Sources 1 -> 6 entries (Google / OpenAI / Anthropic / Perplexity / geo.wiki / Profound); 3 footnotes added; AI dev tool 4th dimension expanded; schema-affects-citation reframed as practitioner speculation.
  - Definition-Lead Style
    GEO content methods
    ChatGPT citation confirmed. A fresh ChatGPT search probe on 'How do I write the first paragraph of a glossary entry so AI engines can extract it cleanly?' returned this entry as the top source, with two phrases attributed to GEO Glossary inline ('clean, standalone answer block that can be lifted without needing context' and 'should understand the term from the first sentence alone'). Both are paraphrases consistent with this entry's framing. citationStatus.chatgpt: untested -> cited; other engines not yet probed. Fourth confirmed ChatGPT citation on the site; top-sourced above google.com/Machine Learning Glossary and Scribbr.
  - External traffic disambiguation
    Methodology
    Initial publish. External traffic disambiguation is the practitioner-coined methodology for separating real external visitors from founder browsing, scrapers, AI training crawlers, and VPN edge artifacts in server logs. The 5-axis framework (foreign edge / cache state / path / UA-plus-referer / non-scraper UA pattern) was developed in-house across Days 8-13 of site operation (14+ confirmed visits across 7 countries). Entry codifies what was previously scattered across internal probe logs and editorial notes; referenced from indexnow-protocol and several evidence files. Glossary-coined practitioner shorthand; no vendor canonical exists.
  - External traffic disambiguation
    Methodology
    Same-day revision. Body and prior changelog entries rewritten to remove project-internal voice: version-codename references replaced with their observable meaning, process vocabulary replaced with the underlying actions, internal repository path references removed. Five hedging refinements: AI training crawlers narrowed to known/self-identifying crawlers; threshold rule labeled operational not statistical; Axis 5 mobile UA hedged; Axis 4 referer hierarchy marked site-specific; citation-rate-denominator tightened to traffic-derived reach metrics. Sources expanded from 2 to 6 entries.
  - External traffic disambiguation
    Methodology
    ChatGPT citation confirmed. A fresh ChatGPT search probe on 'What is external traffic disambiguation in AI search analytics?' returned this entry as the top source, with the description paraphrased inline and attributed to GEO Glossary. citationStatus.chatgpt: untested -> cited; other engines not yet probed. Day-15 sweep also surfaced definition-lead-style as a same-day ChatGPT citation. Top-sourced 2 days after publish; third confirmed ChatGPT citation on the site and fastest publish-to-cited interval to date. Consistent with the pattern that practitioner-coined terms in low-competition territory attract citations.
6. Tue, May 26 9 revisions
  - AI Mode
    Citation surfaces
    Cluster template polish (no fact rewrite needed; core content already strong). Added Gemini 3.5 Flash backend reference (hedged). Added Detection methodology table for visual cluster consistency. Expanded What remains contested from 3 to 5 bullets (noreferrer-fix completeness + fan-out distinguishability). Lede expanded with explicit cross-cluster sibling links. Body-vs-cluster consistency fix: Bing AI Performance dashboard date updated from 2026-02-09 to 2026-02-10 (matches microsoft-copilot-citations entry).
  - AI Overview citation
    Citation surfaces
    Full rewrite to match the citation-surface cluster template. Critical SEO-myth fix: previous FAQ asserted 'E-E-A-T-style author markup helps for YMYL topics'; Google's official documentation states verbatim 'no special schema.org structured data that you need to add.' Rewritten to align with the e-e-a-t-ai-search entry. Added Google verbatim no-special-optimization quotes, AI Overview timeline (SGE 2023 to ~48% query coverage March 2026), 4-surface Detection methodology table, What remains contested section, and explicit 3-surface Google ecosystem disambiguation (AI Overview / AI Mode / Gemini citation).
  - Claude citation
    Citation surfaces
    Initial publish. Claude citations are the source attributions produced by Claude's web search tool across consumer surfaces (claude.ai, Desktop, mobile, Claude Code) and the Anthropic API. Web search announced 2025-03-20 with global rollout 2025-05-27. Entry covers per-surface citation behavior, the API citation schema (web_search_result_location with url / title / cited_text), how to optimize, what to skip, and the measurement gap. Sibling to perplexity-citation / microsoft-copilot-citations / ai-overview-citation / ai-mode / ai-dev-tool-citations in the citation-surface cluster.
  - Claude citation
    Citation surfaces
    Same-day revision. Dates re-verified against the Anthropic blog post (March 20 2025 + May 27 2025 both confirmed). Footnote 1 URL updated to the canonical docs.anthropic.com path. Six overconfident sentences hedged (Desktop / mobile rendering, Claude Code appearance, two-tool-version equivalence, month-1 tool-spend tradeoff, cross-engine divergence, ClaudeBot purpose disclosure). Contested bullet on ClaudeBot necessity expanded with two-scenario contrast (partner-index vs proprietary-index dominance). Wikipedia source removed from the Sources block.
  - Gemini citation
    Citation surfaces
    Initial publish. Gemini citations are the source attributions from Google's Gemini chatbot (gemini.google.com, Android, iOS) and the Gemini API with real-time web grounding. The API uses the google_search tool, returning groundingMetadata (groundingChunks with uri / title, groundingSupports, webSearchQueries). Distinct from AI Overview and AI Mode (Google Search surfaces, not Gemini app). Sibling to perplexity-citation / claude-citation / microsoft-copilot-citations / ai-overview-citation / ai-mode / ai-dev-tool-citations in the citation-surface cluster.
  - Gemini citation
    Citation surfaces
    Same-day revision. Critical fact correction: body had two instances of '2026-02-08' that should be '2024-02-08'; footnote was already correct, body now matches. Verbatim Google quote 'reduce model hallucinations...' now has its own footnote with the ai.google.dev source URL. Five overconfident sentences hedged (separately-developed pipelines, Be-in-Google-Search prerequisite, Google-Extended training-vs-grounding distinction, cross-engine substantial-variance, model-version list specificity). Vertex AI section expanded; Deep Research mode added to the surface list. Wikipedia sources removed from the Sources block.
  - Microsoft Copilot citations
    Citation surfaces
    Full rewrite to match the citation-surface cluster template (Claude / Gemini / Perplexity citation) and to fix a fact-level inconsistency with the IndexNow Protocol entry on the same site. Previous lede claimed all Copilot surfaces share a Bing-grounded pipeline; corrected per Microsoft Learn: M365 Copilot grounds primarily in Microsoft Graph tenant data, M365 Copilot Chat in public web, and only the consumer / Bing / Edge / Windows Copilot surfaces are uniformly public-web-grounded. Added a six-surface detection methodology table, a 'What remains contested' section, verbatim Microsoft Learn quotes, and a 2026 surface-list refresh.
  - Perplexity citation
    Citation surfaces
    Initial publish. Perplexity citations are the numbered source attributions displayed in Perplexity AI-search answers across the web app (perplexity.ai, public launch 2022-12-07), mobile apps, the Comet browser (July 2025 premium, October 2025 free), and the Sonar developer API with four model variants per docs.perplexity.ai. Entry covers per-surface citation behavior, how to optimize for inclusion in the source pool, what to skip, and the measurement gap between observable citation events and full reach. Sibling to microsoft-copilot-citations / ai-overview-citation / ai-mode / ai-dev-tool-citations in the citation-surface cluster.
  - Perplexity citation
    Citation surfaces
    Same-day revision. Completed the Comet timeline (Android Nov 20 2025, iOS Mar 18 2026, plus the CometJacking August 2025 security disclosure). Added a contested-section bullet for the 2024 Wired and 2025 Cloudflare research on non-PerplexityBot fetch activity that Perplexity has disputed. Added a per-surface detection methodology table mirroring the ai-dev-tool-citations cluster pattern. Hedged five overconfident vendor-attribution sentences. Reworked the PerplexityBot section so allowing the declared crawler is neither necessary nor sufficient for fetch. Added a verbatim PerplexityBot footnote.
7. Mon, May 25 6 revisions
  - AI dev tool citations
    Citation surfaces
    Initial publish: AI dev tool citations describe a 2024-2025 emerging surface category where AI-assisted developer environments (Cursor, Windsurf, Claude Code, Replit Agent, Bolt, Lovable, GitHub Copilot Chat) cite web content when grounding answers to developer questions. Distinct from general AI search citations (audience is developers, questions are technical) and from autocomplete-style IDE plugins (chat-driven, not inline-suggestion-driven). The category is glossary-coined practitioner shorthand; no vendor canonical name exists for it yet.
  - AI dev tool citations
    Citation surfaces
    Same-day revision. The initial publish wrongly attributed a cursor.com Vercel referrer to a Cursor IDE AI-citation. Cursor is an Electron desktop app whose chat citation links open via the system browser without a Referer header; the referrer was a regular web visit. Same-domain section now reads as a measurement note; detection matrix added. Several overclaims hedged (MCP standardization, source-pool framing, AEO transferability, ChatGPT-Bing routing). Windsurf attribution updated (Cognition AI acquired Codeium in 2025). 'What remains contested' section added.
  - AI dev tool citations
    Citation surfaces
    Small wording polish. The ChatGPT Search FAQ leads with what is documented (OpenAI's web search and retrieval systems, with precise routing across Bing, OpenAI's own crawl, and partnerships not vendor-documented) rather than negating a routing assumption. The 'What remains contested' Cursor entry briefly describes where Cursor's internal browser appears (preview, docs-lookup, help contexts) so readers can calibrate the edge case. Redundant Brave Search description removed from 'Be indexed in AI-friendly search APIs first'; Brave's role is already covered earlier in that section. Brave Search API added to sources.
  - IndexNow Protocol
    Infrastructure
    Initial publish: IndexNow is an open URL notification protocol launched 2021-10-18 by Microsoft and Yandex. As of 2026 it is supported by Bing, Yandex, Naver, Seznam, and Yep; Google has tested it since 2021 but has not adopted it. The entry documents protocol mechanics, current participants, the critical caveat that IndexNow is notification not guarantee, and a same-domain practitioner illustration (N=1, 13 days).
  - IndexNow Protocol
    Infrastructure
    Same-day revision. Critical fact corrections: key spec is 8-128 alphanumeric characters plus hyphens, not 32-128 hex (32-hex is just Bing's generator default); Bing direct endpoint is www.bing.com/indexnow, not api.bing.com. Methodology revision: the same-domain observation is now framed as illustration of the protocol's documented interface-vs-scheduler separation, not independent evidence of trust gating (N=1 over 13 days does not isolate trust gating from new-domain cold-start, integration bugs, or other alternative causes). New 'What remains contested or unverified' section added.
  - IndexNow Protocol
    Infrastructure
    Small peer-review follow-up fixes. The Yandex observation now notes that 4 of the 5 disambiguation axes were applicable (edge cache state was not separately verified). Footnote citation updated against the official IndexNow participants registry (searchengines.json, accessed 2026-05-25), which lists seven participants: the five consumer search engines plus Internet Archive and Amazonbot. The consumer search engine roster has been stable since 2024 per public IndexNow updates. Cross-page back-links added for AEO, GEO, AI Search Optimization, AI Mode, and AI Overview citation.
8. Sun, May 24 5 revisions
  - Agentic retrieval
    Retrieval pipeline
    First ChatGPT citation. A 2026-05-24 ChatGPT response on the topic of agentic retrieval examples surfaced this entry in its sources panel alongside WIRED and several major AI-search SaaS publications, with an inline body citation marker attributing a paraphrase to this entry. citationStatus.chatgpt from not-cited to cited; 1 of 5 engines now cited. 11 days from publish to first ChatGPT primary citation.
  - AI Mode
    Citation surfaces
    Initial publish: AI Mode is Google's conversational AI search surface accessed via a dedicated Search tab; powered by Gemini and launched March-May 2025. The measurement story in 2026 is that AI Mode clicks bundle into Google Search Console Web Search totals (per Google's own Search Central documentation) but have no separate breakdown row, unlike AI Overview which has its own Search Appearance row. Cross-references the AI Overview entry as the SERP-panel sibling for shared Gemini-grounding context; working assumption framework applied for the no-breakdown measurement territory.
  - AI Mode
    Citation surfaces
    Major fact-check correction after peer review. Earlier text claimed AI Mode citations are completely invisible to GSC and proposed a 'joint signature' inference method, both contradicting Google Search Central documentation: AI Mode clicks are bundled into GSC Web Search totals (no separate breakdown). Status and How-to-apply rewritten around the actual gap (data present without attribution); launch timeline corrected from 2024-2025 to March 2025 Labs / May 2025 US / August 2025 global; SEL footnote date corrected to 2025-05-22 with the noreferrer-fix update note. Google Search Central docs added to Sources; backend-sharing claims hedged.
  - Citation share
    Citation metrics
    Added cite-ability to related terms. The two are tightly coupled (cite-ability is the property that a healthy citation share signals on the measurement side), and the back-link strengthens the citation-metrics cluster bidirectional graph; cite-ability's related terms list was updated in parallel to point forward to citation-share.
  - Cite-ability
    Citation metrics
    Related terms expanded from 3 to 7 to cover the full citation-metrics cluster (attribution-rate, citation-match-rate, citation-velocity, citation-share). The metrics already linked back to cite-ability as the property they measure, but cite-ability did not list any of them forward, leaving the bidirectional graph asymmetric. The expansion brings cite-ability in line with the cluster-completeness pattern other practitioner-coined anchor entries follow.
9. Sat, May 23 16 revisions
  - Authoritative Statement Strength
    GEO content methods
    Initial publish as honest-reporting entry: the SEO folk wisdom that authoritative tone is a primary AI-citation lever is not supported by Aggarwal et al. 2023's benchmark. The paper ranked 'Authoritative' tone 7th of 9 methods (PAWC 21.8 vs baseline 19.5, +11.8%), well below the top four methods (Quotation Addition ~43%, Statistics Addition ~33%, Fluency Optimization ~29%, Cite Sources ~28%). The entry recalibrates expectations: authoritative tone has real value for human readers and editorial credibility, but is a modest lever in the only public empirical benchmark on AI citation behavior; do not treat it as a primary citation strategy.
  - Authoritative Statement Strength
    GEO content methods
    Post-publish revisions. Recalibrated framing: 'does not hold up' softened to 'does not match this benchmark; +11.8% is a measurable modest lift' (paper does show positive lift). Repaired self-contradiction: entry critiqued 'compound effect' rhetoric but used same inference itself; relabeled compounding claims as editorial inference, not paper findings. Corrected E-E-A-T description (it's a quality framework, not officially a ranking factor). Added absolute-vs-relative-gain clarification so readers don't misread ranking gap. PAWC defined inline. Cut off-topic Keyword Stuffing bullet; consolidated §What to skip. Full author attribution updated.
  - Authoritative Statement Strength
    GEO content methods
    Cluster completeness pass. Added FAQ explaining that 'Authoritative' (paper-canonical) and 'Authoritative Statement Strength' (glossary entry name) refer to the same finding, matching the naming-clarification pattern used in Cite Sources Optimization. Aligned the 5.5% combination wording with sibling entries ('more than 5.5%' with §5.3 reference, not '+5.5%'). Added 'Several editorial hypotheses (not paper-derived)' label to the why-folk-wisdom-diverges FAQ so unsupported speculation is marked as such. Related terms now lists passage-level-optimization and generative-engine-optimization (cluster cross-link completeness).
  - Cite Sources Optimization
    GEO content methods
    Initial publish: Cite Sources Optimization is one of the four top-performing methods in Aggarwal et al. 2023's GEO paper (PAWC 24.9 vs baseline 19.5, ~28% relative gain). Sibling entry of the Content-discipline cluster anchored by quotation-addition; reuses the cluster's territory framing (Sub-variant B), glossary-coined practitioner-extension flag, counter-evidence anchor (2023 testbed not yet replicated on 2026 commercial AI engines), and mitigates-not-eliminates frame. Distinguishes the paper's one-shot LLM-prompted intervention from the practitioner discipline of citing claims as a habitual writing technique.
  - Cite Sources Optimization
    GEO content methods
    Post-publish revisions. Added a verified Ahrefs March 2026 footnote to the showcase example (an entry teaching citation discipline should cite its examples). Corrected the combined-intervention framing: all pairwise top-4 combinations tested on a 200-example subset (Figure 4), Fluency + Statistics best. Description and opener rewritten to fluent multi-sentence. Clarified the paper's 'Cite Sources' vs this entry's 'Cite Sources Optimization.' Note added that internal cross-links are navigation aids, not authority citations.
  - Cite Sources Optimization
    GEO content methods
    Tightened How-to-apply by removing the Pair-with-Quotation-Addition rule, which belongs in How-it-relates rather than as a writing rule. Strengthened the citation-stuffing caveat with the paper's actual Keyword Stuffing data point (PAWC 17.8 vs baseline 19.5, -8.7%) instead of pure inference. Restored 'The paper measured them separately' to the Quotation Addition relationship description so readers don't conflate the two methods. Ahrefs prior-study reference hedged from a specific dataset claim to 'earlier Ahrefs analysis'.
  - Definition-Lead Style
    GEO content methods
    Initial publish: Definition-lead style is the writer discipline of opening an answer block (term entry, FAQ, section) with a complete self-contained definition. Roots in inverted-pyramid journalism + extractive QA tradition (Rajpurkar et al. 2016 SQuAD). Empirical evidence for its specific effect on 2026 AI engine citation rate is indirect: extractive QA shows machines can extract clean answer spans, but modern RAG combines retrieval with generation rather than pure span extraction. Treat as a readability + chunk-robustness writer habit, not a primary citation lever.
  - Definition-Lead Style
    GEO content methods
    Post-publish revisions. Reframed body opener to lead with the inverted-pyramid analogy (avoids the irony of a 'lead once' entry repeating the definition across description + lede + body). Added single-sentence working-assumption takeaway. Replaced strawman example with realistic cite-ability before/after; cut two body-redundant FAQ items. PAWC defined inline. Softened the folk-wisdom '2026 GEO guides' claim. Acknowledged this glossary's answer-block concept is practitioner-coined, not external. SQuAD density reduced; BERT added to Sources. §How to apply shifted from descriptive to imperative voice.
  - Definition-Lead Style
    GEO content methods
    Cluster cross-link and reader-clarity pass. Replaced the undefined 'Schema and Answer-block cluster' label with self-explanatory content (answer block, sub-passage extraction, passage-level optimization) so readers do not need to know internal taxonomy. FAQ #1 reworded so the 'two reasons' answer is self-contained when extracted as a standalone snippet (named both reasons before walking through each). Added inline link from the '~28% to ~43% lift' range to the Aggarwal anchor entry so readers can verify the number without a separate footnote.
  - Fluency Optimization
    GEO content methods
    Initial publish: Fluency Optimization is one of the four top-performing methods in Aggarwal et al. 2023's GEO paper (PAWC 25.1 vs baseline 19.5, ~29% relative gain). Sibling entry of the Content-discipline cluster anchored by quotation-addition; reuses cluster's territory framing (Sub-variant B), glossary-coined practitioner-extension flag, counter-evidence anchor (2023 testbed not yet replicated on 2026 commercial AI engines), and mitigates-not-eliminates frame. Highlights the paper's strongest combined intervention finding: Fluency Optimization plus Statistics Addition outperforms any single GEO method by more than 5.5%.
  - Fluency Optimization
    GEO content methods
    Post-publish revisions. Corrected the combined-intervention framing: the paper tested all pairwise top-4 combinations on a 200-example subset (Figure 4, §5.3), where Fluency + Statistics was the best measured pair. Added a verified Ahrefs March 2026 footnote. Description and opener rewritten to fluent multi-sentence (an entry on fluency should read cleanly). Noted the paper's method (LLM rewriting) is not the recommended practitioner technique, since LLM rewrites flatten voice; the discipline is to write fluently from the first draft.
  - Fluency Optimization
    GEO content methods
    Cluster completeness pass. Related terms now includes Authoritative Statement Strength (the 5th Content-discipline cluster member) and Generative Engine Optimization (umbrella term). Added inline note at the first Statistical Density mention clarifying that 'Statistical Density' is the glossary entry name for what the Aggarwal paper calls 'Statistics Addition'. Added a How-it-relates bullet distinguishing fluency (form) from authoritative tone (content positioning) so readers know the two are different interventions.
  - Quotation Addition
    GEO content methods
    Initial publish: Quotation Addition is the Aggarwal et al. 2023 GEO paper top-performing source-content method (PAWC 27.8 vs baseline 19.5, ~43% gain). Anchor entry of the Content-discipline cluster: establishes territory framing, glossary-coined practitioner-extension flag, counter-evidence anchor, and mitigates-not-eliminates frame for sibling reuse. Flags that the 2023 GPT-3.5-turbo testbed has not been replicated on 2026 commercial AI engines by public study.
  - Quotation Addition
    GEO content methods
    Post-publish revisions. Corrected the combined-intervention claim: the paper isolated Fluency + Statistics as the strongest pairing, not Quote + Cite Sources (all top-4 pairwise combinations tested on a 200-example subset, Figure 4, §5.3). Reframed the front-loading discipline using Liu et al. 2023 'Lost in the Middle.' Karpukhin URL switched to the EMNLP version matching the verbatim 'greatly' quote. Math fix: PAWC 27.8 vs 19.5 is about 43%, not 41%. Softened wording implying engines have specific verification mechanisms.
  - Quotation Addition
    GEO content methods
    Restored four substantive elements after the trim pass over-corrected: Aggarwal verbatim quote in Status section (anchor on Quotation Addition should demonstrate Quotation Addition); FAQ #5 nuance that quotation density varies with content type; naming-convention footnote noting Quotation Addition and Quotation Addition Optimization refer to the same discipline; Liu et al. 2023 added to Sources block since the body cites it. Inline-linked Cite Sources and Fluency Optimization at first body mention.
  - Statistical Density
    GEO content methods
    Cluster template alignment. Corrected the top-PAWC ranking (Top 3 by PAWC is Quotation Addition / Statistics Addition / Fluency Optimization, Cite Sources #4; prior text swapped Cite Sources and Fluency). Fixed 'best single intervention' wording (a combination is not single) and tightened the §5.3 reference (Figure 4, 200-example subset). Added the 2026-commercial-engines replication caveat. Inline-linked Quotation Addition, Cite Sources, and Fluency. Ahrefs example corrected to 37.9%. relatedTerms expanded 3 to 9. Sources block synced to all four Aggarwal institutions.
10. Thu, May 21 10 revisions
  - Attribution rate
    Citation metrics
    Added explicit practitioner-coined self-flag to the first paragraph, making the practitioner-coined conceptual ownership claim visible rather than implicit. The term combines existing English ('attribution' + 'rate') but the operational definition (response counting + denominator choices) is first-party-conceptualized for the GEO measurement context, with no vendor or academic standard. The parenthetical 'distinct from traditional marketing attribution' remains as the precedent-acknowledgment caveat.
  - Brand mentions in AI answers
    Citation metrics
    Status section compressed from ~470 words to ~170. The original had drifted to a multi-paragraph Ahrefs research summary (May 2025 vs Dec 2025 comparison, cross-engine correlation breakdown, ChatGPT-specific subnote) that read more like an Ahrefs digest than a sibling-pattern Status section. Compressed to: framing paragraph + one paragraph with the Dec 2025 lead finding (YouTube mentions 0.737), the 2 key Ahrefs cautions, and the Mike King relevance principle. Detail preserved in footnotes [^ahrefs-mentions] and [^ahrefs-mentions-dec]; no factual claims removed.
  - Citation match rate
    Citation metrics
    Added explicit practitioner-coined self-flag to the first paragraph: like attribution rate, citation match rate is not defined in any vendor or academic literature; the linked-vs-unlinked operational distinction was crystallized by GEO measurement practitioners. Makes the practitioner-coined conceptual ownership visible rather than relying on the implicit 'refinement of attribution rate' framing alone.
  - Citation share
    Citation metrics
    Added explicit practitioner-coined self-flag to the second paragraph, alongside the existing 'analog of share of voice' precedent acknowledgment. Vendor and academic literature do not define this exact AI-search operationalization; Profound, Otterly, and similar tools each implement slightly different internal definitions. Makes the practitioner-coined conceptual ownership explicit alongside the precedent-distinguishing prose that was already there. This entry attracted 2 confirmed external Google-search clicks shortly after launch, making the explicit conceptual ownership claim especially load-bearing.
  - Citation velocity
    Citation metrics
    Initial publish. Citation velocity extends the citation-measurement cluster (citation-share, citation-match-rate, attribution-rate, cite-ability, brand-mentions-in-ai-answers) with a temporal-dimension metric. The concept inherits from academic bibliometrics (Garfield 1955; ISI Science Citation Index lineage). Vendor blogs (UltraScout, Rankeo, Steakhouse) have shipped related metrics under similar names; the entry's distinct contribution is the per-engine + per-query-set + novelty-typology operational discipline that those vendor framings underspecify.
  - Citation velocity
    Citation metrics
    Major correction: the original draft conflated velocity (raw new-citation count per window) with acceleration (change in velocity); formula and framing now correct. Tightened Garfield 1955 body framing to match footnote precision. Reorganized Sources: Garfield 1955, UltraScout, Rankeo promoted from footnotes to Sources block. The 2-to-6-week lead-time framing softened to 'monitoring hypothesis, not benchmark'. The cite-ability connection softened from 'tends to drive' to 'may be associated with'. Replaced binary new/persistent split with a 5-fold typology (first-seen / new / recovered / persistent / lost).
  - Cite-ability
    Citation metrics
    Status section tightened. The reflexive-trajectory paragraphs (4x longer than sibling Status sections, with hard 2026-05-16 / 2026-05-17 dates and cross-project comparisons) were compressed into one paragraph that preserves the self-demonstration but ages better. Hard dates and exact 0-of-5 / 1-of-5 counts moved to changelog entries above; broader 'May 2026' and 'roughly a week' framing left in body. The cross-project WriterAI 6-week comparison and Wiktionary / Wordnik / YourDictionary specifics were removed as meta-commentary that belongs in research/citations evidence files, not in a reference entry's body. No factual claims changed.
  - DefinedTerm schema
    Schema cluster
    Perplexity context-aware probe shortly after publish primary cited this entry (source position #1 of 10; inline marker '(aisearchglossary +2)' on a DefinedTerm passage). URL precision partial: the answer body cites the root domain rather than /terms/defined-term-schema, but the source panel and inline marker both anchor our domain at primary strength. citationStatus.perplexity not-cited to cited. The fourth Perplexity-primary-cited entry (after cite-ability, freshness-signals, hybrid-retrieval).
  - Hybrid retrieval
    Retrieval pipeline
    Perplexity context-aware probe shortly after publish primary cited this entry (source position #2 of 9; inline marker '(aisearchglossary +2)' on a rank-fusion and re-ranking sentence; entry URL listed in Example resources). citationStatus.perplexity untested to cited. First retrieval-cluster entry to cross Perplexity's primary-citation threshold (the prior two, cite-ability + freshness-signals, were outside the cluster), adding a data point that a vendor-canonical academic-dense entry is now Perplexity-cited alongside the practitioner-coined ones.
  - Statistical Density
    GEO content methods
    First confirmed external Google-search click landed on this entry (2026-05-21, Singapore edge, Google referrer; all five traffic-attribution axes pass). The third practitioner-coined anchor to attract a confirmed external Google click (after cite-ability and citation-share), making the pattern multi-entry: practitioner-coined empty-territory terms reliably attract organic Google clicks. Recorded as the fifth confirmed external visit.
11. Wed, May 20 11 revisions
  - AI crawler bots
    Infrastructure
    ChatGPT citation. A 2026-05-20 definitional probe with context-aware prompt ('in the context of AI search and GEO') returned this entry as one source in the panel of 9, with the entry title 'AI crawler bots | GEO Glossary' surfaced. citationStatus.chatgpt from not-cited to cited. Part of a same-day seven-new-citation ChatGPT sweep; full probe evidence preserved internally.
  - AI Overview
    Search foundations
    ChatGPT citation. A 2026-05-20 definitional probe for 'AI Overview citation' returned both ai-overview and ai-overview-citation entries as sources (GEO Glossary listed multiple times in panel). citationStatus.chatgpt from not-cited to cited. Part of a same-day seven-new-citation ChatGPT sweep; full probe evidence preserved internally.
  - AI Overview citation
    Citation surfaces
    ChatGPT primary citation. A 2026-05-20 definitional probe for 'AI Overview citation' returned this entry as the #1 source in the panel ('AI Overview citation | GEO Glossary, May 13 2026'). citationStatus.chatgpt from not-cited to cited. Part of a same-day seven-new-citation ChatGPT sweep; full probe evidence preserved internally.
  - Article Schema
    Schema cluster
    Added a territory framing paragraph and an independent counter-evidence anchor (Jono Alderson on structured-data limits). Softened 'Direct input to E-E-A-T' to 'structured input channel' since Google has not documented Article schema as a direct ranking signal; author and datePublished fields are inputs to ranking systems, not guaranteed levers. Expanded Related terms to mirror sibling schema entries (breadcrumb-list, howto-schema, json-ld) for symmetric cross-linking.
  - BM25
    Retrieval pipeline
    ChatGPT citation. A 2026-05-20 definitional probe returned this entry as one source in a 10+ technical sources panel ('BM25 | GEO Glossary, May 14 2026'). Notable because BM25 is heavily vendor-canonical (Wikipedia + decades of IR literature); our entry being included despite saturated incumbents is a signal that the GEO-context framing landed. citationStatus.chatgpt from not-cited to cited. Part of a same-day seven-new-citation ChatGPT sweep; full probe evidence preserved internally.
  - Cite-ability
    Citation metrics
    Second AI engine citation: Perplexity. A 2026-05-20 definitional probe returned this entry as Perplexity's #1 source (of 9), with the primary definition paragraph paraphrasing our practitioner-coined framing ('practitioner-coined content property describing how suitable a passage is for AI extraction, quotation, and attribution'). citationStatus.perplexity from not-cited to cited; 2 of 5 engines now cited. The practitioner-coined-term ranking thesis gains a second empirical data point beyond the prior GSC ranking observation. Full probe evidence preserved internally.
  - DefinedTerm schema
    Schema cluster
    Backfilled relatedTerms with three schema cluster siblings (breadcrumb-list, article-schema, howto-schema) after bidirectional cluster-audit on 2026-05-20 surfaced that this entry was reciprocally linked by 9 other entries but only declared 5 back. Adding the three direct schema-class neighbors makes the cluster bond symmetric without exceeding UI sibling-display guidance.
  - Entity-based SEO
    Search foundations
    ChatGPT citation, near-primary position. A 2026-05-20 probe returned this entry as #2 in the sources panel, immediately following Ahrefs. citationStatus.chatgpt from not-cited to cited. Notable: ChatGPT preferred our entry over major SEO publication blog posts (Yext, SRNA, etc.) for the AI-search-context entity SEO definition. Part of a same-day seven-new-citation ChatGPT sweep; full probe evidence preserved internally.
  - FAQ Schema
    Schema cluster
    ChatGPT citation. A 2026-05-20 probe for 'FAQ Schema' returned this entry as the 4th source in a panel of ~10 ('FAQ Schema | GEO Glossary, May 16 2026'). Notable: surfaced despite Google's May 7 2026 deprecation of FAQ rich results (the entry's lead context); ChatGPT included us regardless. citationStatus.chatgpt from not-cited to cited. Part of a same-day seven-new-citation ChatGPT sweep; full probe evidence preserved internally.
  - Freshness signals
    Search foundations
    Perplexity citation (secondary). A definitional probe on 2026-05-20 returned this entry as the 10th of 10 sources for 'definition of Freshness signals,' with the plain-language version paragraph citing aisearchglossary alongside ahrefs. citationStatus.perplexity from untested to cited. Citation intensity is secondary (not primary), unlike cite-ability which Perplexity returned as #1 source on the same probe day. This is one of two entries cited by Perplexity across a 44-entry probe sweep (the negative finding: 42 of 44 entries not cited). Full probe evidence preserved internally.
  - Topic clusters
    Search foundations
    ChatGPT citation. A 2026-05-20 probe returned this entry as the 4th source in a panel led by Semrush ('Topic clusters | GEO Glossary, May 14 2026'). Notable: surfaced alongside major SEO publication content (Semrush, Seologist) on a vendor-canonical topic. citationStatus.chatgpt from not-cited to cited. Part of a same-day seven-new-citation ChatGPT sweep; full probe evidence preserved internally.
12. Tue, May 19 14 revisions
  - Answer block
    Search foundations
    Hedged claims about how AI engines use answer blocks: the previous wording said featured snippets and AI Overview 'select primarily at the answer-block level,' but no engine publishes its selection mechanism. Softened to observable-behavior framing. Corrected footnote 1: Google's featured snippets documentation does not publish a 40-60 word convention; the length norm is practitioner consensus from observation. Added 'glossary-coined' framing for the term itself with practitioner-described equivalents (featured snippet candidate, answer paragraph). FAQ-schema cross-reference now flags the May 7, 2026 rich-results deprecation.
  - BM25
    Retrieval pipeline
    Corrected a technical mechanism description: 'front-loading tightens BM25's term-frequency signal' was wrong because BM25 counts term frequency regardless of position. Front-loading helps via chunk-level retrieval (concepts near chunk start avoid truncation; front-loaders also concentrate keyword density). Matches hybrid-retrieval's self-aware correction. Separated vendor-documented BM25 platforms (Elasticsearch, OpenSearch, Solr, Azure AI Search, Lucene) from commercial AI engines (observable behavior consistent, not vendor-documented). Added BEIR benchmark grounding and Robertson & Walker 1994 SIGIR primary source.
  - BreadcrumbList Schema
    Schema cluster
    Propagated page's 'observed, not vendor-documented' inline hedge from Status to 5 other vendor-architecture claims (description / What-is / How-to-apply / FAQ / How-it-relates). 'Pages without breadcrumb lose context attribution' reframed to dispel schema-FOMO (strong on-page signals can perform comparably). Softened 'every mainstream CMS' with framework enumeration. Added vendor-canonical territory framing (19th use; schema cluster 5-anchor milestone complete with FAQ/HowTo/Answer block/DefinedTerm/JSON-LD). Added counter-evidence anchor: Google 'rich result not guaranteed' + Jono Alderson 2024 (2nd application).
  - DefinedTerm schema
    Schema cluster
    Three vendor-architecture claims softened: 'parse far more reliably than prose definitions', 'same shape works for Perplexity/Claude/ChatGPT retrieval', 'Suddenly central' + 'AI engines favoring schema-backed glossaries' all moved to observable-behavior framing. Other softening: inDefinedTermSet subgraph; 'schema-first GEO' to 'visible content first'; DefinedTermSet authority evaluation; 'Every DefinedTerm has' precision-fixed. Fact-audit catch: 'since 2017' corrected to Nov 2019 (schema.org v5.0). Added vendor-canonical territory framing paired with FAQ schema + JSON-LD (18th). Jono Alderson added to Sources.
  - Featured snippets
    Search foundations
    Grounded the 'highest-CTR SERP feature' claim with industry CTR studies (Ahrefs, Advanced Web Ranking, Moz) and a 2026 AI Overview distortion caveat. Softened 'AI Overview leans more on Knowledge Graph' to practitioner hypothesis. Aligned footnote 1 verbatim with the answer-block entry so the two cluster pages describe the same Google docs URL consistently. Added an explicit 'structured data does not trigger featured snippets' caveat. Labeled '3-5 FAQPage pairs' as practitioner heuristic and 'Position 0' as informal SEO term. Added reciprocal paired-with framing to answer-block (surface vs writing-unit pair).
  - Generative search index
    Retrieval pipeline
    Reframed for self-awareness: 'generative search index' is glossary practitioner shorthand (standard term: vector database), and the four-layer model is glossary editorial synthesis, since production stacks are typically loosely-coupled rather than one unified backend. Aligned the chunk-size claim with the LLMO entry's 200-1024 range and named tool defaults. Softened vendor-architecture overclaims (per-engine proprietary indices, ChatGPT training-vs-retrieval path, Perplexity hours cadence) to observable-behavior framing. Footnote 1 now disclaims that Pinecone does not use the term or four-layer framing.
  - Hallucination grounding
    AI behavior
    Softened the page's own absolutist claims ('prevents hallucination', 'every assertion traceable') to match what grounding actually delivers. Replaced a fake '63% of SaaS founders' example with the verifiable Ahrefs Dec 2025 YouTube-mentions 0.737 anchor. Flagged 'hallucination grounding' as glossary-coined; standard ML uses 'grounding' / 'retrieval grounding'. Replaced the Perplexity blog root-domain source with Shuster et al. 2021 (arXiv:2104.07567), the direct empirical paper on retrieval reducing hallucination. Added reciprocal paired-with framing to the sycophancy entry, completing the meta-failure-mode pair.
  - Inverted index
    Retrieval pipeline
    Fixed 'weight term position with earlier positions scoring higher' myth: inverted indices + BM25 don't position-weight; front-loading helps via chunk-boundary avoidance and concept density. Completes 5-page cluster consistency (hybrid-retrieval / bm25 / vector-embeddings / reranking / this entry). Added field-weighting nuance (Lucene per-field boost is field-based, not within-document position-based). Softened 'every production hybrid retrieval system' universal claim. '2026 frontier learned indices' softened (Kraska et al. 2017 anchor). Added paired-with BM25 + vector-embeddings framing + IIR 2008 anchor.
  - JSON-LD
    Schema cluster
    Softened universal vendor-architecture claims: 'every major AI search surface parses JSON-LD as the canonical structured-data signal' separated into Google's documented recommendation (JSON-LD 'if your site's setup allows it'; all 3 formats valid) vs ChatGPT/Perplexity/Claude/Copilot/Gemini observable-behavior framing. Added FAQ rich-results May 7 2026 deprecation cross-reference (page uses FAQPage in examples). Added vendor-canonical territory framing paired with FAQ schema. Other softening: one-block vs @graph, sameAs canonical, GEO foundation. Footnotes 1+2 URLs precision-fixed (W3C TR/json-ld11/; Google /intro-structured-data sub-page).
  - Passage-level optimization
    Retrieval pipeline
    Aligned the chunk-size guidance with the LLMO entry's verified range (200-1024 tokens, with named tool defaults: LangChain ~250, LlamaIndex 1024, Pinecone 512-1024), replacing the narrower 256-512 estimate. Corrected footnote 2: the Aggarwal et al. 2023 GEO paper tests 9 content-modification methods at source-page level (Quotation Addition PAWC 27.8 vs no-modification baseline 19.5); the four-trait 'well-optimized passage' framework on this page is glossary editorial synthesis, not a paper finding. Softened several 'AI engines chunk and rank' universal claims to observable-behavior framing matching the rest of the retrieval cluster.
  - RAG (Retrieval-Augmented Generation)
    Retrieval pipeline
    Updated 'Bing Chat' to 'Microsoft Copilot' (Microsoft rebranded the product at Ignite 2023, Nov 15 2023; the glossary's own microsoft-copilot-citations entry already uses the new name). Softened several universal vendor-architecture claims to observable-behavior-plus-inference framing, matching the rest of the retrieval cluster. Replaced the Perplexity Engineering Hub root-domain source with Karpukhin et al. 2020 (the DPR paper, arXiv:2004.04906; this is the retriever component Lewis et al. RAG actually uses). Footnote 1 expanded with cluster-standard precision; added vendor-canonical dual-position framing.
  - Reranking
    Retrieval pipeline
    Fixed 'rerankers weight first 1-2 sentences' position-weighting myth: cross-encoder rerankers process [query + passage] with bidirectional attention, not positional priority. Front-loading helps via chunk-boundary avoidance and concept density. Completes 4-page cluster consistency (hybrid-retrieval / bm25 / vector-embeddings / this entry). Softened 'Production-standard for most AI search engines' to vendor-documented building blocks vs commercial engines (not vendor-documented). '2026 frontier listwise reranking' softened to active research direction. Added retrieval-pipeline 4-component paired-with framing and Nogueira & Cho 2019 footnote.
  - Sycophancy vs cite-able fact
    AI behavior
    Aligned the page with its own thesis: previously a page about cite-able content carried claims that were themselves not cite-able. Grounded the 'AI engines preferentially cite specific content' assertion with Aggarwal et al. 2023 PAWC findings (Cite Sources ~28%, Quotation Addition ~43%). Replaced the anthropic.com/research root-domain source with the Constitutional AI paper. Labeled the '2026 mitigation stack' as glossary editorial synthesis. Flagged 'extraction anchor' as glossary-coined. Added a calibrated-uncertainty caveat for domains where hedging is appropriate.
  - Vector embeddings
    Retrieval pipeline
    Fixed the 'tighter openings cluster more precisely' mechanism: standard embedding models use mean pooling or [CLS]/last-token aggregation, not position weighting. Front-loading helps via chunk-boundary truncation avoidance and concept density. Now matches hybrid-retrieval and bm25; three pages describe the mechanism verbatim. Dimensions range expanded to cluster-verbatim 384-3072 with 5 named models. Corrected 'less relevant to GEO than LLMO' framing (vector embeddings are vendor-canonical, LLMO is not). Added MTEB benchmark anchor and reciprocal paired-with BM25 framing.
13. Mon, May 18 7 revisions
  - FAQ Schema
    Schema cluster
    Softened the lead description from 'enabling AI engines to extract Q&A content as cite-able answer blocks' to 'commonly used to help systems parse Q&A content in a machine-readable format,' aligning with the already-hedged Status framing. Hedged the 'AI engines parse FAQPage more reliably than free-form Q&A in prose' comparative claim, because no public study isolates the structured-vs-prose effect. Source footnote replaced with a specific URL set.
  - HowTo Schema
    Schema cluster
    Fixed a Recipe / HowTo confusion in the FAQ: Recipe is its own schema.org type, not a HowTo subtype, so do not mark up recipes with HowTo hoping for rich results. Softened the specific 'September 13-14 2023' desktop deprecation date to 'in the months that followed' because the cited Google post is dated August 2, 2023 and does not give a specific September date. Several confident AI-citation-benefit claims hedged: the effect of HowTo markup itself versus the underlying step-headed content structure is not isolated by public study.
  - Hybrid retrieval
    Retrieval pipeline
    Corrected a technical claim: 'opening tokens carry more weight in some embedding models' was wrong, because standard sentence embeddings use mean pooling or fixed-position aggregation, not position weighting. The reason front-loading still helps is chunk-boundary truncation plus BM25 keyword density, not position-aware embeddings. Also broadened the embedding dimension range to 384-3072 with named tool examples, and softened vendor-architecture claims for AI engines that have not published their retrieval pipelines.
  - LLM Optimization (LLMO)
    Umbrella terms
    Corrected a footnote scope mismatch: footnote 1 (arXiv 2402.16827) was used to anchor broad claims about training-data influence, but the paper only covers instruction tuning (a fine-tuning phase after pretraining). The footnote scope and body anchor are now precisely about instruction tuning. Hedged the FAQ training-corpora claim because vendor docs do not publish training-data filters, broadened the chunk-size range to 200-1024 tokens to match named tool defaults, softened 'recall it accurately' since publishers cannot control model recall, and replaced the Anthropic research root-domain URL with the canonical platform.claude.com link.
  - LLMS.txt
    Infrastructure
    Verified Anthropic's canonical llms.txt URL: both docs.claude.com and docs.anthropic.com 301-redirect to platform.claude.com (body and Sources block updated). Softened the description to acknowledge that robots.txt and sitemap.xml are widely standardized in ways llms.txt is not. Attributed Google's 'we don't use llms.txt' position to John Mueller (2025) and Gary Illyes (Search Central Live, July 2025) with direct quotes. Replaced an unverifiable 'adoption growing month-over-month' line with a cross-section snapshot. Added Mintlify's role in originating llms-full.txt and a 'What not to expect' section.
  - Sub-document retrieval
    Retrieval pipeline
    Corrected a regression in How-to-apply: an earlier rewrite said 'directly motivates schema-first design', but sub-document retrieval shapes content structure, not schema markup. Restored to section-first content design. Softened the 'every major engine uses passage-level retrieval' claim because engine pipelines are not vendor-documented, broadened the chunking discussion to name LangChain and LlamaIndex defaults, and added DPR / ColBERT historical context to the RAG footnote.
  - Sub-passage extraction
    Retrieval pipeline
    Major rewrite. The original page presented sub-passage extraction as a universal architectural step (an extractive QA layer between retrieval and generation), which fits BERT-era classical IR but not modern LLM-based AI search engines: in those, retrieved chunks pass into the LLM context and quoting happens during generation, with no separate extraction layer. Added a Note on architecture, flagged the term as practitioner shorthand (standard literature: extractive QA / span selection), and replaced a fabricated 'FAQPage schema improves AI Overview citation by 30%' example with the Ahrefs Dec 2025 YouTube 0.737 correlation.
14. Sun, May 17 29 revisions
  - AI crawler bots
    Infrastructure
    Several corrections that mattered for readers copying robots.txt rules: PerplexityBot was wrongly marked as a training crawler (it's retrieval-only per Perplexity's docs); 'anthropic-ai' and 'cohere-ai' aren't in current vendor documentation; Google-Extended and Applebot-Extended are control tokens, not crawlers. Added Meta crawlers and Cloudflare's August 2025 finding on Perplexity stealth-crawler behavior. Added IP-verification guidance since UA strings alone are spoofable.
  - AI Overview
    Search foundations
    Fixed an internal contradiction where the FAQ called Google's top 10 the AI Overview 'candidate pool' while the body acknowledged that 62% of cited URLs rank outside the top 10. Softened several confident causal claims about engine behavior (E-E-A-T weighting, DefinedTerm citation rates, GSC measurability of AI Overview, schema as a citation lever) to match the hedging already used on ai-overview-citation, GEO, AEO, and freshness-signals. Added the May 7, 2026 FAQPage rich result deprecation context to schema recommendations.
  - Article Schema
    Schema cluster
    Softened the date-spoofing-penalty framing: no engine has published a penalty policy, and detection mechanisms (content-diff, re-embedding cycles) are plausible but not vendor-documented. Aligns with the freshness-signals page's reading of the Ahrefs July 2025 primary data.
  - Article Schema
    Schema cluster
    Aligned schema-confidence stance with the rest of the glossary: softened 'feed Google's E-E-A-T inference and AI engines' citation-eligibility decisions' to commonly-hypothesized inputs, removed unsupported engine-behavior claims (Person-typed authors weighted higher, headline-H1 mismatch suppresses rich-result eligibility, wrong-type signals worse than no-type), and added the May 7, 2026 FAQPage rich result deprecation context to the FAQ schema cross-reference.
  - Authority signals
    Search foundations
    Propagation patch from e-e-a-t-ai-search peer review: softened the FAQ claim 'AI engines appear to infer the same signals from structured content' to a practitioner-hypothesis framing, and added Google's documented position that E-E-A-T is not a direct ranking signal. Aligns this page with the cluster-wide three-layer hedge on E-E-A-T-as-AI-signal claims.
  - Authority signals
    Search foundations
    Major reframe: the 'four-layer authority model' is glossary editorial shorthand, not an industry standard; this disclaimer is now on the source page rather than only downstream. Confident weight claims softened to practitioner-hypothesis framing to match the E-E-A-T cluster. Added Ahrefs December 2025 brand-visibility correlations (with Ahrefs' own correlation-not-causation disclaimer), Wikidata notability caveat, the May 7 2026 FAQPage rich result deprecation note, the Bing AI Performance dashboard reference, and a Knowledge Graph operationalization chain.
  - Brand mentions in AI answers
    Citation metrics
    Integrated Ahrefs' December 2025 cross-engine follow-up study (75K brands, ChatGPT + AI Mode + AI Overviews), where YouTube mentions emerged as the strongest single signal across all three engines (~0.737), beating branded web mentions; added 'correlation is not causation' caveat per Ahrefs' own disclaimer; added the content-volume finding (almost no relationship with AI visibility); reframed mentions vs citations as two independent dimensions.
  - Citation match rate
    Citation metrics
    Reframed mention vs citation as two independent dimensions (not a containment relationship), matching the brand-mentions entry; clarified the formula with reference-level vs response-level denominator; corrected the citation-share comparison from 'relative vs absolute' to 'both ratios, different normalization'; softened engine link-behavior claims and the 'backlink-equivalent' analogy.
  - Citation match rate
    Citation metrics
    Second AI-engine citation on the site (after cite-ability). A ChatGPT search probe on 'definition of Citation match rate' returned this entry as a primary source (plus the /observatory page separately), with the GEO definition attributed to GEO Glossary and industry context (87% SearchGPT/Bing) to Seer Interactive. citationStatus.chatgpt untested to cited; Perplexity, Claude, Copilot, Gemini probed the same round, all not-cited. Observatory cited as a separate URL validates that DefinedTermSet entity recognition extends beyond /terms.
  - Citation share
    Citation metrics
    Downgraded the share-of-voice equivalence from 'directly translates' to 'analog with mechanism differences'; clarified the formula denominator (citation instances, not unique sources); added measurement-axes discipline (URL vs domain vs brand level, deduplication, per-engine vs cross-engine aggregation); added engine-coverage caveats; added Peec AI to the sources list with proper citation.
  - Cite-ability
    Citation metrics
    Major correction after re-reading the Aggarwal paper: 'cite-ability' and the four-trait framework were misattributed as paper-derived. The paper actually tests 9 content-modification methods against a Position-Adjusted Word Count metric; the framework on this page is glossary editorial shorthand. Softened schema-as-citation-lever overclaims to match the rest of the cluster, hedged chunking-mechanism claims, removed the generic Perplexity Engineering blog source, and elevated the self-illustrating live example (this page ranks #2 organically but is cited by 0/5 AI engines as of audit) into the Status section.
  - Cite-ability
    Citation metrics
    First AI engine citation. A fresh ChatGPT search probe on the same day returned this entry as one of four primary sources cited for 'definition of Cite-ability,' with the AI-search-context definition explicitly attributed to GEO Glossary. citationStatus.chatgpt flipped from not-cited to cited; Perplexity, Claude, Copilot, and Gemini remain not-cited as of the same probe round. The Status section's live-example paragraph now carries the 0/5 → 1/5 trajectory as a dated update, turning the self-illustrating example into a self-updating one in a 24-hour window.
  - DefinedTerm schema
    Schema cluster
    Softened the FAQ claim that DefinedTerm-tagged definitions appear in AI Overview citations 'at materially higher rates than equivalent prose-only definitions': the comparison has not been isolated by controlled study, and the supporting evidence is practitioner observation rather than measured causal effect.
  - DefinedTerm schema
    Schema cluster
    Propagation patch from cite-ability peer review: softened 'directly enhances cite-ability' to 'commonly hypothesized to support cite-ability' to match the cluster-wide schema-as-hygiene-factor stance.
  - E-E-A-T (AI search context)
    Search foundations
    Core reframing: per Google's own docs, 'E-E-A-T itself isn't a specific ranking factor' and 'Rater data is not used directly in our ranking algorithms.' Whether AI engines use E-E-A-T at all is not vendor-documented. Softened multiple unsourced specific claims (three-link floor, pseudonymous-content underperformance, engines cross-reference Wikidata/LinkedIn, vague third-party citation measurements). Added a Knowledge Graph chain showing how schema → entity recognition → KG node → downstream E-E-A-T-aligned heuristics, and aligned schema-encode-E-E-A-T framing with the cluster-wide hygiene-factor stance.
  - Entity-based SEO
    Search foundations
    Propagation patch from knowledge-graph peer review: replaced 'three high-trust links is the typical floor' with the 2-4 range used cluster-wide and added the 'not vendor-documented' caveat, matching the just-revised E-E-A-T, authority-signals, and knowledge-graph entries.
  - Entity-based SEO
    Search foundations
    Confident engine-behavior claims softened to match the cluster: 'rely heavily on entity resolution before retrieval' became 'one mechanism among several'; 'most major ranking factors are entity-mediated' became a practitioner observation; 'materially higher citation rates' grounded in the Ahrefs December 2025 study (with its correlation-not-causation disclaimer); the 'close to a prerequisite' FAQ claim now acknowledges the Wikipedia counter-example. Footnote upgraded to the specific Wikidata Notability policy page.
  - FAQ Schema
    Schema cluster
    Surfaced in the ChatGPT search sources panel for 'definition of FAQ Schema' (with a 'Yesterday' indexing timestamp) but not in the answer's primary source list; ChatGPT cited Google Search Central + schema.org + 3 industry blogs instead. A recurring pattern: where vendor-canonical docs exist (Google and schema.org both publish FAQPage docs), engines surface third-party glossary entries in retrieval but prefer the vendor sources for citation. citationStatus.chatgpt unchanged (the binary does not model 'surfaced but not cited').
  - Freshness signals
    Search foundations
    Rewritten using Ahrefs' July 2025 primary 17M-citation study. Most material change: Google AI Overview is the only AI engine that cites slightly OLDER content than Google's organic SERP (16 days older), against the prior framing that grouped it with ChatGPT/Perplexity as freshness-preferring. Added the per-engine table, the '2.9-year-average' caveat (most cited content is still long-lived), and softened the date-spoofing-penalty claim to acknowledge no engine has published a penalty policy.
  - Hallucination grounding
    AI behavior
    Propagation patch from cite-ability peer review: softened 'Only cite-able passages survive the grounding filter' to a relational framing: cite-able passages are the form most likely to survive, but exact grounding logic is engine-specific.
  - HowTo Schema
    Schema cluster
    Softened the 'mismatched schema types tend to be ignored or down-weighted by AI engines' claim: Google's 'Spammy structured markup' manual action is documented for deliberate misuse, but no engine has published an AI-citation down-weighting policy for schema-type-to-content-type mismatches.
  - Knowledge Graph
    Search foundations
    Major corrections after a Wikidata notability re-check: 'three high-trust links is the floor' became the cluster-wide 2-4 range; the 'Q-number is the strongest single signal' superlative softened (no controlled comparison exists); 'content is homeless' now acknowledges the Wikipedia counter-example. Removed a fabricated claim (Wikidata's notability page does not say 'primary sources count, unlike Wikipedia'). Footnote upgraded to the specific Notability policy page. Added the Ahrefs December 2025 anchor and the Bing AI Performance reference.
  - Microsoft Copilot citations
    Citation surfaces
    Footnote 2 URL was wrong (the 'april-2024' path does not exist); replaced with the real AI Performance post at blogs.bing.com/webmaster/February-2026/..., published 2026-02-10 by four Microsoft Product Managers. Other corrections: 'in beta' updated to 'public preview' (vendor wording); the five specific dashboard metrics are now listed; the four Copilot surfaces are now split by data source so readers do not confuse Microsoft 365 Copilot (internal tenant data, not GEO-addressable) with the web-grounded surfaces.
  - Passage-level optimization
    Retrieval pipeline
    Propagation patch from cite-ability peer review: softened 'directly amplifies cite-ability' to a relational framing: well-built passages are the form most likely to be quoted intact, not an automatic citation amplifier.
  - Pillar content
    Search foundations
    Reframed pillar-first instead of mirroring topic-clusters. Several confident engine-behavior claims softened to practitioner observation ('entity canonicalization'; 'signals earn cluster recognition in Knowledge Graph layers'; 'engines cite well-built clusters as canonical'). Critical fix: the FAQPage JSON-LD spoke-page recommendation now carries the May 7 2026 rich-result deprecation caveat. Aggarwal footnote added with per-method PAWC + glossary-inference label. Measurement methodology cross-references topic-clusters.
  - RAG (Retrieval-Augmented Generation)
    Retrieval pipeline
    Softened the freshness-penalty language: no engine has published a penalty policy, so 'discount or ignore stale signals' replaces 'penalize'. Aligns with the freshness-signals page's reading of the Ahrefs July 2025 primary data.
  - Search Generative Experience (SGE)
    Search foundations
    Added a caveat to the 'translate SGE techniques to AI Overview directly' guidance: FAQPage rich results were fully deprecated on May 7, 2026, so the older SGE-era playbook of shipping FAQ schema for SERP visual treatment no longer applies.
  - Search Generative Experience (SGE)
    Search foundations
    Softened the 'SGE and AI Overview are directly continuous' framing (Google has not publicly documented architectural continuity across the rebrand) and dropped the 'differences are mostly cosmetic' claim. Labeled the 'classical SEO era / GEO-mainstreaming era' periodization as glossary editorial framing rather than industry consensus. Corrected the SERP-surface description: the knowledge panel typically renders in the right rail on desktop, not above the blue links.
  - Topic clusters
    Search foundations
    Several confident engine-behavior claims softened to practitioner observation to match the cluster ('engines treat the cluster as canonical, raising citation rates'; 'DefinedTermSet converts the cluster into an entity collection'). Critical fix: the FAQPage JSON-LD spoke-page recommendation now carries the May 7 2026 rich-result deprecation caveat (it was recommending schema for a SERP feature that no longer renders). Aggarwal footnote precision-fixed (per-method PAWC; the 'spoke pages need cite-ready' framing is glossary inference). Added a measurement methodology block.
15. Sat, May 16 21 revisions
  - Agentic retrieval
    Retrieval pipeline
    Clarified the Ahrefs 12% AI-citation figure (it averages 5 engines including Perplexity's ~29%, hiding that ChatGPT/Gemini/Copilot are closer to ~8%); softened the query fan-out claim from 'direct evidence' to 'consistent with' Ahrefs' own framing; downgraded llms.txt advice to acknowledge no major engine has publicly confirmed using it.
  - Agentic retrieval
    Retrieval pipeline
    Softened remaining behavioral claims about AI agent architectures, source preferences, and content-condensation behavior to better match available evidence.
  - AI Overview
    Search foundations
    Updated Ahrefs measurements: their Feb-Mar 2026 study revised the AI Overview citation overlap with Google's top 10 from 76% down to 38%, and the cross-engine 12% headline was decomposed into per-engine numbers showing Perplexity at ~29% and ChatGPT/Gemini/Copilot at ~7-8%.
  - AI Overview citation
    Citation surfaces
    Updated Ahrefs AI Overview citation data (March 2026 study, ~38% of cited URLs in top 10 down from earlier ~76%); noted the drop reflects both broader candidate pool and improved measurement methodology.
  - AI Overview citation
    Citation surfaces
    Integrated full Ahrefs March 2026 breakdown (37.9% top 10 / 31.2% ranked 11-100 / 31.0% beyond top 100, plus YouTube at 5.6% of all cited URLs); rewrote FAQ to clarify top-10 ranking is the strongest single observable predictor but not a necessary condition; aligned schema framing with the GEO and AEO entries; added Google's May 7, 2026 FAQ rich results deprecation context; softened E-E-A-T author markup claim to acknowledge Google has not confirmed any specific weighting.
  - AI Search Optimization
    Umbrella terms
    Acknowledged that the AIO acronym is also widely used in SEO coverage to mean AI Overview (Google's SERP feature); corrected the NBER paper to July 2025 with proper author attribution (Chatterji et al.); clarified that AIO's umbrella positioning over GEO and AEO is contested in industry usage; softened several engine-behavior claims to match available evidence.
  - Answer Engine Optimization
    Umbrella terms
    Added Google's full FAQ rich results deprecation on May 7, 2026 (visual SERP treatment removed for all sites); corrected featured snippets launch date to January 2014; downgraded the 'FAQ schema is dominant AEO input' framing to match available evidence; hedged the AI Overview volume and GEO-subsumes-AEO positioning claims.
  - Attribution rate
    Citation metrics
    Updated Ahrefs AI citation data with proper per-engine framing (the 12% figure averages 5 measurements including Perplexity's ~29%); anchored the AI Overview 38-76% range to its source dates and methodology.
  - Attribution rate
    Citation metrics
    Added Microsoft's Bing Webmaster Tools AI Performance dashboard (public preview February 2026, the first major-engine native AI citation dashboard); clarified the formula with explicit denominator choices; distinguished AI-search attribution rate from traditional marketing attribution; resolved a page-internal contradiction between the body and FAQ on query sample size; added FAQ entries on metric distinctions and citation-vs-CTR.
  - BreadcrumbList Schema
    Schema cluster
    Initial publish
  - Cite-ability
    Citation metrics
    First 5-engine citation probe: 0 of 5 cited, despite the page ranking #2 on Google organic search for the same query and Google's AI Overview answering it from other sources. A live demonstration of the gap between organic ranking and AI citation that GEO programs target.
  - FAQ Schema
    Schema cluster
    Added Google's full FAQ rich results deprecation on May 7, 2026 (visual SERP treatment removed for all sites, including previously-protected government and health categories); flagged that Rich Results Test FAQ support retires June 2026 and Search Console API support retires August 2026; softened claims about AI engines citing FAQ-marked content.
  - Featured snippets
    Search foundations
    Added Google's May 7, 2026 full FAQ rich result deprecation; flagged that Rich Results Test FAQ support retires June 2026 (use schema.org validator as fallback); softened the 40-60 word snippet length from 'practitioner consensus' to a common heuristic that varies by query and platform.
  - Generative Engine Optimization
    Umbrella terms
    Removed the 'schema markup is the dominant GEO signal' framing: the cited Aggarwal 2023 paper actually shows that content-level edits (authoritative tone, source citation, direct quotation) drove the largest measured visibility gains, and schema was not tested. Updated Ahrefs measurements (Feb-Mar 2026 study revised AI Overview citation overlap from 76% down to 38%) and decomposed the 12% cross-engine overlap into per-engine numbers. Added a scope note distinguishing Aggarwal's broader visibility-optimization definition from this glossary's narrower citation-focused operational definition.
  - HowTo Schema
    Schema cluster
    Added context for Google's May 7, 2026 full FAQ rich result deprecation to the parallel FAQ reference; softened the 'AI Overview and ChatGPT cite step-by-step content disproportionately' claim to a practitioner observation without vendor confirmation.
  - LLM Optimization (LLMO)
    Umbrella terms
    Surface LLMO acronym in title and primary H2 (GSC showed 8 impressions for what-is-llmo cluster at avg position 32.9)
  - LLM Optimization (LLMO)
    Umbrella terms
    llms.txt bullet downgraded per cross-term Rule 9 calibration: 'observed fetching' → 'practitioners report seeing'; added Google's public statement that it does not use llms.txt; framed upside as opt-in insurance not measurable lift
  - LLMS.txt
    Infrastructure
    Status + How-to-apply sections recalibrated per Rule 9 confidence tiers: added Google's public 'we don't use llms.txt' statement; 'have been observed' downgraded to 'practitioners report seeing'; framed upside as opt-in insurance to avoid implying vendor commitment that doesn't exist
  - RAG (Retrieval-Augmented Generation)
    Retrieval pipeline
    Rewrote the 'how do I optimize for RAG' FAQ to use the Aggarwal 2023 paper's actual method names (Statistics Addition, Cite Sources, Quotation Addition, Fluency Optimization), and noted that 'statistical density' is practitioner shorthand rather than a paper-defined metric.
  - Statistical Density
    GEO content methods
    Reframed 'statistical density' as a practitioner-coined shorthand rather than a metric defined in the Aggarwal et al. 2023 paper. The paper actually tests a content edit called Statistics Addition, measured against the Position-Adjusted Word Count (PAWC) metric, not a sentence-level density ratio. Page rewritten to use correct PAWC numbers, distinguish intervention vs correlation, and add anti-stuffing discipline (relevant + sourced + non-redundant statistics only).
  - Sycophancy vs cite-able fact
    AI behavior
    Replaced two stale examples that were themselves not cite-able by the time of review: the 'Aug 2023 FAQ rich results limited to gov/health' fact (now superseded by the May 7, 2026 full deprecation) and a Princeton-paper claim that misstated both the term used and the number of top-performing levers. New examples use paper-accurate PAWC numbers and the current deprecation date.
16. Thu, May 14 18 revisions
  - Article Schema
    Schema cluster
    Initial publish
  - Authority signals
    Search foundations
    Initial publish
  - BM25
    Retrieval pipeline
    Initial publish
  - Entity-based SEO
    Search foundations
    Initial publish
  - Featured snippets
    Search foundations
    Initial publish
  - Freshness signals
    Search foundations
    Initial publish
  - Generative search index
    Retrieval pipeline
    Initial publish
  - HowTo Schema
    Schema cluster
    Initial publish
  - Hybrid retrieval
    Retrieval pipeline
    Initial publish
  - Inverted index
    Retrieval pipeline
    Initial publish
  - JSON-LD
    Schema cluster
    Initial publish; foundational schema-infrastructure term
  - Microsoft Copilot citations
    Citation surfaces
    Initial publish.
  - Pillar content
    Search foundations
    Initial publish
  - Reranking
    Retrieval pipeline
    Initial publish
  - Search Generative Experience (SGE)
    Search foundations
    Initial publish: historical Google AI-search term
  - Sub-passage extraction
    Retrieval pipeline
    Initial publish
  - Sycophancy vs cite-able fact
    AI behavior
    Initial publish
  - Topic clusters
    Search foundations
    Initial publish
17. Wed, May 13 28 revisions
  - Agentic retrieval
    Retrieval pipeline
    Initial publish
  - AI crawler bots
    Infrastructure
    Initial publish
  - AI Overview
    Search foundations
    Initial publish
  - AI Overview citation
    Citation surfaces
    Initial publish.
  - AI Search Optimization
    Umbrella terms
    Initial publish
  - Answer block
    Search foundations
    Initial publish
  - Answer Engine Optimization
    Umbrella terms
    Initial publish
  - Attribution rate
    Citation metrics
    Initial publish: GEO measurement KPI
  - Attribution rate
    Citation metrics
    First probe round (5 engines × 3 standard queries); all not-cited, site <12h old, expected baseline
  - Brand mentions in AI answers
    Citation metrics
    Initial publish
  - Citation match rate
    Citation metrics
    Initial publish
  - Citation share
    Citation metrics
    Initial publish
  - Cite-ability
    Citation metrics
    Initial publish
  - DefinedTerm schema
    Schema cluster
    Initial publish: schema-class term with inline JSON-LD example
  - DefinedTerm schema
    Schema cluster
    First probe round (5 engines × 3 standard queries); all not-cited, site <12h old, expected baseline
  - E-E-A-T (AI search context)
    Search foundations
    Initial publish
  - FAQ Schema
    Schema cluster
    Initial publish
  - Generative Engine Optimization
    Umbrella terms
    Initial publish: foundational hub term
  - Generative Engine Optimization
    Umbrella terms
    First probe round (5 engines × 3 standard queries); all not-cited, site <12h old, expected baseline
  - Hallucination grounding
    AI behavior
    Initial publish
  - Knowledge Graph
    Search foundations
    Initial publish
  - LLM Optimization (LLMO)
    Umbrella terms
    Initial publish
  - LLMS.txt
    Infrastructure
    Initial publish
  - Passage-level optimization
    Retrieval pipeline
    Initial publish
  - RAG (Retrieval-Augmented Generation)
    Retrieval pipeline
    Initial publish
  - Statistical Density
    GEO content methods
    Initial publish
  - Sub-document retrieval
    Retrieval pipeline
    Initial publish
  - Vector embeddings
    Retrieval pipeline
    Initial publish

July 2026 22 revisions

Mon, Jul 13 9 revisions

Mon, Jul 6 12 revisions

Wed, Jul 1 1 revision

June 2026 100 revisions

Tue, Jun 30 3 revisions

Mon, Jun 29 13 revisions

Sun, Jun 28 1 revision

Sat, Jun 27 1 revision

Mon, Jun 22 5 revisions

Sun, Jun 21 15 revisions

Sat, Jun 20 3 revisions

Thu, Jun 18 1 revision

Tue, Jun 16 1 revision

Sat, Jun 13 2 revisions

Fri, Jun 12 4 revisions

Wed, Jun 10 2 revisions

Tue, Jun 9 4 revisions

Mon, Jun 8 1 revision

Fri, Jun 5 17 revisions

Thu, Jun 4 5 revisions

Wed, Jun 3 2 revisions

Tue, Jun 2 14 revisions

Mon, Jun 1 6 revisions

May 2026 262 revisions

Sun, May 31 5 revisions

Sat, May 30 26 revisions

Fri, May 29 6 revisions

Thu, May 28 37 revisions

Wed, May 27 14 revisions

Tue, May 26 9 revisions

Mon, May 25 6 revisions

Sun, May 24 5 revisions

Sat, May 23 16 revisions

Thu, May 21 10 revisions

Wed, May 20 11 revisions

Tue, May 19 14 revisions

Mon, May 18 7 revisions

Sun, May 17 29 revisions

Sat, May 16 21 revisions

Thu, May 14 18 revisions

Wed, May 13 28 revisions