Cluster discipline

Many terms in this glossary belong to clusters of related concepts (often 5-10 entries, sometimes more as a cluster matures). The cluster groupings we currently treat as load-bearing for consistency checks are listed below. For clusters drawn from the same paper or vendor doc, every entry runs through a consistency checklist before publish or substantive update.

Known clusters as of 2026-05-27

The list grows and shifts as new entries are added and cluster boundaries firm up. Some entries belong to more than one cluster (for example, sub-document retrieval sits in both the retrieval pipeline cluster and the content structure cluster); the consistency rules apply across each cluster an entry belongs to.

Citation metrics cluster (9 entries + 1 pillar): attribution rate, citation share, citation match rate, cite-ability, citation velocity, citation rotation, brand mentions in AI answers, citation vs mention vs link, plus the AI citation metrics pillar as the cluster's synthesis hub.
Citation surfaces cluster (9 entries): AI Overview, AI Overview citation, AI Mode, Microsoft Copilot citations, Perplexity citation, Claude citation, Gemini citation, ChatGPT search citation, AI dev tool citations, Search Generative Experience (historical).
Retrieval pipeline cluster (9 entries): BM25, vector embeddings, hybrid retrieval, reranking, inverted index, sub-document retrieval, sub-passage extraction, RAG, agentic retrieval.
Schema cluster (6 entries): DefinedTerm schema, FAQ schema, Breadcrumb list, JSON-LD, Article schema, HowTo schema.
GEO content methods cluster (6 entries, Aggarwal 2023-derived): Quotation Addition, Cite Sources Optimization, Fluency Optimization, Statistical Density, Authoritative Statement Strength, Definition-Lead Style.
Optimization umbrella cluster (4 entries): Generative Engine Optimization, Answer Engine Optimization, AI Search Optimization, LLM Optimization.
Content structure cluster (5 entries, in formation): answer block, pillar content, topic clusters, featured snippets, passage-level optimization.
Authority and entity cluster (5 entries, in formation): E-E-A-T in AI search, entity-based SEO, knowledge graph, authority signals, freshness signals.
Infrastructure cluster (4 entries, in formation): AI crawler bots, IndexNow protocol, llms.txt, generative search index.

The two smallest clusters are seeds awaiting more members:

AI behaviour cluster (2 entries, seed): hallucination grounding, sycophancy vs cite-able fact.
Traffic measurement cluster (1 entry, seed): external traffic disambiguation.

Cluster sizes shift as the corpus grows; the in-formation clusters typically become stable at 5+ members. The full per-entry mapping (cluster + task + level) is tracked in plan/content-type-strategy.md section 8.

The 5 cluster consistency rules

Paper-vs-glossary naming. When an entry's glossary title diverges from the source paper's method name (for example "Cite Sources Optimization" vs the paper's "Cite Sources"), the entry's FAQ includes a one-question explainer so readers searching either name land in the same place and understand what the suffix means.
Verbatim paper claims. When the same finding is referenced across sibling entries, the wording is lifted verbatim from one canonical mention, including section / figure / subset qualifiers. The same number means the same phrasing on every page. This rule has caught at least three cluster contradictions where one entry described a paper's methodology one way and a sibling entry described it differently; the verbatim discipline forces the divergence to surface during peer review.
Measured vs hand-wave combinations. When a paper actually measured a combination of two methods, the entry calls it out as paper-measured and points to the section reference. When practitioners often pair methods without the paper testing the pairing, the entry says so in a separate "How it relates" section, not as a confident rule.
Cross-link completeness. Cluster entries link to all sibling concepts, the bridge entry, the structural siblings (answer block, passage-level optimization, etc.), and the cluster's umbrella term. If entry A lists B as related, entry B lists A back. The "Mentioned in" block at the bottom of each term page is auto-generated from those back-links; an asymmetric graph is a smell that one direction of the relationship was missed.
Working assumption sentence. For practitioner concepts where direct 2026-engine measurement does not yet exist, the Status section ends with a labeled "Working assumption" line: an honest practitioner prescription paired with an explicit note that the lift claim has not been independently measured on current commercial engines.

Cluster silent contradictions

A failure mode worth naming separately: when a single entry is revised but the revision is not propagated to the cluster's sibling entries. We have caught at least three instances of this during cluster review, each one different in shape but the same in pattern. In every case the original cluster review of one entry surfaced a phrasing improvement that was applied locally and not propagated, leaving the sibling entry in its previous wording even though both pages referenced the same underlying source:

Featured snippets vs answer block. The same Google Search Central documentation URL was cited in both entries but described as supporting subtly different claims about structured-response selection. Both readings were defensible from the source, but neither acknowledged the other reading existed. Resolution: both entries now reference the source with the same qualifier and explicitly link the sibling entry.
BM25 vs hybrid retrieval. The front-loading mechanism (whether and why front-loaded text gets retrieval-weighted) was described one way in BM25 and a slightly different way in hybrid retrieval. The contradiction was small enough to slip past per-entry review but visible when the two pages were read side by side. Resolution: a single mechanism description is now reused verbatim across both entries, with explicit "see also" cross-linking.
Vector embeddings vs hybrid retrieval + BM25. Front-loading framing and the embedding-dimension discussion were both inconsistent across three entries. Resolution: one canonical phrasing per concept, applied across all three, with the cluster review checklist updated to grep sibling entries for the same mechanism words before commit.

The workflow rule that resulted: any time a cluster sibling has a prior peer review, the new revision grep-checks the cluster siblings for the same mechanism words and verifies the new page uses cluster-consistent framing verbatim. Cluster contradictions are not just an editorial nuisance; they are signals that a reader cross-checking two entries will notice and lose trust over.

Why the cluster discipline is visible from the outside

Cluster work is internal craft, but readers see its surface effects:

Consistent wording across related entries (the same paper's finding reads the same way on every entry that cites it).
Symmetric Related-terms graphs (no orphan back-links).
Explicit labels separating paper findings from practitioner inference.
"Mentioned in" sections that surface the bidirectional connection without requiring manual maintenance.

Readers do not need to know the discipline exists for it to do its work. They notice when it is missing (an entry contradicts another, a related-terms link goes one way only, a paper finding is described differently in two places). The cluster work is what lets the corpus stay coherent as it grows past 60 terms.

How this connects to other parts of the site

The editorial-methodology page describes the per-entry workflow; this page describes the cross-entry consistency layer on top.
The ai-citation-metrics pillar is an example of the cluster discipline in pillar form: a synthesis page that comes out only after the 6 anchor entries are all peer-reviewed and cluster-consistent.

Known clusters as of 2026-05-27

The 5 cluster consistency rules

Cluster silent contradictions

Why the cluster discipline is visible from the outside

How this connects to other parts of the site

Other About pages