About · /about/cluster-discipline
Cluster discipline
How GEO Glossary keeps related entries consistent. The 5 consistency rules that apply to clusters of 5-10 related concepts, and why cross-page consistency is a load-bearing trust signal.
Many terms in this glossary belong to clusters of related concepts (often 5-10 entries, sometimes more as a cluster matures). The cluster groupings we currently treat as load-bearing for consistency checks are listed below. For clusters drawn from the same paper or vendor doc, every entry runs through a consistency checklist before publish or substantive update.
Known clusters as of 2026-05-27
The list grows and shifts as new entries are added and cluster boundaries firm up. Some entries belong to more than one cluster (for example, sub-document retrieval sits in both the retrieval pipeline cluster and the content structure cluster); the consistency rules apply across each cluster an entry belongs to.
- Citation metrics cluster (6 anchor entries + 1 pillar): attribution rate, citation share, citation match rate, cite-ability, citation velocity, citation rotation. Synthesised in the AI citation metrics pillar.
- Citation surfaces cluster (8 anchor entries): AI Overview citation, AI Mode, Microsoft Copilot citations, Perplexity citation, Claude citation, Gemini citation, ChatGPT search citation, AI dev tool citations.
- Retrieval pipeline cluster: BM25, vector embeddings, hybrid retrieval, reranking, inverted index, sub-document retrieval, sub-passage extraction, passage-level optimization, RAG.
- Schema cluster: DefinedTerm schema, FAQ schema, Breadcrumb list, JSON-LD, Article schema, HowTo schema.
- GEO content methods cluster (Aggarwal 2023-derived): Quotation Addition, Cite Sources Optimization, Fluency Optimization, Statistical Density, Authoritative Statement Strength.
- Optimization umbrella cluster: Generative Engine Optimization, Answer Engine Optimization, AI Search Optimization, LLM Optimization.
Other groupings (authority signals, content structure, indexing and crawling, AI failure modes, traffic-side measurement) are still in formation as their member count grows; they are tracked internally but not yet treated as load-bearing for cross-page consistency in the same way as the six above.
The 5 cluster consistency rules
- Paper-vs-glossary naming. When an entry's glossary title diverges from the source paper's method name (for example "Cite Sources Optimization" vs the paper's "Cite Sources"), the entry's FAQ includes a one-question explainer so readers searching either name land in the same place and understand what the suffix means.
- Verbatim paper claims. When the same finding is referenced across sibling entries, the wording is lifted verbatim from one canonical mention, including section / figure / subset qualifiers. The same number means the same phrasing on every page. This rule has caught at least three cluster contradictions where one entry described a paper's methodology one way and a sibling entry described it differently; the verbatim discipline forces the divergence to surface during peer review.
- Measured vs hand-wave combinations. When a paper actually measured a combination of two methods, the entry calls it out as paper-measured and points to the section reference. When practitioners often pair methods without the paper testing the pairing, the entry says so in a separate "How it relates" section, not as a confident rule.
- Cross-link completeness. Cluster entries link to all sibling concepts, the bridge entry, the structural siblings (answer block, passage-level optimization, etc.), and the cluster's umbrella term. If entry A lists B as related, entry B lists A back. The "Mentioned in" block at the bottom of each term page is auto-generated from those back-links; an asymmetric graph is a smell that one direction of the relationship was missed.
- Working assumption sentence. For practitioner concepts where direct 2026-engine measurement does not yet exist, the Status section ends with a labeled "Working assumption" line: an honest practitioner prescription paired with an explicit note that the lift claim has not been independently measured on current commercial engines.
Cluster silent contradictions
A failure mode worth naming separately: when a single entry is revised but the revision is not propagated to the cluster's sibling entries. We have caught at least three instances of this during cluster review, each one different in shape but the same in pattern. In every case the original cluster review of one entry surfaced a phrasing improvement that was applied locally and not propagated, leaving the sibling entry in its previous wording even though both pages referenced the same underlying source:
- Featured snippets vs answer block. The same Google Search Central documentation URL was cited in both entries but described as supporting subtly different claims about structured-response selection. Both readings were defensible from the source, but neither acknowledged the other reading existed. Resolution: both entries now reference the source with the same qualifier and explicitly link the sibling entry.
- BM25 vs hybrid retrieval. The front-loading mechanism (whether and why front-loaded text gets retrieval-weighted) was described one way in BM25 and a slightly different way in hybrid retrieval. The contradiction was small enough to slip past per-entry review but visible when the two pages were read side by side. Resolution: a single mechanism description is now reused verbatim across both entries, with explicit "see also" cross-linking.
- Vector embeddings vs hybrid retrieval + BM25. Front-loading framing and the embedding-dimension discussion were both inconsistent across three entries. Resolution: one canonical phrasing per concept, applied across all three, with the cluster review checklist updated to grep sibling entries for the same mechanism words before commit.
The workflow rule that resulted: any time a cluster sibling has a prior peer review, the new revision grep-checks the cluster siblings for the same mechanism words and verifies the new page uses cluster-consistent framing verbatim. Cluster contradictions are not just an editorial nuisance; they are signals that a reader cross-checking two entries will notice and lose trust over.
Why the cluster discipline is visible from the outside
Cluster work is internal craft, but readers see its surface effects:
- Consistent wording across related entries (the same paper's finding reads the same way on every entry that cites it).
- Symmetric Related-terms graphs (no orphan back-links).
- Explicit labels separating paper findings from practitioner inference.
- "Mentioned in" sections that surface the bidirectional connection without requiring manual maintenance.
Readers do not need to know the discipline exists for it to do its work. They notice when it is missing (an entry contradicts another, a related-terms link goes one way only, a paper finding is described differently in two places). The cluster work is what lets the corpus stay coherent as it grows past 60 terms.
How this connects to other parts of the site
- The editorial-methodology page describes the per-entry workflow; this page describes the cross-entry consistency layer on top.
- The ai-citation-metrics pillar is an example of the cluster discipline in pillar form: a synthesis page that comes out only after the 6 anchor entries are all peer-reviewed and cluster-consistent.
Other About pages
- Editorial methodologyHow every GEO Glossary entry is drafted, fact-audited, peer-reviewed, and schema-validated before publish. The 4-step workflow + the fact-check cadence.
- Citation trackingHow the citation status badges on each term page move from untested to cited or not-cited. The multi-source signals, the discipline, and the measurement gaps we openly acknowledge.
- Why this existsWhy GEO Glossary is positioned as an editorial-first, vendor-neutral, indie-maintained terminology reference for AI search in 2026 (we are not aware of another reference combining all three properties with public editorial methodology, but cannot claim to be the only one). The brand positioning, the founder lineage, and what we are explicitly not.