How is a topic cluster different from pillar content?

Pillar content is the hub page at the center of the cluster; the topic cluster is the full hub-and-spoke structure (pillar + spokes + their interlinking). The terms are often used interchangeably in marketing copy, but the pillar is one component of a cluster.

How many spoke pages per pillar?

Practitioners commonly land mature clusters in the 8–20 spoke range, but this is a heuristic; no engine has published a target, and the right number depends on how much real user-question surface the topic has. Quality matters more than count. Each spoke should answer a real user query independently, not exist only to support the pillar.

Is the topic cluster pattern still useful in AI search?

Yes, with a shifted value proposition. In classical SEO it concentrated PageRank on the pillar. In AI search the value shifts to topical clarity, internal discoverability, and passage-level usefulness; a well-built cluster may improve the likelihood that engines retrieve and cite relevant pillar or spoke pages for the topic. Whether the cluster structure itself (vs the cluster-level content quality) independently lifts citation rates has not been isolated by public study; measure directly per the methodology below rather than assuming the lift.

/terms/topic-clusters · 5 min read · foundational

Topic clusters

Topic clusters are a content architecture pattern: one pillar page covers a topic broadly, multiple spoke pages drill into sub-topics, all interlinked. Originated as classic SEO methodology; increasingly adapted for AI-search topic coverage, entity clarity, and retrieval-friendly content organization.

Citation status

ChatGPT·PerplexityClaudeCopilotGemini

Last checked 2026-05-20

What are topic clusters?

A topic cluster is a content architecture pattern: a long-form pillar page broadly covers a topic, and a set of narrower spoke pages each drill into one sub-topic. The pillar links out to every spoke; every spoke links back to the pillar. The structure was popularized by HubSpot's content marketing framework around 2017¹ and became standard SEO practice through the 2020s.

The classical SEO rationale was PageRank concentration: many internal links flowing toward the pillar boosted its ranking authority. In the AI search era, the value proposition has shifted toward topical clarity, retrieval-friendly content organization, and entity coherence: a well-built cluster can make topical relationships clearer and may improve the likelihood that engines retrieve relevant pillar or spoke pages for queries about the topic. Practitioners commonly observe clustered content earning citations across both pillar and spokes more reliably than equivalent isolated pages, though the independent effect of cluster structure vs cluster-level content quality has not been isolated by public study, and citation behavior varies by engine, query, and source authority.

Status in 2026

Mainstream but interpreted differently than in the 2017-2019 SEO playbooks when HubSpot's framework first dominated practice. Spoke pages now need to be cite-able as standalone answers. Embedding-based retrieval and RAG systems typically chunk content at paragraph or sentence boundaries and score retrieved passages individually, so thin spokes that exist only as PageRank funnel may underperform in AI-search retrieval²; exact chunking and scoring behavior varies per engine and is not vendor-documented (see the sub-document retrieval entry for parallel discussion). The 2026 best practice combines the original hub-and-spoke structure with DefinedTermSet-style schema markup for glossary-style clusters; this can make the cluster's collective scope more explicit and machine-readable at the schema layer, though whether engines treat schema-marked clusters as categorically distinct from interlinked content with the same topical coverage has not been isolated by public study.

How to apply

Topic clusters are signal-stacking applied across a content family rather than a single page. Three concrete moves:

Define the spoke set before writing the pillar: list 15–30 sub-topics that fall under the pillar's domain. Each becomes a candidate spoke. A pillar without a defined spoke set tends to become a sprawling page that ranks for nothing in particular.
Make every spoke independently cite-able: each spoke should answer a real user query and stand alone in retrieval. Don't write spokes as funnel bait; write them as standalone answers that happen to live in a cluster. FAQPage JSON-LD remains a valid schema vocabulary for question-and-answer structure where the spoke has real Q&A content, but no longer earns SERP rich results (Google fully deprecated FAQ rich results for all sites on May 7, 2026; see the FAQ schema entry); ship it for the underlying Q&A structure's machine readability rather than for SERP visual treatment.
Wire the cluster with DefinedTermSet for terminology hubs: if your cluster is a glossary or jargon set, mark the pillar as DefinedTermSet and link each DefinedTerm spoke via hasDefinedTerm. This can make the cluster's collective scope more explicit and machine-readable at the schema layer; it does not guarantee that engines will treat the cluster as a recognized entity collection. Recognition still depends on consistency, source trust, and the kind of content-level signals discussed on the entity-based SEO and knowledge graph entries.

What to skip: classical "PageRank funnel" thinking. The 2026 cluster goal is topical clarity and retrieval coverage, not link equity concentration. Spokes that exist only to feed the pillar tend to dilute rather than strengthen the cluster.

How to measure cluster effect on AI citation

Because the lift from cluster structure (vs cluster-level content quality) is not isolated by public study, the only reliable answer for any specific topic is to measure it directly. A practitioner protocol:

Fix a cluster-level prompt set before build-out: pick 15-30 user-question prompts that cover the pillar topic plus its main sub-topics. Lock the set; rotation breaks comparability across the measurement window.
Record citations per page-role separately: for each prompt, log whether the pillar was cited, whether any spoke was cited, whether a brand mention appeared without citation, and which engine surfaced what. Aggregating "the cluster was cited" hides whether the pillar is doing the work, the spokes are, or only one outlier spoke is.
Compute cluster-level citation share and attribution rate: roll up to the topic-cluster level rather than treating each URL independently; this is what makes the cluster framing operationally distinct from individual-page tracking (see the citation share and attribution rate entries).
Compare 4-8 weeks before vs after material build-out: ship the spoke set with measured Day-0 baseline, then re-probe weekly. Treat the trend line as the signal, not any single probe.
For Microsoft Copilot specifically: Bing Webmaster Tools' AI Performance dashboard (public preview since 2026-02-10) surfaces per-page citation counts and grounding queries, the only vendor-native measurement source for any AI surface as of mid-2026.

This loop turns "cluster lifts citation" from a confident causal claim into a verifiable observation specific to your topic, audience, and engine mix.

How it relates to other concepts

The hub-and-spoke pattern's hub is pillar content: pillar is the page, cluster is the structure.
Plausible contributor to Knowledge Graph signals when the cluster is wired with consistent Organization + DefinedTermSet schema; the independent effect of cluster structure on KG entity recognition vs equivalent un-clustered topical coverage has not been isolated.
The content-architecture layer of entity-based SEO: entity-based SEO is the "how" of marking entities; topic clusters are the "where" (the page-set scope across which the entity work compounds).
For terminology clusters specifically, DefinedTerm schema + DefinedTermSet form the schema-layer backbone.
Operates through sub-document retrieval at the engine side: clusters matter for AI search partly because retrieval scores passages individually, so a well-structured spoke is more retrievable than an equivalent paragraph buried in the pillar.
Direct content-strategy enabler of GEO at scale. Single pages compete passage-by-passage; clusters give the topic a larger surface area at the passage level.

HubSpot's original framing of topic clusters and pillar pages, the framework that popularized the hub-and-spoke content pattern (2017). blog.hubspot.com/marketing/topic-clusters-seo. ↩
Aggarwal et al. "GEO: Generative Engine Optimization." arXiv:2311.09735, November 2023. Tests 9 LLM-prompted content-modification methods at source-page level against a Position-Adjusted Word Count (PAWC) visibility metric; top performers include Quotation Addition (PAWC 27.2 vs the no-modification baseline of 19.3, ~41% relative gain), Statistics Addition (~31%), Fluency Optimization (~28%), and Cite Sources (~27%); these per-method percentages are derived from the paper's position-adjusted PAWC scores (the "Overall" column; the un-adjusted Word sub-column reads 27.8 / 25.9 / 25.1 / 24.9) against the 19.3 baseline, while the paper's own Results section names a 30-40% gain for its top-3 (Cite Sources, Quotation Addition, Statistics Addition). The abstract's "up to 40%" is the rounded form of the best method's ~41% position-adjusted gain (Quotation Addition, 27.2 over baseline 19.3). The paper does not distinguish pillar vs spoke pages (those are SEO/GEO concepts, not paper terminology); the editorial inference that individual spoke pages (not just the pillar) need to be cite-ready is a glossary extension of the paper's source-level findings, not a paper conclusion. Counter-evidence: a 2025 follow-up benchmark³ tested 7 of these 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on citation ranking; the 2023 PAWC effect sizes remain valid for the single-actor synthetic testbed but set an empirical upper bound, not a production prediction. ↩
See the C-SEO Bench glossary entry for the full paper attribution (Puerto, Gubri, Green, Oh, Yun. "C-SEO Bench: Does Conversational SEO Work?" arXiv:2506.11097, NeurIPS 2025 Datasets & Benchmarks Track), method-by-method results, multi-actor evaluation methodology, and the full verbatim findings. ↩

Part of Search foundations· editorial cluster, not a semantic link

Also in this cluster: AI Overview · Answer block · Authority signals · E-E-A-T (AI search context) · Entity-based SEO · +5 more

FAQ

How is a topic cluster different from pillar content?: Pillar content is the hub page at the center of the cluster; the topic cluster is the full hub-and-spoke structure (pillar + spokes + their interlinking). The terms are often used interchangeably in marketing copy, but the pillar is one component of a cluster.
How many spoke pages per pillar?: Practitioners commonly land mature clusters in the 8–20 spoke range, but this is a heuristic; no engine has published a target, and the right number depends on how much real user-question surface the topic has. Quality matters more than count. Each spoke should answer a real user query independently, not exist only to support the pillar.
Is the topic cluster pattern still useful in AI search?: Yes, with a shifted value proposition. In classical SEO it concentrated PageRank on the pillar. In AI search the value shifts to topical clarity, internal discoverability, and passage-level usefulness; a well-built cluster may improve the likelihood that engines retrieve and cite relevant pillar or spoke pages for the topic. Whether the cluster structure itself (vs the cluster-level content quality) independently lifts citation rates has not been isolated by public study; measure directly per the methodology below rather than assuming the lift.

Sources & further reading

New terms shipped that week, plus one observation from the AI-citation tracker.

More about what you'll get

Last fact-checked 2026-05-17. Spotted an error or stale claim? See editorial methodology.

Changelog (6 entries)

2026-06-21: Revalued the Aggarwal per-method figures in the footnote to the paper's actual position-adjusted PAWC values (the 'Overall' column: Quotation Addition 27.2 vs baseline 19.3, about +41%). The earlier figures (27.8 vs 19.5) were the paper's plain Word Count sub-column, not its position-adjusted metric.
2026-06-20: Clarified that the per-method PAWC percentages in the Aggarwal footnote are derived from the paper's absolute scores against the 19.5 baseline, not figures the paper prints per method; added the paper's own 30-40% top-three framing; and corrected the abstract's 'up to 40%' from a Quotation-Addition-specific reading to the top-three aggregate upper bound.
2026-05-28: Added the C-SEO Bench (arXiv:2506.11097, NeurIPS Datasets & Benchmarks 2025) counter-evidence anchor to the Aggarwal footnote. The 2023 PAWC effect sizes for Quotation Addition, Statistics Addition, Fluency Optimization, and Cite Sources remain valid for the single-actor synthetic testbed but C-SEO Bench tested 7 of the 9 methods in multi-actor production-realistic conditions and found most largely ineffective or slightly negative on citation ranking. The cluster pattern remains a useful editorial structure, but its citation-lift estimates should reference both papers.
2026-05-20: ChatGPT citation. A 2026-05-20 probe returned this entry as the 4th source in a panel led by Semrush ('Topic clusters | GEO Glossary, May 14 2026'). Notable: surfaced alongside major SEO publication content (Semrush, Seologist) on a vendor-canonical topic. citationStatus.chatgpt from not-cited to cited. Part of a same-day seven-new-citation ChatGPT sweep; full probe evidence preserved internally.
2026-05-17: Several confident engine-behavior claims softened to practitioner observation to match the cluster ('engines treat the cluster as canonical, raising citation rates'; 'DefinedTermSet converts the cluster into an entity collection'). Critical fix: the FAQPage JSON-LD spoke-page recommendation now carries the May 7 2026 rich-result deprecation caveat (it was recommending schema for a SERP feature that no longer renders). Aggarwal footnote precision-fixed (per-method PAWC; the 'spoke pages need cite-ready' framing is glossary inference). Added a measurement methodology block.
2026-05-14: Initial publish