/terms/statistical-density

Statistical density

Statistical density is the rate at which a piece of content presents verifiable statistics, dates, and numerical claims — identified in the 2023 Princeton GEO paper as one of the strongest signals correlated with AI-engine citation.

Citation status

ChatGPTPerplexityClaudeCopilot

Last checked 2026-05-21

What is statistical density?

A measured concept from Aggarwal et al. 2023 (Princeton). Content with more numerical facts — statistics, percentages, dates, quantified claims — was found to be cited at meaningfully higher rates by generative engines. The hypothesis: AI engines prefer cite-able passages because their attribution layer needs to ground claims, and numerical assertions are more easily grounded than purely qualitative ones.

The paper's measurement: the ratio of numerical statements to total sentences within a passage.

Status in 2026

Widely-referenced but rarely measured. Practitioners interpret the finding loosely as "add more stats and citations to your content" — without computing the actual density. Few tools automate the measurement. The original paper tested nine GEO methods; statistical density was among the three with the largest measured citation uplifts (alongside citing sources and authority signals).

How it relates to other concepts

FAQ

Does adding random statistics increase citation?
Only when the statistics are accurate and sourced. AI engines downweight statistical claims that conflict with their training data or with other retrieved sources during the same generation. An incorrect statistic with high prominence can actually hurt citation rate.
What is a good statistical density target?
The original Princeton paper found significant citation uplifts from 1-2 statistics per 200-word block. Higher densities had diminishing returns, and densities above ~5 per 200 words risk appearing 'fact-stuffed' to ranking algorithms.
Should I cite the source of each statistic?
Yes. Sourced statistics are weighted meaningfully higher than unsourced. A statistic with a date and source citation is among the strongest individual cite-ability signals identified in GEO research.

Sources & further reading