Research · Dispatch #3 ·
Cited more on Gemini, less on ChatGPT: a gradient, and what it is not
Our pages show up as AI sources far more often on Gemini than on ChatGPT. The tempting read is that Gemini likes our content better. After checking who else each engine could have cited, our probes suggest something narrower and partly still-confounded: a willingness to credit the source that coined a term.
The third GEO Glossary dispatch. This one is about a number that flatters us, and what happened when we tried to earn the right to report it.
The pattern
Across four weekly probe rounds in June, run identically against five AI engines, we tallied which of our pages each engine cited and then asked a narrower question: of the pages an engine cited, how many were terms we coined ourselves rather than established concepts we restated? The share climbs steadily by engine.
| Engine | Our pages it cited | Of those, our own coinages | Coinage share |
|---|---|---|---|
| ChatGPT | 10 | 1 | 10% |
| Claude | 5 | 2 | 40% |
| Perplexity | 6 | 3 | 50% |
| Gemini | 8 | 5 | 63% |
(Copilot cited nothing across the four rounds, which we have interpreted separately as a likely Bing-side discoverability gap, so it is left out here. Counts are small; read this as a direction to retest, not a measurement. These per-engine counts also come from this specific four-round panel, which over-samples our coined terms; they are smaller than the cumulative cited counts on our Observatory, which use a wider window.)
There is a clean gradient. On ChatGPT, almost everything it cites us for is an established term we explain well. On Gemini, most of what it cites us for is vocabulary we invented. The tempting sentence writes itself: Gemini likes our original work; ChatGPT does not. We almost shipped that sentence. It is wrong, or at least unearned, and the reason is worth more than the gradient.
Why the easy reading is a trap
When an engine answers "what is [a term we coined]," who else can it cite? Often almost no one. We invented the term, so we are frequently the only page that defines it. A citation in that situation is not the engine preferring our quality. It is the engine having no alternative. For coined terms, "we got cited" and "we were the only source" are nearly the same fact, and you cannot tell a quality signal apart from a monopoly by looking at the citation alone.
That confound sits underneath the whole gradient. Coined terms cluster on the engines at the bottom of the table, and coined terms are exactly the ones where we tend to be the sole source. So the gradient could be measuring nothing more than "which engines are willing to cite an obscure single-source page," which is not the flattering story at all.
So we checked who else was available
The way out is to stop looking at the citations we won and look at what the engines cited when they did not cite us, on the same prompts. If an engine had real alternatives and still chose us, that is a choice. If it never had alternatives, the citation was hollow.
Our coined terms turned out to have more competition than "we invented it" suggests. When we coded, for each coined term, what the engines cited on its prompt when they did not cite us, every coined term we got cited for had at least adjacent competing content in the same results: other sites' "AI citability playbook," an "ai-citeability" glossary entry, a GEO agency's citation playbook, arXiv papers on probing, and so on. The engines had somewhere else to go and sometimes went there. Cite-ability, our oldest coinage, was cited by three different engines across the four rounds with those alternatives sitting right there in the results. On one round ChatGPT cited a competitor's page for the exact prompt; on another it cited us. We read the aggregate, not that single switch, as the signal: a two-day flip is also consistent with the citation volatility we keep flagging, so it shows the alternatives were live and we were citable alongside them, not that the engine deliberately upgraded to us.
That is what rescues the gradient from the worst version of the trap. For the coined terms that actually got cited, all of which had available substitutes, being cited was a selection over alternatives, not a default the engine fell into for lack of options. The monopoly confound is real for any genuinely sole-source term, where we still cannot tell deference from no-alternative; but it is not what is carrying the gradient, because the cited coinages were not sole-source.
The honest version: deference to the originator
With that established, the gradient says something after all, but not what the flattering sentence said. It is not that Gemini judges our writing better. In this panel, Gemini was the engine most willing to credit the source that originated a term. When a concept traces back to one page, and that page is ours because we coined it, Gemini reaches for that page. By source of record we mean the page that looks like the earliest or clearest public definition of a term, not the highest-authority page on the broader topic. For our own coinages, that page is ours, so the preference lands on us.
The residual confound we owe the same skepticism
Here is where we have to turn the method on our own renaming, because "originator-deference" has not yet passed the test we just put the gradient through. Last month's dispatch established that Gemini grounds its answers in Google's index, and that the live index and AI citations now largely agree. But for a term we coined, the originator page and the top Google result are the same page: we invented the word, so we rank first for it. That means "Gemini cites the originator" and "Gemini cites whatever ranks first in Google" predict the identical citation for every coinage. Our second piece of evidence, that Gemini cites an academic paper and skips our explainer, does not separate them either, because the paper also ranks highly. So on the Gemini end we cannot yet tell originator-deference apart from Google-rank mirroring, which is a thing we ourselves demonstrated two dispatches running. The clean test is a coined term that a competitor outranks in Google: if Gemini still cites us, it is deferring to the originator; if it follows the higher-ranked competitor, it was mirroring Google all along. We do not have that case in this panel. It is the first thing the next round will look for.
The cleaner half: ChatGPT and the territory line
ChatGPT sits at the other end, and its end is both more explainable and free of that confound, because ChatGPT does not ground in Google's index. In this panel it leans toward established, discoverable sources, but with a sharp internal split. When we separate its citations of established terms, it cited us on five of seven engine-surface and emerging terms but only one of nineteen classic textbook terms: 71 percent versus 5 percent. Those emerging terms have competitors too (vendor docs, other glossaries), so citing us there was a selection, not a monopoly. That split is steeper than the coinage gradient and harder to explain away, and it lands directly on a claim we had only ever supported with Google-ranking data before: that this glossary wins in emerging and contested-with-an-angle territory and loses on settled, canonical terms where a heavyweight reference exists. Until now that territory pattern showed up only in where our pages ranked; here it shows up in what an AI engine actually cites.
Renamed as carefully as we can, the gradient is an originator-deference gradient: how willing an engine is to treat the coiner of a term as the citable source for it, with the honest caveat that on the Gemini end we cannot yet separate that from Gemini mirroring Google's first result. It is a signal to retest, not a settled trait of the engines, and not a grade on us.
One caution on the numbers
The levels in that table are inflated by how we built the probe panel. We over-sampled our own coined terms on purpose, because they are the most interesting to track, so the coinage share is higher everywhere than it would be on a neutral panel. What survives that bias is the gradient between engines, not the absolute percentages, because every engine saw the exact same panel. The shared panel controls the prompt mix; it does not remove small-sample volatility, so the gradient is a tracked signal to retest, not a fixed property. Read the table as "Gemini ranked higher than ChatGPT on crediting originators in this panel," never as "63% of what Gemini cites is ours."
How we measure "cited"
The same disclosure we attach to every dispatch, because it changes how to read the table. "Cited" means: in our weekly hand-run probes, logged out and in a private window, with a fixed prompt that asks each engine to cite its web sources, the engine named the page at least once, counting both inline citations and the folded "more sources" list. It is elicited, cumulative across rounds, and volatile. Read it as "showed up as a source under a standard prompt," not "the AI recommends this unprompted."
| Dimension | This dispatch |
|---|---|
| Rounds | 4 weekly rounds, June 2026 |
| Engines | ChatGPT, Claude, Perplexity, Gemini (Copilot excluded: cited nothing) |
| Mode | Logged out, private window, web search on, no personalization |
| Prompt | Fixed, asks the engine to cite its web sources |
| Counted as cited | Inline citation or folded "more sources" entry |
| Unit | A glossary page named at least once |
| Panel | Over-samples our coined terms; not neutral |
| Read as | Elicited, cumulative, volatile, small-n: a direction to retest |
The takeaway for a publisher
In our current probes, if you coin a term or are the origin point for a concept, Gemini has been the engine most likely to credit that origin point as a source, and that credit may compound when the term clearly traces back to your page, though we still owe that reading the Google-rank test above. If your work is a high-quality explanation of an established concept, ChatGPT can cite you, but its bar is discoverability and authority, and on a truly settled term it tends to reach past you to the standard reference. The practical move is to match the engine to your territory rather than chase a single citation everywhere. And the discipline underneath it is the part we would keep: when a metric flatters you, the useful work is finding out whether it survives the alternative the metric ignored, and then whether your own explanation survives it too. Ours did the first test and half of the second, which is why this is smaller and more carefully named than the gradient we almost shipped.