Cited more on Gemini, less on ChatGPT: a gradient, and what it is not

The third GEO Glossary dispatch. This one is about a number that flatters us, and what happened when we tried to earn the right to report it.

The pattern

Across four weekly probe rounds in June, run identically against five AI engines, we tallied which of our pages each engine cited and then asked a narrower question: of the pages an engine cited, how many were terms we coined ourselves rather than established concepts we restated? The share climbs steadily by engine.

Engine	Our pages it cited	Of those, our own coinages	Coinage share
ChatGPT	10	1	10%
Claude	5	2	40%
Perplexity	6	3	50%
Gemini	8	5	63%

(Copilot cited nothing across the four rounds, which we have interpreted separately as a likely Bing-side discoverability gap, so it is left out here. Counts are small; read this as a direction to retest, not a measurement. These per-engine counts also come from this specific four-round panel, which over-samples our coined terms; they are smaller than the cumulative cited counts on our Observatory, which use a wider window.)

There is a clean gradient. On ChatGPT, almost everything it cites us for is an established term we explain well. On Gemini, most of what it cites us for is vocabulary we invented. The tempting sentence writes itself: Gemini likes our original work; ChatGPT does not. We almost shipped that sentence. It is wrong, or at least unearned, and the reason is worth more than the gradient.

Why the easy reading is a trap

When an engine answers "what is [a term we coined]," who else can it cite? Often almost no one. We invented the term, so we are frequently the only page that defines it. A citation in that situation is not the engine preferring our quality. It is the engine having no alternative. For coined terms, "we got cited" and "we were the only source" are nearly the same fact, and you cannot tell a quality signal apart from a monopoly by looking at the citation alone.

That confound sits underneath the whole gradient. Coined terms cluster on the engines at the bottom of the table, and coined terms are exactly the ones where we tend to be the sole source. So the gradient could be measuring nothing more than "which engines are willing to cite an obscure single-source page," which is not the flattering story at all.

So we checked who else was available

The way out is to stop looking at the citations we won and look at what the engines cited when they did not cite us, on the same prompts. If an engine had real alternatives and still chose us, that is a choice. If it never had alternatives, the citation was hollow.

Our coined terms turned out to have more competition than "we invented it" suggests. When we coded, for each coined term, what the engines cited on its prompt when they did not cite us, every coined term we got cited for had at least adjacent competing content in the same results: other sites' "AI citability playbook," an "ai-citeability" glossary entry, a GEO agency's citation playbook, arXiv papers on probing, and so on. The engines had somewhere else to go and sometimes went there. Cite-ability, our oldest coinage, was cited by three different engines across the four rounds with those alternatives sitting right there in the results. On one round ChatGPT cited a competitor's page for the exact prompt; on another it cited us. We read the aggregate, not that single switch, as the signal: a two-day flip is also consistent with the citation volatility we keep flagging, so it shows the alternatives were live and we were citable alongside them, not that the engine deliberately upgraded to us.

That is what rescues the gradient from the worst version of the trap. For the coined terms that actually got cited, all of which had available substitutes, being cited was a selection over alternatives, not a default the engine fell into for lack of options. The monopoly confound is real for any genuinely sole-source term, where we still cannot tell deference from no-alternative; but it is not what is carrying the gradient, because the cited coinages were not sole-source.

We ran the test, and the flattering half failed

With the monopoly confound cleared, the gradient looked like it credited the originator: Gemini, more than any other engine, reaching for the page that coined a term. That is the reading we shipped two weeks ago, but with one honest caveat and a promise to test it. Last month's dispatch had established that Gemini grounds its answers in Google's index, and that the live index and AI citations now largely agree. That is the confound the gradient owed. For a term we coined, the originator page and the top Google result are usually the same page: we invented the word, so we rank first for it. "Gemini cites the originator" and "Gemini cites whatever Google ranks first" therefore predict the same citation for every coinage we rank first on. The clean way to separate them is a coined term that a competitor outranks in Google. If Gemini still cites us, it is crediting the originator; if it follows the higher-ranked competitor, it was mirroring Google all along.

The next round supplied that case, and several more, because our coinages do not all rank the same. Sorted by where our page sits in Google, the citations line up with rank, not with authorship:

Our coinage	Where we rank in Google	What Gemini did
external-traffic-disambiguation	~4, no real rival	cited us
citation precision	~8	cited us
cite-ability	~21 on the definitional query	cited a higher-ranked third party
citation rotation	~25	cited a third party
citation share	~37	cited a third party

Gemini cited us on the coinages where we rank near the top and dropped us, for third-party pages, on the ones where we rank well down the page. Cite-ability is the sharpest single case, because it is unambiguously our coinage and unambiguously outranked: several third-party pages now define the same idea above us, and Gemini cited one of them, not the page that named the concept. That switch alone refutes deference to the originator; the rest of the column shows it was not a quirk of one term.

One honest limit, since it is the same skepticism the first half of this piece ran on. The top row is the weakest evidence here, not the strongest. External-traffic-disambiguation is a term we rank first on and have no real competitor for, so "Gemini cited us because we rank first" and "Gemini cited us because there was no one else to cite" cannot be separated there: it is the monopoly confound wearing the costume of a control. The weight is carried by the bottom three rows, where a higher-ranked rival existed and Gemini took it. The cleanest cell is still missing, the mirror image of cite-ability: a coinage we rank first on with a real competitor sitting just below us, to see whether Gemini stays with us when rank and authorship point the same way against a live alternative. We do not have that exact case yet. It is the next thing to probe.

So the flattering half of the gradient does not survive its own test. Gemini's high coinage share was not deference to the originator; it was Google rank in a costume. We rank first for most of our coinages because we invented them, Gemini follows Google's first result, and the two facts had been passing themselves off as a preference for our original work. The honest version of the Gemini end is the dull one: it cites the page Google already ranks first, which for an uncontested coinage happens to be ours.

Where the other engines sit, and the cleaner ChatGPT half

The remaining engines sort onto the same axis once you ask whether they follow Google's index. Perplexity behaves like Gemini: on cite-ability it also passed over our outranked page for a higher-ranked third party, which is what an index-grounded engine does. Claude is not a point on this axis at all but a different mechanism; it skipped cite-ability too, yet it tends to reach for the primary source where one exists, citing the original arXiv paper for citation precision and recall rather than our explainer. Its misses are about preferring the source of record, not about rank. That leaves ChatGPT alone at the end that does not ground in Google's index.

ChatGPT is where the one genuine originator signal lives, and it is worth stating exactly how thin it is. Cite-ability, the term the index-grounded engines dropped for whatever outranked us, was cited by ChatGPT at the very top of its sources, despite our page ranking far down in Google. Of the five engines, only ChatGPT cited it. That is the cleanest sign in this whole dispatch that an engine can credit a coiner over higher-ranked rivals, and on its own it is almost nothing: one term, one capture, one round. Treat it as the single thread it is.

Keep it separate from the next finding, which is sturdier but is a different claim. ChatGPT's broader behavior in the panel is about territory, not authorship. Separating its citations of established terms, it cited us on five of seven engine-surface and emerging terms but only one of nineteen classic textbook terms: 71 percent versus 5 percent. Those emerging terms have competitors too, so citing us there was a selection, not a monopoly. That split rests on far more observations than the single cite-ability capture, and it says something narrower than crediting originators: that this glossary wins in emerging and contested-with-an-angle territory and loses on settled, canonical terms where a heavyweight reference exists. We had only ever supported that with where our pages ranked; here it shows up in what an AI engine actually cites. The territory split is the sturdy result and the originator capture is the fragile one; do not let the first vouch for the second.

So the gradient is mostly a rank mirror, not an originator-deference gradient. The name we almost shipped, and the more careful one we did ship two weeks ago, were both still too kind to us at the Gemini end. What survives is narrower: an engine that does not follow Google rank cited the coiner of a low-ranked term over its higher-ranked rivals, once; and that same engine has a steep, repeatable lean toward our emerging-territory pages over classic textbook ones. The Gemini gradient was an artifact of rank. We owed it the test, the test came back against the flattering reading, and reporting that is the whole point of measuring in public.

One caution on the numbers

The levels in that table are inflated by how we built the probe panel. We over-sampled our own coined terms on purpose, because they are the most interesting to track, so the coinage share is higher everywhere than it would be on a neutral panel. What survives that bias is the gradient between engines, not the absolute percentages, because every engine saw the exact same panel. The shared panel controls the prompt mix; it does not remove small-sample volatility, so the gradient is a tracked signal to retest, not a fixed property. Read the table as "Gemini ranked higher than ChatGPT on crediting originators in this panel," never as "63% of what Gemini cites is ours."

How we measure "cited"

The same disclosure we attach to every dispatch, because it changes how to read the table. "Cited" means: in our weekly hand-run probes, logged out and in a private window, with a fixed prompt that asks each engine to cite its web sources, the engine named the page at least once, counting both inline citations and the folded "more sources" list. It is elicited, cumulative across rounds, and volatile. Read it as "showed up as a source under a standard prompt," not "the AI recommends this unprompted."

Dimension	This dispatch
Rounds	4 weekly rounds, June 2026
Engines	ChatGPT, Claude, Perplexity, Gemini (Copilot excluded: cited nothing)
Mode	Logged out, private window, web search on, no personalization
Prompt	Fixed, asks the engine to cite its web sources
Counted as cited	Inline citation or folded "more sources" entry
Unit	A glossary page named at least once
Panel	Over-samples our coined terms; not neutral
Read as	Elicited, cumulative, volatile, small-n: a direction to retest

The takeaway for a publisher

In our current probes, the way to be cited by Gemini for a term you coined is not to be its originator; it is to be Google's first result for it. Gemini mirrors the index, so for an uncontested coinage you usually are that result, but the moment a competitor outranks you, Gemini follows them. ChatGPT is the one engine here that does not work that way: it cited the coiner of a term it ranks far down in Google, and it leans toward emerging and contested-with-an-angle territory over settled textbook terms where a heavyweight reference exists. The practical move is unchanged and now better grounded: match the engine to your territory, win Google rank where Gemini is the prize, and write the clearest explanation where ChatGPT is. The discipline underneath it is the part we would keep. When a metric flatters you, the work is finding out whether it survives the alternative the metric ignored. This one did not, and saying so is the point.