Research · Dispatch #2 ·

Google caught up: the AI-citation gap looks like a reporting lag

Last month, AI engines cited a page Google's index report showed as unindexed. We said we would track whether Google caught up. It did, and the best-supported reading is a reporting lag, with Gemini as the clock that nearly proves it.

The second GEO Glossary dispatch. Last month's finding came with a promise to follow up, and following up changed it. That is the point of measuring in public: the correction is the story, and this correction has a clock we under-used the first time.

Last month, in one line

Dispatch #1 (AI engines cited this page before Google indexed it) reported a single case: our citation precision page was cited by three AI engines on June 1 while Google Search Console still showed it "Crawled, currently not indexed." We built the full table and found it was not a lone case but a small cluster of five recent pages in the same spot. We flagged the caveat up front: Search Console's index report lags, so this was co-occurrence, not a proven sequence, and the only way to settle it was to re-check later. This is the re-check.

Google caught up

As of June 10, Search Console reports 81 of 86 known pages indexed, up from 53 of 76 a week earlier. (The 86 includes site pages like the homepage and about pages; narrowing to term pages, all 70 of the ones we published on or before May 31 are now indexed.) The four tracked pages old enough to appear in the latest report are all indexed, and URL Inspection shows Google crawled the citation-precision page on May 30, about a day after we published it. So the gap closed, and Google had in fact crawled the page almost immediately.

Why the report showed a gap: it lags the live index

The distinction we under-used last month: Search Console's index-coverage report is not the index Google serves from. It is a separate, lagging view of it. A page can be live in Google's serving index, and returned in search results, while the coverage report still reads "Crawled, currently not indexed."

We can now show that directly. We ran URL Inspection, Google's live per-URL tool, on our three newest pages, PAWC, GEO content methods, and one more recent entry. All three return "URL is on Google, page is indexed," even though none of them appears in the aggregate coverage report. The same Google, queried two ways at the same moment, gives two answers: the live tool says indexed, the aggregate report does not have them. The report is behind.

Gemini corroborates from the citation side. Gemini is not like the other engines: it grounds its answers on Google Search, so when it cites a page it reached that page through Google's live index. ChatGPT and Perplexity retrieve from their own indexes and live fetches, so they can cite a page whatever Google is doing; Gemini cannot. Its June 9 citation of the GEO-content-methods page is itself a sign the page was in Google's serving index that day, which is exactly what URL Inspection confirms. The live index was ahead of the report.

The same mechanism is the most likely explanation for last month's original case, citation precision, though that page does not carry its own Gemini citation to prove it. URL Inspection shows Google crawled it May 30, a day after publish; ChatGPT and Perplexity cited it by June 1; the coverage report still showed it unindexed on June 3 and confirmed it only later. Those two engines retrieve from their own substrates, so they cannot pin Google's state the way Gemini can, which is why that case stays suggestive. But the shape fits: crawled fast, served soon, report late.

So the finding shrinks to the honest size: not "AI beats Google," but "Google's coverage report lags Google's own live index, by something like one to two weeks on a young site." We will not pin the exact number. Gemini's grounding is not fully public, so its citation is strong evidence rather than proof, and the coverage report can only bound the lag from one side. The clean confirmatory test is URL Inspection's live "URL is on Google" status, which timestamps the serving index directly, and which we will lean on going forward.

There is a usable takeaway for any publisher in it. Do not treat Search Console's index-coverage report as real-time. A new page shown as "Crawled, currently not indexed" may already be live in Google's index and already citable; check URL Inspection's live status before you conclude that Google, or AI search, is ignoring it.

How we measure "cited"

One disclosure, because it changes how to read the numbers. When we say a page is "cited by AI," we mean this: in our weekly hand-run probes, with a fixed prompt that asks each engine to cite its web sources, at least one of the five engines named the page at least once, counting both inline citations and the folded "more sources" list. It is elicited, cumulative across rounds, and volatile, not a measure of spontaneous citation. Read "cited" as "has shown up as a source under a standard prompt," not "the AI recommends this unprompted."

The newest pages: already indexed, just not in the report

That is the same story playing out live. The three pages we inspected above were published this month and cited by AI within four days of going live, and Google already has all three in its index; only the aggregate report has not listed them yet. Last month we promised to re-check whether Google would catch up. Checking the live status directly is the honest way to answer, and the answer is that it already has. The lag is a property of the report, not of Google's index.

The bigger, better-supported gap

Set the timing aside and the standing picture is the more useful result, and the one our table supports most directly. Across the mature corpus, AI citation and Google indexing now largely agree, and the off-diagonal that started this thread is empty. The asymmetry that remains runs the other way: plenty of pages are indexed by Google but cited by no AI engine. Getting into Google's index is turning out to be the easier half; earning an AI citation is the scarcer, harder state, and that holds even under the loose definition of "cited" above. That is the gap worth working on, and it is what these dispatches will keep measuring.

More dispatches