/terms/meta-ai-citation · 5 min read · intermediate
Meta AI citation
Citation status
Last checked 2026-05-28
Meta AI citation is the discrete event of a webpage being included as a linked source in Meta AI's answers across its surfaces: WhatsApp, Instagram, Messenger, Facebook, the standalone meta.ai web app, Ray-Ban Meta smart glasses, and Quest VR headsets1. Meta AI runs on Llama 4 (released April 5, 2025), available in two consumer-facing variants (Scout and Maverick) and natively multimodal across text, image, and video2.
Meta AI is structurally distinct from the other entries in the citation-surfaces cluster because it has a publicly documented practice (reported by the Washington Post and summarized in Wikipedia) of summarizing news from various outlets without linking to original articles, in operation since May 20241. Meta has not published a citation-by-default design commitment for the Meta AI consumer assistant comparable to DuckDuckGo Search Assist's documented one-or-two-source link convention or to the inline-source-card conventions of ChatGPT search, Perplexity, Claude, and Brave Search AI surfaces. For practitioners, this makes Meta AI the cluster's most ambiguous citation surface in 2026.
For publishers, Meta AI belongs in the per-engine breakout of the AI citation metrics pillar with a documented hedge: attribution rates against Meta AI should be expected to be structurally lower than against engines with documented citation-by-default design, and citation tracking should not be the only signal used to gauge whether Meta AI is using a publisher's content.
Status in 2026
Meta AI has expanded through several integration steps since its September 2023 announcement at Meta Connect:
- Initial assistant (2023): Meta AI announced as a Llama-powered virtual assistant integrated into Meta's consumer apps.
- Cross-app rollout (2023-2024): integration into WhatsApp (search bar, group invocation), Instagram (Direct, Search), Messenger, and Facebook (feed and search).
- Standalone meta.ai web app (early 2024): the dedicated web surface as the canonical Meta AI experience outside the social apps.
- Ray-Ban Meta integration (2023-2024): second-generation Ray-Ban Meta smart glasses ship with Meta AI; camera-based "look and ask" capability added via update in April 20241.
- Llama 4 (April 5, 2025): the underlying model family upgraded to Llama 4 (Scout 17B active / 109B total / 16 experts; Maverick 17B active / 400B total / 128 experts). Multimodal across text, image, and video, with a 10 million token context window on Scout2.
- Quest VR availability: Meta AI is accessible on Quest 2 and later VR headsets.
The cross-surface architecture matters for citation tracking because the assistant's underlying behavior (citation discipline, source-grounding mechanism, refusal patterns) is shared across surfaces, but the rendering of citations differs by modality: linked text citations are operationally awkward in voice-only (smart glasses) and immersive (VR) surfaces, and the web surface (meta.ai) is the most testable for citation-tracking probes.
Meta operates two AI-relevant crawlers that respect standard robots.txt: Meta-ExternalAgent (training crawler that collects data for Llama model training and Meta AI features) and Meta-ExternalFetcher (real-time retrieval crawler for fresh web content when Meta AI surfaces need it)3. These are distinct from the older facebookexternalhit crawler used for link previews when URLs are shared on Facebook, Instagram, or Messenger; facebookexternalhit is not part of the AI training or retrieval pipeline.
Detection methodology
For per-surface citation tracking on Meta AI (running the attribution rate probe discipline against Meta AI specifically), each surface requires its own detection. The meta.ai web surface is the most testable; the in-app and modality-constrained surfaces are harder to verify systematically.
| Surface | Where it appears | Citation rendering | How to probe |
|---|---|---|---|
| meta.ai web | meta.ai standalone web app | Citation behavior not documented as citation-by-default; verify per query | Send the target query in meta.ai; record whether any linked source appears |
| WhatsApp / Messenger / Instagram / Facebook in-app | Search bar or assistant invocation inside the social app | Citation rendering may differ from web; verify per surface | Invoke Meta AI in each app; record citation behavior |
| Ray-Ban Meta smart glasses | Voice-only or visual answer surface | Voice-only modality; linked citation not directly applicable | Out of scope for standard text-citation probes |
| Quest VR | Voice or immersive surface | Voice / visual modality | Out of scope for standard text-citation probes |
For citation match rate tracking, Meta AI's per-query citation slot is variable. For citation share, expect lower attribution from Meta AI surfaces than from peer engines with documented citation-by-default design.
What remains contested or unverified
- Whether Meta AI uses live web search at all. Meta has not vendor-documented when Meta AI invokes Meta-ExternalFetcher for real-time retrieval versus answering from the Llama 4 model's training data alone. Some queries appear to receive grounded answers; others appear to draw from training data; the trigger logic is not public.
- Citation design commitment. Unlike DuckDuckGo Search Assist (always one or two linked sources), Brave AI Answers ("we always show you where the information comes from"), or Perplexity / ChatGPT search (inline citations as default), Meta has not published an equivalent vendor commitment for citation behavior in Meta AI. The documented practice (per Washington Post via Wikipedia) of news summarization without linked attribution is in tension with that consumer-trust convention.
- Per-modality citation rendering. How citations are rendered (or whether they appear at all) on the smart glasses voice surface and the Quest VR surface is not vendor-documented; per-modality behavior may differ in ways that text-citation probes cannot reveal.
- Relationship between Meta's standalone subscription app and the in-app assistants. Wikipedia references a Meta AI standalone subscription app; whether its citation behavior matches the free in-app assistants or differs in citation discipline is not documented at the per-product level.
How to apply
- Track Meta AI as a separate engine with a documented hedge in your citation tracking program. Use meta.ai web as the primary probe surface for text-citation observability; treat the in-app surfaces as inferential rather than directly probable at scale.
- Verify Meta-ExternalAgent and Meta-ExternalFetcher allow rules in robots.txt. Standard user-agent rules apply: Meta's AI-relevant crawlers respect robots.txt directives. If your editorial position is to exclude Meta AI from training and retrieval, both user agents need explicit disallow rules. This is structurally different from Grok, where user-agent rules are unlikely to be honored; see Grok citation for that contrast.
- Do not assume Meta AI citation maps to the same publisher-trust contract as ChatGPT search or Perplexity. The cluster anchors with documented citation-by-default design produce a different attribution surface than Meta AI, where citation is variable per query.
What to skip:
- Treating Meta AI as equivalent to other Llama-using surfaces (Duck.ai's Llama 4 Scout / Maverick models, third-party Llama deployments). The Meta AI consumer assistant is operationally distinct from any other product using Llama as the base model.
- Assuming a "Llama 4 = citation-bearing assistant" inference. The model can produce text without citations, and Meta AI's product surfaces have not been documented to require citations.
How it relates to other concepts
- Citation surfaces cluster sibling: parallel per-engine surface entry to Perplexity citation, ChatGPT search citation, Claude citation, Gemini citation, Microsoft Copilot citations, AI Overview citation, AI Mode, Brave Search citation, Grok citation, DuckDuckGo AI citation, and AI dev tool citations. Meta AI is structurally distinct on the citation-design-commitment axis: it is the cluster anchor without a documented citation-by-default convention.
- Per-engine measurement input to attribution rate, citation share, citation match rate, and the AI citation metrics pillar. Meta AI's variable citation behavior makes its contribution to these metrics structurally lower than peer engines with citation-by-default design.
- Crawler discipline aligned with AI crawler bots: Meta-ExternalAgent (training) and Meta-ExternalFetcher (real-time retrieval) respect robots.txt; standard user-agent allow / disallow rules apply, in contrast to Grok where user-agent rules are unlikely to be honored.
- Citation rendering aligned with Citation vs Mention vs Link: Meta AI's variable citation behavior moves specific responses across cells of the 2x2 taxonomy. The documented practice of news summarization without attribution places some Meta AI outputs in the unlinked-mention cell, which is a structurally weaker citation event for publishers than a linked-citation cell event from peer engines.
Footnotes
-
Wikipedia, "Meta AI." en.wikipedia.org/wiki/Meta_AI. Used here for the assistant's integration timeline across WhatsApp, Instagram, Messenger, Facebook, meta.ai, Ray-Ban Meta smart glasses, and Quest VR, and for the load-bearing observation (citing the Washington Post) that since May 2024 the chatbot has summarized news from various outlets without linking to original articles. Wikipedia is a secondary source; the underlying Washington Post reporting is the primary attribution. ↩ ↩2 ↩3
-
Meta AI Blog, "The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation." ai.meta.com/blog/llama-4-multimodal-intelligence, April 5, 2025. Primary source for Llama 4 launch date, Scout and Maverick model specifications (17B active parameters; 16 vs 128 experts; 109B vs 400B total; native multimodality across text, image, and video; 10M token Scout context window), and integration surfaces (WhatsApp, Messenger, Instagram Direct, meta.ai web). ↩ ↩2
-
51Degrees research on Meta's robots and crawlers in 2026. 51degrees.com/blog/meta-crawlers-2026. Documents the distinction between Meta-ExternalAgent (training crawler), Meta-ExternalFetcher (real-time retrieval crawler), and facebookexternalhit (older link-preview crawler not used for AI). Both AI-relevant crawlers respect standard robots.txt directives. Third-party reporting rather than first-party Meta documentation; Meta has not published an equivalent first-party crawler documentation page. ↩
Related terms
- AI Overview citation/terms/ai-overview-citation
- AI Mode/terms/ai-mode
- Perplexity citation/terms/perplexity-citation
- ChatGPT search citation/terms/chatgpt-search-citation
- Claude citation/terms/claude-citation
- Gemini citation/terms/gemini-citation
- Microsoft Copilot citations/terms/microsoft-copilot-citations
- Brave Search AI citation/terms/brave-search-citation
- Grok citation/terms/grok-citation
- DuckDuckGo AI citation/terms/duckduckgo-ai-citation
- AI dev tool citations/terms/ai-dev-tool-citations
- AI citation metrics/terms/ai-citation-metrics
- Citation vs mention vs link/terms/citation-vs-mention-vs-link
- AI crawler bots/terms/ai-crawler-bots
FAQ
- Does Meta AI cite sources when it answers?
- Inconsistently. Meta has not publicly committed to a citation-by-default design pattern for the Meta AI consumer assistant in the way DuckDuckGo Search Assist has (always one or two linked sources) or in the way ChatGPT search, Perplexity, Claude, and Brave AI surfaces typically do (inline source links with a source-card panel). Wikipedia, citing the Washington Post, has documented that since May 2024 Meta AI has summarized news from various outlets without linking to original articles, including in jurisdictions where news links are restricted on Meta platforms. The practical implication for publishers: Meta AI is the cluster's most ambiguous citation surface in 2026, and citation tracking against Meta AI should expect lower attribution rates than against engines with documented citation commitments.
- What model powers Meta AI?
- Meta AI runs on the Llama family of large language models, developed by Meta. The most recent generation as of mid-2026 is Llama 4, released April 5, 2025 in two consumer-facing variants: Llama 4 Scout (17 billion active parameters with 16 experts, 109 billion total parameters, multimodal across text, image, and video) and Llama 4 Maverick (17 billion active parameters with 128 experts, 400 billion total). Meta AI's consumer assistant uses Llama 4 across WhatsApp, Messenger, Instagram Direct, Facebook, and the meta.ai web app. Llama 4's launch announcement does not document a web search or citation capability for the assistant surfaces.
- What is Meta-ExternalAgent and how is it different from facebookexternalhit?
- Meta-ExternalAgent is Meta's AI training crawler, used to collect web data for Llama model training and Meta AI features. Meta-ExternalFetcher is the real-time retrieval crawler used when Meta AI surfaces need fresh web content. Both respect standard robots.txt directives, giving publishers user-agent-based control. facebookexternalhit is a separate, older crawler used for generating link previews when URLs are shared on Facebook, Instagram, or Messenger; it is not part of the AI training or retrieval pipeline. Treat Meta-ExternalAgent and Meta-ExternalFetcher as the AI-relevant crawlers; do not confuse them with facebookexternalhit (link previews) or with the Meta domain's other infrastructure user agents.
- Where can users interact with Meta AI, and does each surface produce citations differently?
- Meta AI is available across WhatsApp (search bar, group chat invocation), Instagram (Direct and Search), Messenger (one-on-one with the assistant), Facebook (search bar and feed integrations), the meta.ai standalone web app, Ray-Ban Meta smart glasses (voice and camera integrations), and Quest VR headsets. Citation behavior is not consistently documented across these surfaces. The web surface (meta.ai) is the most likely place to see explicit text-based citation conventions; the voice-only and visual surfaces (smart glasses, VR) inherit the assistant's underlying behavior but render answers in modalities where linked citations are operationally awkward. For per-engine citation tracking the meta.ai web surface is the most testable; for the in-app surfaces, attribution is harder to verify.
- How does Meta AI citation compare to ChatGPT search or Perplexity?
- All three are AI assistants with large reach (Meta AI was reported by Meta to be on track to ~600 million monthly active users by end of 2024). Mechanically Meta AI is the most distinct of the three on citation discipline. ChatGPT search and Perplexity have built their consumer-facing surfaces around prominent inline source citations and source-card panels, and treat the citation as part of the user-trust contract. Meta AI's documented behavior (per Wikipedia citing the Washington Post) of summarizing news content without linking to original articles is in tension with that consumer-trust convention. For practitioners running citation tracking, treat Meta AI as a separate engine with a documented hedge: per-query citation behavior is more variable, and publisher attribution from Meta AI surfaces is structurally lower than from peer surfaces with documented citation-by-default design.
Sources & further reading
- Meta AI: The Llama 4 herd (April 5, 2025 launch, Scout / Maverick specs, integration surfaces)2025-04-05
- Wikipedia: Meta AI (assistant integration surfaces; Llama version timeline; documented news-summarization without attribution)
- Wikipedia: Llama language model (version history; February 2023 first release)
- 51Degrees: Meta crawlers and robots (Meta-ExternalAgent vs Meta-ExternalFetcher vs facebookexternalhit)
Get the weekly digest
New terms shipped that week, plus one observation from the AI-citation tracker.