Data & ResearchAI SearchAI CitationsAEO

What Kinds of Websites Do AI Search Engines Cite? We Classified 1,485 Domains

Loudmink Team

Loudmink classified the 1,485 domains behind 6,505 AI search citations collected across 103 tracked queries in June 2026. The findings: company blogs and product sites dominate every AI search engine at 62.7% to 84.9% of citations, ChatGPT cited YouTube exactly zero times while Perplexity cited Reddit exactly zero times, and Grok alone produced 48% of all citations in the dataset because it returns 5 to 6 times more sources per answer than Gemini. This article breaks down the website-type mix engine by engine and tells you what to publish, and where, based on which engines your buyers use.

The source type behind a citation matters more than the citation count, because each type demands a different play: you can publish a blog post this week, but earning a spot in an editorial roundup or a cited Reddit thread takes a different motion entirely.

The Bottom Line

  • Company blogs and product sites are the dominant citation source on every AI search engine (70.2% overall), but they are overwhelmingly other companies' sites. The winning move is becoming the company blog that gets cited for category queries, not just brand queries.
  • Two engines have absolute blind spots. ChatGPT cited zero YouTube videos and Perplexity cited zero Reddit threads across the entire dataset. A single-channel strategy is invisible to at least one major engine by construction.
  • Engines differ in citation volume, not just citation taste. Grok returned 3,138 citations against Gemini's 552 for the same queries. More citations per answer means more chances to appear, and different math for where your effort pays off.

The Overall Mix: What 6,505 Citations Point To

Across all five AI search engines, citations break down like this, as of June 2026:

Website TypeShare of Citations
Company blogs and product sites70.2%
Video platforms (YouTube)9.4%
Forums and social (Reddit, Quora)7.9%
Editorial publications4.1%
Directories and review platforms3.8%
Documentation and reference3.0%
News outlets0.5%
Wikis0.2%
Not yet classified0.8%

The number that surprises most people is the first one. The conventional wisdom says AI search engines favor independent sources: editorial reviews, community threads, aggregators. They do favor independence, but the pages they actually pull from are mostly published by companies. The resolution to that apparent contradiction is in the next section.

Company Blogs Dominate, Just Not Yours

AI search engines cite company-published content constantly, but almost never the content of the brand being asked about. When someone asks "best voice AI platform," the engines cite comparison pages and category guides published by software companies, just not necessarily any of the companies being compared. In our earlier research on third-party citations, a tracked brand's own domain accounted for under 10% of the citations about it. Both findings hold at once: vendor-published content wins, and your vendor-published content about yourself loses.

The practical conclusion is that "company blog" is not one category. A product page describing your own features earns mentions, not citations. A category page on the same domain comparing ten products, including competitors, with honest pricing and tradeoffs, gets treated like editorial. The 70.2% is dominated by the second kind.

What to do: Publish category-level comparison content on your own domain: "best X for Y" pages that cover the full competitive landscape with real pricing and honest assessments. As of June 2026, this is the single most cited content format in our data, and it is the one piece of the citation mix you control completely.

Each AI Search Engine Has Its Own Source Diet

No two engines cite the same mix of website types. The per-engine breakdown, citation-weighted across the same 103 queries:

Website TypeChatGPTGeminiPerplexityClaudeGrok
Company blogs/product sites73.3%83.9%69.1%84.9%62.7%
Video (YouTube)0.0%3.1%19.9%0.2%12.9%
Forums/social3.4%2.9%1.8%2.0%13.8%
Editorial4.6%5.8%3.6%6.0%3.3%
Docs/reference9.4%2.2%2.1%1.3%2.1%
Directories/reviews1.8%1.8%3.5%3.8%4.7%

ChatGPT: docs-heavy, and zero YouTube

ChatGPT is the only engine where documentation and reference content is a major citation source, at 9.4% of its citations, three to seven times the share on any other engine, and it cited zero YouTube videos in the entire dataset. It also had the highest share of long-tail domains our classifier marked unknown (5.4% unclassified versus near zero elsewhere), which suggests ChatGPT reaches deeper into obscure sites than the other engines. If your buyers use ChatGPT, structured reference content, clear documentation, and well-organized comparison pages are the formats that match its retrieval behavior. Video investment does nothing here.

Perplexity: a fifth of citations are YouTube, zero are Reddit

Perplexity cited YouTube in 19.9% of cases, the highest video share of any engine, and cited Reddit exactly zero times. That makes Perplexity the clearest case of channel asymmetry in the data: the platform that dominates one engine is invisible on another. Reddit and YouTube split the engine landscape between them, and Perplexity sits firmly on the YouTube side.

Claude: blogs and editorial, nothing social

Claude pulled 84.9% of its citations from company blogs and product sites, the highest of any engine, and another 6.0% from editorial publications, with near-zero Reddit and YouTube. Claude rewards clean, structured, factual prose on owned domains and established publications. If Claude matters for your audience, your investment is your own website and earned editorial coverage, not community platforms.

Gemini: blogs above all, in this window

Gemini pulled 83.9% of its citations from company blogs and product sites, second only to Claude, with editorial at 5.8% and YouTube at just 3.1%. That YouTube number is notably lower than what we measured earlier in the year, when video was one of Gemini's stronger sources, and there are two candidate explanations: this dataset spans different product categories than our longer-running B2B SaaS study, and engine behavior itself shifts, with only 38% of citations persisting from one week to the next. We will know which explanation holds as this window extends. Until then, treat any single-window number, including ours, as a snapshot rather than a law.

Grok: the volume machine

Grok produced 3,138 citations, 48% of the entire dataset, against 552 from Gemini for the same queries. It is also the most community-driven engine: 13.8% forums and social (8.7% Reddit alone) and 12.9% video. Because Grok cites so many sources per answer, it offers the most surface area for any brand trying to break in, and its appetite for Reddit confirms what we found when Grok went offline and Reddit citations collapsed 91%.

Engines Differ in Volume, Not Just Taste

Citation counts per engine ranged from 552 (Gemini) to 3,138 (Grok) across identical queries, a 5.7x spread. This changes the optimization math. An engine that cites 3 sources per answer is a winner-take-most environment: you are either one of the 3 or invisible. An engine that cites 30 sources per answer offers more entry points but each citation carries less weight in the final narrative.

What to do: Weight your channel investment by both the engine's source taste and its citation volume. Grok's high volume plus Reddit appetite makes Reddit contributions a high-probability play for Grok coverage. Gemini's low volume plus blog concentration means a small number of well-ranked pages carry its answers, so your existing Google SEO does double duty there.

What This Means for Your Content Mix

The data points to a three-part content strategy, in priority order. First, publish category-level comparison content on your own domain, because company-published category content is the dominant citation format on every engine. Second, match your third-party channel to your buyers' engines: Reddit for ChatGPT and Grok coverage, YouTube for Perplexity, editorial outreach for Claude and Gemini. Third, keep documentation and reference content structured and current if ChatGPT matters to you, because no other engine rewards it as heavily.

What the data argues against: spreading effort evenly across every channel by default, and single-channel strategies of any kind. Two engines have absolute blind spots (ChatGPT ignores YouTube, Perplexity ignores Reddit), so any one-channel plan is structurally invisible somewhere.

Loudmink tracks which pages AI search engines cite for your queries and classifies who controls them. Plans from $99/mo.

How We Classified the Domains

We collected every citation URL returned by ChatGPT, Gemini, Perplexity, Claude, and Grok across 103 tracked queries spanning five product categories in June 2026: 6,505 citations pointing at 3,571 distinct URLs on 1,485 domains. Each domain was classified into one of eight website types using a classification system seeded with hand-verified labels for major platforms and extended with model-assisted labeling for the long tail. Results are citation-weighted: a domain cited 50 times counts 50 times, because we are measuring where answers come from, not how many sites exist.

Two honesty notes. First, 0.8% of citations point at domains we could not classify with confidence, and we report that share rather than excluding it. An earlier version of this classification mislabeled some vendor sites as editorial; we tightened the rules, re-classified every model-labeled domain, and updated the numbers here the same day. Second, this is a distribution snapshot from a single June 2026 window, not a longitudinal claim. Source mixes shift as engines update their retrieval, and our own longer-running research shows weekly churn in which specific pages get cited. The per-engine contrasts here (ChatGPT's zero YouTube, Perplexity's zero Reddit, Grok's volume) are large enough that we expect the direction to hold even as the decimals move.

Frequently Asked Questions

What type of website gets cited most by AI search engines?

Company blogs and product sites, at 70.2% of the 6,505 citations we classified in June 2026. The catch is that these are overwhelmingly category-level pages (comparisons, guides, roundups) published by companies, not product pages describing the publisher's own product. Video (9.4%), forums (7.9%), and editorial publications (4.1%) follow.

Does ChatGPT cite YouTube videos?

In our June 2026 dataset, no. ChatGPT cited zero YouTube videos across 913 citations, while Perplexity cited YouTube in 19.9% of cases. If your audience uses ChatGPT, video content does not contribute to your visibility there; structured text content and documentation do.

Does Perplexity cite Reddit?

Effectively no. Perplexity cited zero Reddit threads across 952 citations in our June 2026 data, consistent with the near-zero rates we measured in earlier research. Reddit presence pays off on Grok (8.7% of its citations) and to a lesser degree ChatGPT, not on Perplexity.

Why does Grok produce so many more citations than other engines?

Grok returns more sources per answer than any other AI search engine: 3,138 citations in our dataset versus 552 from Gemini for the same 103 queries. More citations per answer means more brands appear per response, which makes Grok the engine where new entrants have the most realistic path to showing up quickly.

How reliable is a single-window source-type study?

Reliable for distribution shape, not for precise decimals. The large contrasts (blogs dominating everywhere, ChatGPT's zero YouTube, Perplexity's zero Reddit) are robust signals. The smaller percentages will move between windows, and our longer-running research shows only 38% of specific citations persist week to week, so we treat this as a snapshot and update it as data accumulates.

Related Resources

Free visibility report

Not sure if AI search engines recommend you?

Get a free report showing who they recommend instead of you, where they get their answers, and what you can fix.