SEO

New Study: How Often Do AI Assistants Hallucinate Links? (16 Million URLs Studied)

AI assistants’ hallucinated links

This study looks at a clear problem: conversational systems often suggest urls that fail. Ahrefs checked 16 million unique urls tied to popular platforms and compared outcomes to traditional search. The finding is stark: these tools send users to 404 pages about 2.87x more often than Google Search.

For marketers and SEO teams in India, this trend matters now. Early analytics show branded discovery routes can shift as newer search experiences surface content. A small share of traffic that breaks on landing can erode trust quickly.

This piece is a data-led report, not a panic alert. You will get the study method, per-platform results, why such errors occur, and practical steps to detect and fix problems operationally.

Key Takeaways

  • Across 16 million urls, assistants drove 404s 2.87x more than Google Search.
  • “Hallucinated links” here means credible-looking suggestions that return errors when clicked.
  • Indian marketers should monitor small traffic shifts—they can grow into bigger issues.
  • The report covers methodology, results by platform, and mitigation steps.
  • Action now—set up tracking and quick-remediation workflows to protect brand journeys.

Why AI-generated broken links are becoming a measurable web traffic problem

Analytics can now surface a quiet but costly problem: generated URLs leading to dead pages. The 16 million–URL study found that assistants send visitors to 404 pages 2.87x more often than Google Search.

These are produced URLs that look plausible but route users to 404 pages. That dead-end experience erodes confidence in both the referrer and the brand.

Because many visits are mid-journey research, a broken landing interrupts discovery and lowers conversion chances. Repeated failures make a website feel neglected or outdated.

Key headline finding from the 16M-URL analysis

The data is clear: the clicked-URL average 404 rate was 0.43% for generated referrals versus a 0.15% baseline from google search across 629M unique URLs.

  • Generated referrals appear as distinct referrers, so teams can quantify how often sessions hit 404 pages.
  • For high-growth sectors in India—SaaS, fintech, education—even small bumps in website traffic from these sources create visible impact.

The next section will explain how the URLs were analyzed and how the 404 rate was measured across clicks and citations.

How the 16 million URLs were analyzed across ChatGPT, Gemini, Copilot, Perplexity, Claude, and Mistral

We split the study into two complementary datasets to capture both user-facing failures and raw citations.

Dataset one: web analytics of clicked urls

This view used referrer data in web analytics to collect clicked urls where the session source was an assistant. That ties the problem to real sessions rather than hypothetical mentions.

How likely 404 pages were identified

At scale, pages were flagged as likely 404 when the HTML title contained “404” or “not found.” This pragmatic rule made it possible to scan millions of unique urls quickly.

Dataset two: brand radar of cited urls

The brand radar extracted cited urls from model outputs, independent of clicks. This shows how often urls are mentioned, including many that never earn a visit.

A visually striking digital representation of web analytics, showcasing an abstract landscape filled with interconnected nodes and graphs. In the foreground, a sleek, modern laptop displays colorful charts and metrics, illuminated by a soft glow. The middle ground features a variety of data visualizations – bar graphs, line charts, and pie charts – interconnected with flowing lines representing the analysis of the 16 million URLs. In the background, a futuristic cityscape with digital billboards and glowing data streams creates a dynamic atmosphere, emphasizing technological advancement. The scene is brightly lit, with a cool color palette of blues and greens conveying a sense of innovation and clarity. The viewpoint is slightly elevated, providing a comprehensive overview of this digital analytics landscape.

Validating http status with a crawler database

For urls present in the crawler (~65% coverage), we pulled the most recent http status to compute 404 rates. Coverage matters because missing entries can hide true never-existing urls.

“Combining session-based clicked urls with a crawler-backed brand radar delivers a more complete, pragmatic picture of broken referrals.”

Known limitations: title-based detection can undercount 404s when templates omit those strings. Click data misses unclicked citations, and crawler gaps can understate true hallucinations. Conversely, some 404s reflect removed but legitimate pages, so not every 404 equals a fabricated url.

With methodology established, the next section compares 404 rates by source and benchmarks them against Google baselines.

What the data shows about AI assistants’ hallucinated links and 404 rates vs Google Search

The data presents a clear contrast between referral sources and real-world landing outcomes.

Clicked-URL leaderboard:

Source Clicked 404 rate Cited 404 rate Context
ChatGPT 1.01% 2.38% Highest clicked and cited 404 rate
Claude 0.58% Mid-range clicked rate
Copilot 0.34% 0.54% Lower cited rate than ChatGPT
Perplexity 0.31% 0.87% Tracks close to google search index
Gemini 0.21% 0.86% Also close to SERP baseline
Mistral 0.12% Lowest clicked rate; smaller volumes

Benchmarking makes the gap clear. Google referrer baseline sits at 0.15% 404 rate across 629M urls. Across assistants the average clicked 404 rate is 0.43%, or 2.87x the 404 rate google baseline.

Why this matters: ChatGPT’s cited urls return 404 at 2.38%, far above the SERP baseline of 0.84% for top Google results. Perplexity and Gemini track closer to Google’s rate, which suggests their source index leans on Google data rather than inventing urls. Over time, urls that do not actually exist reduce trust and cost conversion opportunities.

Why AI assistants hallucinate links in the first place

Behind each fabricated url there is usually either an expired page or a clever pattern guess. Both create credible-looking referrals that then fail when clicked.

A surreal digital landscape depicting a myriad of abstract, colorful URLs floating in a cloud-like formation. In the foreground, a translucent web of interconnected links glows faintly, hinting at the chaos of misremembered data. The middle ground showcases swirling elements that represent confusion and misinformation, with faint lines connecting the URLs in a complex network. The background features a soft gradient of blues and purples, creating a dreamlike atmosphere, while soft beams of light filter through, illuminating specific URLs as if they are hallucinations. The overall mood is intriguing and enigmatic, inviting viewers to ponder the nature of these AI-generated links. The image utilizes a wide-angle lens to enhance depth and perspective, creating a visually captivating scene without any text or branding.

Once-valid URLs that expired

Models often surface pages that existed during training. When those pages were deleted or moved without redirects, the address now will return 404.

Deletions, CMS migrations, and missing 301s all produce this category. For Indian teams, this means older campaign pages can suddenly harm discovery if left unredirected.

Pattern-based hallucinations

Generative systems also guess addresses using a site’s common structure. These pattern-based hallucinated urls mirror real paths and feel trustworthy.

The more consistent your permalink and blog patterns are, the easier it is for models to invent plausible but nonexistent pages.

Real-world examples and amplification

Ryan Law documented practical cases on Ahrefs: plausible paths such as /blog/internal-links/ and /blog/newsletter/ attracted visits yet return 404.

“Plausible-sounding blog paths that never existed were pulling clicks and producing dead pages.”

When generated content publishes these fabricated urls, crawlers may index them. That creates a feedback loop: the web copies the error and it spreads.

Cause How it looks Fix
Expired but once-valid Previously live blog or product page now missing Restore or 301 to closest live page
Pattern-based guess Looks like /blog/topic-name/ but never existed Create content, 301 to category, or serve helpful 404
Published hallucination Third-party content lists a fake url Request removal, add canonical, or set redirects

SEO implication: even wrong urls often carry correct intent. A targeted redirect or a useful 404 can reclaim value and protect brand journeys.

Impact on SEO and website operations for teams in India

For Indian marketing teams, a tiny share of misrouted traffic can create outsized operational work. Though these referrals make up roughly 0.25% of website traffic versus about 39.35% from google search, they often carry high intent.

Where these broken referrals appear

Look for sudden spikes in 404 pages with nonstandard referrers. They show up in landing page reports and content journeys where users expect a specific guide or pricing page.

How to find AI-referred 404 pages in GA4

Use Explorations, pick “Session source,” and apply this regex filter: .*gpt.*|.*chatgpt.*|.*openai.*|.*perplexity.*|.*claude.*|.*gemini.*|.*copilot.*|.*mistral.*|.*bard.*

Audit urls at scale with Google Sheets

Export landing urls, then add an Apps Script function such as =GetHttpStatus(A2) to pull each page’s http status. Filter results to 404 and combine with visit counts (example: >10 visits/month).

What to fix first and a quick mitigation playbook

  • Prioritize broken urls that have meaningful traffic and business intent (pricing, docs, product pages).
  • Use 301 redirects when a close topical match exists.
  • When no good match exists, serve a high-converting 404 page with resource links and CTAs.

Measure changes over time. Track recurrence, update redirects, and let data guide where to spend engineering time for the best SEO impact.

Conclusion

The study delivers a clear, actionable message: generated referrals produce 404s at 2.87x the Google baseline, with ChatGPT showing the highest observed rates (1.01% clicked, 2.38% cited).

Why this happens is simple: web churn leaves expired pages, and pattern-based guesses create plausible but nonexistent addresses. Both paths send real users to dead pages.

Recommended posture: measure first, then fix. Use analytics and HTTP audits to find high-intent 404s. Prioritize traffic and business impact before creating redirects.

Action checklist: isolate non-search referrers, surface 404 landing pages, redirect or restore high-value paths, and improve 404 UX to recover trust. Teams that adopt this repeatable process will protect brand journeys as discovery channels evolve.

FAQ

What does the new study cover?

The study analyzes 16 million unique URLs to measure how often major conversational models send users to pages that return 404 or similar error states. It compares results across ChatGPT, Gemini, Copilot, Perplexity, Claude, and Mistral and benchmarks those results against Google Search referrer data.

Why are broken URLs from generated content a growing web traffic problem?

Broken URLs reduce user trust, harm brand discovery, and waste referral traffic. When generated content cites or directs users to non-existent pages, site owners miss potential conversions and see distorted analytics, making it harder to prioritize fixes.

What is meant by “hallucinated URLs” in this analysis?

The term refers to URLs that appear plausible but do not resolve to valid pages. This includes pattern-based guesses, once-valid pages removed without redirects, and improperly constructed paths that look authoritative but return 404 or “not found” responses.

How were the 16 million URLs collected and analyzed?

The research used two datasets: one based on clicked URLs and real referrers captured in web analytics, and a second based on cited URLs surfaced by a brand-centric crawl. HTTP status codes were validated with a crawler database, while page title checks (e.g., “404”, “not found”) helped identify error pages.

How did the study detect 404 pages?

Detection combined HTTP status code checks from crawler results with page title heuristics that look for phrases like “404” or “not found.” This hybrid method increases coverage but has known limits that can both undercount and overcount error pages.

What limitations should readers know about?

Coverage gaps in crawler databases, temporary server errors, and sites that return custom pages without proper HTTP status codes can skew results. The method can undercount some pattern-based failures and overcount sites that display “not found” text while returning a 200 status.

What were the key findings vs Google Search?

Clicked URL results show ChatGPT producing the highest 404 rate around 1.01%, with other models trailing. Google referrer baseline across 629 million URLs was about 0.15%. Overall, conversational models drove 404 pages at roughly 2.87 times Google’s rate in this dataset.

How did cited URL results differ from clicked URL results?

In the cited-URL dataset, ChatGPT-cited URLs returned 404 at about 2.38%, indicating a higher failure rate when models list sources versus when users click results. This suggests citation behavior can amplify exposure to broken pages.

How does Google’s SERP baseline compare?

In context, roughly 0.84% of top Google Search results returned 404 in a sampled SERP baseline. That sits between the referrer baseline and the higher rates observed in some model-cited results, showing variation by method and index coverage.

Why do some models track closer to Google’s baseline?

Tools like Perplexity and Gemini relied on broader or fresher source indexes and retrieval strategies that reduced obvious failures. Better source curation and live retrieval reduce the chance of returning expired or guessed URLs.

What causes models to generate or repeat dead URLs?

Common causes include expired pages that were deleted or migrated without redirects, pattern-based URL generation that constructs plausible but non-existent paths, and amplification when crawlers index and repeat those erroneous references.

Are there real examples that illustrate the problem?

Yes. Researchers such as Ryan Law have documented plausible-looking paths — for example, broken blog paths on reputable platforms like Ahrefs — that return 404 despite appearing valid. These show how easy it is for algorithms to surface realistic but dead addresses.

How does this issue affect SEO and site operations in India?

For teams in India handling content, discovery, and analytics, even small AI referral shares can cause disproportionate work: investigating 404 spikes, fixing redirects, and updating content. The impact depends on traffic volume and business intent tied to those referrals.

Where do these referrals show up in analytics?

They appear in web analytics as session referrers, source/medium entries, and landing-page reports. Using GA4 session source filters and regex patterns can help isolate sessions attributed to conversational models or specific query referrers.

How can teams find and audit AI-referred 404 pages at scale?

Use GA4 to filter sessions by source and landing page, export lists of URLs, then run bulk HTTP status checks. Google Sheets combined with an Apps Script that pulls status codes via fetch calls is a practical method for mid-sized audits.

How should teams prioritize fixes for broken URLs?

Prioritize based on meaningful click counts and business intent. Fix top referrers and high-conversion landing pages first. For low-value pages, consider creating useful 404 experiences or consolidating content and adding 301 redirects where appropriate.

What mitigation tactics are recommended?

Implement 301 redirects for moved or consolidated content, create helpful and conversion-friendly 404 pages, maintain a redirect map, and monitor referrer patterns. Regular crawls and server log reviews help catch expired links before they cause user friction.

How often should teams recheck status and redirects?

Schedule automated crawls and audits at least monthly for critical content and quarterly for lower-priority pages. Increase cadence when you run campaigns or notice spikes in referral errors so you can act quickly.
Devansh Singh

Devansh Singh

About Author

Leave a Reply

Your email address will not be published. Required fields are marked *

Helping marketers succeed by producing best-in-industry guides and information while cultivating a positive community.

Get Latest Updates and big deals

    Our expertise, as well as our passion for web design, sets us apart from other agencies.

    ContentHub @2025. All Rights Reserved.