Robots.txt Analyzer & Crawl Budget Checker
Paste your robots.txt and instantly validate every directive. Detect accidental Googlebot blocks, missing sitemaps, conflicting allow/disallow rules, and crawl-delay issues - with a block-by-block visual breakdown.
How to use this tool
- 1Open your robots.txt
Visit yourdomain.com/robots.txt in your browser, then select all and copy the entire content.
- 2Paste and analyse
Paste the content into the editor below. Issues are detected instantly - no button press needed.
- 3Review the breakdown
See all user-agent blocks, check error and warning flags, and validate your sitemaps are declared correctly.
What this tool checks
Full user-agent block parsing
Every User-agent block is parsed individually, showing exactly which Allow and Disallow rules apply to each bot type.
Googlebot block detection
Detects if Googlebot or the wildcard (*) agent is fully blocked via "Disallow: /" - the most damaging robots.txt configuration possible.
Conflicting rule detection
Identifies paths that are blocked globally by the wildcard block but explicitly allowed for a specific bot - useful for verifying intentional Googlebot exceptions.
Sitemap validation
Checks that at least one Sitemap URL is declared. Validates that sitemap paths are absolute URLs starting with https://.
Crawl-delay warnings
Flags Crawl-delay values over 10 seconds and reminds you that Googlebot ignores this directive - use Search Console for Google-specific crawl rate control.
Duplicate rule detection
Highlights redundant Disallow directives that appear more than once in the same user-agent block - these waste space and may cause confusion.
Why robots.txt gets sites deindexed
The most common SEO disaster
The most frequent robots.txt catastrophe is a developer adding "Disallow: /" to block bots during site development, then forgetting to remove it on launch. This causes an entire site to disappear from Google within days of deployment - often after a major redesign or platform migration.
Crawl budget and indexing efficiency
Search engines have a fixed crawl budget per site - they can only crawl a set number of pages per day. Allowing bots to crawl low-value pages (admin panels, filter URLs, session parameters) wastes crawl budget that should be spent on your canonical pages and new content.
How to check your file in 30 seconds
Open yourdomain.com/robots.txt in Chrome. Select all text (Ctrl+A), copy it, and paste into this tool. The analysis is instant. Alternatively, Google Search Console > Settings > robots.txt Tester shows a live version and allows you to test specific URLs.
Get GEO & AEO tips every week
The Layman SEO newsletter. Plain English updates on what is changing in search - SEO, AEO, and GEO - and what to do about it. One email a week. Unsubscribe any time.
No spam. No paywall content. Unsubscribe with one click.
Frequently asked questions
What is robots.txt and why does it matter for SEO?
Robots.txt is a plain-text file at yourdomain.com/robots.txt that tells crawlers which pages to visit. Misconfigured directives can accidentally block Googlebot from critical pages or waste crawl budget on low-value pages.
Does Googlebot respect Crawl-delay?
No - Googlebot ignores Crawl-delay. Use Google Search Console's crawl rate settings to manage Googlebot frequency. Crawl-delay affects other bots like Bingbot and Yandex.
What does "Disallow: /" do?
"Disallow: /" blocks the crawler from accessing the entire site. Under the wildcard (*) block, this prevents all search engines from crawling any page - catastrophic for live production sites.
Should I use robots.txt to block duplicate content?
Generally no - prefer canonical tags or noindex meta tags. Robots.txt only prevents crawling, not indexing (a page can be indexed via backlinks even if it's blocked in robots.txt). Use noindex for pages you want de-indexed but not blocked from crawling.
Does this tool send my robots.txt to a server?
No. All parsing and analysis happens entirely in your browser using JavaScript. No content is uploaded or logged.