$devtoolkit.sh/templates/robots/allow-all

Standard Allow-All robots.txt Template

A robots.txt file is a plain text file placed at the root of your domain (https://example.com/robots.txt) that follows the Robots Exclusion Protocol to communicate crawling instructions to well-behaved web crawlers. Understanding what robots.txt does and does not do is essential before configuring it: robots.txt instructs compliant crawlers what to access and what to skip, but it provides no access control — it is a public file, and any crawler that ignores it can still fetch the pages listed in Disallow rules.

The standard allow-all configuration is the right starting point for most public websites that want all their content indexed. A minimal robots.txt consists of a User-agent: * wildcard rule with no Disallow rules, paired with a Sitemap directive pointing to your XML sitemap. This explicitly declares that all crawlers are welcome and provides the sitemap URL to guide efficient crawling.

Even with an allow-all configuration, certain paths are commonly disallowed to prevent wasting crawl budget on pages that provide no search value: admin paths (/admin/, /wp-admin/), shopping cart and checkout pages (/cart, /checkout), search result pages (/search?q=), account pages (/account/, /profile/), API endpoints (/api/), and duplicate content paths (e.g., /print/ versions of pages). Excluding these from crawling ensures Googlebot spends its crawl budget on your canonical, indexable content.

The Crawl-delay directive is recognized by some crawlers (Bing, Yandex) but ignored by Googlebot. If your server struggles under aggressive crawling from secondary search engines or scraper bots, Crawl-delay: 1 (one second between requests) can help reduce load. For Googlebot, manage crawl rate via Google Search Console's "Crawl Rate" setting under your property.

The Sitemap directive is one of the most valuable additions to any robots.txt. It provides a direct link to your sitemap.xml (or sitemap index file for large sites), making it easy for crawlers to discover your complete URL list without relying solely on link following. You can include multiple Sitemap directives for different sitemap files — image sitemaps, video sitemaps, and news sitemaps can all be listed separately.

User-agent-specific blocks allow you to give different instructions to different crawlers. Most sites maintain one wildcard block, but you may want to add crawler-specific rules for specialized bots (image crawlers, archive bots) or to block specific scrapers by name.

Template Preview

User-agent: *
Disallow: /admin/
Disallow: /api/
Disallow: /account/
Disallow: /cart
Disallow: /checkout
Disallow: /search
Allow: /

Sitemap: https://example.com/sitemap.xml

Customize this template with your own details using the free generator:

Open in Generator

FAQ

Does robots.txt affect my search engine rankings?
Robots.txt affects which pages are crawled, and crawling is a prerequisite for indexing. Pages blocked by robots.txt are typically not indexed. However, robots.txt blocking of important pages can harm rankings — Googlebot cannot access blocked pages to evaluate their content quality. Robots.txt is for preventing crawl of genuinely unimportant pages, not for SEO manipulation. Use the noindex meta tag if you want a page crawled but not indexed.
Should I block /wp-admin/ in robots.txt?
If you use WordPress, yes — Disallow: /wp-admin/ is standard practice. Admin paths have no search value and consuming crawl budget on them is wasteful. However, note that blocking with robots.txt does not secure your admin panel — malicious actors can still attempt to access it. Use authentication, 2FA, and IP allowlisting to actually secure /wp-admin/, independent of robots.txt.
Can I have multiple robots.txt files for different subdomains?
Yes — robots.txt applies per-origin. https://example.com/robots.txt applies only to example.com. https://blog.example.com needs its own robots.txt at https://blog.example.com/robots.txt. This allows different crawl policies for different subdomains — for example, blocking all crawling of a staging subdomain while allowing full crawling of the production subdomain.

Related Templates

/templates/robots/allow-allv1.0.0