Crawlability Basics: robots.txt, sitemap.xml, and llms.txt
How to configure core discovery files so search and AI crawlers can efficiently access the right pages.
Discovery files should be explicit, minimal, and always in sync with production URLs.
robots.txt
Allow major crawlers unless intentional restrictions are needed.
sitemap.xml
Include only canonical URLs and regenerate on every deploy.
llms.txt
Use it to state scope and crawling preferences for AI systems when relevant.
Summary
Accurate discovery files reduce crawl waste and improve indexing coverage.