Today we are delighted to announce the launch of OpenRobotsTXT – a project to archive and analyse the world’s robots.txt files.
The first version of the openrobotstxt.org website is now live. This initial release is a slimmed-down site that aims to provide context for the OpenRobotsTXT crawler to begin its operation in the next few days.
(There is a Catch-22 when launching new crawlers, as the webmaster community likes to see a page that describes the crawler, to help inform consent.)

The project has been bootstrapped by a huge data export of robots.txt files collected by the Majestic crawler, MJ12bot. This has enabled us to analyse the User Agents reported around the web. The initial release of the site focuses on this study with a free-to-download (Creative Commons) data set that details the User Agents discovered across the web.
A range of free tools and features are planned for the OpenRobotsTXT site. Once we have launched the dedicated crawler, further updates will provide searchable archives, lots of stats, and deliver a greater insight into the world of robots.txt.
Read more at openrobotstxt.org
- Majestic Launches Robots.txt Archive - May 15, 2025
- How to build an SEO strategy for AI Overviews - April 28, 2025
- Majestic at brightonSEO, April 2025 - March 28, 2025