As it is Christmas Day, Majestic SEO is releasing data on the top one million websites in a Creative Commons sharealike license, downloadable CSV file, allowing web users to create derived works and research  (subject to attribution). The files are available at the end of this post.

Majestic SEO launched Majestic Million on May the 19th, and it has caused ripples of interest from time to time, and has found a nice niche to power Buzz League Tables.

 

We have altered the algorithms behind Majestic Millions, generating the list on the Referring C-subnet count rather than the Domain Count. This has resulted in a shift of the top ten, with an increase in the number of well known domains in the Majestic Million.

Today though, we thought we would do something different. Majestic has had a long history of making our data publicly accessible, and we would like to think that it has bought us a certain amount of goodwill in the wider internet community. So we have a surprise gift for the internet analytic community ( and who knows – perhaps some statisticians also ) and are making a snapshot of the entire Majestic Million List available to download.

As a sanity check, we ran a couple of plots using the Statistical Computing package “R”:

A graph of referring C-subnet count against Majestic Million Rank:

 

Again, but just for the top 250:

And a Graph of the referring IP Address count against the C-subnet count:

We would love to hear about any conclusions you come to using the data – so what are you awaiting for – Downloadable in Excel or TXT below:

[ download Excel file here  NB: 1,000,000 records in an Excel file is 60 MB. You need a modern version of Excel. Save to Disk first]

[ download full file here This is the 25 MB .TXT file ZIPPED, Tab delimited and much smaller - but it is still a million lines of data!]

Creative Commons License
This Majestic Million Data is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

 

If you would like to use this data or re-release it, you should reference Majestic SEO as follows, providing a link to this blog post should the medium support it:

Backlink Data sourced from the MajesticSEO.com public release of Majestic Millions Dataset – generated on 22nd December 2011

Comments

Comments are closed.