Today we are releasing under creative commons, a list of the top 100,000 websites broken down by the country in which they are hosted. Francois – our French ambassador – was working on some research for another post and asked if we could gethim a list of the top 1,000 websites hosted in France from our Majestic Million list. The main list – which updates every day – lets you sort by TLD, but does not have a full location lookup. So – there was no EASY way to get this. Of course, we never do EASY stuff at Majestic, so Chris – our Network Administrator started at number 1 in our Majestic Million list and set up a routine to look up the hosted location of the site before moving onto item 2… The plan was to carry on until we had 1,000 French hosted sites in the list.
Then Chris left his PC running and went home…
The next morning the program was still running. We had well over 3,000 French hosted websites in the list… and 150,000+ in all. So we decided to give the first 100,000 or so away for you to play with.
The data in the CSV file
GlobalRank – This is the global rank of the website in our Majestic Million – starting at the biggest (Google.com) at position #1
TldRank – This is the rank of the site in relation to all sites in the list with the same Top Level Domain suffix (.com /.net /.fr etc.)
Domain – The domain itself
TLD – We left the Top Level Domain suffix itself as a column for easy sorting
RefSubNets – The Referring “C subnets” – explained in our glossary of terms. This is how we sort the Majestic Million list.
RefIPs – The number of unique IP numbers linking to the domain
IDN_Domain – Usually the same as the “Domain” column, but there are some occassions when this differs because these websites are not in a character set that most of us recocnile. For example: http://つけまつげ.net/ does not translate well into a CSV file which is ACII only!
IPAddress – The IP number we found the domain on
GeoCountry – The really new bit… where the domain is hosted according to our best guess.
It is not perfect, because many sites use what is called a “CDN” which helps distribute large sites over several servers – potentially hosted in several countries. It is pure chance in this case as to which one we saw – or we saw the one that a search from the UK would have seen. However – most CDNs are all in the same country, so this error will be pretty small. Also – in some cases – we were not able to resolve the IP number for whatever reason. In these instances we have also not reported a country code as with a few other instances. I therefore actually put 103,000 in the list so there are a few more for you just in case.
The data is free. It’s in a CSV download…
To Download the Top Websites by Country
Latest posts by Dixon Jones (see all)
- How PageRank Really Works: Understanding Google - October 25, 2018
- Outbound Links and Language Data lands in the Historic Index. - November 22, 2017
- New Functionality: Outbound Links and Language Upgrade - October 23, 2017