We thought we would have some fun. Attached is a file showing the top 50,000 backlinks that we have to Google.com’s root domain. Feel free to play with this data. Here are some interesting facts about Google’s own backlink profile in Majestic SEO’s database.
Some observations about Goole’s Backlinks
1: All but 1695 were crawled by us in December or January. That’s about 97%.
2: Many of Google’s strongest links by ACRank are from widgets – in particular pulling in newsfeeds. For eample planet.wordpress.org uses Feedproxy to 301 redirect users to inner stories with links like: http://feedproxy.google.com/~r/weblogtoolscollection/UXMP/~3/k5JSiI6IzXI/ . These of course would not help Google, as they redirect straight out of Google. But this might set up the question: “If you have a link from feedproxy.google.com (something we can all do), is this worth more than a link directly within your site?” Majestic SEO is not the place to say what that means in algotithmic terms – but may be the best place to start if you want to set up some experiments of your own.
3: Cnet, WashingtonPost and Chicago Tribune are all linking to Google from their home page (Or at least, they were, when we checked)
4: If you take the file and start to analyze it, you will see that one page can typically link many times to Google in many different ways (and to many different subdomains). This demonstrates the need for care when deciding “which links count” (whatever you choose to mean for the word “count”.)
5: I was surprised to see “mysimon.com” so near to the top of the list. I had never heard of “mysimon”.
Getting this data within MajesticSEO is not hard. It is the equivalent of a “standard” report in the system – although the actual number of links you get will depend on your subscription level. So at £10 a month (about US$16) you would be able to get 10 such reports, but they would only have the top 5,000 backlinks for Google.com. You can make this report go a LOT further. though, by running the same report for the subdomain (WWW.google.com) or indeed the home page itself. Platinum subscriptions go to 20,000 depth, although you can get subscriptons that go 50,000 deep in the standard reports by agreement.
Why “Standard” is better than “Advanced”
Using the advanced report in theory will return ALL the links we have to Google.com. Now I am telling you here and now that you do not have a large enough subscription to run such a report and to do so would need us to prepare our servers for the ensuing onslaught as we serach through 25,492,660,101 links to 3,666,643,870 urls on Google.com we have indexed via 3180,909 subdomains. That’s on awful lot of data! But that’s why we created the standard report in the first place. It’s SO much more efficient.
Now – this is 50,000 backlinks as we see them raw in our dataset, sorted by ACRank. Our definition of “strongest” is quite specific – so please don’t go saying “But I know of many stronger links than these”. Any data set of this size will have some unusual anomolies. Imagine the anomolies in the other 25 billion 442 odd million that we know of!
Enjoy.
Attachment: google_com_top50k_backlinks_Jan_2011.csv (Gzip format).
- How Important will Backlinks be in 2023? - February 20, 2023
- What is in a Link? - October 25, 2022
- An Interview with… Ash Nallawalla - August 23, 2022
Thanks, 50000 is more than enough to begin with.
January 25, 2011 at 9:01 amIt is amazing to think that one website has so many links, and even more impressive that you can ‘just’ go get the data. Great work and keep it up!
January 25, 2011 at 4:42 pmI found your site from UniqueArticleWizard and just going through the list now… thanks for that… I did have trouble opening it initially.
Can you explain why you only see raw links and not anchor text?
Thanks
January 26, 2011 at 8:18 amHi,
Anchor text is part of the data – it’s not just backlinks.
Alex
January 26, 2011 at 12:05 pmHi
What free products do you recommend to unzip these files? Ive tried a few and they all had issues??
thanks
February 1, 2011 at 10:28 amHi Garry,
You can use this free produce to unzip files: http://www.7-zip.org/
Alex
February 1, 2011 at 1:59 pm50,000 will keep me busy for a while. That’s a ton of data
February 7, 2011 at 6:43 pmIf you have a subscription with 50,000 backlinks, are these the strongest 50,000 or just the first/last 50,000 in the collection? IOW, you could miss out seeing the strongest links if they were past 50k.
March 8, 2011 at 4:21 amHi Ash,
March 8, 2011 at 10:21 amThey are the strongest. I should clarify that they are the strongest as defined by ACRANK which – whilst not as sophisticated as the search algorithm definitions – is at least transparent!
Dixon.