We’ve just updated our index and wanted to clarify a few numbers that tend to fly around. We do have a VERY large index.

So where are we now?

We now know about, of 1.8 Trillion urls on the Internet.

Of those URLs,we have crawled 202 billion unique pages of data. Nearly 10 billion of these were crawled since our last update less than a month ago.

That’s 153 million root domains and 646 million subdomains. Infact it would be possible to cite numbers much larger than this, because many subdomains and root domains can be identified on the web, but they don’t in fact resolve. So I’m just not counting those.

To put this into a bit of perspective, we are now crawling more than 10 Terrabytes of data a day and that’s generally around 40 million URLs a day.

And the Bells and Whistles?

During our last update, we started reporting EDU and GOV link counts at the domain level. What we couldn’t show then, but now can, is the EDU and GOV link counts at the URL level. That’s pretty sweet – check it out yourself in our bulk backlinks checker.

Dixon Jones
Latest posts by Dixon Jones (see all)

Comments

  • Majestic

    Hey,

    why not offer these additional services, you already have the data for them:

    1) Google Analytics ID spy: find out which domains share the same GA ID.
    2) Affiliate link spy: find ALL the affiliates of a given product.

    May 13, 2010 at 10:50 pm
  • Majestic

    How big is your index, compared to Google’s index?

    May 13, 2010 at 10:53 pm
  • Dixon

    Our index is deep enough to be representative, but we do not get fresh links online as quickly as Google. So we are a bit behind. We generally report MORE links than Google, but that is because google oes not giv you all its data.

    Both your suggeste features are neat ideas… check this blog in the next week about how you might follow up on those ideas!

    May 15, 2010 at 11:30 am

Comments are closed.