Data distribution from study

We studied results from over 6.5 million domains to find out how many external links is “normal” for a URL to receive from a given URL on a mainstream website. The answer may surprise you.

For the most part, a URL appears to link out externally to another URL about once, roughly. Put another way, most pages appear to have, on average, just over one backlink from any given URL.

Let’s have a look at how we came to that figure in a little more detail!

For this study, we looked at two scores across the sample set – the External referring URL count and the External Inbound link count, as shown in the backlink breakdown section of Site Explorer Summary.

Backlink breakdown section from Majestic Site Explorer

We calculated the ratios of these two counts for every domain in the study data set. Let’s see what the numbers tell us.

Beyond average counts and into distributions

We started with around six and a half million records of the top domains in our dataset. We compared the two backlink counts showcased in backlink breakdown in Site Explorer. These are: a calculated sum of all of the backlinks pointing to each domain and a calculated sum of all of the URLs which have at least one backlink. In Majestic terms, we have a ratio for each domain that measures the impact of supplemental links on the dataset.

We can then do some maths on the millions of ratio’s calculated to try to find out what a normal amount of supplemental backlinks are to a given domain.

Before we look at the data, lets quickly step a little bit back and consider how we present this study.

The dangers of an over reliance on averages is well-documented. We therefore wanted to look at the distribution of ratios across the sites tested. A common way of illustrating this is to produce a box plot which looks at the median, and the higher and lower quartile bounds. A quartile is just a median average for the higher and lower halves of the data set.

Box / Scatter plot showing outliers

Unfortunately, this data is quite skewed by outliers that the box in the box and scatter plot looks flat. We will zoom in, in a moment, but for now, let’s focus on some of the outliers.

RatioNumber of Domains
>53014
>1023
>259
>502
>1001
Outliers found in study

The number of domains with a ratio of greater than five was 3014. 3014 may feel significant, but it’s important to take into account the size of the dataset. In the scale of this study, 3014 represents less than 0.05% – less than a twentieth of a single percent of the data.

Having performed a quick manual analysis on some of these outliers, it appears that some of the high counts are due to a phenomenon called “TLD Correction”. For some sites, where a great deal of subdomains were observed, Majestic includes links to subdomains in the TLD count. These records are annotating with a “TLD Correction” redirect and tend to be quite obvious in Site Explorer. Link level and subdomain level analysis are not impacted.

distribtion of URL:Link relationships where ratio < 10

Zooming in, we see a tighter distribution of ratios, with a flat line where the box plot should be. There still seems a lot of noise. We can look at the ratios a different way to see where the majority of data lies.

Percentage of RatiosMaximum Ratio ( backlinks per URL )
90% of ratios1.28
95% of ratios1.51
99% of ratios1.99
Illustration of ranges of the data ( minimum: 1 )

The vast majority of domains in our dataset had less than 2 backlinks per URL, with a hugely substantial number having far less.

This doesn’t mean it’s wrong to see more than one link to the same URL on the web. The data suggests that multiple links appears relatively uncommon practice. Pages with multiple links to the same URL are vastly outweighed by all the pages with only one link.

Having established the lowest value and a sensible limit to illustrate a box plot, we can now expand out the orange bar from the above two diagrams to illustrate the lower quartile, median and higher quartile of the dataset.

The box plot shows that for 75% of domains sampled, the ratio of quoted external referring URLs to external referring Links on the web lies between 1:1 and about 1:1.03. Ignoring the outliers described above, a boxplot normally shows the range between lower and higher quartiles. In this instance the lower bound is 1, the same as the minimum value. This is because a url cannot link to another URL without at least one link.

For many of the domains tested, the impact of supplemental links appears negligible. From the box plot, it would appear that for many domains, URLs having more than one external link to a given resource is in the range of around 1:100.

How the totals and ratios are calculated

Majestic crawls the web on a massive scale. As well as crawling the web, Majestic analyses the data returned. As well as Trust Flow and Citation Flow, Majestic produces a dizzying number of counts.

The two totals used for this study are:

  1. The total number of external referring URLs
  2. The total number of external referring links ( There will be at least one referring link for every referring URL ).

External Links and External URLs are links from sites other than the ones hosting the page in question.

The totals are calculated on a domain level basis. For each domain, first of all external backlinks are calculated at URL and link level. These numbers are then summed up to generate domain level totals.

As an extreme example: if you had one referring domain with 400 URLs, and each of these URLs had 2 backlinks to your site, this would be reported as 400 URL level relationships and 800 backlinks. While this may feel odd, it’s important to note that raw backlink counts are rarely used in isolation. In this case, the referring domain count would be very low (only 1), presenting a red flag for an experienced SEO to uncover the unusual linking pattern. This way of analysing the data also means that for this Domain, the URL:Link ratio would be “1:2”, providing a very strong signal which emphasises the validity of the approach and the benefit of offering multiple data points. These valuable signals are at risk if more complex calculations are performed to try to get better feeling numbers at URL level.

It is important to note that the counts used in this study are from the Fresh Index and include low quality links, parameterised URLs and deleted links. It may be that revisions in how referring URLs and Links are qualified might impact the findings of this study.

What does this mean

Will this study change your strategy? We would like to think you will validate the findings before acting!

Practicalities aside, we recognise that the appreciattion of signals and their use to inform question making is important part of SEO.

How many of you enjoy greenfield projects? If you inherit work from a different specialist or agency, do you perform an assessment on the work that has gone before in order to understand what work is needed in the future?

Even though interest around disavowing links isn’t quite what it once was we appreciate the benefit of an SEO audit.

An aspect of audit is to prioritise work. That prioritisation could focus on areas in which a site could do better, or trigger further research where the sites KPIs differ from the perceived norm in it’s niche.

Will understanding how many links a client has per URL may play a part in this process?

UX is a ranking factor. UX research and consulting firm the Nielsen Norman Group recommend that redundant links contribute to cognitive overload and should be reduced.

This study appears to suggest that for external links at least, this recommendation reflects common practice.

Why may a page link to the same page more than once?

This study shows that multiple links to resources is uncommon, but not so rare as to be unheard of. We’ve included three examples which illustrate when multiple links could be present for the same referring URL.

Reason #1: The referring page simply has a few outgoing links

In some circumstances multiple links enhance the utility of content. For example, in “Color Craft & Counterpoint: A Designer’s Life with Color Vision Deficiency”, Noah Glushien links to the same resource about colour blindness ( https://www.colour-blindness.com/general/how-it-works-science/ ) in the body of the text, and in a resources section at the bottom of the page:   

In content link to resource
The same link target in a resources section at the footer of the article.

In the content highlighted above, links are used as a form of “academic citation” within the body of the article. These editorial style links are complimented by a list of important related content emphasised at the bottom.

If you check this site in Majestic Site Explorer, you’ll find just one link at URL level. However, the other link, and associated context is available via the Raw Export feature in the “Actions” menu.

Related to reason one, it’s not too uncommon to see an image link and a text link in the same document. Product shortlisting site welikedthis.co.uk features products in a what appears to be a template. The template incoperates a featured image, buy button and title all linking through to the same URL.

Example of multiple link types from one URL to another

In “Full Extract” Majestic reports these different link types as multiple backlinks between the same URLs.

Note: In some circumstances, Majestic can report multiple links when an anchor tag contains both an image and text content. This is because Majestic adjusts the link to report two backlinks from the same URL – one for the image, one for the text content.

Reason #3: How Majestic reports redirects and canonicals

For some links, Majestic stores more than one copy of the same link, so as to include both forms of redirects. Majestic treats canonical tags as a form of soft redirects, as illustrated below.

To bring you the most accurate results, Majestic reports on both the original and canonicalised form of the link in Site Explorer – giving you the choice of which form of the data that is most important to you.

Canonical “Redirect” at domain level for twitter handle

This example is quite common. Twitter canonicalises handles to lowercase, but sites tend to use case sensitive handles to enhance clarity or brand strength.

In this example, YouTube links to twitter.com/YouTube, however, twitter canonicalises that form to twitter.com/youtube.

The contribution of redirects and canonicalised forms of the link to the totals may be worthy of further consideration and is an area where differnent backlink checkers may take a different approach. Areas where redirects may impact link counts is migration, and also adoption of an “https” form of a site over an existing “http” site.

Majestic
Latest posts by Majestic (see all)