Research into what we might consider “Authority linking” reveals that the very concept might be an anathema in modern day large data retrieval systems. Even though Linking remains a significant ranking consideration, there are Paradoxes for the SEO specialist to explore. This article recaps research carried out over several years and offers new definitions of what an “Authority Link” might be.
This week, at Pubcon we discuss “Authority Linking”. This post considers how the research has changed over the years and suggests some new ways to look at what Google and (independently) Majestic might infer from “linking”.
A search (from a US based IP number) for the phrase “authority link definition” might reasonably expect to surface modern research on what constitutes “Authority” when links are used as a ranking factor in the Internet. In fact the results returned are a long way from this Utopia.
The results as of October 2015 broadly break down into two groups: “Hubs” (results from sites that aim to explain many concepts and act as a bridge for users to other authority sources) and “Authorities”: (Sites or pages which might be deemed themselves to be a prevailing authority). This search result behaviour is entirely consistent with research carried out many years go at Cornell University in 1999 by Jon Kleinberg(1) on Hubs & Authorities. The results offered by Google are by no means satisfactory for the searcher. The first “Authority” result is from Moz.com, but is in fact a definition of “Page Authority” – a metric specific to Moz, not “Link Authority”. Other results from what might be considered authority sources include Julie Joyce in 2012 on Search Engine Watch, Chris Brogan of the NY Times in 2008 and Webmaster World in 2005!(2)
These results are for very aged pages in Internet terms. Has link research not moved on since 2005 enough to usurp these results in Google search?
In an attempt to further search for a valid definition of “Authority”, a search was isolated to Google.com results only. The premise for the search was that if “authority” was interpreted as “Links that Google considers authoritative”, then Google itself should be the prime source for a definitive answer. In fact the results were flooded by feedproxy (third party) results, with only one first page result written by Google itself.
• “Any links intended to manipulate PageRank or a site’s ranking in Google search results may be considered part of a link scheme…”
• “…and a violation of Google’s Webmaster Guidelines…”
• “This includes any behaviour that manipulates links to your site or outgoing links from your site.”
This suggests that Google chooses not to accept the notion of “Link Authority” and perhaps suggests that even discussing the subject seeks to penalize those that aim to take action to develop authority links. But at the same time, the fact that the warnings exist suggest that Google still very much uses links within its algorithms and therefore links must remain a relevant and significant ranking factor.
This begs the question as to whether research carried out since the Cornell studies (Kleinberg) have moved into a direction where the term “Link Authority” has itself lost meaning.
A Link from “Authority” Reduces that Authority
One paradox of the HITS theory is also reflected in Majestic’s algorithms. Neither Majestic nor the HITS algorithm/approach seeks to duplicate Google results, but both seek to understand and measure the concept of Authority independently of how Google might come to its own conclusion. In the HITS approach, Authorities are far less likely to link out to other authorities than earlier algorithms had proposed (Garfield’s “Impact Factor” et al (4). Instead, Hubs are likely to link consistently to the Authority. The Trust Flow algorithm (7) (Majestic 2014) also considers a link to a web page to be a signal of Trust, but only insofar as the incoming link comes itself from a page considered Trustworthy. In all three methodologies, multiple links from a page reduce the impact of the Authority that is passed through this links.
Popular <> Authority
One clear signal in search that Google does not necessarily believe its own organic results when it comes to defining “Authority” is in a video by Matt Cutts (Google, 2014)(5) where he notes that an authority website, for example a legislative body controlling a vertical, may not be a popular site. This would not be a suitable result for a Google search in most circumstances, and the legislative body may not receive many visitors. Nonetheless, the legislative body remains the authority in the vertical.
Modern Research is moving towards Context being increasingly measurable
There now appears to be a divergence in approach between Majestic and Google on the subject of an “Authority Link”, although conveniently there remains significant overlap in findings for Majestic to have a valid signal for identifying pages that would be valuable to get a citation for a given basket of web content. Google purchased Freebase in 2010 – a massive dataset that sought to record information about entities, rather than web pages. They ultimately shelved it in 2015 (apparently in favour of Wikidata). Freebase moves towards identifying Entities on the web. A web page itself might be an entity – or perhaps a container for a set of entities. On that page, many other objects might be identified…and a crawl of the page might count instances of those mentions and also evaluate the salience (relevance) of those entities.
The concept of entities manifests itself primarily in the development of Google’s Knowledge graph, a data base of facts, figures and information instead of a pure collection of web pages. It could be reasonably surmised that Google itself feels it can generally do a better task of being a “hub” than other hubs on the internet and is using the Knowledge graph in favour of hubs in its search results. Nevertheless, the knowledge graph’s entities themselves would naturally be a natural signal of quality, because the entities themselves generally point (link) to web pages that are the official and unofficial manifestations of that entity. For example, if the entity is a company, the entity’s record might point to the company website and the company’s Twitter profile.
A Google Research article(6) seeks to explain how Google might identify these instances of entities when crawling a web page but also, importantly, relate those instances to the entity entry in the database itself. The article presents the following example”
In this example, many entities have been identified… Manhattan, David Boies and the SEEC, but Tyco International and PriceWaterHouse have been singled out as “Salient” in the article. The Freebase record then identifies these with a specific record about Tyco International, where further details are recorded. So in this instance, a mention of Tyco International and any link therein will be classed as “Salient”, but will be a vote for Tyco, not necessarily a vote for the page the page links to!
This article is not suggesting or proposing causality between an entity defined by Google and a link from a page that cites that entity in a salient manner, but it is suggesting further consideration to the thought that these citations (links) might be considered more relevant because they come from pages containing salient and measured entities.
Regardless of Google’s interpretation of “Authority Link”, Majestic would consider a links as authoritative only within the context of relevant topics.
How Majestic seeks to Measure Context
The above research from Google plays to the theory that link building should be more about building relationships than building links. If the entity is recorded in a separate database or table to the web domain to which it pertains, the entity itself gains authority through citation. Any citation that the authority gives is therefore highly valuable, providing the citation is salient (contextually relevant).
Majestic chooses to elaborate on an earlier approach to link analysis to seek to define a salient (topical) link, by first refining a training set of sites into categories (topics), then refining an algorithm similar to the Garfield Impact algorithm to pass value ONLY based on the categories (or topics) associated with the page. Unlike the HITS Algorithm that suggests hubs link to authorities and Authorities rarely link, Majestic uses the notion that Hubs are not authorities unless and until identified as such, and that any such authority is in context. Further, that the links from those pages also dissipate or “flow” authority according to context.
Majestic calls this “Topical Trust Flow” and has trademarked and sought to protect this approach.
A Suggested definition of an “Authority Link”
Whilst Google may choose not to elaborate on the concept of an “authoritative link”, they have regularly acknowledged that Links remain a core element of Google’s understanding of the Internet, Including the video acknowledgements from Matt Cutts. Majestic can therefore only put forward two definitions of an authority link. With confidence it can offer a definition of an authority link in the eyes of Majestic as”
“An authority link is a legitimate citation (link) from a topically relevant web page which has, itself, considerable Topical Trust Flow in the relevant topic. That Trust is a composite of both external links to the page and internal links to the page”.
A less confident definition and new definition about an Authority Link from Google’s perspective can only remain conjecture, but is offered for consideration.
“Salient links to entities as recognized by Google will increase the authority of those entities, which in turn can speak (link) as an authority to salient content on the web.”
This article comments and critically evaluates existing research rather than offering further primary research, it suggests:
• Authority <> popularity
• Google uses entities and links and mentions of those entities may ultimately impact rankings when those entities link out to semantically similar pages
• Topical Trust Flow is a resource independent of Google for understanding link authority in context
See These Ideas Presented Live at Pubcon 2015
If you are at Pubcon, please either join Dixon Jones on the subject of Authority Linking, Wednesday (Non-Majestic content, main track) or alternatively on the subject of “Mastering Majestic” (Sponsored Track) on Tuesday at Pubcon Las Vegas 2015.
Here are the slides from that conference.
Statement and References
Some of the views expressed in this article are the opinions of the author acting in a personal capacity and whilst the author acts as a legal representative of Majestic, the observation in this article in particular should not be seen as official Majestic policy or opinion.
1: Jon Kleinberg, Cornell, Hubs and Authorities
2: “Link Authority Definition” as a search term on Google (Transient link)
3: Link Schemes. Google Webmaster Help
4: Garfield’s “Impact Factor”
5: Matt Cutts (Google, 2014) (Search Engine Round Table)
6: Google Research (PDF download)
7: Topical Trust Flow (Majestic 2014)