What can backlinks tell us about the authenticity of social media handles?

Last month, Twitter rolled out changes to its account verification system. Users became able to verify their accounts by paying for a subscription to “Twitter Blue”. Following this announcement, a number of incidents of “spoof/hoax” accounts were reported. Today, we will be looking at some examples from the period and asking if Link Intelligence data can help inform us about differences between real accounts, and malicious imposters. Let’s begin by looking at some of the hoax accounts which made the headlines.

Tweet from a fake LeBron James twitter account
Image Source: Yahoo! Sports

A fake LeBron James account, which falsely claimed he was looking to leave The Lakers.

Tweet from an account impersonating Twitter
Image Source: Maldito Timo

An account pretending to be twitter, seemingly attempting a phishing scam.

Tweet from a fake Eli Lilly and Company twitter account
Image Source: @rafaelshimunov on Twitter

And finally, an account pretending to be the pharmaceutical company Eli Lilly and Company, suggesting they were going to make insulin free.

During the chaos, the system was quite promptly rolled back. While Twitter Blue’s re-emergence has been delayed, it is reportedly planned to return on December 12th, accompanied by some manner of verification system.

In his recent brightonSEO talk, “What’s in a link?”, Dixon Jones talked about the value of backlinks for gaining greater insight about content on the web. In light of these developments with Twitter, we thought it would be a good time to ask “What can backlinks tell us about the authenticity of social media handles?”

Social Media is such a huge phenomenon that it can be easy to forget that a lot of the platform on which social media is based on websites and web technology. It stands to reason, therefore that some of the tools used to measure web visibility may be used to attempt to measure facets of social visibility.

The massive adoption of twitter since its inception in 2009 means that the impression it’s left on the web is huge. The platform has been adopted by a lot of businesses as a platform for PR; many public figures use it. A multitude of influencers/creators have followings on there. The ease at which posts can be made and immediately arrive in front of their intended audience has led to Twitter becoming a prominent source of information regarding these entities. As such, it is very common for articles on the web to cite content from twitter directly – either in discussion with the content, or simply in recounting events the content is related to.

The Majestic Million - twitter.com sits at number 4
Twitter.com is in the top 5 biggest websites in the Majestic Million.

Twitter’s visibility on the web is so huge that it sits comfortably in the top 5 sites of The Majestic Million.

Top stats on majestic.com for twitter.com: Trust Flow - 100, Citation Flow - 100; Top Topical Trust Flow scores: Reference / Education, Arts / Music, Recreation / Travel, Society / Government
Site Explorer Summary data for twitter.com

The breadth of Twitter’s reach can be seen clearly in the link profile for Twitter.com. The site sports the highest possible Trust Flow score of 100, with a diverse set of Topical Trust Flow categories.

As you can see, the content found on Twitter does not just exist within the closed ecosystem of the site itself, but permeates into much of the web. As such, we can use the Majestic suite of tools to study its footprint in the greater link map of the web. What then, do these spoof accounts contribute to this footprint? And what can that contribution tell us about hoax accounts? Can we see any signals that suggest things may not be as they appear?

LeBron James

Let’s take a look at Lebron James. We understand his official handle is @KingJames. Let’s see how this handle looks in Site Explorer.

Majestic Site Explorer summary for the Twitter Handle @KingJames. Trust Flow: 46, Top Topical Trust Flow: Sports / Basketball, 98,889 External Inbound Links (Fresh Index), 3,996 Referring Domains (Fresh Index)
Site Explorer Summary for @KingJames

There are some respectable numbers here. As the page of a celebrity, this looks about how we’d expect. The handle has a pretty high trust flow, a dense and diverse Link Graph, and a Link Profile suggestive of a well-linked-to page. Furthermore, it has a lot of inbound links, from a large pool of referring domains. The data in Majestic suggests that this is a noteworthy handle.

Top Links for @KingJames
Top Links for @KingJames

The top links report gives us some signals that suggest we’re looking at the legitimate account of Lebron James. The first two links look to be directory listings. The first, the Trendsmap link, suggests this is a large account, which feels to be expected. The second is from a source that seems to be relevant to the account we’re looking at – basketball-reference.com. We cannot entirely rely on these sources as truths, but we can see from their relative Trust Flow and Citation Flow scores that the links are certainly from strong pages.

The last backlink is quite a strong signal. This appears to be from the official LeBron James website. Once again, we cannot be sure just looking at that link that the domain in question is in fact official, but following the link through, it certainly seems to be.

If we were feeling particularly sceptical about any of the backlinks showing up, we could look at them in more detail in Site Explorer. As an example, here is the top links view for lebronjames.com:

Top Links for @KingJames - first link is from britannica.com
Top Links for lebronjames.com

In particular, the Britannica link supports the notion that lebronjames.com appears to be the official LeBron James site.

One last thing that would be good to look at, while we’re checking individual backlinks, is the new backlinks tab in the Site Explorer.

New Links for @KingJames
Best New Links for @KingJames

Here, we’ve selected the subdomain view. This view makes it easier for us to see the diversity of referring domains. The metrics again provide valuable signals to inform an assessment about the trustworthiness of the sources of new incoming links. As we have seen before, these links seem relevant to who we’re expecting to be behind the twitter account. At least one of the top results seems to be a recent news article related to LeBron James, and its source seems relatively well regarded – sporting a good Trust Flow score.

So that’s how the handle for LeBron James looks. What about our imposter?

Majestic Site Explorer summary for fake twitter handle @kingjamez - 1 Inbound Link (fresh index), 0 Trust Flow and 0 Citation Flow
Site Explorer Summary for @KingJamez

Immediately, this doesn’t look to us how we would expect it to, were it to be a genuine Twitter handle. Investigating a few examples of hoax accounts, we’ve found that it is very common for these fake profiles to have had little impact on the web. As such, we haven’t found many (if any) backlinks for them.

As previously mentioned, LeBron James is a well-known celebrity. In this case, we would find it incredibly strange for his twitter profile to have this scarce of a backlink profile.

Let’s investigate further by taking a look at the handle’s top links.

Top Link for fake twitter handle @KingJamez
Top Links for @KingJamez

The one link available in the Fresh Index does not seem to be of a very high quality. The Trust Flow and Citation Flow scores are very low. You can also see from the Link Context that the link is quite crowded with links to other accounts. While this may not be cause for concern if the backlink was one of many, it would again be unusual to see a link like this at the top of the link profile for such a well-known figure.

Broadening the Scope

It’s very important to note that not every account on Twitter is as well-known. We certainly wouldn’t expect every user’s link profile to have such a rich link profile as LeBron James. However, often, the handles of organisations, public figures and creators, all of various sizes, will have some presence on the web. The key is to have some expectation of what the backlink profile should look like for a particular handle.

To demonstrate, let’s take a break from hoax analysis to take a look at a more niche handle. At Majestic, we invest some resources into supporting the Birmingham tech scene. One of the local events we’ve sponsored is HackTheMidlands.

Tweet from @HackTheMidlands announcing the Majestic would be sponsoring HackTheMidlands 7
Majestic Sponsored HackTheMidlands 7

Within its community, HackTheMidlands is quite a high profile event. However, in the grand scheme of profiles on Twitter, it represents quite a niche interest. As such, this handle provides a valuable contrast to that of our global celebrity, Lebron James. Taking a look at the backlink profile for the HackTheMidlands handle, we can see how even a smaller account can have a noteworthy footprint on the web.

Majestic Site Explorer summary for the Twitter Handle @HackTheMidlands. Trust Flow: 20, Top Topical Trust Flow: Computers / Internet / Domain Names, 21 External Inbound Links (Fresh Index), 6 Referring Domains (Fresh Index), 32 Referring Domains (Historic Index)
Site Explorer Summary for @HackTheMidlands

While the top level metrics of @HackTheMidlands are much smaller than that of a celebrity account, we can see some healthy symptoms. The top Topical Trust Flow topics seem relevant to the handle, and the Link Graph is in line with what we’d expect for the page of an entity at the centre of a local community. Furthermore, even as an account with 590 followers at the time of writing, the @HackTheMidlands handle still sports 6 referring domains in the Fresh Index, with more in the Historic Index (noteworthy as HackTheMidlands has been around for years at this point).

Just like with the authentic LeBron James handle, looking at HackTheMidlands’ Top Links reveals relevant and robust backlinks, in line with what we’d expect for the account we’re looking at.

Top Backlinks for @HackTheMidlands
Top Links for @HackTheMidlands

Leading in the table of top links is the official website associated with the handle. We also have a link from a blog post about the Birmingham tech scene. Inspecting the Link Context for the backlink, we can even see the account listed alongside some of its peers. Finally, we can see the handle listed as part of the contact details for the event, in a link from hack.athon.uk, a community resource for information on UK Hackathons.

We’ve come into this exercise under the understanding that this is the legitimate Twitter handle of the entity it claims to be, but our hope is that the exercise demonstrates that, unlike many of the fake twitter profiles we found, even smaller legitimate Twitter profiles will often leave noteworthy impressions on the web, which we can see in their backlink profiles.

“Twitter”

Getting back to our fake accounts, let’s look at the account impersonating Twitter itself.

Majestic Site Explorer summary for the Twitter Handle @szat_0. Trust Flow: 17, Top Topical Trust Flow: Computers / News and Media, 4 External Inbound Links (Fresh Index), 2 Referring Domains (Fresh Index)
Site Explorer Summary for @SzAt_0

A glance at its Site Explorer stats reveals that much like the fake Lebron James account, it does not have many backlinks at all. However, compared to that example, what’s there is more telling.

Backlinks for @szat_0
Backlinks for @SzAt_0

Looking at the handle’s backlinks in more detail, we can see that the two backlinks it has both mention that the account has been suspended. Google Translate suggests that the maldita.es link is from an article titled “How scammers are taking advantage of the change in Twitter verifications to trick us and impersonate accounts”.

Earlier in this post, we discussed the potential signals you can see that might indicate that the real Lebron James twitter handle (@KingJames) is legitimate. Among them, we mentioned that the top backlinks for its page seemed relevant to the account. In this example, knowing that this handle is fraudulent, we can say that these backlinks also seem relevant to the handle in question – only now the story they tell is not that of an official social media platform’s account, but rather that of an entity aiming to scam readers.

Looking back again to Dixon Jones’ brightonSEO talk, he talks about using Backlink History to track the growth of domains. Specifically, he talks about how the history of his body of work can be seen across the Backlink History Graphs for sites he wrote for. We can apply the same concept to twitter handles.

Backlink History Graph for @SzAt_0
Backlink History Graph for @SzAt_0

This is the Backlink History Graph for the handle impersonating Twitter. Here we’re using the “Cumulative” view mode as it makes it a little easier to see the upward trend of the account. We can see that the account wasn’t visible on the web until November 10th, when the change to verification rolled out.

From this we can conclude that it is most likely that this account wasn’t having an impact on the web at large until November 10th. We might suspect that the account did not even exist before this date – but we can’t know for sure just looking at the Backlink History. It is possible that the account was active for longer and simply did not become noteworthy enough for it to be discussed outside of Twitter (in this case, when the account gained verification and started impersonating twitter). However, much like the LeBron James impersonator’s sparse site explorer entry, a Backlink History Graph like this can be a great signal that things may not be as they seem.

Before we try to draw any conclusions as to if a handle may be real or fake, we must have an idea of how we expect the handle to look. It’s likely that for a noteworthy public figure, there will be some kind of story told by their Backlink History. We would expect Twitter’s handle to be having a much larger impact on the web, much further back than November 10th. As such, the Backlink History for @SzAt_0 further implies that this is unlikely to be the official handle for Twitter.

Eli Lilly and Company

The case of the fake Eli Lilly and Company handle is a little more complex, as the handle gained some traction after its tweet got picked up in discussion around the web. The hoax was highly topical, highlighting an ongoing debate on the price of insulin. It is no wonder that the hoax handle has picked up some links, as shown in Site Explorer.

Majestic Site Explorer summary for the Twitter Handle @elillyandco. Trust Flow: 21, Top Topical Trust Flow Score: Health / Conditions and Diseases, Second Topical Trust Flow score: Computers / News and Media, 1,431 Inbound Links (Fresh Index), 133 Referring Domains (Fresh Index)
Site Explorer Summary for @EliLillyAndCo

The handle’s Link Graph and Link Profile tell the story of a page being linked to by a diverse handful of established sites – albeit a small handful, a speculation reinforced by the very modest number of referring domains. The top links for the handle give us the context to understand why this is.

Top Links for the fake Eli Lilly and Company account. The first (from mashable.com) mentions that the account is fake. Two links mention that the account has lost its blue checkmark. Two links mention that the account's tweets are now protected.
Top Links for @EliLillyAndCo

While @EliLillyAndCo is a hoax account, it has become noteworthy in its own right. This fake handle has been referenced in content across the web. The handle was relevant to the verification changes and also a topical debate in health care. This reflected beyond the Top Links report and can also be seen in the topical trust flow of the handle.

We mentioned earlier in this post that the impression that Twitter has on the web is huge. We’ve seen how this impression can be used to inform us about handles on the site, and in this example, we’ve now seen how a social media handle, official or not, can impact content on the web at large. There’s one last thing we’d like to look at with regards to the impact of spoof accounts and that’s the way their impact can correlate to that of the corresponding official account.

Backlink History Graph for @EliLillyAndCo
Backlink History Graph for @EliLillyAndCo

Here we can see the fake Eli Lilly and Company accounts’ backlink history. While this is a much higher profile example, we can see the same pattern in the fake account we observed before – complete obscurity up until the introduction of Twitter Blue Verification.

However, things get interesting when we look at this besides the backlink history for the real Eli Lilly and Company handle.

Backlink history graph for both @Lillypad and @EliLillyAndCo
Backlink History Graph for both @LillyPad and @EliLillyAndCo

We can see the official account steadily accumulating backlinks up until the fake account becomes noteworthy. At that point, together, they skyrocket in backlinks. Looking at the top backlinks for @LillyPad, we can theorise why this might be.

Top Backlinks for the official Eli Lilly and Company twitter handle. The first is an investors page on the Eli Lilly and Company website. The second two seem to both be news articles about the false account.
Top Links for @LillyPad

The official Eli Lilly and Company account is being picked up in stories about the Twitter Verification changes too! This seems like it would be a natural conclusion of the story involving Eli Lilly and Company – what’s the big deal? Well, we find this to be an interesting example of how we can observe the impact of a fake account, even outside of its own link profile.

Any map represents an abstraction, rather than a statement of fact. A link map is no different in concept. Regardless of whether a twitter handle is authentic, a hoax, or an irrelevant coincidence, sufficient reporting of it can still impact the web map. The web map differs from actual maps in that a web map models an abstract information space, rather than a physical territory. The web is designed to be reasonably tolerant of mistakes, so that an error quoted on one or more webpages will not cause the internet to collapse. This adds an additional source of error when producing web maps when compared to maps of physical spaces. Both will suffer from errors arising from the mapping process. Web maps also reflect errors in the information space. The web makes it possible to link to things that don’t exist (such as 404 errors) It follows that a web map may include references to areas that do not exist, or may be referred to by accident.

We wanted to share this background as it is relevant to something we discovered while researching this post. We decided to evaluate if the first found date of links to a twitter handle could be helpful in determining if the handle was a genuine brand handle or if it was fake. We soon found that with so many things, matters can be a little more complex than they first appear. As with so many things in SEO, the answer to the question “does first found date act as a reasonable indicator of the hoaxziness of a twitter handle?” is….

It depends.

If we take a look at the Backlink History of the fake Eli Lilly and Company handle using the Historic Index, we see something interesting.

Historic Index Backlink History Graph for the fake Eli Lilly and Company Twitter Handle
Historic Index Backlink History Graph for @EliLillyAndCo

The handle has links which far predate the changes to Twitter profile verification by quite some margin – all the way back to October of 2014! Although the amount of backlinks the handle has are still quite small, this does raise the question: why were there links to this seemingly unrelated handle?

As well as the potential of an error entering the mapmaking process, resulting in a small anomaly, there are a few reasons such a twitter handle could have history, including:

• The hoax may have run for a long time becoming newsworthy.
• Some time ago, a link was created with an incorrect target. This may have been copied a handful of times before correction.
• Misleading or incorrect information elsewhere could have resulted in a number of people independently using an incorrect URL.
• By coincidence, someone else may have innocently used a different handle for a number of years before it was repurposed.
• Some brands may change handle, resulting in a history of links on the old handle (For example when @majestic switched from @majesticseo)

We aren’t in a position to conclusively explain why this handle is reported as existing for so long, but we can use data to better inform our understanding of the scale of issue and its influence.

As you will see from the graph below, in the case of Eli Lilly and its hoax, a better indicator of authenticity than the first found date appears to be the volume of links over time. First found date may be appropriate in some circumstances, however proper analysis will consider more than one factor before coming to a conclusion.

Historic Index Backlink History Graph for the official Eli Lilly and Company Twitter Handle
Historic Index Backlink History Graph for both @LillyPad and @EliLillyAndCo

The handle @LillyPad has been receiving many more links for much longer. While it’s possible that the handle was being used by a different account, the consistency of links over time, contrasted with the comparatively fewer backlinks the @EliLillyAndCo handle had, seems to suggest that, at the very least it is more likely that @LillyPad has been the official handle for Eli Lilly and Co since at least September 2010.

The Official Eli Lilly and Company Twitter account. Joined date is July 2010
Eli Lilly and Company has been on Twitter since July 2010

Comparing the Backlink History chart to the Eli Lilly and Company account’s “Joined” date seems to suggest consistency.

The irrelevance of first found date as a signal in this instance is further underlined by an investigation into the backlinks for @EliLillyandCo.

The Site Explorer Link Context report adds an extra dimension of analysis for this handle. Examining reports for the various handles on which the twitter handle suggests report suggests links may come from low authority pages, and some have been detected as deleted in recent crawls.

Deleted links to the fake Eli Lilly and Company Twitter handle
Backlinks for @EliLillyandCo according to the Historic Index

There is little here to contradict the notion that @LillyPad has always been the brand’s official handle. It remains unclear why these links were present, but a deeper study is unlikely to provide much benefit. At any rate, it looks as though the linking sites have since removed these links, reducing their impact on recent crawls.

What we hope to illustrate by interrogating this example is that the link map has limits. The map can give us really rich information about the state of the web at a given time, but that web is a chaotic system of individual agents, the intent behind each of which we cannot identify for certain and who may not always execute what they intend. However, using the link map and our metrics, we can still build models which can help us to navigate the sprawl that is the web.

In Closing

With the future of verified accounts unclear, we hope to have given a new perspective with which to view accounts on Twitter.

If you’d like to see for yourself how the backlink profiles of Twitter handles look in the coming days, we’ve provided a Bookmarklet to make things smoother. If you’d like to know more, we’ve talked in the past about how Bookmarklets can be used to enhance SEO workflows.

Saving this bookmarklet to send a page to Majestic to your bookmarks (right click the link and ‘Bookmark Link’) and then opening the bookmark up from a Twitter profile page will take you straight to that handle in Site Explorer.

We also provide browser plugins for both Mozilla Firefox and Google Chrome, which enable you to see a lot of the data we’ve talked about today directly from the profile page you’re looking at in your browser.

Using the Majestic browser plugin to view backlink data from the HackTheMidlands twitter page
The Majestic Browser Plugin enables you to see backlink data for a Twitter handle directly from its webpage

We would love to know if you find anything interesting, and stay vigilant out there!