So you (or your client, not pointing any fingers here) engaged in some shady black hat link building tactics in the past. Well perhaps they weren’t black hat at the time, but they are now. Whatever, Google. The time has come to clean up a messy backlink portfolio and start fresh with white hat tactics. But how do you find the negative links of the past?!
First download a list of all referring domains from Majestic’s Historic Index:
Right click and delete unnecessary columns in the red boxes so all of your data fits on one screen more easily:
Expand your columns of data by clicking the triangle button in the top left, then double click any line between columns (like the one in the red box between A & B).
Freeze the top row by highlighting row 1 (by clicking on “1”), click View > Freeze Panes > Freeze Top Row, and while you have the row highlighted, you may as well color it Yellow or something (back on the “Home” tab) so you can stay organized better:
Finally, add a data filter to the “Domain” column by clicking “A” to highlight the column, then Data > Filter:
Next, add a new sheet (to archive all of the negative domains we isolate). Click the sparkling spreadsheet button at the bottom of your excel doc, or if you’re feeling frisky use the hotkey set (Shift + F11). To rename this sheet, double click the newly created tab and call it “Potentially Negative Domains”
In a dirty backlink portfolio, it is very easy to identify potentially negative domains at a glance. Domain names with the words “directory”, “add”, “link”, “seo”, etc. are usually evil. Highly relevant directories are OK, so you’ll have to give that filter a bit of manual review.
Now the time has come to filter for the garbage. Click your triangle clicky filter button, and in the search field type “dir” – then click OK.
OH THE HORROR! Argentinian directories! Highlight all of the results with your top left clicky triangle button, control+C to copy your selection, click on your Potentially Negative Domains tab, and control+V to paste the data. Make sure to freeze your top row in this tab as well, and consider coloring the top row Orange.
Orange header bar ^ It means bad and it means business.
Hop back over to your All Referring Domains tab, clear the filter (Open filter, click select all, click OK) and start over with a new filter. Here are some good (bad) ones to look for:
-article (will return a lot of article directories if that strategy was utilized)
Continue copying and pasting all of the negative domains isolated to the Potentially Negative Domains tab. Fortunately the brains behind these tactics were incredibly unoriginal and almost every spammy site uses the same recurring keyword domains. Some results will have been returned multiple times due to keyword overlap (such as addsiteurlfreewebdirectory.com), so now we need to remove duplicates from our negative domain list.
Highlight column A by clicking “A”. Click the “Data” tab, and then click “Remove Duplicates”.
Click “Expand the selection” and then “Remove Duplicates…”
Note the ominous “…” …
Unselect all, and then check the box next to “Domain” only. Make sure “My data has headers” is checked as well. Click OK!
Now you only have 1 result per negative domain. Scroll to the bottom of the tab, take note of the cell row populated, and divide this row number by the total number of referring domains. Algebraic! Now you have the total percentage of negative domains comprising your backlink portfolio. Be sure to manually review all of the sites in your negative domain list, but using the above filters, there will probably be very few to no quality sites that slip through.
Keep in mind, this technique will isolate spammy domains, but simple irrelevant links may be left behind. For example, miamicarpetcleaning.com has a link here: (http://www.bestcyprusproperties.com/More-Useful-Links-2/pageid-762/) For those geographically impaired, Cyprus is located in the middle of the mediterranean sea. Miami Carpet Cleaning has no business acquiring a link on this site. Isolating irrelevant links is tough manual work, but identifying spammy domains with the techniques described will take a large chunk out of the research.
Congratulations, you’re a data wizard now!