It's quite likely that most people haven't heard the term “SEO security audit” before: but that's totally undeserved. The recent Google updates have caused an incredible growth of interest in link audits – but if you think link audits are all you should be doing, and only after getting a warning email from Google, think again.
The reasons for SEO security audits
As well as spammy link profiles, there are many more threats that can potentially harm your website.
From hackers looking to plant malware on your site; to scrapers stealing your content and earning you a duplicate penalty; to online reputation issues; negative SEO attempts by a competitor or a hater … But ask yourself: are you helping them hurt you?
There are a host of things you should be aware of about your own site. How secure is your CMS? How safe are your plugins? How logical is your site architecture? And that's just the start.
Let’s look at these and other issues in more detail.
The spread of WordPress, Joomla, Drupal and other popular CMS applications has made it easy for people to create their own sites and the widespread use of them is too much of a temptation for those willing to exploit them. Come up with a way to control a popular CMS, and you’ve got millions of sites under your control immediately.
2012 has seen an increase in discovered vulnerabilities in publicly available software solutions, including those used to run and manage sites, as you can see on this graph:
(Data source: http://www.cvedetails.com)
Having a vulnerability on your site means:
you can get hacked
links to other sites can be placed on your site without your knowledge
your whole site can be hijacked and redirected elsewhere
or your server may be used for hosting and running malware or displaying hidden cloaked iframes stealing your traffic.
Have you ever come across notices like these in any SERPs (search engine ranking pages)?
Whenever that happens, you lose traffic. Nobody wants to click on a potentially harmful link, and even if they do they will get another warning instead of your site, so it takes them a few extra clicks to get into it. Your rankings can drop due to this as well, and no site owner ever wants that to happen!
How do you find out if your site has been hacked? If it's connected to a Google Webmaster Tools account, Google will display a message as soon as it's aware of your site getting compromised:
Another method is to use Google search and look for something that should not be there:
I did a detailed post on how to get rid of malware a while ago so if that’s a problem that's happening to you right now, follow the instructions in that post.
If you suspect links to other sites have been inserted into your site without your knowledge, a tool called ScreamingFrog can help you detect all external links from your site:
Bad redirects are another issue that can cause your site to become de-indexed. When you direct a search engine bot from one URL on your site to another, only to bump it right back, it shouldn’t come as a surprise when the search engine gives up and considers your site unworthy of crawling, indexing and including in the SERPs. ScreamingFrog can help you audit your redirects:
It can also help you review your meta tags and canonical instructions. I've heard Google representatives say they now just ignore bad canonicals, but in the past when rel=”canonical” tag was first introduced I heard horror stories of sites getting almost completely de-indexed due to an incorrect implementation, eg, when all canonicals were pointing at the home page.
Another painful issue that can be caused by site owners themselves is contradicting robots.txt and XML sitemaps on a site. Don't ever include in your sitemap anything you don’t really want indexed, and make sure all the URLs listed in your sitemap file are not blocked by robots.txt. Here is a post I did on some common robots.txt mistakes. One of the worst things you can do to your site is prevent it from being indexed – so be sure to double check your robots.txt when creating and uploading a new site.
Google has a great function in Webmaster Tools for checking your robots.txt syntax (under Health > Blocked URLs). It makes sure you block whatever needs to be blocked and makes available whatever needs to be indexed:
Google has never been particularly good with original content source attribution (making the right decision on who copied it and who published it). One of the big promises of Panda has been to fix that: but nothing has really changed.
You can still get outranked by other sites scraping your content, and that has led to Google using DMCA (Digital Millennial Copyright Act) complaints against sites as a ranking factor. Ie, the more complaints a site gets, the worse the effect is on its ability to rank.
This could have been a nice solution to the problem, but along with the increase in the number of legitimate DMCA complaints, the number of fake complaints increased as well:
Source: Google Transparency Report: Top reported domains. Also see this study by Barry Schwartz
What’s worse is, you don’t even need to be scraped to have duplicate issues. It’s enough to have a site architecture creating multiple URLs for the same content:
The above example is for a WordPress-based site, but other popular CMS’s are guilty of this as well. In this case, we can see how the same content can be potentially indexable by at least five different URLs, thus reducing the value of the content on the main URL and undermining its chances for good rankings.
Indexable on-site search is another thing many would never consider as causing a problem. In fact, back in the day, having onsite search seemed like a great idea, didn’t it? Yet, if not handled properly, it can lead to problems you never imagined.
In the case of a default, untweaked WordPress installation, search results can create two URLs:
Regardless of whether anything is actually found for the query on your site, you get two pages with identical content. And, because of the default robots.txt file that comes with the default WordPress installation, both of these URLs will be indexable. The additional danger here is that anybody can cause your site to be indexed by the search engines for anything they wish (eg, an adult phrase), effectively turning it into a bad neighborhood: and you just helped them make it possible.
When talking about offsite issues, people typically think of links and not much else. But as with SEO security audits as a whole, there are more issues to look at than just links.
Bad neighborhood can be a serious factor undermining the strength of your site, for instance. In the past, whole TLDs (top level domains, eg, .cc) have been banned by Google for being too spammy and thrown out of the index completely.
Physically, your neighborhood is your host, unless you are running your site off a dedicated server. Hosts known to carry a lot of spammy sites are naturally less trusted. You can check your server neighborhood using WhoRush
This is a really handy tool with more than one use: it was in WhoRush that I discovered a very unnatural looking site network belonging to Interflora when it got penalized a few months ago. Google has always been paranoid about using sites belonging to the same owner to boost one site’s rankings: recently they announced that yes, webmasters are allowed to interlink their own sites when it makes sense, but the last thing you want to happen is to get caught with a poorly set up network. If you are doing something frowned on by Google, at least hide it properly.
Finally we get to links!
You can use tools to help you classify the likely suspects – but you still need to go through everything manually and make your own judgement (or have somebody else who knows what to look for do it for you, if you aren't very experienced).
This judgement, first of all, needs to be niche-specific. What passes in one niche may get you burned in another, and no general lists will cut it. Also, no “expert advice” type blog posts on how a site was fixed in a totally unrelated niche is any good for you.
Look at the other sites, both winners and losers, in your own industry, see what the averages are and compare them to your own site. It’s the only way to come up with something meaningful and make a proper judgement.
SEO SpyGlass is a tool that can help you evaluate how risky a link profile is: but don't use it only for your site in isolation. Run reports for other sites in your industry as well. (I’m a great believer that everything can be used for more than one purpose, and in this case the side benefit is you will likely come up with a bunch of ideas as to where to get links from for your site based on what you see – but that’s a topic for a separate post so let’s not get sidetracked too much.) [PS, Wordtracker has some excellent advice on analyzing your backlink profile]
Are you buying links? I'm not going to condemn you: there are industries where there is little chance of getting a link that’s not pure spam other than by buying it. Let’s forget for a moment about anchor text, Google paid link reports and so on. However, if you make your paid links as obvious as this you are asking for trouble:
Generally, the stronger your link profile is, the more difficult it is to hurt your site, even if somebody decides to run a negative SEO campaign against it.
I use MajesticSEO for estimating the strength of a link profile:
In the example above, it's easy to gauge the link profile strength of the two domains I am comparing. While it would be simple to overthrow the first site with just a few spammy links, it would be extremely expensive and resource-intensive to run a negative campaign against the second site.
Trust Flow and Citation Flow are great comprehensive metrics that let you estimate this at a glance. However, there are no universal criteria here and the link profile strength is also an industry-specific characteristic. I have seen sites with Citation Flow and Trust Flow that would be a dream come true in some industries – yet in their own niche, they were some of the weakest sites.
The key principle: your link profile should look natural. What is the typical anchor text distribution in your industry? What is the typical speed of link acquisition?
Does it happen steadily or seasonally?
My favorite example is the site of BrightonSEO, a UK SEO conference that takes place twice a year, with tickets typically going on sale a couple months before the event. By looking at the historic graph of their links, guess the months when it is taking place :o) [And get the tips from the last one].
You would typically call this an unnatural link acquisition trend, yet in this site’s case the trend is very logical and makes perfect sense.
Below is the bare minimum checklist for an SEO security audit:
Indexable search results
Incorrect or absent robots.txt
Bad neighborhood on the server
Weak link profile
Obvious paid links
Overuse of anchor texts
Unnatural patterns in link acquisition
Obvious network setup
Of this list, only two, at best three items are really covered in a typical link audit. More often than not, a site’s problems will have little or nothing to do with links, or it will be more complex than just link related issues. An SEO security audit is the only way to uncover the root of the problem – and hence know how to fix it.
Get a free 7-day trial
A subscription to Wordtracker's premium Keywords tool will help you to:
- Generate thousands of relevant keywords to improve your organic and PPC search campaigns.
- Optimize your website content by using the most popular keywords for your product and services.
- Research online markets, find niche opportunities and exploit them before your competitors.
Take a free 7-day trial of Wordtracker’s Keywords tool