Google’s Panda update is aimed at low quality sites like content farms. So why is it hurting so many established high quality content sites? Mark Nunney explains why he thinks this happening, what to do about it and what we can all learn from it.
This article is about quality sites getting Panda slapped. For a wider look at Panda see the Google Panda update survival guide.
Before we get to the serious stuff, enjoy this video parodying content farms' reaction to Panda:
Since Panda first hit back on 24th Feb 2011 I’ve seen a number of quality sites lose half their Google organic traffic overnight followed by a steady drift down since.
On the same sites, I’ve since seen long articles, written by leading experts in their field get beaten in Google’s SERPs by spammy scraper sites.
Panda is not catching the spammy scraper sites. But that’s not what I’m investigating here. I want to know why the quality sites are getting Panda slapped in the first place.
Tried and failed
Like almost all who’ve tried, I’ve wrestled with the Panda and failed (so far).
I’ve read all the advice from Google and the insights from SEO experts. I wrote the Google Panda survival guide myself.
I found all pages that might be deemed thin in content and they have been removed from Google’s index with noindex tags.
So all that’s left is those in-depth quality articles.
Nothing changed. So if Panda is only about quality content it is a narrow definition of quality.
Adsense has been removed.
Nothing changed. So the problem is not advertising.
I checked the link profiles. There were already thousands of natural links, built over many years, from blogs, expert sites and Wikipedia.
So the problem is not the absence of natural quality links or the presence of spammy low quality links.
So we can rule out the following:
• Quality content
• Link profile
How can that be, considering all the advice about quality we are receiving from Google?
I’ll tell you how soon. But first let’s look at these quality sites that have been Panda slapped …
What properties do these sites share?
They all share the following:
• Quality original content written by experts.
• Established. Some over ten years old.
• Lots and lots of natural quality links to the expert quality content.
• Owned by small companies.
• No fancy design. These sites mostly cared about the words.
• Almost no brand development.
The above combination has led to one other shared quality …
These sites were very successful at getting visits from search engines.
They got visits by ranking high for very competitive keywords.
They beat international brands and media sites with ease.
They got visits via thousands and thousands of long tail keywords because …
… if a page with thousands of words on it is successful for a popular keyword and those thousands of words are about the same subject then that page will clean up in the long tail.
So by design or luck, these sites had great SEO.
Strike ‘luck’ from that sentence. These sites did everything that Google said they were we supposed to do and that wasn’t luck.
So what went wrong?
It starts with a machine …
Panda is a machine (not a cuddly animal)
The first Panda details from Google talked about the challenge they faced trying to define quality with a machine and an algorithm.
Some factors a machine can use to define quality are obvious indicators and these are the most mentioned eg, ‘thin’ content issues like:
• Lots of pages with little content
• High duplicate content (including boilerplate content) to original content ratios
• Lots of adverts
But clearly these factors can’t be used on their own or half the web would be Panda slapped.
Those factors would earn you Panda points (a bad thing) but they must be cross-referenced with more than that.
What about things like:
• User experience
‘Human’ factors like these are mentioned a lot by Googlers talking about Panda. But how can a machine measure these?
It can’t. Not directly.
But it can measure the following two types of metrics and be confident there is a strong relationship between them and the human factors …
• Toolbar data
• Branding data
Let’s look at each of those …
Toolbar data (and all behavior metrics)
Toolbar data (and all behavior metrics) are all the factors that Google can measure via its toolbar and whatever data sources it might choose from its vast store of information. Factors like:
• Time spent on site when clicking through from a SERP
• Bounce rate
• Return visits
(On those other data sources: you might believe Google that it was an accident its Street View cars downloaded the web browsing data of the wifi signals they passed. But it wasn’t a coincidence because Google is dedicated to gathering as much information as possible about everyone.)
That Google uses toolbar data is now clearly documented. For evidence, read SEO veteran Mike Grehan:
“Andrew Tomkins, Engineering Director at Google and former Chief Scientist at Yahoo, made it quite clear at SES New York in 2008 that, in his opinion, whereas anchor text had always been the 'workhorse' of search, the strongest signal now comes from the toolbar."
Also, when Panda 2.0 rolled out, Google publicly stated that user feedback signals were now being used. These signals come from the 'Block results from this domain' links that now appear next to search results if you quickly bounce back to Google after a search.
Google engineers say they don’t favor big brands.
Eric Schmidt, Google chairman, and ex-CEO, seems to disagree. When talking about the “cesspool” (his description) that is the internet (ie, spammy results on Google), he says "Brands are the solution, not the problem".
These two viewpoints are easily reconciled. Google favors sites with a wide range of factors that big brands just happen to display. Factors like:
• Brand mentions and links in social media.
• Brand mentions and links on highly trusted sites.
• Searches with brand and domain name.
It’s important to note that the definition of a big brand is relative to the keyword and the market it is part of.
So a big brand in a small market might not be a big company.
But a big brand in a big market will be a large company.
In practice, this means if you’re a small company it’s going to be hard to do well in a big market.
This is a bit like small shops getting kept out of the shopping mall by high rents.
The video below gives a humorous take on brands and Google from Aaron Wall …
The Panda hypothesis
So my Panda hypothesis is that the Panda algo measures quality with these three types of factors:
• Thin content
• Toolbar data
• Branding data
And some quality sites get Panda slapped because …
First the toolbar data fail …
Our quality sites are failing on the toolbar data because their SEO was too good.
Those quality sites have (had) fantastically successful SEO thanks to their natural links and the depth of their content.
This meant they got lots of visits from searchers using popular keywords.
But the pages contain long detailed experts’ content.
This is not the content those ‘populist searchers’ want.
Result: high bounce rates, low average visit times and low average return rates. Ie, poor toolbar data.
Lots of painful Panda points.
Now the branding fail …
These sites all have almost no brand development. Branding just wasn’t part of their plans.
There was no association of the brand with the content.
No marketing of the brand.
And they are small companies so their brands don’t get looked after and mentioned by national media (who can only exist if they look after their big brand advertisers).
They weren’t that sophisticated. They just produced quality content.
Or they came from a direct marketing background that taught them the message is what delivers response and not the brand.
What’s more this was working very well for them. So why change?
And it was exactly what Google recommended – quality content. So why change?
Result: low brand and domain name searches and mentions
Lots more painful Panda points.
That’s the theory.
I’m more confident of the toolbar part of it than the branding. Both are considered by Google but I can’t be sure they are considered by Panda.
The trouble with Panda
The problem with Panda is that it’s a hack to fix a broken algorithm.
Large content farm sites containing crap were appearing at the top of Google’s SERPs. The sites were not breaking any rules.
Google’s algorithm couldn’t stop them.
Specialist quality sites like those talked about in this article were getting top for populist searches they weren’t built for.
Google’s algo was not working well enough.
Searchers were finding inappropriate and poor sites and this is a threat to Google’s success.
Aaron Wall goes further than this and says that the content farm industry publicly made Google look stupid
Google had to act.
But it couldn’t fix the algo. So it bolted on Panda.
Panda is not part of the main algo. Panda is run every 4-7 weeks. Panda does not change the algo so that better results appear for every keyword search.
Instead any site that falls foul of Panda gets crudely wiped out by a site-wide, all-keyword handicap. That handicap can’t be lifted until the next time Panda has been run.
And it probably won’t be lifted then because very few sites have had a Panda slap lifted.
Why have almost no sites escaped Panda?
I know of only one site that I can confidently say has escaped Panda. A few other cases are reliably reported.
This is telling us something. This suggests that you can’t just change a few things on your site. Like no-indexing thin content, removing excessive advertising and tidying up design.
It’s consistent with the case for toolbar and/or branding data being part of Panda because you can’t change this data quickly.
It takes time to change your toolbar data. Especially if there is a mismatch between many of the keywords being used to reach your site and your site’s content.
You’ve either got to offer those searchers different content or go out of your way to un-optimize your site for those keywords.
It also takes time to build your brand online. That can either take years or a lot of money.
What to do (anti-Panda tactics)
The following points are for quality sites that have been Panda slapped so I’m assuming you don’t have a crappy site with scraped, dupe, useless or spammy content.
• Use robots tag to noindex any large groups of thin content pages. I know other sites get away with them and yours may be very useful. But you’ve been Panda slapped so you’ve got to have to get rid of every possible Panda point you have.
• Look for significant content from feeds. Remove it or noindex the pages it’s on.
• Make sure you don’t have too many ads on a page.
• Reduce ‘boiler plate content’ (the same content on multiple pages).
• Identify groups of pages (two or more) that are optimized for very similar keywords. Merge the content and 301 from one to the other or change what all but one is optimized for.
• Use bounce rates and ‘average time on site’ on keyword reports to find the page and keyword combinations that aren’t delivering good data.
You’re looking for mismatches between keywords used to reach a page and the content of that page. When found, either:
-- Change page content to make those searchers happy, or
-- De-opitimize for those keywords. Eg, optimize for others.
• Make sure your design and user experience is clear and up to date.
• Make sure users can easily see some interesting related content when they arrive on each page. This might be significant categories, links to your best content and even irresistible free marketing offers.
• Make sure you have ‘best content’ to tempt users.
• Show images and videos ‘above the line’ (ie, they can be seen without scrolling). Give images captions.
• Show videos where you can.
• Craft your headlines so that readers want to read the following content.
• Craft your standfirst (first paragraph) so that users want to read on for details.
• Give users clear share buttons for Twitter, Facebook, StumbleUpon and Google +.
• Invite discussion with comments.
• User your brand name in your marketing and public relations.
• Use your brand name in content headlines when this can be done smoothly. For example, your ‘free widget’ becomes the ‘Mybrand free widget’.
• Use your brand in your page title tags and description tags.
• Build a network of bloggers, media journalists and social media users. Interact with them.
• Build a branded email list for a free value-packed newsletter that’s named after your brand.
• Promote your new content on your free newsletter and to your media, blogging and social media networks.
Also see the Google Panda update survival guide.
Lessons for us all
That Google Panda includes toolbar and branding data is a hypothesis (that’s another word for a guess).
But even if they aren’t considered at all by Panda, they are considered by the main algo.
So I still strongly recommend you consider them in your SEO planning.
To become a leader in your niche’s SERPs you have to be a leading brand in your niche. That’s branding.
To stay a leader, visitors from SERPs are going to have to want to stay on your site and come back. That’s toolbar.
Mentions and links (I’ll call these ‘sharing’) in social media (Twitter, Facebook, G+, StumbleUpon, etc) are part of both branding and toolbar …
Your brand needs to be shared (branding) and traffic seen moving to and from social media (toolbar).
Branding and toolbar data are a significant part of a New SEO.