Hiding in the Shadows of Penguin
Let me start out this post by saying that I am about to describe what I believe to be a case study in negative SEO. However I don’t have the “proof in the pudding,” “the smoking gun.” In other words I can’t tie the activity I’m about to describe to a perp (crime show talk for a “perpetrator”). This is precisely the world Google ushered in with Penguin and its penchant not merely for neutralizing link spam, but actively penalizing it. It’s a world made possible by Google’s black box decision-making process, the clever villainy of technically savvy and mercilessly opportunistic scumbags, and the apathy of people who create safe havens for them to operate (Google appears to be one of those as well).
Nevertheless, read the post and see if my conclusions are reasonable. If they aren’t, tell me why not. On the other hand, if you agree with me, agree to rant and rail against Google’s wrongheaded practice of penalizing bad links with no effort to determine the reasons they were created.
A word of caution is in order. This article is intended for SEO professionals at the intermediate to advanced level of their profession, and therefore I don’t take time to explain a lot of the basics. However if you are interested in studying a very interesting case, regardless of your experience with SEO, it might be worth reading it just to get a feel for the topic.
The Unlucky Target
The website that is being targeted belongs to one of my clients and offers workplace training in person or online. For the online training they are, in essence, an affiliate marketer, although they don’t fit the classic affiliate marketer profile. In the first place, they are bona fide experts in the field where they are offering training. They don’t seek opportunities to market just anything, but focus on training they know about and provide in person. The competition in their space is fierce. And a lot of it is spammy, even in this post-Panda, post-Penguin, post-Pirate, post-Payday-loans, post-Pigeon age (Hummigbird doesn’t really evoke a Penalty, so I guess that’s why the name of that update doesn’t start with a P – yet I digress).
This client has been hit several times by Penguin updates. They really did have a spammy back link profile, dating back years and built by a previous owner of the company. As their SEO consultants, hired after the fact, we’ve invested countless hours identifying spammy links, requesting removals, and updating their disavows. At first it was working. At least until October 17th 2014 and the release of Penguin 3.0. Here’s what happened then:
Seeing this site get slammed in October of 2014 was particularly heartbreaking since the client had invested tens of thousands of dollars in trying to comply with Google’s new age of reason. We determined that there must still be a problem in the backlink profile of the site and began taking yet another look at the site.
The Evidence Was Planted With Care, But it Was a Frame Job
If you run a cursory check on Majestic.com for domains linking to this site, you’ll see a lot of suspicious stuff like this:
Notice the referring domains indicated above in the green outline. When we first investigated these domains last year we found that they all had a common trait: they had been virtually abandoned by their owners, and they had been hacked. They also had a common manner of linking to my client’s website, which I describe below when I talk about the “Last Man Standing.”
Of the 5 top offenders shown in the screen shot, as of this writing 4 are effectively off line, taken down by their site owners, but the all time top offender, coopercomputers.com, is still up and running. Let’s take a look at how this game is played.
Last Man Standing
The last online site showing tens of thousands of backlinks to my client’s website is coopercomputers.com. If you go to the site, it looks like your typical abandoned website: broken images, text bragging about having a fax line…this is old stuff. They even offer to create presentations by “VCR Tape.” Exciting back then, now not so much.
Digging Beneath the Surface: Source Code to the Rescue
So I examine the home page for a link to my client’s site, but nothing comes up. Nothing shows in the source code either. But obviously Majestic is picking up on tens of thousands of links, so I do a site: search using Google to see other pages Google might have crawled, and here is what I get:
There’s something here that obviously doesn’t match the profile of a small time computer guy’s website. Let’s pick one of these pages and see what we get. And here’s the top of the page. Oh yeah, this is classy stuff alright:
Again, you won’t find a link to any websites obviously present on this page. For that you’ll have to look at the source code of the page. So do a quick Ctrl+U (on a PC) and we’ll say a dense mass of hidden text, with link after spammy link embedded in it. Note the highlighted portion. Except for obscuring my client’s identity for obvious reasons, you’ll see that I finally have located their link.
This exact method is used to embed links on thousands and thousands of pages residing in their slimy glory on the CooperComputers.com website.
Commercial Link Text That’s Almost Comical
My client has never sold anything that could be monetized by “legoland logo florida,” but doesn’t it sound nicely like commercial anchor text. These links were never created by the owner of the website they are pointed at, nor by any SEO consultant they have ever hired. Additionally, note the href attribute in the html. It points to a page that has never existing and which never would because it’s in the /wp-admin/ folder of the site. This folder is never used for public web pages, and in fact by default it is excluded from the Google index in WordPress installations with a disallow statement in the robots.txt. This would be crazy behavior if it was a legitimate link spammer (did I just combine “legitimate” with “link spammer”? Yikes!)
Here’s a list of the commercial-sounding anchor text this domain has used to point thousands of links to non-existent pages on my client’s site:
- Amber Rose and Kanye West kissing
- Benelli Nova shotguns for sale
- Moshi monsters Katsuma purple
- Countdown girl Rachel Riley
- Flamenco origins
- God of war 3 ps2 download
- Simple basketball court layout
- And more
It’s time to start speculating about why these links exist on ComputerComputers.com by the tens of thousands.
Can it be that the site owner created them on their own? Well, if they did, what would be the motive? None of these links generate revenue. In the case of links pointed at my client’s domain, they point to a non-existent page in a directory that’s not even crawled.
Can it be that my client had a former SEO firm that created these links? If so, what would be the motive of that firm? (Let me add an additional detail: thousands of new links at this domain are being discovered every month by Majestic, so this is not something that only happened in the past, it’s ongoing, and my client doesn’t have any other SEO consultants working for them.)
Can it be a random programming act? Anyone who’s done programming will tell you that this is unreasonable.
What are we left with? Well, notice how the links bear a resemblance to a link spam network. Here are the characteristics:
- Commercial sounding anchor text (as seen from the examples above)
- Extensive repetition of anchor text thousands of times
- Link velocity (rate at which links are created) artificially high and disproportionate to site activity
- Hidden text
- Blocks of links, as if the site were selling links to any and all bidders
So this bears a resemblance to link spam, but at the same time by targeting non-existent pages in the wp-admin directory, perhaps an effort to disguise the links (although I’m a bit fuzzy on how this would disguise the links from site webmasters – and if you have any ideas on this last peace of the puzzle, please tell me in the comments below).
Could it be that someone hacked into a seemingly abandoned site in order to create the appearance of link spam targeted at sites such as my clients? And paid for by their competitors to damage them in search? Does this type of thing really happen? If you want to see a hard example of a company offering to create just such links to destroy company’s, reference this recent blog post on the SEW blog by Marcela de Vivo with a screen capture of just such a solicitation.
Checking for Signs of Malicious Hackers
So let’s look for the smoking gun. I often will run a page through sitecheck.sucuri.net if I suspect that a site has been cracked, and so I ran the home page of CooperComputers.com through Sucuri Guess what! It came back clean (see below).
The thing is, these villains don’t really want to have their work discovered. They know that most people will just check the home page of a site and if it’s clean, they will decide the whole site is clean. Having discovered this, often hackers will leave the home page alone and go for the interior pages. Therefore I decided to run some interior pages of CooperComputers.com through Sucuri. Hot dog, but we have an immediate winner.
The first interior page I ran through Sucuri came back with exactly what I was looking for: signs of a hacker. Sucuri was even good enough to identify it as “SEO Spam.”
Is this harming my clients in search? I think there’s a strong possibility it is. Naturally we have disavowed the whole coopercomputers.com domain, however might it be that Google doesn’t care? Also might it be that Google, in seeing that links are still being created by the hundreds, is willing to disregard the disavow because it seems to be ongoing link spam activity? We might never know the answer because Google keeps its link spam logic in a tightly guarded black box (don’t get me going on the insanity of this).
Apathy Becomes the Main Obstacle to a Clean Up
I have contacted the owner of the website that has been hacked, who seems to be an older gentleman who is convinced that all of my crazy talk of negative SEO (even though I’ve described it like he was 5 years old) is designed to con him out of his last dollar.
The site is hosted by Network Solutions and I’ve reported this whole issue to their abuse@ email as hacked, but good old Network Solutions support, always eager for one less thing to do, ran a Sucuri check on the home page (as I also did, you’ll recall), and told me the site is clean. A clear reply with links and screen captures proving that this is not the case has so far been completely ignored.
If I don’t go completely crazy with this, I will modify this post with updates about our efforts to clean up this site. Let me know in the comments if you’ve seen similar activity and agree with my conclusions. If you disagree with my conclusions, feel free to tell me that as well, although please, keep things civil, shall we?
In the meantime, my client has lost half their business and fallen into red ink. They might have to close their doors as result of this, so take this lesson very, very seriously (Hey, Googlers, is anyone listening??).
I hope by the time you read this we’ve convinced the owner of the site to take it down or clean it up, however even if they do I’ll leave this post in place in case it’s helpful to someone else encountering the same shadowy pattern.
I am noticing more and more that sites are getting hacked. I have had several clients get hacked in just this manner. I am very diligent about getting firewalls and monitoring set up on websites I manage now. I haven’t used Sucuri before, but will definitely check it out!
Sucuri is also an affordable choice for a cleanup. Some of these attacks are incredibly subtle and very difficult to clean up, even for a seasoned WordPress developer. Sucuri has always be successful in cleaning up intrusions. I’m a big fan of their services (and receive no incentive for saying so, btw).
I’m sorry to hear this happened to your client. It’s sad to think that someone (a competitor or anyone for that matter) would intentionally sabatoge someone’s business this way. Hope you can help remove the negative SEO in time to save your client’s business.
Thanks for the well wishes, Krissi. We’re doing our best.
WOW and scary! I sure hope you can solve this. Never would I have thought of these things happening. Are there any wordpress apps that can be purchased and installed onto sites to help w/ this?
Hi Leona. Since this is mostly a problem that occurs off your own site and on another site, there’s very little that can be done in terms of WordPress plugins (although to keep your site from being used in attack on someone ELSE it’s always good to have security plugins installed. Sucuri offers a couple of choices that we have used, but there are lots of other options. Otherwise the best practice is to keep regular tabs on links coming in to your site. I’ll give you an example, we use Cognitive SEO which reports on new links discovered and just this week an entirely new website surfaced with the same kinds of links on it. I hope this information helps.
Wow. I’m just starting to learn about negative SEO, so this was a big eye opener. So sorry you had to go through this, but thank you so much for sharing.
Goodness, that definitely did baffle me more than a little. Thank goodness there are people out there who understand this stuff.
This blog post definitely gave some food for thought about the changes and how it can impact businesses.
I mean, to consider that every one is trying so hard to comply and do things the right way, without getting ‘dinged’ yet still finding such a wreck to their traffic.
I know that has to be frustrating.
This is such an interesting article and case that you’ve presented. I appreciate the insights.
Been hearing a lot lately about websites being under attack and unfortunately not all of them can be recovered. We live in a world where technology advances come with many good things and many bad things at the same time.
You’re right — this is well above my comprehension level but I get enough of it to see that this is horribly wrong. I hope you’re able to get this remedied, although with the giants like Google, it’s hard to be heard. Good luck!
Thanks for the well wishes Jackie. Indeed it is (hard to be heard, that is).
I am not a social media professional and must admit I didn’t read this carefully. I have seen reports that it is happening. Sad state of affairs. Like robberys & thefts in real world.
Thanks for sharing a detailed explanation. I haven’t had this problem with any of my clients, though I do get all those ghost spam referral links and that drives me crazy. Hope you are able to get to the bottom of it.. good luck with NS too. urgh
The referrer spam is getting to be insane. Here’s a good article on filtering it that I found useful: http://www.ohow.co/what-is-referrer-spam-how-stop-it-guide/
This is way over my head. Yet, it’s good to read about what to be on the look-out for –and who to call when we need more help.
Thanks for diving in Sharon. Admittedly this post is a bit on the technical side, but I usually don’t get bogged down too much in that, so check back for other posts that are less “geek” and more “business.”
Wrote out a long comment and your system canned it spam – pretty hilarious because all I did was talk about your client’s strategy and all of his other sites.
1. A spam hack is not the same as negative SEO.
2. Links that go to pages that 404 don’t hurt you. At all.
3. Penguin doesn’t target short term, hacked links, irrelevant links. Those are easy to disregard algorithmically.
4. Your client does have long-standing unnatural links meant to boost their ranking – most coming from other sites they own. THESE ARE THE ISSUE. (samples in my post that your system deleted, but they are easy to find).
Your client has owned the site for quite some time it appears (his G+ profile + whois data). The bad links fall within his purview.
The real problem are links he indeed built that are live AND the fact that he has over 200 related domains – many live (just Google his phone number). Those hacked links you found are not likely a ripple in the pond of issues he has and it isn’t correct to tell people they are an issue and that they are negative SEO.
Sorry the system ate your comments. It’s WordPress combined with Akismet, usually a pretty good combination. I particularly regret it because I would love to have more details for some of your statements. Nevertheless thanks for taking the time not only to read the article but to respond in depth not only once, but twice. I very much value your feedback. So I really hope you don’t mind if I take a moment to disagree on a few points.
As I acknowledged in the article, I don’t have “proof” that this is negative SEO. Furthermore, even if it is, I don’t have proof that this is what is destroying the site’s authority in search. But isn’t this a part of my complaint? The “black box” approach leaves us guessing. This is perhaps my primary dissatisfaction in all of this.
From reviewing your profile I realize that you have connections at Google, which I do not, and that your background in search is very impressive. Therefore it’s only reluctantly that I tell you I’m not completely persuaded by your comments. I certainly don’t mean to be argumentative by what I’m about to write, but a lot of the statements you make are basically assertions without attribution (perhaps that was in the post that the system ate). I am NOT saying they are incorrect, but many of them don’t resonate with me.
It is true that the client’s network of websites was put together in a way that obviously violates all of the prohibitions against scraped content, artificially created content and artificial link networks (I’ve reviewed Google’s internal Search Quality Rating Guidelines from 2014, so I know what Google does and doesn’t want to see). What I can tell you is that my client purchased this network of sites in late 2011, before Panda was rolled out. He has been active in workplace safety training far longer, but not with this site or this network of sites. The originator of all of these sites created them and then sold them to my client when they were extremely strong in search results. In other words, Google “rewarded” all of these sites with great visibility and sales at one time. The tone of your comment sounds to me basically that “he had it coming,” (You said: “The real problem are links he indeed built”) unless I misread it, and I could not more strongly disagree.
There’s too much history to go into here, but my client, the exact opposite of a search professional or even an internet marketer, made a purchase based on one thing and one thing only: Google was displaying the websites he purchased very strongly in search; he was a safety guy, he thought he was buying an online safety business. Simple, no? Why should he be blamed for making a purchase based on that? And yet he has a large investment at risk because of the collapse of said business, which is directly a result of the collapse in his search visibility. Sorry I write with passion about this, but I feel for these people and it bothers me when I hear a tone that sounds as if a person in this position has done something vaguely unethical to bring disaster on themselves. Moreover, this site owner has expended vast amounts of time and money trying to remediate the problems created by the previous owner of the company, as I describe below ONLY IN PART.
With regards to all the the sites that you have identified that are associated the client’s phone number, unless we’ve missed something (always a possibility, I admit) these sites have no links pointed at the domain this blog post is about; nor links going the other way. All linking relationships with the target domain were severed ages ago, which was a huge project in and of itself. So are you asserting that Google will trace an association through a registrant’s phone number, even when no links exist? This is he first I’ve heard that, but if it’s true I count it as extremely valuable information. Do you have a source? Do you have inside knowledge that I don’t?
When you say that the artificial links are easy to find, I’m assuming you’re using a tool like Majestic, Cognitive, ahrefs, Moz, etc., and if I’m incorrect I’d love to hear what you are using. If you are using one of those tools then I suggest caution in your statements is in order because those tools leave out a crucial part of the puzzle. To explain: we have done extensive link remediation on this site and some of that is impossible to see from running a 3rd party service, particularly the disavow. I pointed in the article to a number of sites where we contacted the site owners and convinced them to clean up the hack jobs on their sites, and many site owners we have contacted have pulled down or no-followed the links. For example you’ll see thousands of links from a company that is a legitimate partner of my client (in fact they are a national training provider) which appear decidedly “artificial” and only through hours of begging their very corporate, very offshore web developers were we able to get their web developers to no-follow everything. But we did (even a 3rd party tool will show you the no-follows). Again, your examples were eaten by the system, so maybe I’m missing something here.
A former colleague, Kyle B, who I believe you know, was heavily involved in the link remediation on this site 3 years ago, and can attest to the fact that we did an extensive audit with Link Research Tools, mounted a sustained effort FIRST to persuade the creators of the links to pull them down, THEN submitted those who wouldn’t via disavow to what was then known as Webmaster Tools. Bottom line: this client did not create these links, he did purchase a business where they existed, he has seen his traffic disappear, and subsequently has literally invested 10s of thousands of dollars in this effort to clean up the profile. Majestic, Moz, ahrefs…none of them will show you that. And if you were privy to the documentation we keep internally and to the disavow file I think you’d probably see that virtually all, if not all, of the links you are looking at have been addressed at a “best practices” level according to Google’s recommendations.
Additionally I’ve discovered that a lot of link indexing services, particularly Majestic, have old data. Many links that appear in Majestic’s index are no longer live, for example. However new spammy links are being added at a rapid rate (the link velocity in Cognitive is off the charts), and I can tell you again, as the person who has complete control over their SEO for 3 years now… it ain’t us. I have personally examined just about all of the links you are talking about, most likely, depending on the tool you use, and neither the client nor his previous webmaster built them.
Ashley, please believe me when I say that I could make a career out of chasing down new, spammy links that are being created for this client without their knowledge and against every effort we can exert.
When you say that links pointing at 404s don’t hurt a site “at all,” forgive me but can you point me in the direction of a statement from Google to that effect? If I were Google, and I were trying to sniff out artificial link building, why would I discount links to missing pages? Would that somehow mean that the links had NOT been built artificially?
Finally you have not given me any explanation for the motive of someone who would hack into a site and place these kinds of links. The effort I’m seeing here requires programming skill, an ability to breach basic levels of server security, and intentionality. I see no commercial reason for doing this other than negative SEO (and even if what I’ve described here is “barely a ripple,” and yet it’s being pursued as negative SEO, then it is negative SEO. You know, if it walks like a duck and quacks like a duck…etc.) If you do know of any other motive for this, I’m sincerely interested in hearing your thoughts on it. I don’t want to be harming my reputation by overlooking something obvious, so please clue me in.
Let’s just use a common sense approach here for a second. You say: “Your client has owned the site for quite some time it appears…The bad links fall within his purview.” Why would my client hack into another site to place links with the link text “Legoland Logo Florida”? I’m sure you can see that this makes no practical senses even in the days when artificial link networks worked!
Please believe me when I say that I mean none of these comments to be interpreted as disrespect, and I’m sure you’ll be able, if you choose to sacrifice the time, to set me straight where I have spoken in error.
(Not-so-fun fact: You might also be interested to know that the originator of this network of sites that you have identified, having sold it to my client, went out and built a similar spammy site with scraped content and artificial linking and is crushing the search results right now. I can’t tell you how this sets my teeth on edge. If you would like to message me on LinkedIn I would absolutely love to give you specific details of a site that is doing it all wrong and is dominating a particularly category of search results notwithstanding.)
Once again, my sincere thanks for taking the time.
Loving this discussion – thank you to Ashley and Ross for taking the time to really discuss the details of this issue! My own opinion is that in certain very competitive markets there are definitely some negative SEO tactics taking place that are flying under Google’s radar. But what I teach my students is that Google is actively working to identify these issues.
We may not be able to predict when Google will be able to catch negative SEO tactics – so in the meantime we need to be aware of them so we can proactively address situations that arise.
Hey there – first off, I can’t argue with the black box model. We don’t get to pick that and I get why Google operates the way that it does. But Google does offer tons of info and help to understand issues.
If you know that the website has serious liabilities (with the doorway sites, copied content, unnatural links), why try and find another issue? Why not FIX these issues?
At this point, it doesn’t matter when he purchased them. They are his liabilities now. If he was interested in creating a business that relied on any organic search and has an online model – he’s got to do a pinch of homework. Read the guidelines, understand what he is buying, etc. The guidelines have been largely unchanged over many, many years. Now, if he decided to buy websites that were clearly breaking guidelines because it appeared to be working in the short-term – that was a calculated risk he took.
Otherwise, if he knows nothing about search and he did no homework before taking out a huge loan he’s now going to default on…. I don’t see how that’s Google’s fault. It’s a bad business decision.
If he or you actually wanted to clean this up – you can. But it’s been sitting like this for years – you said he’s spending time and money fixing it, but why are there still so many live sites? That’d be the first thing to take care of. What are you waiting for?
There are links between some of the sites. Regardless, it is hard enough to build one really great website – why do you and your client think you can build dozens (or more) websites that rank well? Such old school unfortunate thinking that doesn’t benefit users at all.
As far as links, I use ahrefs and I checked the ones I was referencing to make sure they were live. I don’t use their scoring – I just look at the links and use my judgment. That’s what I mean by bad links are easy to find. Stop relying on tools to find what is easy to spot right in front of you. I know I can’t see your disavow – but if you think the links from the coopers example are a problem then I’m concerned your remediation wasn’t on target. Those links from the cooper site cannot do any damage at all for more than one reason! The new spammy links your chasing are likely not the issue at all. Google Penguin is after long-standing links built to manipulate ranking. Irrelevant, spammy links or links to 404 pages do NOTHING.
And as far as motive for other sites getting hacked… hackers. Logic not needed. I never said your client hacked another site. Your client’s site was likely temporarily hacked (they are on WordPress afterall), so hacked content may have been added – then other sites were hacked to create links so value can be passed on. Happens all the time. It doesn’t matter anyway, those pages 404 now.
I’m not using any inside info here – just basic logic. Focus on building one really great website and cleaning up the links that have any potential to do harm (not ones like the cooper links) and your client would probably be doing just fine.
This isn’t negative SEO. I see nothing that would point to that at all. The client’s strategy is the problem.
If the originator is still spamming and winning – submit a spam report.
Thanks for the follow up comments. Naturally I realize that one blog post and a few follow up comments don’t really provide the details of years of work. Here’s just a bit of clarification:
I agree with you about the inadvisability of doorway sites. This client has always generated leads from them. I convinced him that they needed to go dark, and in fact they were pulled down for about 2 years, without any positive impact on the core site. They were only recently re-enabled because he was missing the leads. Therefore I do not think they are the root of his problem.
Links to the primary site used in the post? Can you give me examples? It would be incredibly useful.
Yes indeed. In addition to shutting all those doorway sites down for 2 years, the money has been invested in extensive content audits to find and replace duplicate content, link audits to find and laboriously request removal of spammy backlinks, repeated link audits because spammy backlinks continue to accumulate “magically,” creating unique content that is of high quality, creating and socially sharing infographics that address unique information needs in the industry, and so forth.
I have, more than once. It goes into the black box; no response, no action. I could show you this competitors site, with tons of pages of duplicate content, cross-linking from domains he owns, and so forth. I could then show you a graph of his visibility on SEMRush and it’s like a jet aircraft taking off.
I guess we’ll have to agree to disagree on that one. I do not disagree with Google keeping their algo opaque, by the way. However, the issue for me is wrapped up in “penalty.” I think a company who is being penalized has the right to an explanation. A penalty is, by definition, punitive. Punishment deserves explanation.
Everything you have mentioned above by way of recommendation is EXACTLY what I preach both to clients and to students that I train. All of the encouragement to just “make a great site” is part of my mantra. My posting of this article is merely to address a very real situation that people who are fighting this battle in the trenches see repeatedly, and it’s not only me.
Thanks for the discussion.