What is keyword cannibalization?

If you optimize your articles for similar terms, your rankings might suffer from keyword cannibalization: you’ll be ‘devouring’ your own chances to rank in Google! Especially when your site is growing, chances are your content will start competing with itself. Here, I’ll explain why keyword cannibalism can be detrimental to SEO, how you can recognize it and what to do about it.

What is keyword cannibalization?

Keyword cannibalization means that you have various blog posts or articles on your site that can rank for the same search query in Google. Either because the topic they cover is too similar or because you optimized them for the same keyphrase. If you optimize posts or articles for similar search queries, they’re eating away at each other’s chances to rank. Usually, Google will only show 1 or 2 results from the same domain in the search results for a specific query. If you’re a high authority domain, you might get 3.

Why is keyword cannibalism bad for SEO?

If you cannibalize your own keywords, you’re competing with yourself for ranking in Google. Let’s say you have two posts on the exact same topic. In that case, Google can’t distinguish which article should rank highest for a certain query. In addition, important factors like backlinks and CTR get diluted over several posts instead of one. As a result, they’ll all probably both rank lower. Therefore our SEO analysis will give a red bullet whenever you optimize a post for a focus keyword you’ve used before.

But, keyword cannibalism can also occur if you optimize posts for focus keywords that are not exactly, but almost the same. For instance, I wrote two posts about whether or not readability is a ranking factor. The post ‘Does readability rank?‘ was optimized for [does readability rank], while the post ‘Readability ranks!‘ was optimized for the focus keyword [readability ranking factor]. The posts had a slightly different angle but were still very similar. For Google, it is hard to figure out which of the two articles is the most important.

Update: Did you see the same article? That’s correct, by now we’ve fixed this cannibalization issue, but we’ve kept this example for the sake of illustration.

How to recognize it?

Checking whether or not your site suffers from keyword cannibalism is easy. You simply do a search for your site, for any specific keyword you suspect might have multiple results. In my case, I’ll google site:yoast.com readability ranks. The first two results are the articles I suspected to suffer from cannibalization.

Googling ‘site:domain.com “keyword” will give you an easy answer to the question of whether you’re suffering from keyword cannibalism. You can check your findings by typing the same keyword into Google (using a private browser or local search result checker like https://valentin.app/). Which of your pages do you see in the search results, and what position do they rank? Of course, if two of your pages for the same keyword are ranking #1 and #2, that’s not a problem. But do you see your articles, for example on positions 7 and 8? Then it’s time to sort things out!

Solving keyword cannibalization

We have an extensive article written by Joost that explains how to find and fix cannibalization issues on your site. It clearly describes the four steps you should take to solve these kind issues:

  1. Audit your content
  2. Analyze content performance
  3. Decide which ones to keep
  4. Act: merge, delete, redirect

The first two steps will help you to decide which articles to keep and which ones to merge or delete. In many cases, the acting part will consist of combining and deleting articles, but also to improve internal linking on your site:

Merge/ combine articles

If two articles both attract the same audience and basically tell the same story, you should combine them. Rewrite the two posts into one amazing, kickass article. That’ll help your rankings (Google loves lengthy and well-written content) and solve your keyword cannibalization problem.

In fact, that’s exactly what we did with our two posts on readability being a ranking factor. In the end, you’ll delete one of the two articles and adapt the other one. And don’t forget: don’t just press the delete button; always make sure to redirect the post you delete to the one you keep! If that’s something you’re struggling with, Yoast SEO Premium can help: It makes creating redirects easy as pie!

Improve internal linking

You can help Google to figure out which article is most important, by setting up a decent internal linking structure. You should link from posts that are less important, to posts that are the most important to you. That way, Google can figure out (by following links) which ones you want to pop up highest in the search engines.

Your internal linking structure could solve a part of your keyword cannibalism problems. You should think about which article is most important to you and link from the less important long-tail articles, to your most important article. Read more about how to do this in my article about ranking with cornerstone content.

Keyword cannibalization and online shops

Now, if you have an online shop, you might be worried about all those product pages targeting similar keywords. For online shops, it makes sense that there are multiple pages for products that are alike. It’s very important to give site structure some thought in this case. A good strategy is to link back from every product page to your category page – the page you should optimize to rank. And keep an eye on old product pages that could potentially cannibalize more important pages, and delete and redirect those – Yoast SEO Premium could help make that easier with its handy redirect manager!

Keyword cannibalism will affect growing websites

If your site gets bigger, your chances increase to face keyword cannibalism on your own website. You’ll be writing about your favorite subjects and without even knowing it, you’ll write articles that end up rather similar. That’s what happened to me too. Once in a while, you should check the keywords you want to rank for the most. Make sure to check whether you’re suffering from keyword cannibalism. You’ll probably need to make some changes in your site structure or to rewrite some articles every now and then.

Read more: Keyword research: the ultimate guide »

The post What is keyword cannibalization? appeared first on Yoast.

Find and fix keyword cannibalization in 4 steps

As your site grows, you’ll have more and more posts. Some of these posts are going to be about a similar topic. Even when you’ve always categorized it well, your content might be competing with itself: You’re suffering from keyword cannibalization. At the same time, some of your articles might get out of date. To prevent all of this, finding and fixing keyword cannibalization issues should be part of your content maintenance work.

Table of contents

Keyword cannibalization?

Keyword cannibalization – or content cannibalism – arises when your website has multiple articles with similar content about the same keyword. This issue mainly affects growing websites: More content means a higher chance of the creation of posts and pages that are very alike. For search engines, it’s difficult to distinguish between these similar articles. As a result, they might rank all articles on that topic lower.

Read more: What is keyword cannibalization »

How to identify and solve content cannibalism

In a lot of cases, solving keyword cannibalization is going to mean deleting and merging content. I’m going to run you through some of that maintenance work as we did it at Yoast, to show you how to do this. In particular, I’m going to show you my thinking around a cluster of keywords around keyword research.

Step 1: Audit your content

The first step in my process was finding all the content we had around keyword research. Now, most of that was simple: we have a keyword research tag, and most of the content was nicely tagged. This was also slightly shocking: we had quite a few posts about the topic.

A site:search in Google gave me the missing articles that Google considered to be about keyword research. I simply searched for site:yoast.com "keyword research" and Google gave me all the posts and pages on the site that mentioned the topic.

I had found a total of 18 articles that were either entirely devoted to keyword research or had large sections that mentioned it. Another 20 or so mentioned it in passing and linked to some of the other articles.

The reason I started auditing the content for this particular group of keywords is simple: I wanted to improve our rankings around the cluster of keywords around keyword research. So I needed to analyze which of these pages were ranking, and which weren’t. This content maintenance turned out to be badly needed. It surely was time to find and fix possible cannibalization issues!

Step 2: Analyze the content performance

I went into Google Search Console and went to the Performance section. In that section I clicked the filter bar:

I clicked Query and then typed “keyword research” into the box like this:

performance filter: keyword research queries

This makes Google Search Console match all queries that contain the words keyword and research. This gives you two very important pieces of data:

  1. A list of the keywords your site had been shown in the search results for and the clicks and click-through rate (CTR) for those keywords;
  2. A list of the pages that were receiving all that traffic and how much traffic each of those pages received.

I started by looking at the total number of clicks we had received for all those queries and then looked at the individual pages. Something was immediately clear: three pages were getting 99% of the traffic. But I knew we had 18 articles that covered this topic. Obviously, it was time to clean up. Of course, we didn’t want to throw away any posts that were getting traffic that was not included in this bucket of traffic. So I had to check each post individually.

I removed the Query filter and used another option that’s in there: the Page filter. This allows you to filter by a group of URLs or a specific URL. On larger sites, you might be able to filter by groups of URLs. In this case, I looked at the data for each of those posts individually, which is best if you truly want to find and fix keyword cannibalization on your website.

Step 3: Decision time

As I went through each post in this content maintenance process, I decided what we were going to do: keep it, or delete it. If I decided we should delete it (which I did for the majority of the posts), I decided to which post we should redirect it. The more basic posts I decided to redirected to our SEO for Beginners post: what is keyword research?. The posts about keyword research tools were redirected to our article that helps you select (and understand the value of) a keyword research tool. Most of the other ones I decided to redirect to our ultimate guide to keyword research.

For each of those posts, I evaluated whether they had sections that we needed to merge into another article. Some of those posts had paragraphs or even entire sections that could just be merged into another post.

I found one post that, while it didn’t rank for keyword research, still needed to be kept: it talked about long-tail keywords specifically. It had such a clear reach for those terms that deleting it would be a waste, so I decided to redirect the other articles about the topic to that specific article.

Step 4: Take action

Now it was time to take action! I had a list of action items: content to add to specific articles after which each of the articles that piece of content came from could be deleted. Using Yoast SEO Premium, it’s easy to 301 redirect a post or page when you delete it, so that process was fairly painless.

With that, we’d taken care of the 18 specific articles about the topic, and retained only 4. We still had a list of ~20 articles that mentioned the topic and linked to one of the other articles. We went through all of them and made sure each linked to one or more of the 4 remaining articles in the appropriate section.

Fixing keyword cannibalization is hard work

If you’re thinking: “That’s a lot of work”. Yes, finding and fixing keyword cannibalization requires some serious effort. And we don’t write about just keyword research, so this is a process we have to do for quite a few terms, multiple times a year. This is a very repeatable content maintenance strategy though:

  1. Audit, so you know which content you have;
  2. Analyze, so you know how the content performs;
  3. Decide which content to keep and what to throw away;
  4. Act.

Now “all” you have to do is go through that process at least once a year for every important cluster of keywords you want your site to rank for.

Keep reading: Use your focus keyword only once »

The post Find and fix keyword cannibalization in 4 steps appeared first on Yoast.

Duplicate content: Causes and solutions

Search engines like Google have a problem – it’s called ‘duplicate content’. Duplicate content means that similar content appears at multiple locations (URLs) on the web, and as a result search engines don’t know which URL to show in the search results. This can hurt the ranking of a webpage, and the problem only gets worse when people start linking to the different versions of the same content. This article will help you to understand the various causes of duplicate content, and to find the solution to each of them.

What is duplicate content?

Duplicate content is content which is available on multiple URLs on the web. Because more than one URL shows the same content, search engines don’t know which URL to list higher in the search results. Therefore they might rank both URLs lower and give preference to other webpages.

In this article, we’ll mostly focus on the technical causes of duplicate content and their solutions. If you’d like to get a broader perspective on duplicate content and learn how it relates to copied or scraped content or even keyword cannibalization, we’d advise you to read this post: What is duplicate content.

Let’s illustrate this with an example

Duplicate content can be likened to being at a crossroads where road signs point in two different directions for the same destination: Which road should you take? To make matters worse, the final destination is different too, but only ever so slightly. As a reader, you don’t mind because you get the content you came for, but a search engine has to pick which page to show in the search results because, of course, it doesn’t want to show the same content twice.

Let’s say your article about ‘keyword x’ appears at http://www.example.com/keyword-x/ and the same content also appears at http://www.example.com/article-category/keyword-x/. This situation is not fictitious: it happens in lots of modern Content Management Systems. Then let’s say your article has been picked up by several bloggers and some of them link to the first URL, while others link to the second. This is when the search engine’s problem shows its true nature: it’s your problem. The duplicate content is your problem because those links both promote different URLs. If they were all linking to the same URL, your chances of ranking for ‘keyword x’ would be higher.

If you don’t know whether your rankings are suffering from duplicate content issues, these duplicate content discovery tools will help you find out!

Causes of duplicate content

There are dozens of reasons for duplicate content. Most of them are technical: it’s not very often that a human decides to put the same content in two different places without making clear which is the original – it feels unnatural to most of us. There are many technical reasons though and it mostly happens because developers don’t think like a browser or even a user, let alone a search engine spider – they think like a programmer. Take that article we mentioned earlier, that appears on http://www.example.com/keyword-x/ and http://www.example.com/article-category/keyword-x/. If you ask the developer, they will say it only exists once.

Misunderstanding the concept of a URL

No, that developer hasn’t gone mad, they are just speaking a different language. A CMS will probably power the website, and in that database there’s only one article, but the website’s software just allows for that same article in the database to be retrieved through several URLs. That’s because, in the eyes of the developer, the unique identifier for that article is the ID that article has in the database, not the URL. But for the search engine, the URL is the unique identifier for a piece of content. If you explain that to a developer, they will begin to get the problem. And after reading this article, you’ll even be able to provide them with a solution right away.

Session IDs

You often want to keep track of your visitors and allow them, for instance, to store items they want to buy in a shopping cart. In order to do that, you have to give them a ‘session.’ A session is a brief history of what the visitor did on your site and can contain things like the items in their shopping cart. To maintain that session as a visitor clicks from one page to another, the unique identifier for that session – called the Session ID – needs to be stored somewhere. The most common solution is to do that with cookies. However, search engines don’t usually store cookies.

At that point, some systems fall back to using Session IDs in the URL. This means that every internal link on the website gets that Session ID added to its URL, and because that Session ID is unique to that session, it creates a new URL, and therefore duplicate content.

URL parameters used for tracking and sorting

Another cause of duplicate content is using URL parameters that do not change the content of a page, for instance in tracking links. You see, to a search engine, http://www.example.com/keyword-x/ and http://www.example.com/keyword-x/?source=rss are not the same URL. The latter might allow you to track what source people came from, but it might also make it harder for you to rank well – very much an unwanted side effect!

This doesn’t just go for tracking parameters, of course. It goes for every parameter you can add to a URL that doesn’t change the vital piece of content, whether that parameter is for ‘changing the sorting on a set of products’ or for ‘showing another sidebar’: all of them cause duplicate content.

Scrapers and content syndication

Most of the reasons for duplicate content are either the ‘fault’ of you or your website. Sometimes, however, other websites use your content, with or without your consent. They don’t always link to your original article, and therefore the search engine doesn’t ‘get’ it and has to deal with yet another version of the same article. The more popular your site becomes, the more scrapers you’ll get, making this problem bigger and bigger.

Order of parameters

Another common cause is that a CMS doesn’t use nice clean URLs, but rather URLs like /?id=1&cat=2, where ID refers to the article and cat refers to the category. The URL /?cat=2&id=1 will render the same results in most website systems, but they’re completely different for a search engine.

Comment pagination

 In my beloved WordPress, but also in some other systems, there is an option to paginate your comments. This leads to the content being duplicated across the article URL, and the article URL + /comment-page-1/, /comment-page-2/ etc.

Printer-friendly pages

If your content management system creates printer-friendly pages and you link to those from your article pages, Google will usually find them, unless you specifically block them. Now, ask yourself: Which version do you want Google to show? The one with your ads and peripheral content, or the one that only shows your article?

WWW vs. non-WWW

This is one of the oldest in the book, but sometimes search engines still get it wrong: WWW vs. non-WWW duplicate content, when both versions of your site are accessible. Another, less common situation but one I’ve seen as well is HTTP vs. HTTPS duplicate content, where the same content is served out over both.

Conceptual solution: a ‘canonical’ URL

As we’ve already seen, the fact that several URLs lead to the same content is a problem, but it can be solved. One person who works at a publication will normally be able to tell you quite easily what the ‘correct’ URL for a certain article should be, but sometimes when you ask three people within the same company, you’ll get three different answers…

That’s a problem that needs addressing because, in the end, there can be only one (URL). That ‘correct’ URL for a piece of content is referred to as the Canonical URL by the search engines.

canonical_graphic_1024x630

Ironic side note

Canonical is a term stemming from the Roman Catholic tradition, where a list of sacred books was created and accepted as genuine. They were known as the canonical Gospels of the New Testament. The irony is it took the Roman Catholic church about 300 years and numerous fights to come up with that canonical list, and they eventually chose four versions of the same story

Identifying duplicate contents issues

You might not know whether you have a duplicate content issue on your site or with your content. Using Google is one of the easiest ways to spot duplicate content.

There are several search operators that are very helpful in cases like these. If you’d want to find all the URLs on your site that contain your keyword X article, you’d type the following search phrase into Google:

site:example.com intitle:"Keyword X"

Google will then show you all pages on example.com that contain that keyword. The more specific you make that intitle part of the query, the easier it is to weed out duplicate content. You can use the same method to identify duplicate content across the web. Let’s say the full title of your article was ‘Keyword X – why it is awesome’, you’d search for:

intitle:"Keyword X - why it is awesome"

And Google would give you all sites that match that title. Sometimes it’s worth even searching for one or two complete sentences from your article, as some scrapers might change the title. In some cases, when you do a search like that, Google might show a notice like this on the last page of results:

This is a sign that Google is already ‘de-duping’ the results. It’s still not good, so it’s worth clicking the link and looking at all the other results to see whether you can fix some of them.

Read more: DIY: duplicate content check »

Practical solutions for duplicate content

Once you’ve decided which URL is the canonical URL for your piece of content, you have to start a process of canonicalization (yeah I know, try saying that three times out loud fast). This means we have to tell search engines about the canonical version of a page and let them find it ASAP. There are four methods of solving the problem, in order of preference:

  1. Not creating duplicate content
  2. Redirecting duplicate content to the canonical URL
  3. Adding a canonical link element to the duplicate page
  4. Adding an HTML link from the duplicate page to the canonical page

Avoiding duplicate content

Some of the above causes for duplicate content have very simple fixes to them:

  • Are there Session ID’s in your URLs?
    These can often just be disabled in your system’s settings.
  • Have you got duplicate printer friendly pages?
    These are completely unnecessary: you should just use a print style sheet.
  • Are you using comment pagination in WordPress?
    You should just disable this feature (under settings » discussion) on 99% of sites.
  • Are your parameters in a different order?
    Tell your programmer to build a script to always put parameters in the same order (this is often referred to as a URL factory).
  • Are there tracking links issues?
    In most cases, you can use hash tag based campaign tracking instead of parameter-based campaign tracking.
  • Have you got WWW vs. non-WWW issues?
    Pick one and stick with it by redirecting the one to the other. You can also set a preference in Google Webmaster Tools, but you’ll have to claim both versions of the domain name.

If your problem isn’t that easily fixed, it might still be worth putting in the effort. The goal should be to prevent duplicate content from appearing altogether, because it’s by far the best solution to the problem.

301 Redirecting duplicate content

In some cases, it’s impossible to entirely prevent the system you’re using from creating wrong URLs for content, but sometimes it is possible to redirect them. If this isn’t logical to you (which I can understand), do keep it in mind while talking to your developers. If you do get rid of some of the duplicate content issues, make sure that you redirect all the old duplicate content URLs to the proper canonical URLs.

 Sometimes you don’t want to or can’t get rid of a duplicate version of an article, even when you know that it’s the wrong URL. To solve this particular issue, the search engines have introduced the canonical link element. It’s placed in the <head> section of your site, and it looks like this:

<link rel="canonical" href="http://example.com/wordpress/seo-plugin/" />

In the href section of the canonical link, you place the correct canonical URL for your article. When a search engine that supports canonical finds this link element, it performs a soft 301 redirect, transferring most of the link value gathered by that page to your canonical page.

This process is a bit slower than the 301 redirect though, so if you can just do a 301 redirect that would be preferable, as mentioned by Google’s John Mueller.

Keep reading: rel=canonical • What it is and how (not) to use it »

Linking back to the original content

If you can’t do any of the above, possibly because you don’t control the <head> section of the site your content appears on, adding a link back to the original article on top of or below the article is always a good idea. You might want to do this in your RSS feed by adding a link back to the article in it. Some scrapers will filter that link out, but others might leave it in. If Google encounters several links pointing to your original article, it will figure out soon enough that that’s the actual canonical version.

Conclusion: duplicate content is fixable, and should be fixed

Duplicate content happens everywhere. I have yet to encounter a site of more than 1,000 pages that hasn’t got at least a tiny duplicate content problem. It’s something you need to constantly keep an eye on, but it is fixable, and the rewards can be plentiful. Your quality content could soar in the rankings, just by getting rid of duplicate content from your site!

Read on: Rel=canonical: The ultimate guide »

The post Duplicate content: Causes and solutions appeared first on Yoast.

Ask Yoast: Meta descriptions and excerpts

When you’re running a large and busy website, it’s practical and time-saving if you can reuse some of your material. Both meta descriptions and excerpts use a brief passage to summarize the content of a web page. So, it could be handy to use the same text for both. But how do you do that? In this video, Joost explains the easiest way to reuse your text for both meta descriptions and excerpts, and whether Google approves of this reuse.

Renee Lodens sent us an email with the following question:

“Is there a way to bulk copy the Yoast SEO meta descriptions to the excerpt field? Also, is this considered duplicate content?”

Watch the video or read the transcript further down the page! 

Optimize your site for search & social media and keep it optimized with Yoast SEO Premium »

Yoast SEO for WordPress pluginBuy now » Info

Meta descriptions and excerpts

So, what to do if you want to save time and use the same passages for meta descriptions and excerpts?

“Well, let’s start with the first thing. It’s probably easier to do it the other way around. If you put the description that you want in the excerpt field, and then in the back end, in the Yoast SEO Titles & Meta section, you can use the excerpt short code for meta descriptions. We will automatically put your excerpt in your meta description. That’s easier. You can do it the other way around too, but then you’d have to code a bit.

Is this considered duplicate content? No, it’s not. Because they are different things used for different purposes. Your meta description will only show up in the metadata, which will not be shown on the page. And Google considers these two separate things.

So this might actually work well for you if you write really good short excerpts that fit well into your meta description.

Good luck!”

Ask Yoast

In the series Ask Yoast we answer SEO questions from followers. Need some advice about SEO? Let us help you out! Send your question to ask@yoast.com.

Read on: ‘How to create the right meta descriptions’ »

Metadata and SEO part 2: link rel metadata

In the first post of our metadata series, I discussed the meta tags in the <head> of your site. But there’s more metadata in the <head> that can influence the SEO of your site. In this second post, we’ll dive into link rel metadata. You can use link rel metadata to instruct browsers and Google, for example to point them to the AMP version of a page or to prevent duplicate content issues. The link rel tags come in a lot of flavors. I’d like to address the most important ones here.

Use rel=canonical to prevent duplicate content

Every website should use rel=canonical to prevent duplicate content and point Google to the original source of that content. rel=canonical is one of those metadata elements that has an immediate influence on your site’s SEO. If done wrong, it might ruin it. An example: we have seen sites that had the canonical of all pages pointed to the homepage. That is basically telling Google that for all the content on your website, you just want the homepage to rank.
If done right, you could give props to another website for writing an article that you republished.

If you want to read up on rel=canonical, please read this article: Rel=canonical: the ultimate guide.

Add rel=amphtml to point search engines to your AMP pages

In order to link a page to its AMP variant, use the rel=amphtml. AMP is a variation of your desktop page, designed for faster loading and better user experience on a mobile device. It was introduced by Google, and to be honest, we like it. It seriously improves the mobile user experience.

So be sure to set up an AMP site and link the AMP pages in your head section. If you have a WordPress site, adding AMP pages is a piece of cake. You can simply install the AMP plugin by Automattic and you’ll have AMP pages and the rel=amphtml links right after that.

If you’d like to read up about AMP, be sure to check our AMP archive.

dns-prefetch for faster loading

By telling the browser in advance about a number of locations where it can find certain files it needs to render a page, you simply make it easier and faster for the browser to load your page, or (elements from) a page you link to. If implemented right, DNS prefetching will make sure a browser knows the IP address of the site linked and is ready to show the requested page.

An example:
<link rel="dns-prefetch" href="https://cdn.yoast.com/">

Please note that if the website you are prefetching has performance issues, the speed gains might be little, or none. This could even depend on the time of day. Monitor your prefetch URLs from time to time.

Become a technical SEO expert with our Technical SEO 1 training! »

Technical SEO 1 training$ 199 - Buy now » Info

What about rel=author?

Rel=author has no effect whatsoever at the moment. It hasn’t had any effect we know of for quite a while actually, as Joost already mentioned this in October of 2015. You never know what use Google might come up with for it, but for now, we’re not pushing it in our plugin. It was used to point to the author of the post, giving the article more or less authority depending on how well-known an author was. At the time, this was reflected in the search results pages as well (it’s not anymore). No need to include it in your template anymore.

Other rel elements include your stylesheets (make sure Google can use these) and you can set icons for a variety of devices. SEO impact of these is rather low or simply not existing.

Is there more?

So we discussed meta tags and link rel metadata in the <head> . Is there even more metadata that affects SEO? Yes there is! In our next metadata post, I’ll explore social metadata, like OpenGraph and Twitter Cards. In addition to that, we’ll go intohreflang, an essential asset for site owners that serve more than one country or language with their website. Stand by for more!

Read more: ‘Metadata and SEO part 1: the head section’ »

Ask Yoast: duplicate content issues on my shop?

If you own an eCommerce site, you might wonder how to optimize your category pages and your product pages. Could you have the same content on your category page and your product pages? If you have the same content on multiple pages of your website, would Google know what to rank first? Or would it cause duplicate content issues? This Ask Yoast is about the optimization of category and product pages of your online shop. Hear what I have to say about this!

Jeroen Custers from Maastricht, the Netherlands, has emailed us, asking:

“My product pages and category pages have 99% the same description, except for the color. Although the category page gets all the links, one product page ranks. Does Google see my pages as duplicate content?”

Check out the video or read the answer below!

Want to outrank your competitor and get more sales? Read our Shop SEO eBook! »

Shop SEO$ 25 - Buy now » Info

Duplicate content on your shop?

Check out the video or read the answer below!

The answer is simple: Yes. So what should you do is optimize your category page for the product. And only optimize the sub pages, the product pages for the individual product colors, and then make sure that the category page gets all the links for that product. So you should improve your internal linking structure so that when you mention the product, you link to the category page and not to the specific color page underneath that.

If you improve that category structure in the right way, then that should fix it. If it doesn’t, then noindex the product pages and “canonical” all of them back to the category, so that Google really knows that the category is the main thing. That’s what you want people to land on. Most people want to see that you have more than one option.

If they search for the specific product and you do not noindex it, so if you choose for the first option, then Google should send them to the right page. So try that first. If that doesn’t work, noindex as product page and then “canonicalize” them back to the category.

Good luck!

Ask Yoast

In the series Ask Yoast we answer SEO questions from followers. Need help with SEO? Let us help you out! Send your question to ask@yoast.com.

Read more: ‘Crafting the perfect shop category page’ »

Ask Yoast: Duplicate content on LinkedIn Pulse

Social media is not only an important part of your marketing strategy, but it’s important for your SEO strategy as well. LinkedIn publishing platform Pulse is one of the many content publishing platforms out there. You can read stories and news from other publishers, and you can publish your own content. But could you publish the same blog post on Pulse, as the one you post on your own site? Or should you post an excerpt and link back to your site? Does Google consider content on Pulse as duplicate content? Joost will answer this question in this Ask Yoast.

Guy Andefors from Stockholm in Sweden emailed us the following question:

“Can we safely republish an entire blog post on Pulse or should we post an excerpt and link back to our site?”

Check out the video or read the answer below!

Optimize your site for search & social media and keep it optimized with Yoast SEO Premium »

Yoast SEO for WordPress pluginBuy now » Info

LinkedIn Pulse

Read the transcript of the video here:

To be honest, if you post your own blog post first, make sure that it’s indexed in Google and then post it on Pulse with a link underneath the posting: “This post originally appeared on…” linking back to your blog post. If you do this, you should be okay.

It’s not rel=canonical, but Google is smart enough to understand most of that and work its way through, so you should be okay. It might still rank the LinkedIn one higher, if your own domain is not that strong, because it might think that it actually gets a better interaction on LinkedIn. If that’s the case you should think about maybe using excerpts. Just try it a bit, see how it works for you. It really depends on how strong your own domain is and on what you want to achieve. If it works on LinkedIn, maybe leave it on LinkedIn and then make people click from LinkedIn to your site. That’s just as good for you, if it works. 

Good luck!”

Ask Yoast

In the series Ask Yoast we do our best to answer your SEO question! Need some help with your site’s SEO? Send your question to ask@yoast.com. You might get a personal answer on video!

Read more: ‘DIY: Duplicate content check’ »

Ask Yoast: importance of using excerpts

Want to know how to create attractive archive pages? And how to increase click-through rates to your posts or pages? Make sure to write short and appealing excerpts for every post or page. The excerpt should be a teaser to get people to read your post. In this Ask Yoast, Joost explains the importance of using excerpts.

This Ask Yoast is all about the following question:

“Why is it important to use the excerpt? Doesn’t Google consider this to be duplicate content?”

Check out the video or read the answer below!

Optimize your site for search & social media and keep it optimized with Yoast SEO Premium »

Yoast SEO for WordPress pluginBuy now » Info

The importance of using an excerpt

“The excerpt is that bit of the post, that will be shown on archive pages. So, if you write a specific excerpt for a post, then that excerpt is what shows on archive pages.”

excerpt input field wordpress

The excerpt input field in WordPress

 “Sometimes it’s also shown on your front page, if the front page of your site features your blog posts. The excerpt can actually be a very good teaser to get people to read your article.”

excerpt on homepage

Blog post excerpt as shown on our homepage

“The excerpt is not considered to be duplicate content. In fact having excerpts for every post prevents having duplicate content, when you have a long archive page which shows more bits of the post. So you should use the excerpt if you can. It’s a bit more work, because that means writing an excerpt for every post. But you should if you could. Good luck!”

Ask Yoast

In the series Ask Yoast we help you with your SEO question! Not sure what’s best for your site’s SEO? We’ll come to the rescue! Just send your question to ask@yoast.com.

Read more: ‘How to create the right metadescription’ »

DIY: Duplicate content check

Duplicate content is much-dreaded in the world of SEO. If your content lives on multiple pages on your site, or other websites, Google might get confused and won’t know what to rank first. You’ll want to prevent duplicate content as much as possible. So, what can you do, yourself? Here, I’ll explain how to perform a duplicate content check, which you should do from time to time to find copied content. Plus, some tips to avoid duplicate content in the first place. Let’s get started!

Adding a preventive snippet

In the ‘Search Appearance’ > ‘RSS’ section of our Yoast SEO plugin, we have predefined a snippet to add to your feed entry saying “This article first appeared on yourwebsite.com”. The link in this snippet makes sure that every scraper includes the link to the original article. Of course, this already helps to prevent duplicate content, as Google will find that backlink to your website.

Nevertheless, if you write awesome content, your content will be duplicated. And that copy won’t always include a link to your website. All the more reason to do a duplicate content check on a regular basis.

CopyScape duplicate content checker

There are a lot of tools to find duplicate content. One of the best known duplicate content checkers is probably CopyScape.com. This tool works pretty easily: insert a link in the box on the homepage, and CopyScape will return a number of results, presented a bit like Google’s search result pages.

copyscape duplicate content checker results
The results page of a CopyScape scan

You can click the results for more details and to see which parts of your text are duplicate. Let’s look at an example from our popular post on 6 common SEO mistakes, which was first published on 3 October 2017. Copyscape found that 170 words, or 9% of this post, were copied:

CopyScape highlights passages that are duplicate

In this case, the first paragraph from our article, discussing low site speed as a common SEO mistake, was copied and turned into a short blog post. CopyScape clearly highlights the text they found to be duplicate, which gives an idea of how severe the copying is. If it’s just a small percentage of the page, I wouldn’t worry. If it’s like over 40%, and makes up quite a large part of the other page, I would simply email them to change the copied text.

Use the CopyScape duplicate content checker to find copied content from your website on other websites. Again, it’s one of many tools, but this one’s free and easy to use. Keep in mind, though, you won’t get unlimited scans for one website. If you want to dive a bit deeper into your duplicate content, CopyScape also offers a premium version for more insights.

Tip: Duplicate content on product pages

Using CopyScape, we frequently find manufacturer descriptions used in online shops to be duplicate. Usually, these are automatically imported into the shop’s content management system. Usually, not just for your website. Be aware of this. I understand it’s quite the hassle to write unique product descriptions for every product. But, don’t your best-selling products, at the least, deserve as much? So start now and take it from there!

Siteliner internal duplicate content check

Siteliner is CopyScape’s brother that searches for internal duplicate content. So, this duplicate content checker will find duplicate content on your own site.

Internal duplicate content

Internal duplicate content, how does that happen, you ask? Well, a very common example of this is when a WordPress blog doesn’t use excerpts but shows the entire blog post on the blog’s homepage. That means that the blog post is available on at least two pages: the homepage and the post itself. And it’s probably on the category and tag overview pages as well. That’s four versions of the same article on your own website already.

Using excerpts (rather than showing the entire post) has the advantage that the excerpt always has a proper link to the post. This link will tell Google that the original content is not on that blog/category/tag page but in the post itself. We often recommend the use of excerpts.

Using Siteliner

The Siteliner duplicate content check will show you a lot of things, but limited to 250 pages and once every 30 days. Again, there is a premium version, but the free one will already give you a good impression. Just do a search and you’ll end up on the overview page. You’ll see the percentage of internal duplicate content at the top left. Don’t panic when you see high numbers, as this duplicate content check also considers excerpts duplicate content:

Siteliner results overview
The siteliner overview page

Simply click one of the links and check if it’s indeed the excerpt. The excerpt obviously links to the post, so if that’s the case, you’re covered.

Siteliner highlights the content it considers internal duplicate content and tells you where to find it

Sidenote on using duplicate content checkers

While Google understands what a sidebar is, CopyScape and Siteliner appear to include all text on a page in their percentage calculations. This means that the actual percentage of the duplicate content, when just looking at the main content of a page, might be higher. Please keep this in mind when you use one of these duplicate content checkers. Just a heads-up!

Manual duplicate content check

CopyScape and Siteliner are nice, easy-to-use duplicate content checkers. However, if you want to see what’s duplicate according to Google, you could also just use Google itself.

If you have a certain page that you’d like to check, simply go to that page. Copy a text snippet, preferably from a section that you think might be attractive for others to copy. Let’s take a passage from our common SEO mistakes article: “If your page title is too long (currently 400 to 600 pixels), it will get cut off in Google. You don’t want potential visitors to be unable to read the full title in the SERPs.” (Note that Google only takes the first 32 words into account). Insert the exact snippet in Google between double quotation marks like this:

Duplicate content check in Google

This search query returns ‘about 208 results’ according to Google, which is well over the 10 results CopyScape returned.

Check your own duplicate content

Use a duplicate content checker like CopyScape to find what has been copied from your site, and use Google to see where else on the internet this content ended up. These are simple tools that serve a higher goal: to prevent duplicate content. If you want to read more on duplicate content, start with our Duplicate content: causes and solutions article.

Read more: rel=canonical: the ultimate guide »

The post DIY: Duplicate content check appeared first on Yoast.

Ask Yoast: www and duplicate content

If content on different urls is the same, search engines don’t know which url to show in the search results. We call this a duplicate content issue. And it can hurt your rankings! Unfortunately it happens more often than you’d think. Did you, for instance, ever think about the consequences of www or non www versions of your site?

At Ask Yoast, we received a question about this from Steve Blundell of Avonsci:

“Do the www and non www versions of a page create duplicate content, and if so how can I deal with it?”

Watch the answer in the video below!

www or not?

“The answer is yes, it creates duplicate content. It’s not the worst kind of duplicate content, because Google knows that these things happen, but it’s better to fix it nonetheless. The best way of fixing it is to choose one, either the www or the non www version and to redirect the other to it. So on Yoast.com we redirect www.yoast.com to yoast.com. We did that, because we think it’s cooler and www is a bit old fashioned. But, choose whatever suits you best, redirect the other and you’re done!”

Do you have a question about duplicate content, link building or copywriting? Just ask! We’ll be glad to help you out if we can. Send your SEO question to ask@yoast.com!

Read more: ‘Duplicate content: causes and solutions’ »