What is a soft 404 error and what to do about it?

If you’ve been working on your site for a while, you might have come across a soft 404 error. Google Search Console might have sent you an email notifying you of a number of soft 404 errors on your site. But what are these errors? What is ‘soft’ about a 404 error? Why do they happen and what can you do about this kind of error? Read on to find out!

What is a soft 404 error?

A soft 404 error is a confusing error, so let’s break it down. First, we’ll look at what a regular 404 is:

  • A 404 error happens when a page is not available and the server sends the correct HTTP status code — a 404 Not Found — to the browser telling the page is nowhere to be found.
  • A soft 404 error happens when the server sends a 200 OK status for the requested page, but Google thinks that the page should return a 404. It may do this if the page content looks like an error, or, if there’s no content.

So basically, pages with soft 404 errors are pages that don’t or shouldn’t exist but still exists according to the CMS. Confusing, right? If it’s confusing for you as a reader, think about how search engines get confused by this. A soft 404 error is not a standard server status code, but a label that search engines add to help them make sense of these pages and to ignore these as they see fit. These errors show up in tools like Google Search Console and you should do your best to fix these.

If the regular 404 pages return a 200 OK status, these pages still seem to exist and might end up in the search engine results page if crawled and indexed. That’s not what you want. When a search result gets a soft 404 error label it will not appear in the index.

Don’t think this never happens. It does quite a lot, actually, especially in CMS’s like WordPress. There are a lot of automatically generated, totally empty and useless pages in WordPress.

Here’s a quick example. Just make a new tag in WordPress, leave it empty and visit it on your site. Open your browser’s developers tools and you’ll the page gets a 200 status message while also displaying a big nothing found message.

Tadaa, you’ve created a soft 404 error.

The page is not found and yet it gives a 200 status saying everything is A-OK

Can soft 404 errors harm SEO?

Yes, as search engines hate stumbling on dead links. They are often a sign of bad maintenance and a lack of respect for the user’s experience. In the case of soft 404 errors, these can be extra confusing for search engines because the expected result is different from the actual result. By telling the search engines that the page is real, it will get crawled and might end up in the search results.

It is bad practice having faulty, empty or thin pages crawled. Having loads of errors on your site might negatively impact your crawl efficiency  i.e. the way Google sees and crawls your site.

How can I find soft 404 errors?

You can find all the soft 404 errors on your site inside Google Search Console’s Coverage report. Here, you can click on the error marked ‘Submitted URL seems to be a Soft 404’ to see an overview of the pages that have errors.

Don’t have a Google Search Console account yet? You’re missing out on essential features that can help you improve your site. Here’s our Beginner’s guide to Google Search Console — it helps you get started.

The Coverage report in Google Search Console shows your soft 404 errors

How to fix these errors

How you fix soft 404 errors depends on the page and what you want that page to do. But when you have them, the least you should do is make sure the page with errors always sends the correct status code. Here are a couple of options:

  • If a page doesn’t exist (anymore) give it a 404 (not found) or 410 (content deleted) and make sure you have a great 404 page. Keep in mind that having loads of 404s on your site is bad practice as well.
  • If a page is available, but still gets a soft 404, Google deems this thin content and you should fix that page. Give it some solid, relevant content to show search engines that this page has value.
  • Did the page move to a new location? Redirect it with a 301 redirect.
  • You want to keep the page, but not have it indexed by Google: noindex it with Yoast SEO.

Always try to find out where the issues originate and see if you can prevent these errors from happening in the first place.

Site maintenance is important

Keeping your 404 errors in check is a recurring task for every site owner. Regularly check Google Search Console to see if you have new errors. If possible, fix these as soon as possible.

Want to learn more about working with crawl errors like the soft 404? Here are a couple of helpful posts:

The post What is a soft 404 error and what to do about it? appeared first on Yoast.

SEO Anti-patterns: 301 redirect all your 404s to your homepage

Sometimes I encounter new “SEO hacks” that people apply, that are actually anti-patterns. One of these new anti-patterns I noticed is the pattern of 301 redirecting all your 404 pages to your homepage. Let me explain why this is a lot like cleaning up your room by throwing everything into a drawer and what the better solution would be.

The premise of this SEO hack

The premise of this hack is that 404 errors are counted by Google, and that through some magic the number of errors on your site affects your site’s overall ability to rank. The solution, that really isn’t a solution, that people come up with is then to start 301 redirecting all error pages to their homepage. Let me quote some of the reasons people give for doing this:

to siphon Google Page Rank (TM) from missing pages to the homepage

If you care about your website, you should take steps to avoid 404 errors as it affects your SEO badly.

I have a website, every time I login to Google webmaster tools, I found many new discovered 404 error links, the problem is not in 404 errors itself, but when Google see them and count them for you!

Let’s be clear: we’ll be the first to tell you that you should keep an eye on your 404 errors and try to fix them where possible. Google indeed shows a graph of your 404 errors in Google Search Console and lowering the number of 404s on your site is often a good idea. That doesn’t mean that your site shouldn’t have any 404s.

Let me go back to my analogy of throwing everything into your drawer when your dad or mom told you to clean up your room. Everything, in this case, means not just the dirty clothes, or your toys, but also that half emptied milk carton, that half-finished sandwich, etc. You know what that makes your drawer when you clean up your room like that? A mess. And soon your whole room will start to stink because you cleaned up like that. This situation is no different.

I verified this with Google before I wrote this article, see John Mueller’s response:

As John explains: when you do this blanket redirect, all those URLs are treated as 404s. So none of them spread value. So the premises listed above are all wrong. On top of that, by 301 redirecting all your 404 pages, you throw away the opportunity to find real errors on your site and fix them.

Better solution to 404s

The better solution for this problem of having too many 404s is much more granular. You see, 404 redirects can exist for lots of reasons, and each of those reasons has their own “solution”. For instance:

  • Someone linked to an article and made a mistake in their URL. If you can redirect that wrong URL to the right article: do so.
  • You’ve deleted a page, you should think about that and act properly, we have an article on that.
  • Someone is trying whether your site can be hacked through a certain URL, that 404 is 100% the right thing to serve.
  • You have a lot of 404s on your site because you had a broken link in your template somewhere (all too common): fix that broken link. Then redirect all those 404s to the right page.
  • Someone is typing in random URLs on your site just to see if something exist: a 404 is right. Of course, then your 404 page could be helpful in guiding them to the right spot.

How common is this hack?

Unfortunately, all too common. I encountered at least 3 plugins with major user bases on WordPress.org that do this, and only this:

Together they account for 240,000+ sites that show this behavior and there are probably a lot more.

Stop 301 redirecting all your 404 pages

Now, don’t take this as though we’re telling you not to 301 redirect 404 errors. We’re telling you to do it granularly. There’s nothing wrong with having a few 404 errors on your site, and you should definitely keep an eye on them. The redirect manager in Yoast SEO Premium can make this really easy to do.

The post SEO Anti-patterns: 301 redirect all your 404s to your homepage appeared first on Yoast.

How to properly delete a page from your site

Whenever you delete a page (or post) from your site, you also delete one or more URLs. That old URL, when visited, will usually return a ‘404 not found’ error, which is not the best thing for Google or your users. Is that what you really wanted to happen? You could redirect that deleted page to another page, or maybe – if you really want the content gone from your site – serving a 410 header would actually be a better idea. This post explains the choices you have and how to implement them.

Did you know Yoast SEO Premium has an awesome redirect manager that makes the redirection of deleted posts a breeze? Try it out!

Redirect or delete a page completely?

The first thing you have to work out is whether or not the content you deleted has an equivalent somewhere else on your site. Think of it this way: if I clicked on a link to the page you deleted, would there be another page on your site that gives me the information I was looking for? If that’s true for most of those following the link, you should redirect the deleted URL to the alternative page.

In general, I’d advise you to redirect a page even when only a handful of the visitors would benefit from it. The reasoning is simple: if the other option is for all your visitors to be sent to a “content not found” page, that’s not really a great alternative either…

Create a redirect

There are several types of redirects, but a 301 redirect is what’s called a permanent redirect, and this is what you should use when you redirect that deleted page URL to another URL. Using a 301 redirect means Google and other search engines will assign the link value of the old URL to the URL you redirected your visitors to.

Deleting content completely

If there really is no alternative page on your site with that information, you need to ask yourself whether it’s better to delete it or keep it and improve it instead. But if you’re absolutely sure you want to delete it, make sure you send the proper HTTP header: a ‘410 content deleted’ header.

404 and 410 HTTP headers

The difference between a 404 and a 410 header is simple: 404 means “content not found”, 410 means “content deleted” and is, therefore, more specific. If a URL returns a 410, Google knows for sure you removed the URL on purpose and it should, therefore, remove that URL from its index much sooner.

Our Yoast SEO Premium plugin for WordPress has a redirects module which lets you set 410 headers. The redirect manager is the perfect tool for working with redirects, automatically asking you what you want to do with a URL when you delete it or change the permalink. Of course, you can set any type of redirect.

The problem with serving 410 content deleted headers is that Google’s support for it is incomplete. Sure, it will delete pages that serve a 410 from its index faster, but Google Search Console will report 410s under “Not found” crawl errors, just like 404s. We’ve complained to Google about this several times but unfortunately, they have yet to fix it.

Collateral damage when deleting a page

When you delete one or more posts or pages from your site, there’s often collateral damage. Say you deleted all the posts on your site that have a specific tag. That tag now being empty, its archive’s URL will also give a 404. Even when you handle all the URLs of those posts you deleted properly (by redirecting or 410ing them) the tag archive will still give a 404, so you should make sure to deal with that URL too.

Even when you didn’t delete all the posts in a tag, the tag archive might now have 5 instead of 12 posts. If you display 10 posts per page in your archives, page 2 of that archive will now no longer exist, and thus give a 404 error. These aren’t the biggest problems in the world when you delete one or two posts, but if you’re dealing with a Google Panda problem and because of that deleting lots of poor content, creating a lot of 404s like this can take your site down even further, so proceed with care!

Read more: Which redirect should I use? »

The post How to properly delete a page from your site appeared first on Yoast.

Website maintenance: Check and fix 404 error pages

If your website is important to your business, it’s essential to schedule time to keep it running smoothly. Therefore we regularly write about the things you should do to keep your site in shape. In this post, we’ll write about the most basic of all: checking for 404 errors.

Note: this post does not cover the required elements of a good 404 page, we do have a post on that, though: Thoughts on 404 error pages.

404 errors and broken links

One of the most annoying things that can happen to a visitor is to hit a 404 “page not found” error on your website. Search engine spiders tend to not like such errors much either. Annoyingly, search engines often encounter other types of 404s than your visitors, which is why the first section of this post is split in two:

1. Measuring visitor 404 error pages

If you use the MonsterInsights plugin, it’ll automatically tag your 404 pages for you. So then, if you go into your Google Analytics account and go to Behavior → Site Content → Content Drilldown and search for 404.html, you’ll find a ton of info about your 404s (click for larger version):

Google Analytics report showing 404 error pages

You’ll see URLs like this:

/404.html?page=/wordpress/plugin/local-seo/&from=https://yoast.com/articles/wordpress-seo/

This tells you two things:

  • The 404 URL was /wordpress/plugin/local-seo/ (it lacks an s after plugin)
  • It was linked to from our WordPress SEO article.

Using this info, you can fix the 404 and go into the article and fix the link.

As you can see from the above screenshot, we actually get 404s too. We break things all the time because our website is a constant work in progress! Making sure that you notice it when you’re breaking things is a good way of not looking stupid for too long though.

2. Measuring bot 404 error pages

Next to 404s for visitors, search engines will also encounter 404s on your site that can be quite different. You can find the 404s that search engine spiders encounter by logging into their respective Webmaster Tools programs. There are three webmaster tools programs that can give you indexation reports, in which they tell you which 404s they encountered:

  1. Bing Webmaster Tools under Reports & Data → Crawl Information
  2. Google Search Console under Coverage → Errors
  3. Yandex Webmaster under Indexing → Excluded Pages → HTTP Status: Not Found (404)

One of the weird things you’ll find if you’re looking into those Webmaster Tools programs is that search engine spiders can encounter 404s that normal users would never get to. This is because a search spider will crawl just about anything on most sites, so even links that are hidden will be followed.

If you’re serious about website maintenance, you might want to find these 404s before search engines encounter them. In that case, spidering your site with a tool like Screaming Frog will give you a lot of insight. These tools are built specifically to behave just like search engine spiders and will, therefore, help you find a lot of issues.

Fixing 404 errors

Now that we’ve found all these 404 errors, it’s time to fix them. If you know what caused the 404 and you can fix the link that caused it, it’s best to do that. This will be the best indication of the quality of your site for both users and search engines.

As search engines will continue to hit those URLs for quite a while, it actually makes sense to still redirect those faulty URLs to the right pages as well. To create those redirects, there are several things you can do:

  • Create them manually in your .htaccess or your NGINX server config
    While this is not for the faint of heart, it’s often one of the fastest methods available if you have the know-how and the access to do it.
  • Create them with a redirect plugin
    There are several redirect plugins on the market, the most well-known one being Redirection. This is a lot easier but has the disadvantage of being a lot slower as to do the redirect, the entire WordPress install has to load first. This usually adds half a second to a second to the load time for that particular redirect.
  • Create them with our Yoast SEO Premium plugin
    Our Yoast SEO Premium plugin has a redirect module that allows you to make redirects with the ease of the WordPress interface, but also allows you to save those to your .htaccess file or a NGINX include file, so they get executed with the speed of the first option above. It actually also has another few nifty options: you can get the 404 errors from Google Search Console straight in your WordPress install and redirect them straight away, and it’ll add a nice button in your WordPress toolbar if you’re on a 404 page:

Check for image / embed errors

If you’d look at your server logs, you’d get 404 errors of a different type too: 404s for broken images or broken video embeds. You might also have errors that don’t show up in your logs, like broken YouTube video embeds. They don’t cause the entire page not to work, but they do look sloppy. These types of errors are harder to find because webmaster tools programs don’t report them as reliably and you can’t track them with something like Google Analytics either.

The easiest method to find these broken images and embeds is using one of the aforementioned spiders. Screaming Frog, in particular, is very good at finding broken images. Another method is to check your server logs and go through them searching for a combination of 404 and “.jpg” and “.png”.

How often should you check for 404 errors?

You should be checking your 404s at least once every month and on a bigger site, every week. It doesn’t really depend on how much visitors you have but much more on how much content you have and create and how much can go wrong because of that. The first time you start looking into and trying to fix your 404 error pages you might find out that there are a lot of them and it can take quite a bit of time… Try to make it a habit so you’ll at least find the important ones quickly.

Read more: Content maintenance for SEO: research, merge and redirect »

The post Website maintenance: Check and fix 404 error pages appeared first on Yoast.