How to fix duplicate content problems
Duplicate content can lead to ranking problems on Google and other search engines. Although there is no duplicate content penalty, your web pages might not rank as well as they could if their content can also be found on other pages.
What is duplicate content?
Google’s definition of duplicate content:
“Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin. Examples of non-malicious duplicate content could include:
- Discussion forums that can generate both regular and stripped-down pages targeted at mobile devices
- Store items shown or linked via multiple distinct URLs
- Printer-only versions of web pages
Google tries hard to index and show pages with distinct information. This filtering means, for instance, that if your site has a ‘regular’ and ‘printer’ version of each article, and neither of these is blocked with a noindex meta tag, we’ll choose one of them to list.”
If the content of a page can be found on multiple pages, Google only displays one of these pages in the search results. Unfortunately, this might not be the page that you want to see.
How to find duplicate content issues on your website
If your website contains duplicate content, the wrong pages might appear in Google’s search results. For that reason, you should check your pages for duplicate content.
The easiest way to find duplicate content on your web pages is to use a Website Audit tool. The Website Audit tool checks all pages of your website and it informs you about errors that can lead to ranking problems.
How to fix duplicate content issues
There are several things that you can do to fix duplicate content issues. Of course, you can do nothing and hope that Google gets it right. Usually, this is not recommended. Better do the following:
- Use the rel=canonical tag to inform Google about the preferred version of a page. Details about the attribute can be found here. For most duplicate content issues, this is the best solution.
- Redirect the duplicate URLs with a 301 redirect to the original page. In that case, the alternate pages won’t be displayed at all. This doesn’t work with print pages, etc. because you want to display these pages.
It’s not necessary to block duplicate pages in your robots.txt file and it’s also not necessary to use a noindex attribute on these pages. Just use the two tips above and you’ll be fine.
Google wants to show unique URLs
“We only index pages that can be jumped right into. […] I click on the main navigation for this particular page and then I click on this product and then I see and everything works. But that might not do the trick because you need that unique URL.
It has to be something we can get right to. Not using a hashed URL and also the server needs to be able to serve that right away.
If I do this journey and then basically take this URL and copy and paste it into an incognito browser… I want people to see the content. Not the home page and not a 404 page.”
“Most of the time it is because [pre-rendering] has benefits for the user on top of just the crawlers. […]
Giving more content over is always a great thing. That doesn’t mean that you should always give us a page with a bazillion images right away because that’s just not going to be good for the users. […]
It should be always a mix between getting as much crucial content and as possible, but then figuring out which content [images and other non-crucial content] you can load lazily in the end of it.
So for SEOs that would be you know, we we know that different queries are different intents. Informational, transactional… so elements critical to that intent should really be in that initial rush [i.e. pre-rendered].”
Friday, August 13, 2021