Are your crawler reports always FULL of duplicate content warnings? Getting worried about all the looming Google penalties because duplicate content is a sure way to get your entire site banned from ranking in Google?
First, you need to step back and take a few breaths. It’s OK. Your ecommerce site is not the first one to encounter duplicate content issues and will not be the last. Google deals with thousands (millions? Billions?!) of ecommerce sites every day and they’ve pretty much worked out most of the duplicate content problems ecommerce sites have.
That said, you don’t want to just do nothing and hope for the best. It’s always a good idea to keep an eye on things and do all you can to help the crawlers find your content and identify what is important. Let’s take a look at one of the most common duplicate content issues with ecommerce sites: product sorting and pagination.
Duplicate Content Due to Sort Options
Your customers want to be able to quickly sort your products. They want to sort by price, by listing date, by review, by color, by size, by how many of their friends have purchased it, etc. Your customers want a million ways to sort your products and because you want to provide a great UX, you've complied. But now your site has many, many URL parameter options and your website crawls look like this:
The issue with this is that Google is going to crawl and index ALL of these pages, but they're not going to rank them all. Without using the rel=canonical tag you're letting Google pick which page(s) should or shouldn't be used in the search results. Sometimes this works out well, but sometimes it doesn't. Don't leave this to chance!
Solution: Add Canonical Tags
You don’t want all these random sort orders to show up in the search results instead of your curated product listings. You want users to land on the first page, with the default sort options, which is a page that you have created to show the best of the best. To identify which page this is, you should add the canonical tag to identify the ‘main’ version of this page:
<link rel=”canonical” href=”https://www.domain.com/collections/pearl” />
Add this to the <head> section of the page and you're done! Some content management systems have fields where you can add rel=canonical tags, others don't. Your development team can help you out with this as well.
What About Canonicalizing Page 2, Page 3, etc.?
While it is a good idea to add rel=canonical to your sort option pages, it's not recommended to add it to your paginated product listings. As a rule of thumb the page 2 canonical tag should point to page 2, and so forth for all other paginated pages. Putting in a canonical tag for these deep pages to point to the main listing page (page 1) can result in all those deep pages not getting crawled, which is a major problem if you want the crawlers to find and crawl all your products (which you do).
Using Google Search Console’s URL Parameters Tool
If you’re savvy and know what you are doing, you can also use the Google URL Parameter Tool in Search Console. This tool lets you specify URL parameters that should be ignored when Google is crawling your site. Be warned though, incorrectly configuring the URL parameter can stop Google from crawling important areas of your site so only use this if you’re comfortable and fluent with your parameters.
Why Am I Still Getting Duplicate Content Warnings?
If you’re using tools like Moz or Raven you might still see all of these different sort option URLs showing up as duplicate page warnings. That does not necessarily mean that Google is going to consider them duplicates as well. Make sure you keep an eye on your search console, on the crawlers, and on your rankings to make sure that the right pages and being crawled and indexed.
Not sure how to handle it? Have a complicated site that is a URL nightmare? We work with ecommerce websites day and night and have seen it all. Send us your problems and we’ll see if we can help.