Duplicate Content and the Canonical Tag
7 min
Duplicate content dilutes authority between multiple URLs and confuses Google's choice of which page to display. The canonical tag indicates which version is the reference. Use it systematically on your parametric, paginated URLs, and HTTP/HTTPS or www/non-www versions.
Duplicate content is not a penalty in itself, but it forces Google to independently choose which version to index — and it does not always choose the one you want. The canonical tag gives you back that control.
The most common sources of duplication
The majority of duplicate content is technical, not intentional. E-commerce sites are particularly exposed: sorting filters, pagination parameters, product variants, and session URLs generate dozens of identical versions of the same page.
Content syndication, article republishing across multiple domains, and printable page versions are sources of external duplication often missed during audits.
- URLs with and without www (example.com vs www.example.com).
- HTTP and HTTPS versions not redirected.
- Sorting and filter parameters in e-commerce URLs.
- Pagination pages (/page/2, /page/3) with similar content.
- Product pages accessible via multiple categories.
The canonical tag: syntax and usage
The canonical tag is placed in the head of the non-canonical page and points to the reference URL. It can point to itself (self-referential) on main pages — a best practice recommended by Google.
A self-referential canonical on each page confirms your intent to Google and prevents a parasite URL from taking over if someone creates a link to an alternative version.
- Cross-domain canonical: to indicate the original source of syndicated content.
- Canonical on AMP pages: point to the standard non-AMP version.
- Canonical on pagination pages: point to the main page in the series.
- Never chain canonicals (A points to B which points to C): Google often ignores chains.
Canonical vs 301 redirect: when to choose which
A 301 redirect is stronger than a canonical because it eliminates the alternative URL at the server level. If two URLs are absolutely identical and one is unnecessary, prefer the redirect.
The canonical is preferable when you need to keep both URLs accessible for technical or functional reasons — for example, a printable page or a mobile version kept for a specific campaign.
On medium-sized e-commerce sites, between 10 and 35% of indexed pages are technical duplicates resolved by a combination of canonicals and robots.txt rules.
Sector studies 2025-2026 on e-commerce SEO audits
FAQ
Does Google always respect the canonical tag?
The canonical is a hint, not a directive. Google follows it in the vast majority of cases, but may override it if it judges the pointed page to be less relevant than the current version. Contradictory signals (internal links toward the wrong version, sitemap including the duplicate version) reduce its effectiveness.
Does duplicate content trigger a Google penalty?
No, unless content is deliberately copied to manipulate results. Technical or accidental duplication does not trigger a penalty but dilutes your authority and can lead to the wrong canonical version being chosen.
How do I detect duplicate content on my site?
Screaming Frog with content hash comparison mode is the most effective tool for internal duplication. For external duplication, Copyscape or Siteliner can detect reprints of your texts on other domains.