Version 7.1

How I add canonicals into Perch CMS sites

by Simon Cox



How I add canonicals into Perch CMS sites

The Canonical link in a page's header lets the search engines know where the original page resides. Originally conceived for situations where articles were duplicated they would reference the original. Google tends to choose the oldest version of a page that it can find (but not the only method it uses) and any other pages with the same or very similar content are considered duplicates and will not do a well on the Search Engine Results Pages - SERPs and we want our pages to do well there for the traffic.

Canonicals can trip up your sites SEO

In most con­tent man­age­ment sys­tems, devel­op­ers tend to take the quick option and ref­er­ence the URL the page is on. To an extent, this works very well but dupli­cate pages can occur by acci­dent / non-design. For exam­ple, if you are using Perch and you decide to pret­ti­fy your URLs by remov­ing the .php you will have set up .htac­cess rules to remove them. But did you decide your URLs should end in a / or not? Search Engines index URLs with and with­out the / as dif­fer­ent pages — hence you can suf­fer from duplication.

  1. http://​www​.exam​ple​.com/​i​n​d​e​x.php
  2. http://​www​.exam​ple​.com/​index
  3. http://​www​.exam​ple​.com/
  4. http://​www​.exam​ple​.com
  5. http://​exam​ple​.com/​i​n​d​e​x.php
  6. http://​exam​ple​.com/​index
  7. http://​exam​ple​.com/
  8. http://​exam​ple​.com
  9. https://​www​.exam​ple​.com/​i​n​d​e​x.php
  10. https://​www​.exam​ple​.com/​index
  11. https://​www​.exam​ple​.com/
  12. https://​www​.exam​ple​.com
  13. https://​exam​ple​.com/​i​n​d​e​x.php
  14. https://​exam​ple​.com/​index
  15. https://​exam​ple​.com/
  16. https://​exam​ple​.com

All the above are essen­tial­ly the same page of con­tent — a home page and the search engines have to work out which one is the orig­i­nal. They are get­ting much bet­ter at this but that’s not a rea­son to help them under­stand your website.

All the above are essen­tial­ly the same page of con­tent — a home page and the search engines have to work out which one is the orig­i­nal. They are get­ting much bet­ter at this but that’s not a rea­son to help them under­stand your website.

For sub­pages, canon­i­cals are more crit­i­cal as the search engines are less like­ly to be tol­er­ant and often they will find your site through links to a sub­page rather than down through the home page. Hav­ing the canon­i­cal auto­mat­i­cal­ly gen­er­at­ed means that any URLs that resolve that you actu­al­ly do not want on the site will include the incor­rect canon­i­cal. If you remove the .php from the URLs, as I tend to do, then you may have sit­u­a­tions where Perch is out­putting links with the .php — the canon­i­cal would then include the .php and cause dupli­cate con­tent issues. Foot­er menus are an exam­ple of where this may happen.

I like to man­u­al­ly add the Canon­i­cal so that I know I am in con­trol but this can lead to issues if an edi­tor mistypes the URL so the tech­nique I use grabs the list of pages from with­in Perch as a drop­down list for the edi­tor to choose from.

Perch field type — Pagelist

You will need to add the Perch field type into /​perch/​addons/​fieldtypes/​— drop the fold­er and its php file in there and you are good to go.

The Perch 2 field type Page list is avail­able from the Perch CMS site. At the time of writ­ing, there is no Perch 3 ver­sion but the archived Perch 2 ver­sion seems to work ok.

Perch tem­plate code

The fol­low­ing code goes into perch/templates/pages/attributes/seo.html

<link rel="canonical" href="<perch:pages id="domain" /><perch:pages id="canonical" type="pagelist" output="pageurl" replace=".php|,/index|" label="Canonical page" help="Please select the page you wish to have as the canonical URL for this page (normaly just choose this page)" required="true" />">

replace=”.php|” removes the .php from the URL.
type=“pagelist” pro­vides the list of pages on your site

On each page in the CMS appears a drop-down box with the pages you have on your site. The edi­tor can select from this list thus avoid­ing man­u­al errors — though they could choose the wrong page so that’s worth checking!

example of dropdown list used in the Perch content management system

The out­put code in the head:">

And there is more...


Clive Walk­er asked me how do I deal with pag­i­na­tion. Gen­er­al­ly, I don’t as I pag­i­na­tion is the work of the dev­il and adver­tis­ers. There are so many sites who make you click through a series of pages to read an arti­cle — this is just to sell adver­tis­ing, not to make it easy for you to read as usu­al­ly the whole arti­cle could eas­i­ly go on one page and you would scroll down to read it.

There is, how­ev­er, a sit­u­a­tion where pag­i­na­tion is very use­ful — lists of arti­cle entries, cat­e­gories, top­ics and tags. In these sit­u­a­tions, it is rec­om­mend­ed that there is a view all page and that the pag­i­nat­ed pages are canon­i­calised to that, but with huge lists, a view all page is imprac­ti­cal — will take days to load etc. and then the pag­i­nat­ed pages can be self-canon­i­calised. If you want to know more then head over to Deep Crawl’s infor­ma­tion on canon­i­cal­i­sa­tion and pag­i­na­tion.

18 Decem­ber 2017 Update for home page

I have also updat­ed the perch code I used as there was an issue. The home page was out­putting ‘/​index’ so I have added that into the replace state­ment as it was canon­i­cal­is­ing the home page to a URL that didn’t exist — and that is a bad thing! Apolo­gies to any­one who had used the code pri­or to today.

Other articles in this topic

A first look at Safecont content quality analysis SEO tool

Review of a new tool from Data elasticity S.L in Spain that analyses the quality of your websites content Read more about A first look at Safecont content quality analysis SEO tool

How to find insecure pages in your site before Google start penalising them

Google will be highlighting all non-secure sites from July 2018 meaning that the Insecure Content Report of Screaming Frog will be in use a lot over the next few months - I will show you what to do. Read more about How to find insecure pages in your site before Google start penalising them

Using a CDN to help migrate your sites hosting

Moving a site to new hosting can be fraught with issues. Here is one technique that I find useful when moving to a website to a new hosting server. Read more about Using a CDN to help migrate your sites hosting

EU cookie consent law largely ignored

The EU has give the UK until May 28th 2012 to comply with the cookie consent law - how many of the organisations pinpointed by the ICO have met the deadline? Read more about EU cookie consent law largely ignored

HTML Test Page for CSS Style Guide

This content is used to test that css has been covered for all tags used in content. Read more about HTML Test Page for CSS Style Guide

EEUK11 report

The UK's first ExpressionEngine conference has taken place in Manchester in August 2011. I provide a report on the event itself and the surrounding activities. Read more about EEUK11 report

Assets for ExpressionEngine

Assets is a new file management add-on for ExpressionEngine and brings a new way of managing assets in your website. I installed it and built a simple image Gallery. Read more about Assets for ExpressionEngine

Responsive web design

Responsive web design has been emerging as a way forward over the past two years. Following inspiration from the way Simon Collison's crafted his personal website I have now implemented a Responsive Web Design on this, my personal website. Read more about Responsive web design


With HTML5 finally emerging from the primordial soup I thought it was time to have a look at where we have got to with fonts on the web, what today's techniques are and what is new in the world of the digital hot metal. Read more about WebFonts


Online testing suites are a very useful way of checking your website to see if it's up to scratch. I take you through Nibbler from Silktide. Read more about Nibbler

New look for 2010

I have finally taken the plunge and updated the site and have switched over to ExpressionEngine. Read more about New look for 2010