SEO Crawler Detecting 2 Home pages - with and without Trailing Slash

I am having this issue with both Kirby 2 and Kirby 3 sites.

When we generate SEO reports, the SEO crawler always detects 2 pages as the home page:

https://example.com
https://example.com/

The difference is the trailing slash. It somehow thinks that there are 2 “different” pages - with exactly the same content - and then we get penalised.

Usually this means that there is a link, or some structured data info, something somewhere with the “example.com/” address, with trailing slash. I have now looked at the html output of every page of one of our sites, and there is no mention of the address with the trailing slash anywhere

I’m starting to think that maybe this is an issue with .htaccess file? Or perhaps, somewhere else?.. Any suggestions would be welcome!

You could fix this via your server config/.htaccess. You can find example here on the forum.

@luxlogica There is a very good article on this over here… https://www.danielmorell.com/guides/htaccess-seo/redirects/https-www-and-trailing-slash

Thank you both!

@jimbobrjames that is a great article, and a great resource - bookmarking it for posterity.

@texnixe I did some more searching, and did find some examples of how to get the trailing slash out via .htaccess. However, after reading the linked article above, it seems that it shouldn’t matter whether the home page has a trailing slash or not - certainly not for SEO purposes. My feeling, therefore, is to check with the devs of the SEO reporting tool, to find out WHY they are reporting this - perhaps it’s a bug at their end?

As far as I can tell, the ‘default’ .htaccess file that Kirby ships with is pretty good and works fine for everything else - not having any other issues on any site, apart from this one. So, don’t want to go and change it, if this is just a reporting error in their end. I’ll let you guys know once I’ve heard back from them.

Once again, many thanks!

I recommenced Screaming Frog, which has a free version that will crawl up to 500 urls on a site, but its actually well worth paying for. Run this against the site in question and you will see what Google (or any search engine for that matter) sees when it crawls it. Its also great for finding bad links in the site and images that are large or having missing alt information.

That guide above is pretty good, but im working to tweak the rule he suggests in it to remove the www and having it work on any domain. I don’t like that he hardcoded the domain name in the rules because I run sites on a staging server and a production, so i need it not to care about the domain name.

If you are on a VPS, it makes sense to move all this stuff from the .htaccess to your virtual host configuration. You can use macros for those parts that are used repeatedly, like expires, deflate, redirects or the Kirby configuration. Such macros can then be included using the use keyword.

https://httpd.apache.org/docs/2.4/mod/mod_macro.html

@jimbobrjames As long as we can dream, the.htaccess config that we really want would:

  • remove the trailing slash from ALL URLs - because the Kirby router can handle everything else from there
  • remove the ‘www’
  • re-route every request in the domain to https
  • ideally, not have the domain hard-coded into the file, so it can be used with any project, as well as on staging and production sites without modification.

Well, we can dream, right?.. :wink:

I tried ScreamingFrog, and it’s useful, but the reports are not very user-friendly, so we ended up having to use tools like SEMRush. They are, however, SUPER expensive, and do have their quirks, as this latest issue with the trailing slash shows (yes, it IS an issue at their end…).

@texnixe The least I have to deal with Apache mods and configs, the better…