How to disable show: and tag: links or prevent them from being indexed?

Hi there,

I use DuckDuckGo on my personal Kirby website as an internal search engine and was recently surprised to find links with this structure amongst the search results:

https://yves.io/show:links/page:3

https://yves.io/tag:services/page:3

And here come my two and a half questions:

  • How can I disable show: and tag: links or prevent them from being indexed?
  • And how did DuckDuckGo (and Google too) find these links? I can’t remember having used (meaning having linked to) them anywhere.

I tried to find an answer in the documentation and failed miserably – sorry.

Thanks, Yves

These links stem from pages with tag/show parameters and are paginated. So you must have some code somewhere in your home controller that does that.

Hm, I’ve just checked my templates, snippets, etc. – no signs of show: or tag: there. Also some random HTML source code I’ve checked doesn’t contain these in their link. And last but not least: none of my content files contain a tag field.

Maybe the search engines are now so intelligent, that they know my site is build with Kirby and these are valid URI parts? What’s next? World domination? Just kiddin – they dominate it already :wink:

Yep!

In the footer of the rendered page you have a link “Weiter”! If you click there, you get the “/page:x”! This is because you have paginated your page somewhere.
For details please look at https://getkirby.com/docs/reference/objects/pages/paginate and https://getkirby.com/docs/reference/objects/pages/pagination or at https://getkirby.com/docs/cookbook/templating/pagination

Have you ever used such links in the past? I can obviously add any such parameter to your (or my) Kirby websites without running into any errors. So maybe these links got cached somehow.

So if I open such a link you posted above, I actually get a result. Until I add something with a page number that doesn’t exist, e.g https://yves.io/tag:green/page:100, which finally gives me the error page, because the page is out of range.

Visiting https://yves.io or https://yves.io/tag:blue or https://yves.io/show:nothing always return the same page.

Maybe we need a route or a redirect to prevent this from happening.

Hi, @yves.

In our Zero One theme, we are using this solution

<?php 
    $robots = (urldecode(param('tag')) OR urldecode(param('category'))) ? 'noindex,nofollow' : 'index,follow,noodp';
?>

<meta content="<?= $robots ?>" name="robots">

Just change parameters to fit your custom taxonomy.
It is just a snippet you add to the header.

1 Like

@anon77445132 – Thanks for your comment. The pagination is fully (well mostly) understood – the tag: and show: parts are difficult parts.

@texnixe – I assume that any Kirby page can be called with tag:xyz and show:xyz so once it somehow got into the search engine’s index it is still shown since it produces some content. So I am in favour of you caching theory. The easiest way would be to disable tags and the thingy that is responsible for the show: other thingy. BTW are these two other thingies somewhere documented?

@mrfreedom That’s definitely an option. Although you could argue that sending users to the error page instead could make sense.

Tag and show are just parameter names, they could be anything and would lead to the same result.

So you can either do it as suggested by @mrfreedom to prevent search engines picking up such routes, or use a similar piece of code to send users to the error page. Whatever you think is best.

@mrfreedom – this is surely a way to go. I already have some code in place to prevent paginated pages from being indexed (but followed):

<meta name="robots" content="<?php if (isset($pagination) && $pagination->hasPages()) { echo 'noindex, follow'; } else { echo 'index, follow'; } ?>"

I’ll try to add your solution to the mix – thanks a lot!

1 Like

At least we now know their name. Hopefully there are no other hidden parameter names. Maybe I should block any parameter-name: except page:?

With the params() helper, you can catch all params.

That hint was it:

<meta name="robots" content="<?php if (empty(params())) { echo 'index, follow'; } else { echo 'noindex, follow'; } ?>">

Mille grazie!