Multilanguage Setup - Remove duplicate content for SEO

My clients site is available in two languages, English and German. As there are more languages planned, especially ‘variations’ of German like Swiss or Austrian with adapted contacted details etc I have the following language setup:

c::set('languages', array(
     array(
        'code' => 'en',
        'name' => 'English'
        'locale' => 'en'
        'url' => '/en',
     ), 
     array(
        'code' => 'de-de',
        'name' => 'Deutschland',
        'locale' => 'de_DE',
        'url' => '/de',
        'default' => true,
     ),
)

Fairly normal so far, I guess. However, we just noticed that the German content is available under /de/kontakt as well as /kontakt. This is horrible for SEO. When I open the website I get automatically forwarded to /de and stay under this folder than if I’m just navigating around innocently. If I remove the de the urls still work, and I don’t “get it back”, i.e. it stays gone after navigating through the page.

What do you think would be the easiest solution to get rid of the duplication (at least on a SEO level)? Ideally afterwards all urls would contain the language code de.

I thought about setting up the router to handle that, but it seems very hacky?!

I think what is missing in your language configuration is the default language, if German is supposed to be your default language, the config should look like this:

c::set('languages', array(
    'en' => array(
      'name'    => 'English',
      'code'    => 'en',
      'locale'  => 'en_US',
      'url'     => '/en',
    ),
    'de' => array(
      'name'    => 'Deutsch',
      'code'    => 'de',
      'locale'  => 'de_DE.UTF-8',
      'default' => true,
      'url'     => '/de',
    ),
));

Thank you, but my router does actually look like that, I edited it above… Something messed up my vim and I can’t copy/paste anymore :unamused:

I wonder why this happens, because it shouldn’t, the rewrite works ok on the start page but not on the subpages. Wonder if we should create an issue on GitHub.

Anyway, I would rather put a permanent redirect rule into the .htaccess instead of setting up a router.

I’d guess that this happens around here: https://github.com/getkirby/kirby/blob/master/branches/multilang/site.php#L173-L176

I discovered that I don’t even need to redirect and am now inserting a rel="canonical" tag if the de part is present in the url:

<?php
$url_segments = str::split(url::path(), '/');
$first = array_shift($url_segments);

if ($first == 'de'): ?>
    <link rel="canonical" href="<?= $site->url() ?>/<?= implode('/', $url_segments) ?>" />
<?php endif; ?>

This seems to work quite well and prevents google from treating the two versions as different pages.

[added 1 day later:]
The following pointer avoids your problem, that the German content is available under /de/kontakt as well as /kontakt.
It avoids all pages like /de/kontakt!

This trick we can find at Kirby Docs: Languages > Setup!
[/added]

If your default language has

      'url'     => '/',

you may want to use

  <link rel="canonical" href="<?php echo $site->url() ?>/<?php echo $page->uri() ?>" />

without your if-statement.
This link shows always the page-link of your default language.

This also runs, if your Kirby runs in a subdirectory like http://127.0.0.1/test_kirby/.

But this is not the case, the default language in the To’s example has

'url' => '/de',

otherwise the problem wouldn’t even exist …

With the KISS principle I say: “Keep it simple and straightforward”!

In my experience it’s usually easier to get around a problem than it is to deal permanently and for all time!

Therefore, I would change the URL of the default language and thus eliminate the main problem. For SEO, you can set up a Redirekt of the path /de, if needed, e.g. temporarily.