Having different robots.txt info based on subdomain


#1

We have robots.txt files for production, and a robots.txt file for staging that denies everything. We deploy everything with git, so we have removed robots.txt from the git repo and manually added the files to the servers.

This works great until we set up a new server and completely forget about the robots.txt file, and then google indexes the staging server and some other stuff we’d rather not be indexed in production, and GRRR

We have config files set up for each subdomain. Is there a way we can piggyback off these config files to server different robots.txt data off different servers?


#2

You could provide your robots.txt file via a route in your config.php, you don’t need a physical file.

c::set('routes', [
  [
    'pattern' => 'robots.txt',
    'action'  => function() {
      // code
    }
  ]
]);

For example, create a staging.txt file with your content for the staging site.

In your config.staging.php

c::set('routes', [
  [
    'pattern' => 'robots.txt',
    'action'  => function() {

      $html = f::read(kirby()->roots()->index() . '/staging.txt');
      return new Response($html, 'text/plain', 200);
    }
  ]
]);

#3

Excellent. That’s exactly the kind of thing I was wondering about.


#4

Hmmm…I’m not getting this to work, and it’s also not giving me the standard kirby error page, but just a default browser 404. Now that part might be that the .txt route isn’t actually being handled by kirby yet, but if I do the same thing with changing it to just robots as a test, then it still doesn’t work and I get the standard kirby 404 instead.

Any suggestions for troubleshooting?


#5

Hm, maybe test something simple:

c::set('routes', [
  [
    'pattern' => 'robots.txt',
    'action'  => function() {

      $robots = 'hello';
      return new Response($robots, 'text/plain', 200);
    }
  ]
]);

Or a simple reroute to test the route:

c::set('routes', [
  [
    'pattern' => 'robots.txt',
    'action'  => function() {
       go('/');
    }
  ]
]);

If that doesn’t work either:
Do you use any other routes without having them in one array?

Is there a static robots.txt in your root folder and is access to that file maybe blocked?


#6

Haha, well part of this was my fault. I put this in the overall config, and when it didn’t work I put it in the localhost config. So then when I was modifying the overall config, it was getting overridden by the localhost config that didn’t work.

So the issue here is that the robots.txt pattern doesn’t work because it hasn’t got to Kirby yet because it’s not hitting index.php. So I think what I’ll try doing is rewriting robots.txt to /robots or something and then routing that.


#7

So this was in htaccess so kirby would leave robots.txt alone:

# leave robots.txt alone for search engines
RewriteRule ^robots.txt robots.txt [L]

Removing that allows the route to work using robots.txt as the pattern.


#8

Ok, fine, I thought there was something like that…

Of course, you shouldn’t use a physical file called robots.txt anymore because that would take precedence over the route.


#9

Yeah, I’m going to have one called robots-disallow.txt that’s in the standard config for everything but production, and then robots-production.txt that I load for the production domain.

Thanks for your help.


#10

I have set this up myself, and its working great, however theres a small issue with it on the staging server.

The first line of the robots file is getting an < at the start. It doesnt happen locally. Both are on PHP 7.2

<User-agent: *

My route looks like this…

  // Robots
  array(
    'pattern' => 'robots.txt',
    'action'  => function() {
    $html = f::read(kirby()->roots()->index() . '/robotsstaging.txt');
    return new Response($html, 'text/plain', 200);
    }
  ), 

@fitzage I don’t suppose you ran into this and fixed it?

if anyone else knows why this is happening, please chime in :slight_smile:


#11

Sounds like some file encoding issue. Is the text file utf-8 encoded?


#12

Yes it is. I even deleted it and recreated it to be sure, and got the same result. The same file is being used locally, i think if it was encoding issue i would get the same thing on local.


#13

And there are no spaces or invisible characters in the file?


#14

Nope :frowning: Ive turned on invisibles and all i can see is line breaks at line ends.


#15

Are any other routes or templates also affected? I’ve seen this before, but not in routes.


#16

I always use this:

But I dont think it works because of a possible route collision somewhere.


#17

Thanks @jenstornell but i actually used that first, and got the same result. It’s actually why i tried this method :slight_smile:


#18

Nope… everything else is working fine. The route for this is the last one declared. There are others before it.


#19

If you set $html to a simple string like $html = 'this is a test';, does it work?


#20

Nope… I get:

<this is a test

:frowning: