How to access site data via cronjob?

ErVal · May 4, 2022, 12:18pm

Hey hey people,

I’m looking for a way to access the content of my Kirby website within a cron job script.
Now I am not entirely sure how to make it work as I have only been getting errors

So far I have tried to go for something like that:

#!/usr/bin/env php
<?php

// check if we are indeed on the command line
if (php_sapi_name() !== 'cli') {
  die();
}

// load Kirby
require __DIR__ . '/../kirby/bootstrap.php';

$props = [
  'roots' => [
      'index'   => __DIR__ . '/..',
      'content' => __DIR__ . '/../content',
      'site'    => __DIR__ . '/../site',
  ],
];

// initialize Kirby and site
echo "\033[1mLoading Kirby...\033[0m\n";
$kirby = new Kirby($props);

// re-index using the Algolia plugin
echo "\033[1mReindexing updating applications...\033[0m\n";
$site = kirby()->site();
// dump($site);
// dump($kirby);
$seminarpages = $site->find('seminare')->children();
try{
  foreach($seminarpages as $seminar){
      if(!empty($seminar->seminarapplications())){
          $seminarapplicationcollection = $seminar->seminarapplications()->yaml();
          foreach($seminarapplicationcollection as $key => $value){
              if(empty($value['anmeldeid'])){
                  echo "fehlende Anmelde-ID gefunden!";
                  dump($key);
                
                  $seminarapplicationcollection[$key]['anmeldeid'] = md5(uniqid("", true).random_bytes(20));
              }
          }
          echo "job's done!";
          $kirby->impersonate('kirby');
          $updatedpage = $seminar->update([
              'seminarapplications' => Data::encode($seminarapplicationcollection, 'yaml')
          ]);
      }
  }
}catch(Exception $e){
  echo $e;
}

echo "\033[32mSuccessfully updated all the applications. \033[0m\n";

This is based on the algolia-index script that’s on the public repository of the Kirby website.

One thing, that’s different from that repository however is that I unfortunately can’t put the script anywhere else but the content folder. Maybe that’s also what giving me issues.
The way I have to call it is like this: mysite.com/scripts/script.php

I am updating my algolia index that way as well, and that one seems to work just fine, so I am at a bit of a loss right now.

The error message I’m getting isn’t helping at all though…

<!DOCTYPE html>
<html>
 <head>
  <meta charset="utf-8">
  <style type="text/css">
   html, body, #partner, iframe {
                height:100%;
                width:100%;
                margin:0;
                padding:0;
                border:0;
                outline:0;
                font-size:100%;
                vertical-align:baseline;
                background:transparent;
            }
            body {
                overflow:hidden;
            }
  </style>
  <meta content="NOW" name="expires">
  <meta content="index, follow, all" name="GOOGLEBOT">
  <meta content="index, follow, all" name="robots">
  <!-- Following Meta-Tag fixes scaling-issues on mobile devices -->
  <meta content="width=device-width; initial-scale=1.0; maximum-scale=1.0; user-scalable=0;" name="viewport">
 </head>
 <body>
  <div id="partner">
  </div>
  <script type="text/javascript">
   document.write(
                    '<script type="text/javascript" language="JavaScript"'
                            + 'src="//sedoparking.com/frmpark/'
                            + window.location.host + '/'
                            + 'IONOSParkingDE'
                            + '/park.js">'
                    + '<\/script>'
            );
  </script>
 </body>
</html>

Also one thing I am really concerned about is the safety…
I am not very experienced in that field, so I’d appreciate any advice you can give me.

ErVal · May 5, 2022, 2:37pm

bump

Adspectus · May 5, 2022, 4:56pm

Actually I am not sure what you are trying to achieve.

A cron job is usually a program/script which is running in the background without user interaction and it is not intended to be started via an URL. Hence no output via echo except you are redirecting it into a file. The script should be started with php /path/to/myscript.php from the command line of your shell.

What happens if you are doing this?

texnixe · May 5, 2022, 5:01pm

And additionally, I don’t think such a script belongs in the content folder (and I don’t understand why you think it needs to be there (probably because you want to call in tin the browser), or where exactly do you get this parking webpage (which is usually shown when a parked domain is called) as response?

If you want to call a script via URL, you can use a route. But that’s another story.

ErVal · May 6, 2022, 7:36am

I wish I could start it from the command line.
But the hoster of my customer (ionos) doesn’t seem to allow it that way.
At least I couldn’t find how to do it there.

The cronjob is called by them via HTTP GET, which I already consider weird.
It’s probably moreso just pinging a certain URL.

That’s also why it’s in the content folder as of now.
My guess is I’ll have to move it over to the plugins folder later on and use routes?

But I am honestly a bit clueless as to how to make it work with them.

This parked page error, I got rid of. I had a typo in my URL that I failed to see.
Now it says that the cronjob has successfully been executed.
However nothing has happened, as I would have expected.

Also the content of this cronjob is simply a test for accessing / updating the page content.
As soon as I figure out how to do that, I’ll have to make use of this knowledge to update other parts of the page and also to send out e-mails (things like reminder e-mails for every application that the event they applied for is going to be happening within a week from now for example).

bnomei · May 6, 2022, 10:08am

if its called via http get then dont do that check above.

bnomei · May 6, 2022, 10:13am

you could also use my janitor plugin to create a job for your task. that would allow you to run it with a secret protected url (and a panel button if you want that).

using janitor might make setting up the script file easier since you dont need one but create a kirby plugin for the job. also testing it locally might be a bit simpler that way. but thats just my opinion.

ErVal · May 6, 2022, 10:33am

To be honest, I don’t feel comfortable having these scripts in possible reach via HTTP GET as I have next to no knowledge in terms of security and wouldn’t know how to make it safer.
So I’ll gladly give your janitor plugin a shot.

Thanks for the suggestion!

ErVal · May 6, 2022, 1:47pm

Haven’t yet set up any regular jobs, but the way this works so far is perfect.
This’ll save me a couple of headaches!

Thanks for your help!

If I need any help with it, I’ll open up another thread.

mthiebou · May 15, 2022, 8:40am

CLI request to Kirby don’t have a request method,

You can make CLI only routes to do something like this.

$uri = $_SERVER[‘argv’][1] ?? urldecode(parse_url($_SERVER[‘REQUEST_URI’], PHP_URL_PATH));

$this->Io((new Router($this->routes))->call($uri, Server::cli() ? ‘CLI’ : $_SERVER[‘REQUEST_METHOD’] ?? ‘GET’));

In the route method you can now use CLI.

Topic		Replies	Views
Right CronJob Path for Kirby Questions v2	10	1662	February 6, 2025
Update content via cronjob, fetch content from other site Questions v2	4	573	February 6, 2025
Access Site Object Questions	2	269	February 16, 2023
Accessing Kirby content from another site Questions v2	5	1331	February 6, 2025
Getting kirby content from within another kirby site Questions v2	1	654	February 6, 2025

How to access site data via cronjob?

Related topics