Using Google Cloud Storage for the content folder

Here are the results of my research into this topic:

When moving a CMS into the Google Cloud Platform, there are three main products to be considered for such a project, ordered by the amount of abstraction, starting with the lowest:

Google Compute Engine

  • It’s a regular VM in the cloud
  • Need to update, configure OS, runtime, etc
  • Has persistent storage built in
  • Google Cloud Storage can be connected via Filesystem (GCSFuse)

Running either a database backed or flat file CMS on Compute Engine is very easy. You start a new machine, set up your runtime environment (e.g. PHP) and your network ports and you’re good to go. You can set up a DBMS on the same machine, use another Compute Engine instance or use a cloud database (like Cloud SQL). This will work with all existing CMS, but you’re not gaining anything in terms of abstraction: It’s just like running your own server – you still need to update the operating system and runtime environment and take care of security.

Google Cloud Run

  • Runs docker containers in the cloud
  • Has no filesystem persistent storage options

Google Cloud Run is a product to run single containers in the cloud. This gives you the advantage that you can select pre-configured docker images that come with a secure and correct runtime configuration for your CMS and you can just take care of installing your CMS. However this approach is problematic: CMS typically need to be able to write to the filesystem – even database backed systems use the filesystem to store configuration, cache and session data. Google Cloud Run does not offer any persistent storage options.

Google App Engine

  • Runs application in the cloud (no container)
  • No need to configure OS or runtime
  • Has no filesystem persistent storage options
  • BUT has PHP Stream Wrappers

Google App Engine seems like the most promising candidate to run a CMS in the cloud, because its trying to abstract on the “App” level. So for App Engine we don’t need to care about the OS – we just throw our App code at it and it’s supposed to work out of the box – unless you need persistent storage. Just like Cloud Run, there are no persistent storage options in App Engine. Not only is storage not persistent, but you can’t even write to the local filesystem.

But App Engine does support PHP as a runtime environment, and PHP has a great feature for situations like this: Stream Wrappers. They allow us to override PHP file system methods like fopen() or fread() to work on other storage systems, not only the local file system. And luckily, Google offers a stream wrapper implementation for reading and writing to Google Cloud Storage buckets.

So theoretically, by using the mentioned stream wrapper, it should be possible to run a flat file CMS (or even a database backed CMS – when using an external database server) on App Engine. However it turns out, that it’s not that easy – at least not for Kirby.

Using stream wrappers, MOST filesystem methods can be adjusted to work on GCS buckets, but not all of them. One of the unsupported methods is PHPs realpath() method, which returns a canonicalized path (removing relative parts like ‘.’ and ‘…’). It is used heavily by Kirby CMS (31 times), and it will always return false when used with the GCS stream wrapper – which brings the CMS to a crash.

To make this work, one would have to override the realpath() function to return sane values. This would require changes to the Kirby codebase itself. Unfortunately Kirby uses both, its own realpath() wrapper in the F class inside the Toolkit, but also the native function in a couple of other places. So changing this behavior would require changes to the Kirby codebase in multiple places, which will make updates more complicated, etc. So it’s probably not a good idea for most people.

Also, if we could get the stream wrapper approach to work, there might be more non-compatible file system methos. This was just the first one that I came upon, which did not work. Even if it worked, it would still be problematic, because the GCS access is pretty slow. So we would definitely also need a non-filesystem caching solution (Redis seems like the best idea, because it’s available in the Google cloud).

3 Likes