In the cache folder the filenames looks like this:
3093650eed6b171b631b913c30961166
The data inside the files looks like this:
4f3a 3131 3a22 4361
What is the reason for that? Is it of security reasons?
Alternative
The alternative would be like this:
Filename
Underscore as url safe character instead of /
.
projects_project-a
Data
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width,initial-scale=1.0">
[...]
I bet there is a good reason for not going with the alternative? I’m curious of that that is. If it’s security reasons, wouldn’t it be easy to decrypt it?
I dont’t know the details behind Kirby’s caching mechanism, but the filename looks like some kind of hash over the content. This should give a unique file for each different content.
Digging a bit through the code gives me these hints:
Take a look at the render
function in the kirby.php file. There you can see, that the cacheID is a md5-sum of the url and page content.
After creating the cacheID
, it is checked, whether the site was modified after creating the cache. If the cache is still valid, the page will be loaded from the cache (get
function). Otherwise a new cache file will be created (set
function).
The standard cache driver is file
, which can be found in the toolkit at lib/cache/driver/file.php. There you can see, that the set
function takes some key as the filename (the md5 value from before). The file content is some serialization of whatever you decide to give in the cache.
Serializing the data is needed in this case, because the cache driver is not only meant to be used for the html output, but other data as well. So you could store an array in the cache and retrieve it later.
I haven’t taken a deeper look into the details, which data is actually stored for the pages, but I think, there would be a good reason, why it’s not only the html output. But that’s only what I found out by taking a quick look into the source code.
3 Likes
No, definitely not. It wouldn’t make sense to add “security” for a cache.
That’s correct. Using a hash for the filename is very important: Replacing characters with URL-safe ones would lead to ambiguity. There could very easily be two pages with the same cache key. Hashes are a bit more unique.
Regarding content serialization: The 4f3a 3131 3a22 4361
you saw is just a hex representation of the text. Try reopening the file as UTF-8 and you should see the source code. Kirby does not “encrypt” the cache in any way, this is just your editor displaying the file in a different format.
2 Likes
To give a bit more insights into the format. The stuff you’ll find in the cache files is a serialized PHP Value Object. You can find the class for that here: kirby/vendor/getkirby/toolkit/lib/cache/value.php
I abstracted the value, which is being stored in the cache to be able to keep important information, such as the expiry date in there.
That’s good reasons.
An md5 of projects_project-a
is always 21d0279566a2e90f24ef9e71442ab81f
. Therefor I don’t see why one is more unique than the other. Both are like IDs.
But in a larger perspective I can understand it. That it’s more unlikely that a plugin developer creates a hash the same way.
A cache plugin idea
I have a new idea for a cache plugin. If you have a similar idea for a core feature, please stop me now.
The built in cache will clear itself when a page is saved etc. My idea is a smarter cache that only clear what is needed.
Clear current
When I for example save the page, it should clear this current page cache, but no other cached pages.
Clear dependencies
A common problem with only clearing the current page is that I might use information of this page elsewhere, in archives or on the startpage.
Therefor it should also clear dependecy pages like archive and startpage if content from this page appears there. To make this work I need to specify these dependencies. I don’t know if I can do that in a controller with for example:
return array(
'dependencies' => array(
'home',
'projects'
)
);
For it to work I guess the controller needs to run in the panel and needs to run before hooks to make it work (untested).
Yes and no. If you replace characters and then run the value through a hash function, the result is indeed the same.
Kirby doesn’t do that character replacement though. The result are two different hashes for two different pages that would have the same hash after character replacement.
That’s not it. Plugin developers should prefix their cache IDs anyway.
You could, but I see several problems:
- Performance: If those pages again depend on other pages, Kirby needs to run a lot of controllers just to find out which caches to clear.
- Controllers are not based on pages but based on templates; not all pages with the same template depend on the same pages
- Developers would need to manually update the controllers once the page UIDs change; this is not going to work as clients can change page UIDs, the cache will then break completely
There has already been a similar discussion about selective cache clearing in another topic. I think the best way this could be solved in the core is to clear only pages, not plugin caches. Clearing only some pages is going to create more issues than it solves though, so Kirby will still clear all pages.
1 Like
I have changed my mind. I no longer want to build a better cache.
1 Like
For some reason i’d just need to remove the concerned cache entry by talking to memcached right away (bash scripts using libmemcached-tools) and without going through kirby
currently trying very hard to understand the need of keeping the key name ‘internal’ and opaque; I understand it provides some kind of automagic cache busting; but here i’d do want to manage how things are cleared; and i don’t see the entrypoint in the kirby context.
i’d be better off using a part of the url as a cache key so that i’d know in advance which one to clear; but currently the hashing makes it impossible
not being a kirby expert, what should be done in that case ?
I think Kirby 3 will have your back here. The cache system is also updated and will be more “transparent” of what is going on.
1 Like