Will find() or similar get slower with a big collection?

We had a similar discussion a while ago in slack and i am still not sure what would be the correct answer to the question…

Let’s say i have 3.000 pages within my subfolder and in my code i am using something like:

// within Children there's that said 3.000 pages
$somepage = $page->children()->find('exact-uid');
// do something with $somepage then... e.g. $somepage->update([...]) or $somepage->changeStatus()

Is kirby gonna run though all of the pages to come to the result of my query, or will it go straight ahead and not affect the performance? and if it does affect performance, is there an alternative, enhanced way of doing it?

I understand that obviously running something like findBy() will get slower in this case and in such cases there’s already @bnomei 's auto_id plugin

If you call $page->children(), Kirby will read the list of subdirectories in that page and initialize a Pages collection. But it won’t yet read all text files – just the directories and the templates of the respective pages.

If you then call find(), Kirby will return the Page object it has already created before.

So using find() is definitely better than findBy() as Kirby does not need to read all content files, but it still needs to get a list of directories.

So basically it will still run though 3000 folders which kinda slows things down? Or is running though the folder still very quick and not resource hungry? (in comparision to go though all folders and read it’s contents)

Is there any solution to keep everything fast and less resource hungry? (except adding more and more subfolders such as /date/month/

It should still be pretty fast, a lot quicker than findBy(). But the best way to find out would be to test it on your server directly by creating a lot of empty pages with a script. The performance depends a lot on the server hardware and also software.

As far as I can tell, not at the moment (with the query syntax at least). Kirby needs the object structures internally. A more efficient way would be risky as there may be edge-cases in which the objects are not loaded properly and stuff fails with no apparent reason. Maybe we will be able to optimize it later, but consistency and robustness are the highest priorities.

If you really need it to be as fast as possible, you could initialize the Page object manually with new Page().

i mostly test these use-cases on local machine and haven’t had a go with kirby v3 for these kinds of things.

live server specs are alot more powerful but when running short tests (previously with v2) i can still see the performance drop when all (testserver 8 cores) run at 100% when putting on some peak traffic. and page rendering shoots up from 0.2s to 5s

the other way is to use the page factory thingy and overwrite the page model to use sql/lite as shown from bastian in slack. I’m not sure if this is flexible enough for me and while testing his “demo” it seemed not to work perfectly (at that point of beta stage maybe)

I’ve thought about this as well. In the new K3 you have 2 new concepts which could be of great help battling this:

I haven’t played with them yet, but I plan to test setting up a custom collection for “big folders” and cache those manually. I’m kinda sure this will work, but I’ll specifically look for drawbacks where you won’t suspect them (e.g. $site->index() or search pages)…

yeah for big collections, depending on what needs to be done, i am thinking about the auto_id plugin…

but it’s just like let’s say we have a shop like shopkit, a user comes to the shop,

  • it generates a random page which acts as shopping cart and will be updated several times during the shopping process (e.g. adding products)
  • later it will be updated to add customer data like adresses and whatsoever
  • when payment is undergoing your cart (which is a draft at first) will get his status changed to listed / unlisted…

so we basically modify one certain page (which is looked up though a session -> e.g. session saves random uid)

so we have

// pseudo
$cart = page('shop')->children()->find(s::get('cart'));
// I am not sure if $cart = page('shop/'.s::get('cart')); would already be a saver??
// v3 also has the filter to check for drafts only.

which then later will be accessed to in all ways like

$cart->update([...]); // add cart, customer, invoice number, whatsoever, 
$cart->changeStatus('listed'); // e.g. mark as paid - or at least internally a draft is a not completed order

so let’s say the shop will lots of traffic, and there’s about 500 (or even more) unfinished carts it would get more and more unresponsive and resource hungry as it goes. (until let’s say the draft will be deleted after it hasn’t been completed after a few days)

I’m not at all familiar with shopkit nor with your use case, but I’ld only “save” the cart to the server somewhere when it’s actually become an order (and maybe not even as a “page” as this can be a source for performance issues later on - if you have lots of them). For all the other stuff I’ld use sessions and/or localstorage in the client.

1 Like

We built our license and upgrade management tool for Kirby 3 with Kirby as well and used a trick for that: You can create subdirectories for the first few characters of the cart ID.

So for example 99afb6d7741c5e159d173be768b0a056 becomes 99afb/6d7741c5e159d173be768b0a056.

Otherwise I agree with @bvdputte. Volatile stuff should probably not be stored as a page.

yes. saving it as session as long as possible will not slow things down, but as soon as there’s any asyncron payment process, you have to write it down to disk so that e.g. paypal can make it’s call back to the server thus we have to find the page as mentioned above.

in this case i am referencing shopkit because in that example it is written to disk all the time.

That’s definitely true. But what you could do is to keep open carts in the session and only migrate them to the file system once the payment is actually initialized.

that’s what i would do :kissing_heart:

knowing when to devalidate the cached collection is the biggest problem. but the readme of my autoid plugin might get you started with that.

In theory, would it be possible to combine virtual pages with sessions to make use of a file-less cart and when it comes to a payment we can then copy/move the virtual page to filesystem?

https://getkirby.com/docs/guide/routing#virtual-pages

My thought would just be to have the (virtual) page ready while being able to edit and make use of most kirby functions and have everything set

in comparision to extract everything from the session and dump it into a page.

1 Like

That’s a nice idea! I’m not sure if it would help with performance though.

is virtual pages drawing performance?
in comparison to saving each cart directly to disc and further modify everything, having a virtual page shouldn’t be much more resource consuming than just working with sessions?

just when the virtual page will be copied to disk this is when the performance will start dropping. i guess i will put out an attempt for virtual page to see if this is an acceptable compromise (or if it even works)

No, what I meant was that if you have a hybrid approach, you would still need to check the file system. The only way around that would be a completely separate page structure for “session carts”.

well let’s say an order is created because the payment process has been started. but the user has abandoned the payment, so the file page will have to run though, check and draw performance. i guess as long as we use a flat file, at some point it’s unavoidable.

I think Kirby’s flat file system is great for building websites, even for websites with loads of info/data. You can build your website in a way that copes with that.

But as soon as you’re starting to have a lot of “transaction data” (data generated from the front-end), it’s better to put that into a database and manage those via a special page in the panel or send it to an external system. Because usually it’s that kind of data (each record becomes a directory + file) which creates huge directories which slow down index(), search() or other traversal a lot.

Kirby 3 should have database support, check it out.

yeah database is an option, we have plenty at our disposal.
as mentioned bastian had an excellent showcase where he overwrite the page model and included sql to be editable within the panel.

this example as a shopping cart can also be done without using a database. all we need to know where bottlenecks can occur. or fix them as lukas mentioned by splitting up to more and more subpages. i would just have to check out some methods, which are actually reading the whole subpages collection before doing a task… such as:

  • create page
  • update page
  • find page as we already know (replace or simplify)
  • checks (if page exists and whatsoever)

will take a flashlight and check github