Building a publications library with authors. Tags? Many to many relationships? Db?

Hello,

I am in the early stages of researching Kirby.

One of the features I would like to build is a library of publications (mostly journal articles, some books, etc.). Each publication will have information like title, authors, DOI, page numbers, abstract, etc. I think most of that information would be easily handled by Kirby’s built-in fields. The part that I am not sure about is how to handle authors since I would like to be able to filter and display publications by author and make sure there are no duplicate authors.

The built-in fields that seem most closely related are the tags, structure, or users fields. However, I’m not entirely sure that any one of these fields have exactly what I need out of the box. Each publication may have many authors, and when adding a new publication there will probably be a mix of new authors and existing authors. Also, authors are just metadata, so it doesn’t make sense to have authors be users.

I have also come across this many-to-many plugin by JonasHolfeld: GitHub - jonasholfeld/kirby3-many-to-many-field: This plugin allows you to create many-to-many relationships between pages in Kirby and synchronizes them on both sides. Latest version also works with Kirby 4!, which I think would mean that I make publication pages and a author pages and then link them together through this custom relationships field. However, it seems a bit excessive to have a separate page for each author in this case.

The dynamic options features for the tags field seems quite powerful. Would it make sense to write some code that uses a JSON file to store authors and then I read and write to that JSON file so that it acts kind of like a database for the authors?

The last option I have come across is the Db class which could allow me to interact with a MySQL database and set up the relationships that I need inside that. I think Kirby 4 can now display Db information in the Panel, but I’m not certain about that.

I plan on building a tool to parse .ris files to automatically add publications to my library, in addition to doing it manually via the Panel. Also, down the road I would like to try and create a citation tool that works in the Writer field with a custom mark/node to cite publications from this library. In terms of scale, there will probably only be some hundreds of publication pages to start with, though that number does grow each year.

So, my question for you all is how would you recommend I approach this feature? Would the tags field be able to handle this? Would I need more complicated approaches like a database or the many-to-many plugin?

Thanks!

I’d use a tags field. It allows you to query options that already exist, while letting user add new authors.

That is, if authors are really just meta data, and you don’t want to store extra information about those authors. In the latter case, storing authors as page (not users) would be the way to go, I think.

Performance issues only appear if you need a lot of data from pages in a single request. Let’s assume…

TL;DR;

If you have less than 2000 publications and less than a few hundred authors, I would use pages for both of them, add the pages cache (optionally with the staticache plugin) and be done with it. You will not have any noticeable performance issues on medium-sized hosting (5-10 euros/month).
I suggest building it as simply as possible and adding caching once issues appear. Feel free to contact me again if you need help with that then.

2000 Authors

If you stored them as pages, you would need to load all of them to fetch their names from the content file to show the select list (no matter if it’s a pages/select/multi select/tags field). With all but the pages field, you could add caching to speed that up. This will, in my opinion, only affect the panel when selecting authors manually.

If you generated them dynamically from existing authors set in publications, you would read all publications to get that list and would most likely need a cache.

I suggest using pages for authors, as this makes handling them in PHP code easier than other constructs. Using the UUIDs also makes finding and updating related publications simple ($publication->authors()->toPages() method).

If you do not store any metadata with the authors, you could get away with a plain JSON file updated by hooks, but that’s more or less just a different way of using a cache. I would stick with pages.

5000 Publications

Displaying a single publication will not cause Kirby to slow down. But you will run into issues if you try to generate a list of all publications in a particular category… unless you wrap that in a cache. The default “pages cache” should do, but your mileage depends on how often you add publications. Otherwise, implementing partial caches is the way to go.

Resolving Relations

The most taxing scenario I can think of would be a list of all authors showing a count (or a link) of their publications. Such a list will load all content files, such as 2000 authors and 5000 publications, and slow Kirby noticeably unless cached.

I think the most disruptive issue with using pages for authors which don’t have any content is that you cannot add new authors on the fly, but have to create a page first before you can select an author.

Thank you both for the excellent answers so far!

Would it be possible to build a custom field or something for publications pages that can create a new author page on the fly? That would solve this concern I think.

Another thing that has occurred to me is that the way different publications report author names may not be standardized (i.e middle initial vs first and last only, or maybe some places only report last name and first initial).

I will need to look into this a bit better, but if data quality is an issue it might not be worth it to try to build a relational structure like I was originally thinking.

In which case, I would probably have to start thinking about using search to approximate the behavior I want.

In this case, are there any design/performance considerations if I wanted to show a list of all publications by authors with the first name “Bob” or something like that? At first glance it looks like Kirby’s router and support for
virtual pages would make templating and displaying these searches pretty easy. Beyond that, I don’t know enough about how search algorithms or Kirby works.

Yes, that’s possible.

personally i would use a structure for the authors (just in case in the future you might need some little extra info for them) and then use the many to many field to select the authors. you could then have some field (or fields, in the future) in your publication blueprint for the author, for instance “Name of new author” to add someone on the fly. after you save the publication you could then check if the field is empty or not, if not empty you can trigger a method to add this new author to the structure field located for instance in site, and then empty the field in the publication again, thus making it work as a field for adding stuff to the structure. in addition, in the same method/hook you could then add the new author to the many-to-many field so it is already selected. i think i would do it like that

I wouldn’t use a structure field. This is likely to get huge and then difficult to handle. Even more difficult if you want to add entries on the fly.

Thank you all for the input!

I’ll have to do some thinking about what makes the most sense for my project. But, regardless of what I do, I think this discussion gave me a better sense of how to think about data relationships in Kirby as well as performance and UX.