I’m currently migrating my wordpress websites to kirby. So far it’s going smoothly but I need to export my posts and since the official Kirby Exporter is quite old (Kirby 1 only), I searched for something else and found a php script made by Sally Jenkinson for a friend of her. I edited a few things like filenames exports, putting each file in his own folder, etc… The goal besides my own need is to clean up and test it enough so I can share it to others.
But I have a problem: the script takes the html from the wordpress.xml export file and turn it into kirbytext. Everything works fine except for clickable images. In wordpress code generated by the editor makes images clickable by default. It gives you this kind of html code:
The problem is the script (working way too well!) putting the image as the text of the link, returning this result:
(link: link_to_somewhere text: (image: image_adress))
And of course that does not work as the way to make a clickable image in Kirby is:
(image: image_adress link: link_to_somewhere)
I’m not very skilled in PHP and could not find a good way to fix this problem. I suppose you need to check in the DOM if the link has a childnode but my tests didn’t worked. Sally’s code can be found here and the functions that converts html to kirby text are in this file.
Can someone give me a hand to finish this?
I would be very intrested in this, at the moment I am manually moving http://skippy.org.uk
A small update about this since @texnixe mentioned it in this topic. It’s waaaay more complicated than I thought for a good number of reasons. First I could not find a way to change clickable images to the kirby text syntax.
I thought it would be bad to let html in but that was before I started working with a complicated website that has a lot of posts containing a lot of text formating. To sum it up, sometimes the wordpress editor is doing nonsense and keep some empty html (like center, bold, etc) inside the posts and you can only see them in the code editor. When you make the export to kirbytext, it can completly mess your file and the page will not load. I found old posts with an empty bold html tag adding **** during the export and even more nonsense.
So knowing that, I think that the best “safe” route will be to not convert the html to kirbytext at all. Most of the posts being exported for a conservative purpose, I think that’s for the better to prefer readable and working pages instead of “almost but not totally exported” files.
There’s also the topic of the age of the wordpress install. Before wordpress used oembed you had to use a third party embed or the iframe from youtube for example. Cases like this are legion and a real mess.
And there’s another problem with the hyperlinks. Most of the time people use either the basic p=1 model or the /year/month/day/topic model and you can’t do this as kirby will believe you are talking about folders. Exporting urls this way creates three subfolders with articles in it and that would make using the kirby panel a nightmare (and let’s not talk about folder numbers ). What I’m trying to say is that, even if we manage to export correctly the posts, we’ll also have to work on a redirection solution for most cases.
So in the end, I think there can’t be a single perfect solution to migrate to Kirby. Each wordpress install is different and outside of the recent and well managed wordpress installs with basic content, you’ll have to modify the exporter.
I’ll update the topic once I’m done exporting my websites and propose the exporter I used. Maybe we can work on a file with different cases in it and let the user choose what matches his install. Maybe we can also work on an .htaccess for urls.
Anyway, it’s going to be a bit longer that I wanted.
Just to update a little bit on this since it’s been a while and I’ve been busy.
The file and folder automatic export is now complete and creates the exact same permalink as wordpress except for the date separated by /. With a simple redirect in the htaccess it’s working like a charm. I used
RedirectMatch 301 /([0-9]+)/([0-9]+)/([0-9]+)/(.*)$ /$4 and it worked well enough for me. You have to edit a bit the wordpress xml to make all this works but it’s better than struggling with special chars and urls.
I also made some progress on the wordpress editor nonsense, fishing for empty tags and things like that but it’s far from finish.
Any news on the state of the wordpress to kirby exporter?
The project was stopped as I had way to many special cases to check out (more than 5k posts dating back from 2009 to 2015). I went another road to export the content as a static website.
I pushed the code on github here: https://github.com/Thomasorus/wordpresstokirby
Maybe someone will be able to work on something better than I could.
@Thomasorus: Thanks for sharing. Is there any advantage in exporting the WP site to XML first instead of querying the database?
For small websites I suppose it can be done that way But for large wordpress installs like mine, waiting for the 5k posts to be processed is probably a bad idea.
Like I said in a previous post, the real problem is more related to permalinks. If like me, you used the /year/month/day/title format, you will lose your permalinks with kirby. That’s a major problem when you want to avoid dead links. Plus wordpress supports characters (like japanese) that kirby does not support and it will break some titles.
As for content again, the wordpress visual editor messed so much with html that exporting content and converting it to kirby syntax needs a lot of manual edits.
Thanks for your reply @Thomasorus. I’ll have to export a WP site to Kirby soon. A lot of the content will have to be restructured, so it might not make that much sense to automate the process.
Fortunately, the blog part is not so huge, so importing directly from the database via a little script will work.
probably there is no final answer to this question but the script mentioned is a good starting point to get a lot WP stuff into Kirby in a reasonable amount of time. at least it saves me a lot… thank you!