Hi!
Finally I can share something perhaps useful!
I had to convert an old website running on The Secretary to kirby, and instead of cleaning up the csv
file I exported from the mysql database, I put together a small python script to automatize that.
The script visits a bunch of sub-pages from an ‘index’ page of the website, and convert all that into kirby subfolders with a txt file and images.
It works like
python scraper.py <url w/ links to visit> <subfolder-name> <page-name>
You have to adapts the parsing operations to suit your use-case but in general:
- use a page containing a list of subpages, or better
- grab from each subpage what you need and add it to the article {}
dictionary
- let the script create a subfolder for each subpage visited, with a
page.txt
kirby-formatted file, and with pictures
More infos here and you can find the script here. Also the code is commented.
The code could probably be way more pythonic, but pretty satisfied for two half-afternoons of work.