My website has always been around 24 pages. Pre Kirby it was hand coded HTML and CSS – and all the pages were indexed by Google. But since moving to Kirby about 9 months ago I’ve noticed that only about half the pages are indexed.
In Google Search Console it says the pages are:
Discovered – currently not indexed. These pages aren’t indexed or served on Google.
Discovered - currently not indexed
The page was found by Google, but not crawled yet. Typically, Google wanted to crawl the URL but this was expected to overload the site; therefore Google rescheduled the crawl. This is why the last crawl date is empty on the report.
From doing some research it would seem that Googlebot is detecting that fetching my HTML is taking longer than it would expect for a site of my size. See Google wanted to crawl the URL but this was expected to overload the site - Webmasters Stack Exchange
But when I do a Google Lighthouse test on pages not indexed, they often score 100% for performance.
I wonder if this issue is linked to this topic SEO and 403 errors
Has anyone else had problems with Kirby sites or pages not indexing? Or maybe it’s not linked to Kirby?
if you are using my robots-txt plugin either by installing it yourself or bundled with a theme it will block indexing when debug mode is enabled.
Thank you for considering this, but no I’m not using any plugins.
It seems that HTML pages (with no PHP includes or CMS) are easy for Google to index. But when I’ve recreated my site in Kirby CMS using templates, snippets, and CMS content, Google has struggled to index my pages because the pages are taking longer to fetch than expected.
if you dont want to share the link to your page here you can PM me the link and a list of what content pages are not indexed properly and i can have a short look.
Happy to take a look as well, though @bnomei already is kind of the performance expert here. Just wanted to add that this is definitely not something that is generally broken with Kirby, but something specific to your site which we should find out together to fix.
Something else, which might be related.
Bing and DuckDuckGo used to rank my site for my key words. But I’ve noticed that they no longer rank my site, and worse no longer know that my site exists. I’ve searched for
www.advocatedesign.co.uk and neither can find it.
In Bing Webmaster Tools I’ve inspected
and it comes up with:
URL cannot be indexed by Bing
The Inspected URL returned HTTP 403 error when we tried to fetch the content. Please make sure that this is intentional. If you have moved the page to a new location, please use proper redirects.
"A 403 Forbidden Error occurs when you do not have permission to access a web page or something else on a web server. It’s usually a problem with the website itself”
Why is my website suddenly not allowing access to my website?
My .htaccess redirects have not changed. I have upgraded from PHP 7 something to PHP 8 something. And I have added responsive image code to my config.php file. But apart from that I don’t think I’ve made any big changes to my site.
Over the last few months I have repeatedly asked Google to reindex my site. It has only indexed 11 of the 24 pages. I’ve just rechecked by doing a Google search for “site:advocatedesign.co.uk” and now all the pages seem to be listed.
But Google Search Console is still only showing 11 pages as being Indexed. Perhaps there is a time lag.
That issues seems to be lessening, but I’m still concerned that other search engines do not have permission to access my website!
I ran into similar issues with Google on a client site, there were no errors on those pages, just had to re-index some pages multiple times, or rather, I think it just takes time until they re-index.
I cannot reproduce any
403 on the domain you shared. Maybe Bing Webmaster Tools still have a request cached from when you were moving your site and it needs first to refresh its cache (crawl with heir bot again) to not show the old 403?
Maybe Bing Webmaster Tools still have a request cached from when you were moving your site and it needs first to refresh its cache (crawl with heir bot again) to not show the old 403?
I’m not sure about that. The URL has been the same for about 15 years. And I’ve had more-or-less the same htaccess redirects (from http to https, from non www to www, and .html to no .html) for many years. I recreated my site in Kirby about 9 months ago. Bing and DDG were finding my Kirby site. Looking at my analytics I haven’t had a visitor from either of them since March. That suggests something changed around then.
Clutching at straws. The only things I can think are:
- moving from PHP 7 something to 8 something
- setting up responsive images and using the config.php file for the first time
- updating to the latest version of Kirby
- in cPanel turning on ImageMagick
I don’t see how any of this should affect your search engine results. To me it seems that your site is running properly. The Lighthouse results also look fine. Maybe you can provide some links to pages that are not indexed.
When running your site through Screamingfrog I don’t see any particular issues either.
Yeah, I know it all looks fine, but
- do a search in Bing or DDG for
www.advocatedesign.co.uk and they can no longer find my site.
- or search for ‘site:advocatedesign.co.uk’ in Bing or DDG and it only returns one page, saying something about a redirect.
- and Bing Webmaster Tools says “URL cannot be indexed by Bing. The Inspected URL returned HTTP 403 error when we tried to fetch the content. Please make sure that this is intentional. If you have moved the page to a new location, please use proper redirects.”
I’ve asked my hosting service to look into it…
I’ve heard back from my hosting service:
Mod_security was triggered by the Bing bot and it was blocked by it hence why it wasn’t able to access your website and crawl it.
I have now whitelisted the rule so there should be no issues with that.
Thanks for your help everyone!
In case this is useful:
DuckDuckGo, Yahoo and Ecosia all use Bing for their search results.