I can’t get Google to crawl site

No, free subdomains cannot be used on cloudflare. If you’re paying for paid hosting why not get a domain too? PS. it is free with one of their plans!

1 Like

Thanks, it could have a logic. In that case it would just be a matter of waiting long enough. Who knows. Because no online sitemap generator seems able to scan the site as well.

So you have a sitemap? First please share a URL. So we can find the issue without fuzz.

Here it is: https://luigigiarnera.epizy.com/

Regards.

LOL, there is no sitemap. Maybe try yourself:

Sitemap generators and fetchers cannot access your sitemap because of this:

@jaikrishna.t, OP already knows how to make a sitemap.

3 Likes

Yes there is, it contains 25 URLs, all valid in Search Console but all excluded/not indexed by Google crawler.

1 Like

Patience is advised if so.

Google is not blocked as can be seen here

image

I also see you have robots.txt

can you give me the exact URL of your sitemap file?

2 Likes

I apologize but yesterday I exceeded the quota of messages allowed to new users.

I see what Google says. Weird so. My Search Console says the exact opposite: passaporto-di-sanita-1630 hosted at ImgBB — ImgBB

About the sitemap: /sitemap1a.xml

About robots.txt:
User-agent: *
Allow: /
Disallow: /administrator/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /logs/
Disallow: /modules/
Disallow: /plugins/
Disallow: /tmp/

1 Like

You should always use a sitemap generator built into your website software to begin with. The point of site maps is to tell search engines about the structure of the site what they can’t figure out on their own. Your website software knows that structure, but external crawlers do not. If external crawlers are sufficiently detailed, it means you don’t need a sitemap, because Google can just crawl your site themselves and learn the same.

No, it doesn’t. A good crawler, like those used by search engines, behave like regular web browsers and will have no problems with this. Only dumb scrapers, like those used by sitemap generators, have this issue.

3 Likes

@KenTamelson

I changed some things in your sitemap to make it valid

so download and upload

sitemap.xml (4.8 KB)

and then visit this link or do it via the GSC (sitemap section)



Here it tells you what it actually means excluded

Excluded The page is not indexed, but we think that was your intention. (For example, you might have deliberately excluded it by a noindex directive, or it might be a duplicate of a canonical page that we’ve already indexed on your site.)

src - Page Indexing report - Search Console Help

I don’t know why this is happening !

3 Likes

Thank you. The reason why I didn’t start with an internal sitemap is that the plugin I normally use only has a commercial version, that is, paid, and the customer said he didn’t want it. Then I was forced, seeing failures with any online crawler in fact.

The discourse on the impossibility of generating the sitemap with external tools is however only incidental: my question is, and remains, why Google crawler (certainly, an excellent crawler) after almost three weeks has not yet managed to perform its usual work on the site.

Is it just “in late”? I have never experienced delays of weeks, not even on subdomains. So my question arose from this, if there’d be are technical reasons on the domain / hosting side that prevent crawling by Google.

Thanks so much. I have immediately uploaded your modified sitemap, sorry if my wasn’t, It was generated with a sitemap plugin so I don’t know how it could happen.

The link you provided me says all ok: " Sitemap Notification Received - Your Sitemap has been successfully added"

Anyway “my” Search Console gone mad :see_no_evil: : error 404 on sitemap.xml img3 hosted at ImgBB — ImgBB

Sometimes it says “Couldn’t fetch” at first upload, but just refresh the page to get the “Success” message. Now it doesnt… and I never seen a 404 while the file is in place. Even because i clicked on “OPEN SITEMAP” and it showed the sitemap properly.

Yes I have read what “Excluded” means: I have no “noindex” directive at all, let alone duplicate pages that have not been indexed … there being zero indexed pages at the moment :cry:

Well, I took too much of your time. Let’s give Google a few days to see if it notices the new sitemap, and if it can do anything with it.

For now, thanks very much to everyone.

1 Like

one of the reasons may be that your page is invalid (all)

up here you have a piece of code that doesn’t belong there

that part should be in the head section

1 Like

As far as my knowledge goes, I do not think that InfinityFree delays or prevents Google’s crawler. Have you already added to domain to the Google Search Console and requested it to index your site? Google will prioritize more popular, as well as sites that have requested an index before yours, so it may take some time before the sitemap is noticed.

We are here to help you, you haven’t wasted our time at all :slight_smile:

1 Like

God. And where did that come from? Can’t believe it … i slipped on such a banana peel.

This thing is embarrassing … I have thought of the maximum systems, and I have not noticed such a trivial thing. Terrible.

I just have to apologize doubly … I’m going to fix the mistake, and take a vacation … it’s time for it.

Sorry again, thanks for everything

3 Likes

GSC often has problems even when everything is fine on your part

and this has been going on for quite some time (since they introduced the new version)

sometimes it just report can't fetch and that actually means that the bot hasn’t come to index yet and has a plan to come in the next 2 weeks or a few months.

sometimes if your domain is quite fresh then they don’t have DNS data yet
so it also knows to throw out a problems.

it might also help if you insert this part of the code into the head section
(in each page you want the bots to index)

<meta name="robots" content="index,follow">

1 Like

btw. this forum supports images - so just drag and drop ( no need for imgbb )

3 Likes

Interesting, it is something I was not aware about. Good to know for the future.

OK the page is fixed and the meta robots directive is in place now.