(Last update May 26. 2022.)
Don’t expect that if you just setup the CF nameservers everything will be well.
Cloudflare is a tool, that you need to understand, learn and put to work for you after proper configuration.
We often have the opportunity here on the forum to see that some websites have been suspended for exceeding the limit and mostly this is caused by a bunch of bots doing a vulnerability scan and searching sensitive directories and files on your site.
Some of these bots are very reckless and do over 30 requests in one sec!
An example of what it looked like on my website:
Analytics from CF (nearly a million requests and generated 20GB of traffic)
Fortunately, a large part of my website is static content, so everything was served by CF (20GB).
Imagine what it’s like for someone who has WordPress and every time the server has to generate the pages that BOT requests.
This kind of behavior drains system resources, hogs bandwidth, and prevents your site from operating at maximum capacity.
After my reaction, the situation is like this
You first need to see the logs which are not possible
*1 if you use the free plan on Cloudflare
So it is good to install this application on Cloudflare Cloudflare Apps
Official site and help https://logflare.app/
You can find more information about it here Logflare CloudFlare App - Review | Nick Samuel
You can access the logs on their website or like me
generate a public link which you can then put in bookmarks, so you don’t have to log into their system…
Instead of that - you just watch logs directly in real-time.
Scroll until you see this section (and generate a URL)
and by what these bots do on your site (request) you generate firewall rules on your CF
In this example, you see that the bot searches the file
wlwmanifest.xml non-stop in different locations.
and bots can often try over 500 locations in a matter of seconds which of course burden your origin
(doesn’t mean you shouldn’t do it)
you can shorten the job by generating FW rules
OR in FW rules so you save a number of free rules
The example I use is just for my website !!!
because I don’t have WordPress and I can then “easily” decline some requests
If anyone wants these rules (note I don’t use WP)
you can click on CF FW rules edit expression
and paste my code inside
Rule 1 (user agents) bad bots engines
(http.user_agent contains "python-requests") or (http.user_agent contains "Python-urllib") or (http.user_agent contains "Apache-HttpClient") or (http.user_agent contains "Nuclei") or (http.user_agent contains "MJ12bot") or (http.user_agent contains "httpx") or (http.user_agent contains "MauiBot")
Rule 2 (files and paths) vuln. scan
(http.request.uri.path contains "wlwmanifest.xml") or (http.request.uri.path contains "wp-includes") or (http.request.uri.path contains "xmlrpc.php") or (http.request.uri.path contains ".env") or (http.request.uri.path contains "wp-content") or (http.request.uri.path contains "eval-stdin.php") or (http.request.uri.path contains "wp-login") or (http.request.uri.path contains "wp-admin")
The code above (about WordPress dirs) can also be shortened quite a bit by using this
(http.request.uri.path contains "/wp-")
but I intentionally did not use it because maybe someone has such a beginning of the name for file or dir
so it would be problematic.
URL normalization modifies separators, encoded elements, and literal bytes in incoming URLs
so that they conform to a consistent formatting standard.
For example, consider a firewall rule that blocks requests whose URLs match
The rule would not block a request containing an encoded element —
Normalizing incoming URLs at the edge helps simplify Cloudflare firewall rules expressions that use URLs.
For more information about chars and more see here.
If the bots pass the JS challenge you can further reduce their expiration time here
If you are not sure about an IP (or ASN - AS ) before the block be sure to look at
and some of the sites like (enter the IP in the search field )
- IP 18.104.22.168 spam report. Blacklists & IP abuse DB
And/or click on the
metadata in the log (Logflare)
and after that, you have created enough FW rules to repel most malicious bots.
Periodically do some necessary updates to existing Firewall and IP’s rules.
so those legit crawlers such as Googlebot aren’t blocked if they touch a URL that is blocked (trigger) by some other FW rules.
The list of bots is maintained by the CF and checked by IP and other validation methods,
so you have nothing extra to do other than turn it ON and allow.
Cloudflare verified Bots list
Just in case, move the rule to the first position (grab and move)
I intentionally did not share my complete configuration
because I don’t want to jeopardize my site by publicly showing someone how to circumvent all my rules
But one of the more important activities is to block hosting (WAF/tools)
most attacks/scans will come from there
(and hundreds more on the list)
Of course, some online sites (tools) that are on those servers will no longer work because your site will deny them access
so if necessary, you can temporarily turn off a block and when you finish with the desired “test” (online) re-enable the block.
At some point, if attacks go beyond normal complexity,
you need to add rate limit, session inspection, and fingerprinting. Going beyond that is most of the time absurd.
Relying on CAPTCHA or JS Challenges for complex attacks is foolish, it’s always been like that.
Everybody knows that attacks can solve those two challenges, these layers are made to add costs and complexity to attacks.
When bot operators realize that their bots are being blocked, many will simply stop trying and look for another victim instead (hopefully not you).
Good luck and thank you for your time