• If you are still using CentOS 7.9, it's time to convert to Alma 8 with the free centos2alma tool by Plesk or Plesk Migrator. Please let us know your experiences or concerns in this thread:
    CentOS2Alma discussion
  • Please beaware of a breaking change in the REST API on the next Plesk release (18.0.62).
    Starting from Plesk Obsidian 18.0.62, requests to REST API containing the Content-Type header with a media-type directive other than “application/json” will result in the HTTP “415 Unsupported Media Type” client error response code. Read more here

Question fail2ban, can I ban visitors with no user agent?

mendip_discovery

New Pleskian
Server operating system version
Debian 11.9
Plesk version and microupdate number
Plesk Obsidian Version 18.0.61 Update #2
So I am getting constant code 200 requests to my website with no user interaction, as in they never open an image or anything just scraping. This is just wasting server resources. So I have recently started using the badbots filter to get rid of some of that. Then for some odd reason, I was getting hammered by the Facebook scraper, though when I try FB webmaster thing it says it is not crawling my site. So I added that to the bad bots to save a few more cycles. Now I am getting hits from AWS ip addresses, again they are just asking for a page.

Code:
    Line 4122: 34.219.197.206 - - [21/May/2024:17:54:17 +0100] "GET /picture.php/20060524_TimeTrial_0080-/categories HTTP/1.0" 200 3252 "-" "i"
    Line 4140: 34.219.197.206 - - [21/May/2024:17:57:40 +0100] "GET /picture.php/20050714_IC2005_0896/categories HTTP/1.0" 200 3183 "-" "o"
    Line 4155: 34.219.197.206 - - [21/May/2024:18:01:03 +0100] "GET /picture.php/TT_26-05-2012_0138/tags/247-45 HTTP/1.0" 200 3345 "-" "d"
    Line 4161: 34.219.197.206 - - [21/May/2024:18:03:10 +0100] "GET /picture.php/TT_10-06-2012_0460/tags/279-12 HTTP/1.0" 200 3331 "-" "o"
    Line 4167: 34.219.197.206 - - [21/May/2024:18:04:55 +0100] "GET /picture.php/154523/tags/279-12 HTTP/1.0" 200 3274 "-" "p"
    Line 4168: 34.219.197.206 - - [21/May/2024:18:05:17 +0100] "GET /picture.php/NYD_01-01-2012_0027/tags/267-2 HTTP/1.0" 200 3260 "-" "x"
    Line 4170: 34.219.197.206 - - [21/May/2024:18:05:36 +0100] "GET /picture.php/TT_26-05-2012_0239/tags/267-2 HTTP/1.0" 200 3320 "-" "o"
    Line 4173: 34.219.197.206 - - [21/May/2024:18:05:47 +0100] "GET /picture.php/NYD_01-01-2012_0038/tags/273-53 HTTP/1.0" 200 3352 "-" "x"
    Line 4174: 34.219.197.206 - - [21/May/2024:18:05:59 +0100] "GET /picture.php/TT_10-06-2012_0458/tags/241-41 HTTP/1.0" 200 3322 "-" "i"
    Line 4178: 34.219.197.206 - - [21/May/2024:18:06:16 +0100] "GET /picture.php/20050227_LDT_0236/tags/273-53 HTTP/1.0" 200 3269 "-" "x"
    Line 4179: 34.219.197.206 - - [21/May/2024:18:06:23 +0100] "GET /picture.php/20050227_LDT_0053/tags/247-45 HTTP/1.0" 200 3239 "-" "n"
    Line 4180: 34.219.197.206 - - [21/May/2024:18:06:30 +0100] "GET /picture.php/TT_26-05-2012_0683/tags/267-2 HTTP/1.0" 200 3344 "-" "d"
    Line 4181: 34.219.197.206 - - [21/May/2024:18:06:36 +0100] "GET /picture.php/MC_12-10-2014_0321/categories HTTP/1.0" 200 3235 "-" "i"
    Line 4183: 34.219.197.206 - - [21/May/2024:18:06:43 +0100] "GET /picture.php/DCS_07-09-2013_0067/categories HTTP/1.0" 200 3158 "-" "n"
    Line 4184: 34.219.197.206 - - [21/May/2024:18:06:49 +0100] "GET /picture.php/20050713_IC2005_0001/tags/886-andrew_pounce HTTP/1.0" 200 3210 "-" "d"
    Line 4185: 34.219.197.206 - - [21/May/2024:18:06:56 +0100] "GET /picture.php/CE_09-04-2006_0537/tags/241-41 HTTP/1.0" 200 3320 "-" "h"
    Line 4186: 34.219.197.206 - - [21/May/2024:18:07:02 +0100] "GET /picture.php/TT_26-05-2012_0572/tags/273-53 HTTP/1.0" 200 3262 "-" "a"
    Line 4188: 34.219.197.206 - - [21/May/2024:18:07:08 +0100] "GET /picture.php/20060514_Luc-Sur-Mere_0153-/categories HTTP/1.0" 200 3204 "-" "n"
    Line 4189: 34.219.197.206 - - [21/May/2024:18:07:15 +0100] "GET /picture.php/TT_26-05-2012_0625/tags/267-2 HTTP/1.0" 200 3323 "-" "i"
    Line 4190: 34.219.197.206 - - [21/May/2024:18:07:23 +0100] "GET /picture.php/TT_30-04-2006_0035/tags/247-45 HTTP/1.0" 200 3202 "-" "a"
    Line 4191: 34.219.197.206 - - [21/May/2024:18:07:29 +0100] "GET /picture.php/CE_09-04-2006_0045/tags/241-41 HTTP/1.0" 200 3238 "-" "n"
    Line 4193: 34.219.197.206 - - [21/May/2024:18:07:35 +0100] "GET /picture.php/TT_26-05-2012_0443/tags/273-53 HTTP/1.0" 200 3353 "-" "i"
    Line 4194: 34.219.197.206 - - [21/May/2024:18:07:41 +0100] "GET /picture.php/MC_12-10-2014_0319/categories HTTP/1.0" 200 3265 "-" "n"
    Line 4195: 34.219.197.206 - - [21/May/2024:18:07:48 +0100] "GET /picture.php/20050714_IC2005_0900/categories HTTP/1.0" 200 3194 "-" "i"
    Line 4196: 34.219.197.206 - - [21/May/2024:18:07:54 +0100] "GET /picture.php/TT_26-05-2012_0354/tags/267-2 HTTP/1.0" 200 3320 "-" "n"
    Line 4197: 34.219.197.206 - - [21/May/2024:18:08:00 +0100] "GET /picture.php/TT_10-06-2012_0281/tags/279-12 HTTP/1.0" 200 3324 "-" "x"
    Line 4198: 34.219.197.206 - - [21/May/2024:18:08:07 +0100] "GET /picture.php/CE_09-04-2006_0201/tags/273-53 HTTP/1.0" 200 3307 "-" "n"
    Line 4199: 34.219.197.206 - - [21/May/2024:18:08:13 +0100] "GET /picture.php/NYD_01-01-2012_0210/tags/279-12 HTTP/1.0" 200 3280 "-" "p"
    Line 4200: 34.219.197.206 - - [21/May/2024:18:08:19 +0100] "GET /index.php/tags/367-323 HTTP/1.0" 200 4782 "-" "p"
    Line 4202: 34.219.197.206 - - [21/May/2024:18:08:25 +0100] "GET /picture.php/CE_09-04-2006_0010/tags/279-12 HTTP/1.0" 200 3336 "-" "i"
    Line 4203: 34.219.197.206 - - [21/May/2024:18:08:32 +0100] "GET /index.php/tags/378-320 HTTP/1.0" 200 4784 "-" "h"
    Line 4205: 34.219.197.206 - - [21/May/2024:18:08:38 +0100] "GET /picture.php/NYD_01-01-2012_0325/tags/279-12 HTTP/1.0" 200 3285 "-" "i"
    Line 4206: 34.219.197.206 - - [21/May/2024:18:08:44 +0100] "GET /picture.php/20050409_WW-HH_0048/category/230 HTTP/1.0" 200 3173 "-" "d"
    Line 4208: 34.219.197.206 - - [21/May/2024:18:08:51 +0100] "GET /index.php/tags/466-246 HTTP/1.0" 200 4808 "-" "i"
    Line 4209: 34.219.197.206 - - [21/May/2024:18:08:57 +0100] "GET /picture.php/TT_10-06-2012_0501/tags/241-41 HTTP/1.0" 200 3323 "-" "i"
    Line 4210: 34.219.197.206 - - [21/May/2024:18:09:03 +0100] "GET /picture.php/20050714_IC2005_0905/categories HTTP/1.0" 200 3184 "-" "x"
    Line 4211: 34.219.197.206 - - [21/May/2024:18:09:10 +0100] "GET /picture.php/MC_12-10-2014_0326/categories HTTP/1.0" 200 3236 "-" "d"
    Line 4213: 34.219.197.206 - - [21/May/2024:18:09:16 +0100] "GET /picture.php/TT_26-05-2012_0201/tags/279-12 HTTP/1.0" 200 3325 "-" "i"
    Line 4217: 34.219.197.206 - - [21/May/2024:18:09:22 +0100] "GET /picture.php/TT_26-05-2012_0749/tags/279-12 HTTP/1.0" 200 3329 "-" "i"
    Line 4218: 34.219.197.206 - - [21/May/2024:18:09:29 +0100] "GET /picture.php/CE_09-04-2006_0154/tags/267-2 HTTP/1.0" 200 3258 "-" "i"
    Line 4219: 34.219.197.206 - - [21/May/2024:18:09:35 +0100] "GET /picture.php/TT_26-05-2012_0328/tags/279-12 HTTP/1.0" 200 3363 "-" "h"
    Line 4220: 34.219.197.206 - - [21/May/2024:18:09:41 +0100] "GET /picture.php/NYD_01-01-2012_0377/tags/247-45 HTTP/1.0" 200 3272 "-" "a"
    Line 4222: 34.219.197.206 - - [21/May/2024:18:09:48 +0100] "GET /picture.php/TT_26-05-2012_0542/tags/273-53 HTTP/1.0" 200 3312 "-" "d"
    Line 4223: 34.219.197.206 - - [21/May/2024:18:09:54 +0100] "GET /picture.php/20060524_TimeTrial_0076-/categories HTTP/1.0" 200 3246 "-" "a"
    Line 4224: 34.219.197.206 - - [21/May/2024:18:10:00 +0100] "GET /picture.php/TT_10-06-2012_0487/tags/247-45 HTTP/1.0" 200 3352 "-" "h"
    Line 4225: 34.219.197.206 - - [21/May/2024:18:10:06 +0100] "GET /picture.php/WW_17-06-2012_0005/category/58 HTTP/1.0" 200 3209 "-" "i"
    Line 4226: 34.219.197.206 - - [21/May/2024:18:10:13 +0100] "GET /picture.php/20050714_IC2005_0912/categories HTTP/1.0" 200 3184 "-" "o"
    Line 4227: 34.219.197.206 - - [21/May/2024:18:10:19 +0100] "GET /picture.php/TT_26-05-2012_0009/tags/267-2 HTTP/1.0" 200 3316 "-" "h"
    Line 4228: 34.219.197.206 - - [21/May/2024:18:10:25 +0100] "GET /picture.php/MC_12-10-2014_0347/categories HTTP/1.0" 200 3238 "-" "o"
    Line 4229: 34.219.197.206 - - [21/May/2024:18:10:31 +0100] "GET /picture.php/MC_12-10-2014_0312/categories HTTP/1.0" 200 3257 "-" "i"
    Line 4230: 34.219.197.206 - - [21/May/2024:18:10:38 +0100] "GET /picture.php/TT_10-06-2012_0259/tags/267-2 HTTP/1.0" 200 3315 "-" "x"
    Line 4231: 34.219.197.206 - - [21/May/2024:18:10:44 +0100] "GET /picture.php/WW_30-09-2012_0032/category/136 HTTP/1.0" 200 3238 "-" "a"
    Line 4235: 34.219.197.206 - - [21/May/2024:18:10:52 +0100] "GET /picture.php/MC_12-10-2014_0333/categories HTTP/1.0" 200 3237 "-" "h"
    Line 4237: 34.219.197.206 - - [21/May/2024:18:10:58 +0100] "GET /picture.php/20050714_IC2005_0886/categories HTTP/1.0" 200 3184 "-" "n"
    Line 4238: 34.219.197.206 - - [21/May/2024:18:11:04 +0100] "GET /picture.php/NYD_01-01-2012_0026/tags/267-2 HTTP/1.0" 200 3319 "-" "n"
    Line 4239: 34.219.197.206 - - [21/May/2024:18:11:11 +0100] "GET /picture.php/TT_10-06-2012_0263/tags/273-53 HTTP/1.0" 200 3316 "-" "n"

As it is a photography gallery hosting some 181,376 of my photos in the past year it's become popular with AI bots wanting images to train on, urgh. Plus I think some script kiddies were getting interested as a bit of code on the site would throw an error or two which I have fixed but doesn't stop them from probing.

Is there a way to filter those with no User-Agent details or even malformed ones?
 
Sure. You likely have create your own Jail and Filter to do this. There is an interesting and detailed blog post about optimizing fail2ban to block bad bots. It does not cover your use case, but might be a good starting point to read up on fail2ban.

I came across this post that has some useful information to take in to account too.
 
Back
Top