mendip_discovery
New Pleskian
- Server operating system version
- Debian 11.9
- Plesk version and microupdate number
- Plesk Obsidian Version 18.0.61 Update #2
So I am getting constant code 200 requests to my website with no user interaction, as in they never open an image or anything just scraping. This is just wasting server resources. So I have recently started using the badbots filter to get rid of some of that. Then for some odd reason, I was getting hammered by the Facebook scraper, though when I try FB webmaster thing it says it is not crawling my site. So I added that to the bad bots to save a few more cycles. Now I am getting hits from AWS ip addresses, again they are just asking for a page.
As it is a photography gallery hosting some 181,376 of my photos in the past year it's become popular with AI bots wanting images to train on, urgh. Plus I think some script kiddies were getting interested as a bit of code on the site would throw an error or two which I have fixed but doesn't stop them from probing.
Is there a way to filter those with no User-Agent details or even malformed ones?
Code:
Line 4122: 34.219.197.206 - - [21/May/2024:17:54:17 +0100] "GET /picture.php/20060524_TimeTrial_0080-/categories HTTP/1.0" 200 3252 "-" "i"
Line 4140: 34.219.197.206 - - [21/May/2024:17:57:40 +0100] "GET /picture.php/20050714_IC2005_0896/categories HTTP/1.0" 200 3183 "-" "o"
Line 4155: 34.219.197.206 - - [21/May/2024:18:01:03 +0100] "GET /picture.php/TT_26-05-2012_0138/tags/247-45 HTTP/1.0" 200 3345 "-" "d"
Line 4161: 34.219.197.206 - - [21/May/2024:18:03:10 +0100] "GET /picture.php/TT_10-06-2012_0460/tags/279-12 HTTP/1.0" 200 3331 "-" "o"
Line 4167: 34.219.197.206 - - [21/May/2024:18:04:55 +0100] "GET /picture.php/154523/tags/279-12 HTTP/1.0" 200 3274 "-" "p"
Line 4168: 34.219.197.206 - - [21/May/2024:18:05:17 +0100] "GET /picture.php/NYD_01-01-2012_0027/tags/267-2 HTTP/1.0" 200 3260 "-" "x"
Line 4170: 34.219.197.206 - - [21/May/2024:18:05:36 +0100] "GET /picture.php/TT_26-05-2012_0239/tags/267-2 HTTP/1.0" 200 3320 "-" "o"
Line 4173: 34.219.197.206 - - [21/May/2024:18:05:47 +0100] "GET /picture.php/NYD_01-01-2012_0038/tags/273-53 HTTP/1.0" 200 3352 "-" "x"
Line 4174: 34.219.197.206 - - [21/May/2024:18:05:59 +0100] "GET /picture.php/TT_10-06-2012_0458/tags/241-41 HTTP/1.0" 200 3322 "-" "i"
Line 4178: 34.219.197.206 - - [21/May/2024:18:06:16 +0100] "GET /picture.php/20050227_LDT_0236/tags/273-53 HTTP/1.0" 200 3269 "-" "x"
Line 4179: 34.219.197.206 - - [21/May/2024:18:06:23 +0100] "GET /picture.php/20050227_LDT_0053/tags/247-45 HTTP/1.0" 200 3239 "-" "n"
Line 4180: 34.219.197.206 - - [21/May/2024:18:06:30 +0100] "GET /picture.php/TT_26-05-2012_0683/tags/267-2 HTTP/1.0" 200 3344 "-" "d"
Line 4181: 34.219.197.206 - - [21/May/2024:18:06:36 +0100] "GET /picture.php/MC_12-10-2014_0321/categories HTTP/1.0" 200 3235 "-" "i"
Line 4183: 34.219.197.206 - - [21/May/2024:18:06:43 +0100] "GET /picture.php/DCS_07-09-2013_0067/categories HTTP/1.0" 200 3158 "-" "n"
Line 4184: 34.219.197.206 - - [21/May/2024:18:06:49 +0100] "GET /picture.php/20050713_IC2005_0001/tags/886-andrew_pounce HTTP/1.0" 200 3210 "-" "d"
Line 4185: 34.219.197.206 - - [21/May/2024:18:06:56 +0100] "GET /picture.php/CE_09-04-2006_0537/tags/241-41 HTTP/1.0" 200 3320 "-" "h"
Line 4186: 34.219.197.206 - - [21/May/2024:18:07:02 +0100] "GET /picture.php/TT_26-05-2012_0572/tags/273-53 HTTP/1.0" 200 3262 "-" "a"
Line 4188: 34.219.197.206 - - [21/May/2024:18:07:08 +0100] "GET /picture.php/20060514_Luc-Sur-Mere_0153-/categories HTTP/1.0" 200 3204 "-" "n"
Line 4189: 34.219.197.206 - - [21/May/2024:18:07:15 +0100] "GET /picture.php/TT_26-05-2012_0625/tags/267-2 HTTP/1.0" 200 3323 "-" "i"
Line 4190: 34.219.197.206 - - [21/May/2024:18:07:23 +0100] "GET /picture.php/TT_30-04-2006_0035/tags/247-45 HTTP/1.0" 200 3202 "-" "a"
Line 4191: 34.219.197.206 - - [21/May/2024:18:07:29 +0100] "GET /picture.php/CE_09-04-2006_0045/tags/241-41 HTTP/1.0" 200 3238 "-" "n"
Line 4193: 34.219.197.206 - - [21/May/2024:18:07:35 +0100] "GET /picture.php/TT_26-05-2012_0443/tags/273-53 HTTP/1.0" 200 3353 "-" "i"
Line 4194: 34.219.197.206 - - [21/May/2024:18:07:41 +0100] "GET /picture.php/MC_12-10-2014_0319/categories HTTP/1.0" 200 3265 "-" "n"
Line 4195: 34.219.197.206 - - [21/May/2024:18:07:48 +0100] "GET /picture.php/20050714_IC2005_0900/categories HTTP/1.0" 200 3194 "-" "i"
Line 4196: 34.219.197.206 - - [21/May/2024:18:07:54 +0100] "GET /picture.php/TT_26-05-2012_0354/tags/267-2 HTTP/1.0" 200 3320 "-" "n"
Line 4197: 34.219.197.206 - - [21/May/2024:18:08:00 +0100] "GET /picture.php/TT_10-06-2012_0281/tags/279-12 HTTP/1.0" 200 3324 "-" "x"
Line 4198: 34.219.197.206 - - [21/May/2024:18:08:07 +0100] "GET /picture.php/CE_09-04-2006_0201/tags/273-53 HTTP/1.0" 200 3307 "-" "n"
Line 4199: 34.219.197.206 - - [21/May/2024:18:08:13 +0100] "GET /picture.php/NYD_01-01-2012_0210/tags/279-12 HTTP/1.0" 200 3280 "-" "p"
Line 4200: 34.219.197.206 - - [21/May/2024:18:08:19 +0100] "GET /index.php/tags/367-323 HTTP/1.0" 200 4782 "-" "p"
Line 4202: 34.219.197.206 - - [21/May/2024:18:08:25 +0100] "GET /picture.php/CE_09-04-2006_0010/tags/279-12 HTTP/1.0" 200 3336 "-" "i"
Line 4203: 34.219.197.206 - - [21/May/2024:18:08:32 +0100] "GET /index.php/tags/378-320 HTTP/1.0" 200 4784 "-" "h"
Line 4205: 34.219.197.206 - - [21/May/2024:18:08:38 +0100] "GET /picture.php/NYD_01-01-2012_0325/tags/279-12 HTTP/1.0" 200 3285 "-" "i"
Line 4206: 34.219.197.206 - - [21/May/2024:18:08:44 +0100] "GET /picture.php/20050409_WW-HH_0048/category/230 HTTP/1.0" 200 3173 "-" "d"
Line 4208: 34.219.197.206 - - [21/May/2024:18:08:51 +0100] "GET /index.php/tags/466-246 HTTP/1.0" 200 4808 "-" "i"
Line 4209: 34.219.197.206 - - [21/May/2024:18:08:57 +0100] "GET /picture.php/TT_10-06-2012_0501/tags/241-41 HTTP/1.0" 200 3323 "-" "i"
Line 4210: 34.219.197.206 - - [21/May/2024:18:09:03 +0100] "GET /picture.php/20050714_IC2005_0905/categories HTTP/1.0" 200 3184 "-" "x"
Line 4211: 34.219.197.206 - - [21/May/2024:18:09:10 +0100] "GET /picture.php/MC_12-10-2014_0326/categories HTTP/1.0" 200 3236 "-" "d"
Line 4213: 34.219.197.206 - - [21/May/2024:18:09:16 +0100] "GET /picture.php/TT_26-05-2012_0201/tags/279-12 HTTP/1.0" 200 3325 "-" "i"
Line 4217: 34.219.197.206 - - [21/May/2024:18:09:22 +0100] "GET /picture.php/TT_26-05-2012_0749/tags/279-12 HTTP/1.0" 200 3329 "-" "i"
Line 4218: 34.219.197.206 - - [21/May/2024:18:09:29 +0100] "GET /picture.php/CE_09-04-2006_0154/tags/267-2 HTTP/1.0" 200 3258 "-" "i"
Line 4219: 34.219.197.206 - - [21/May/2024:18:09:35 +0100] "GET /picture.php/TT_26-05-2012_0328/tags/279-12 HTTP/1.0" 200 3363 "-" "h"
Line 4220: 34.219.197.206 - - [21/May/2024:18:09:41 +0100] "GET /picture.php/NYD_01-01-2012_0377/tags/247-45 HTTP/1.0" 200 3272 "-" "a"
Line 4222: 34.219.197.206 - - [21/May/2024:18:09:48 +0100] "GET /picture.php/TT_26-05-2012_0542/tags/273-53 HTTP/1.0" 200 3312 "-" "d"
Line 4223: 34.219.197.206 - - [21/May/2024:18:09:54 +0100] "GET /picture.php/20060524_TimeTrial_0076-/categories HTTP/1.0" 200 3246 "-" "a"
Line 4224: 34.219.197.206 - - [21/May/2024:18:10:00 +0100] "GET /picture.php/TT_10-06-2012_0487/tags/247-45 HTTP/1.0" 200 3352 "-" "h"
Line 4225: 34.219.197.206 - - [21/May/2024:18:10:06 +0100] "GET /picture.php/WW_17-06-2012_0005/category/58 HTTP/1.0" 200 3209 "-" "i"
Line 4226: 34.219.197.206 - - [21/May/2024:18:10:13 +0100] "GET /picture.php/20050714_IC2005_0912/categories HTTP/1.0" 200 3184 "-" "o"
Line 4227: 34.219.197.206 - - [21/May/2024:18:10:19 +0100] "GET /picture.php/TT_26-05-2012_0009/tags/267-2 HTTP/1.0" 200 3316 "-" "h"
Line 4228: 34.219.197.206 - - [21/May/2024:18:10:25 +0100] "GET /picture.php/MC_12-10-2014_0347/categories HTTP/1.0" 200 3238 "-" "o"
Line 4229: 34.219.197.206 - - [21/May/2024:18:10:31 +0100] "GET /picture.php/MC_12-10-2014_0312/categories HTTP/1.0" 200 3257 "-" "i"
Line 4230: 34.219.197.206 - - [21/May/2024:18:10:38 +0100] "GET /picture.php/TT_10-06-2012_0259/tags/267-2 HTTP/1.0" 200 3315 "-" "x"
Line 4231: 34.219.197.206 - - [21/May/2024:18:10:44 +0100] "GET /picture.php/WW_30-09-2012_0032/category/136 HTTP/1.0" 200 3238 "-" "a"
Line 4235: 34.219.197.206 - - [21/May/2024:18:10:52 +0100] "GET /picture.php/MC_12-10-2014_0333/categories HTTP/1.0" 200 3237 "-" "h"
Line 4237: 34.219.197.206 - - [21/May/2024:18:10:58 +0100] "GET /picture.php/20050714_IC2005_0886/categories HTTP/1.0" 200 3184 "-" "n"
Line 4238: 34.219.197.206 - - [21/May/2024:18:11:04 +0100] "GET /picture.php/NYD_01-01-2012_0026/tags/267-2 HTTP/1.0" 200 3319 "-" "n"
Line 4239: 34.219.197.206 - - [21/May/2024:18:11:11 +0100] "GET /picture.php/TT_10-06-2012_0263/tags/273-53 HTTP/1.0" 200 3316 "-" "n"
As it is a photography gallery hosting some 181,376 of my photos in the past year it's become popular with AI bots wanting images to train on, urgh. Plus I think some script kiddies were getting interested as a bit of code on the site would throw an error or two which I have fixed but doesn't stop them from probing.
Is there a way to filter those with no User-Agent details or even malformed ones?