• The ImunifyAV extension is now deprecated and no longer available for installation.
    Existing ImunifyAV installations will continue operating for three months, and after that will automatically be replaced with the new Imunify extension. We recommend that you manually replace any existing ImunifyAV installations with Imunify at your earliest convenience.
  • The Horde webmail has been deprecated. Its complete removal is scheduled for April 2025. For details and recommended actions, see the Feature and Deprecation Plan.

Question How to Prevent High Connection Counts Causing High Memory Usage (Possible DDoS?)

Webmore

New Pleskian
Hello everyone,

I know this might be a simple question and may have been asked multiple times, but I couldn’t find a clear answer.

I'm frequently receiving the following email alert from Plesk:

We have detected a critical status for one of the server parameters.
Please log in to Plesk and check the server status.
The message from Monitoring:
The memory usage status is critical!
The current value is 3.4 GiB.
When this happens, all websites on the server start running extremely slow.

After connecting via SSH and running the following command:


ss -tan state established | grep ":80\|:443" | awk '{print $4}'| cut -d':' -f1 | sort -n | uniq -c | sort -nr

I notice that one IP always has a significantly high number of connections (e.g., 184 connections from xx.xx.xx.xx).

This looks like a DDoS attack, and my current solution is to manually ban the IP using Fail2Ban, which immediately restores normal performance. However, I want to automate or prevent this before it affects my server performance.

My questions:​

  1. Is there a way to automatically block an IP that exceeds a certain number of connections in Plesk or Fail2Ban?
  2. Any other best practices to prevent these types of issues?
I’d appreciate any advice or guidance from the community!

Thanks in advance!
 
This here is still relevant, although in the meanwhile @Kaspar suggested a valuable improvement of one of the key regex lines
Improved key regex line:
Code:
failregex = ^<HOST> -[^"]*"(GET|POST|HEAD) \/.* HTTP\/\d(?:\.\d)" \d+ \d+ "[^"]*" "[^"]*(%(badbots)s)[^"]*"$

If you want to annoy the bad guys more, then use this badbot list and regex section instead of the one I provided in the article:
Code:
badbots = VIZIO|meta-externalagent/1\.1|facebookexternalhit/1\.1 \(|python-httpx/|Pinterestbot/|aiohttp/|        |Cookiebot/|\(Amazonbot/|ClaudeBot|Optimizer|seobility|Timpibot|Go-http-client/1\.1|colly|GPTBot|AmazonBot|Bytespider|Bytedance|thesis-research-bot|fidget-spinner-bot|EmailCollector|WebEMailExtrac|TrackBack/1\.02|sogou music spider|seocompany|LieBaoFast|SEOkicks|Uptimebot|Cliqzbot|ssearch_bot|domaincrawler|AhrefsBot|DigExt|Sogou|MegaIndex\.ru|majestic12|80legs|SISTRIX|HTTrack|Semrush|MJ12|MJ12bot|MJ12Bot|Ezooms|CCBot|TalkTalk|Ahrefs|BLEXBot|Atomic_Email_Hunter/4\.0|atSpider/1\.0|autoemailspider|bwh3_user_agent|China Local Browse 2\.6|ContactBot/0\.2|ContentSmartz|DataCha0s/2\.0|DBrowse 1\.4b|DBrowse 1\.4d|Demo Bot DOT 16b|Demo Bot Z 16b|DSurf15a 01|DSurf15a 71|DSurf15a 81|DSurf15a VA|EBrowse 1\.4b|Educate Search VxB|EmailSiphon|EmailSpider|EmailWolf 1\.00|ESurf15a 15|ExtractorPro|Franklin Locator 1\.8|FSurf15a 01|Full Web Bot 0416B|Full Web Bot 0516B|Full Web Bot 2816B|Guestbook Auto Submitter|Industry Program 1\.0\.x|ISC Systems iRc Search 2\.1|IUPUI Research Bot v 1\.9a|LARBIN-EXPERIMENTAL \(efp@gmx\.net\)|LetsCrawl\.com/1\.0 \+http\://letscrawl\.com/|Lincoln State Web Browser|LMQueueBot/0\.2|LWP\:\:Simple/5\.803|Mac Finder 1\.0\.xx|MFC Foundation Class Library 4\.0|Microsoft URL Control - 6\.00\.8xxx|Missauga Locate 1\.0\.0|Missigua Locator 1\.9|Missouri College Browse|Mizzu Labs 2\.2|Mo College 1\.9|MVAClient|Mozilla/2\.0 \(compatible; NEWT ActiveX; Win32\)|Mozilla/3\.0 \(compatible; Indy Library\)|Mozilla/3\.0 \(compatible; scan4mail \(advanced version\) http\://www\.peterspages\.net/?scan4mail\)|Mozilla/4\.0 \(compatible; Advanced Email Extractor v2\.xx\)|Mozilla/4\.0 \(compatible; Iplexx Spider/1\.0 http\://www\.iplexx\.at\)|Mozilla/4\.0 \(compatible; MSIE 5\.0; Windows NT; DigExt; DTS Agent|Mozilla/4\.0 efp@gmx\.net|Mozilla/5\.0 \(Version\: xxxx Type\:xx\)|NameOfAgent \(CMS Spider\)|NASA Search 1\.0|Nsauditor/1\.x|PBrowse 1\.4b|PEval 1\.4b|Poirot|Port Huron Labs|Production Bot 0116B|Production Bot 2016B|Production Bot DOT 3016B|Program Shareware 1\.0\.2|PSurf15a 11|PSurf15a 51|PSurf15a VA|psycheclone|RSurf15a 41|RSurf15a 51|RSurf15a 81|searchbot admin@google\.com|ShablastBot 1\.0|snap\.com beta crawler v0|Snapbot/1\.0|Snapbot/1\.0 \(Snap Shots&#44; \+http\://www\.snap\.com\)|sogou develop spider|Sogou Orion spider/3\.0\(\+http\://www\.sogou\.com/docs/help/webmasters\.htm#07\)|sogou spider|Sogou web spider/3\.0\(\+http\://www\.sogou\.com/docs/help/webmasters\.htm#07\)|sohu agent|SSurf15a 11 |TSurf15a 11|Under the Rainbow 2\.2|User-Agent\: Mozilla/4\.0 \(compatible; MSIE 6\.0; Windows NT 5\.1\)|VadixBot|WebVulnCrawl\.unknown/1\.0 libwww-perl/5\.803|Wells Search II|WEP Search 00

failregex = ^<HOST> -[^"]*"(GET|POST|HEAD) \/.* HTTP\/\d(?:\.\d)" \d+ \d+ "[^"]*" "[^"]*(%(badbots)s)[^"]*"$
            ^<HOST> .*GET .*aws(/|_|-)(credentials|secrets|keys).*
            ^<HOST> .*GET .*(credentials/aws|secretes/(aws|keys)|oauth/config|config/oauth).*
            ^<HOST> .*"GET .*(freshio|woocommerce).*frontend.*" (301|404).*
            ^<HOST> .*"GET .*contact-form-7/includes.*" (301|404).*
            ^<HOST> .*"(GET|POST) .*author=.*" 404.*
            ^<HOST> .*"(GET|POST) /.*wp-json/tdw/save_css.*" (301|404).*
            ^<HOST> .*"GET /.*/.git/config.*" 404.*
            ^<HOST> .*"GET /error-404.*" (301|302|404).*
            ^<HOST> .*"(GET|POST) .*xmlrpc\.php.*" (403|404|301).*
            ^<HOST> .*"(GET|POST) //?xmlrpc\.php.*" 200.*
            ^<HOST> .*"(HEAD|GET) /(bc|bk|home|backup|old|new|wp|blog|wordpress|app/.*) .*" 404.*
            ^<HOST> .*"(HEAD|GET) .*js/core\.js .*" 404.*
            ^<HOST> .*"(HEAD|GET) .*\?(back|SubmitCurrency|order)=.*2525252525.*252525252.*" 200.*
            ^<HOST> .*"(HEAD|GET) .*((login|admin|config|lock|simple|radio|alfa|txt|autoload_classmap|wp-includes/autoload_classmap|makeasmtp|yanz|filefuns|gel4y|\.tmb/admin|access|wp-admin/includes/xmrlpc|\.well-known/pki-validation/cloud|inicio-sesion|admin-post|sidwso|pl/payu/pay)\.php|\.env|package\.json|angular\.json|config\.py|base\.py|config/env\.json|config/dev\.json|config/settings\.js|config/config\.go|config/prod\.json|appsettings\.json|config/dev_settings\.py|config/prod_settings\.py|config/application\.yml|wp-includes/Requests/(Auth|Cookie|Exception|Proxy|Response|Transport|Utility)/|wp-includes/Requests/Exception/(HTTP|Transport)) .*" 404.*
            ^<HOST> .*"GET /((aa|ss|rr|ig|in|be|go)/|/?wp-admin/(install|setup-config)\.php|/?(blog|web|wordpress)?/wp-includes/wlwmanifest.xml|/?wp-json/wp/v2/users/|/?wp-json/oembed/1.0/embed.*|.*\+41|(blocks|a11y|media-utils|api-fetch|commands|components|patterns|core-data|editor|rich-text|preferences|block-editor|keycodes)\.js|wp-content/plugins/member-access|wp-content/plugins/xml-sitemaps|wp-content/plugins/wp-hide-dashboard|post_login|wp-content/plugins/google-sitemap-generator|(inetpub|admin|tmp|temp|old)\.war|.*db\.rar|.*sql\.tar\.gz|.*db\.tar\.gz|.*backup\.zip|.*db\.tgz|.*backup\.tgz|notip\.html|images/pt_logo\.svg|images/process\.jpg|san_filez/img/alert\.svg|files/img/blank\.gif|merchantbank/pageBank/bank).*" 404.*
            ^<HOST> .*"POST (/v[1-3]/graphql|/graphql(/v[1-3])?|/graph/api).*" (404).*
            ^<HOST> .*"GET / HTTP/.*" (200|301|503) .* "http://.*:(80|443)/" ".*

It'll ban all typical attacks used these days, but careful with Wordpress, as this will not allow many mistakes on Wordpress login-related files. You might lock yourself out in case you address missing admin or login files or address XMLRPC.

Further, I recommend to set the jail.conf to immediate bans after the first violation of the rules and to ban long-term. At least 14d.
 

Update: Improving Fail2Ban Regex for Better Protection

After extensive testing and analysis, I found that using the following regex in Fail2Ban significantly improved security and prevented about 60% of attacks:


failregex = ^<HOST> -[^"]*"(GET|POST|HEAD) \/.* HTTP\/\d(?:\.\d)" \d+ \d+ "[^"]*" "[^"]*(%(badbots)s)[^"]*"$

However, attackers started using different techniques to bypass this rule and continue their malicious activities.

Enhancing the Regex for More Security

To increase protection and block even more malicious requests, I modified the regex to:


[Definition]
failregex = ^<HOST> - - \[.*\] "(GET|POST|HEAD) \/.* HTTP\/[0-9.]+" \d+ \d+ "-" ".*(bot|crawl|spider|scraper|scanner).*"
ignoreregex =

This stopped around 90% of attacks by catching a wider range of suspicious bots, crawlers, and scanners.

Unexpected Issue: Webmail Users Got Banned

However, I discovered a side effect—this new regex was blocking legitimate users who accessed webmail using URLs like:

  • /?_task=mail&_mbox=INBOX
  • /?_task=mail&_action=compose&_id=...
Since these webmail URLs matched some patterns used by bots, Fail2Ban mistakenly flagged and banned real users trying to access their emails.

Solution: Excluding Webmail from Fail2Ban

To fix this, I added an ignoreregex rule to ensure that webmail requests are not caught by Fail2Ban:


ignoreregex = .*webmail\..*\..*

After applying this rule, webmail users were no longer getting banned, and the security improvements remained effective.


If you're running a Plesk server and using Fail2Ban, I highly recommend applying these updated regex rules to better protect against automated attacks while allowing legitimate users to access webmail without issues.
 
Good to read your customized failregex helped you fend of most of the attacks your where dealing with. However I would not recommend anyone else to use this regex as is has some serious flaws. It might be a perfect fit for your use case, but I doubt it is for others.

The original fail2ban badbot filter is meant to filter specific bad bots. Which it does comparing user agent strings from requests to a filter list containing stings with names of known bad bots. Which allows you to ban request/connections from specific bad bots, but allow legimate bots and scrapers to do there thing. It also makes it easier to append the list with bad bots with new bot names.

Your failregex however only bans requests if the user agent string contains the words bot, crawl, spider, scraper or scanner. This leaves out a huge number of bad bots which do not have any of those words in their names. For example VIZIO, meta-externalagent, facebookexternalhit, Optimizer, seobility, Go-http-client, colly, Bytedance, EmailCollector, WebEMailExtrac, TrackBack and many, many others. Conversely it blocks bots which are generally considered legit, most notability googlebot. Your failregex also fails to take requests with referer into account.
 
Thank you for taking the time to review my custom Fail2Ban failregex and for your detailed feedback. I really appreciate your insights, as it's clear that you have much more experience in this area than I do.

I now see the flaws in my original approach, especially the issue with only filtering based on certain keywords like "bot, crawl, spider, scraper, scanner." As you pointed out, this left out many bad bots that don’t include these words in their User-Agent strings while also mistakenly blocking legitimate bots like Googlebot.

Based on your recommendations, I am modifying my failregex to incorporate a more robust approach that targets a broader list of known bad bots while avoiding unintended bans on legitimate scrapers. I will also look into properly handling referer-based filtering to refine the accuracy further.

Again, thanks for your guidance—I genuinely appreciate it!

Best regards,
 
Based on your recommendations, I am modifying my failregex to incorporate a more robust approach that targets a broader list of known bad bots while avoiding unintended bans on legitimate scrapers. I will also look into properly handling referer-based filtering to refine the accuracy further.
If you're searching for inspiration, have a look the failregex @Bitpalast posted in this thread. It's by far the most comprehensive failregex for fail2ban I've come across.

Again, thanks for your guidance—I genuinely appreciate it!
Sure, your welcome.
 
Back
Top