• If you are still using CentOS 7.9, it's time to convert to Alma 8 with the free centos2alma tool by Plesk or Plesk Migrator. Please let us know your experiences or concerns in this thread:
    CentOS2Alma discussion
  • Inviting everyone to the UX test of a new security feature in the WP Toolkit
    For WordPress site owners, threats posed by hackers are ever-present. Because of this, we are developing a new security feature for the WP Toolkit. If the topic of WordPress website security is relevant to you, we would be grateful if you could share your experience and help us test the usability of this feature. We invite you to join us for a 1-hour online session via Google Meet. Select a convenient meeting time with our friendly UX staff here.

Issue Apache crashes: "server reached MaxRequestWorkers setting"

King555

Regular Pleskian
Plesk Onyx 17.5, Debian 8.1 x64.

One time per week my Apache webserver crashes all day long. A few minutes before every crash this is in the logfile (var/log/apache2/error.log):

AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting

I found this page: Apache keeps going down on a Plesk server: server reached MaxRequestWorkers setting

I set both values to 300 some weeks ago, but it did not help. Now I set it to 500, but still no success, still a lot of crashes. But the server load is as usual, which means there is not much load. The server has far more power than required for my amount of website visitors.

This is my current /etc/apache2/mods-enabled/mpm_prefork.conf:

Code:
<IfModule mpm_prefork_module>
        StartServers                     5
        MinSpareServers           5
        MaxSpareServers          10
        MaxRequestWorkers         500
        MaxConnectionsPerChild   0
        ServerLimit             500
</IfModule>

What else can I do to prevent the crashes?
 
What happens in access log at this moment? I had a bad bot trying downloading all pages of an onlineshop. It always tried to download 20 urls at the same time and was located in the same datacenter as my server. This caused for me, that my server always run out of php-fpm children. I blocked this IP .... silence ....

So please first identify, is it a normal user behaviour that triggers the issue, or something unusual! DoS Attack?
 
It's not so easy, because I have 10 websites on that server, which all have different access logs.

But I analyzed one log file from a site which has the most visitors and has been attacked several times since its existence.

These are today's crashes:
Code:
[Tue Aug 07 07:15:00.143275 2018] [mpm_prefork:notice] [pid 19183] AH00169: caught SIGTERM, shutting down
[Tue Aug 07 07:50:28.567378 2018] [mpm_prefork:notice] [pid 21894] AH00169: caught SIGTERM, shutting down
[Tue Aug 07 08:28:24.932412 2018] [mpm_prefork:notice] [pid 24015] AH00169: caught SIGTERM, shutting down
[Tue Aug 07 09:08:56.949961 2018] [mpm_prefork:notice] [pid 26402] AH00169: caught SIGTERM, shutting down
[Tue Aug 07 09:54:39.150237 2018] [mpm_prefork:notice] [pid 28817] AH00169: caught SIGTERM, shutting down
[Tue Aug 07 10:32:40.734850 2018] [mpm_prefork:notice] [pid 31926] AH00169: caught SIGTERM, shutting down
[Tue Aug 07 11:15:38.679491 2018] [mpm_prefork:notice] [pid 2026] AH00169: caught SIGTERM, shutting down
[Tue Aug 07 12:01:19.884231 2018] [mpm_prefork:notice] [pid 5090] AH00169: caught SIGTERM, shutting down
[Tue Aug 07 12:36:48.393795 2018] [mpm_prefork:notice] [pid 7946] AH00169: caught SIGTERM, shutting down
[Tue Aug 07 13:12:06.635000 2018] [mpm_prefork:notice] [pid 10312] AH00169: caught SIGTERM, shutting down
[Tue Aug 07 13:52:38.583211 2018] [mpm_prefork:notice] [pid 13122] AH00169: caught SIGTERM, shutting down
[Tue Aug 07 14:30:33.429754 2018] [mpm_prefork:notice] [pid 15963] AH00169: caught SIGTERM, shutting down
[Tue Aug 07 15:01:02.504927 2018] [mpm_prefork:notice] [pid 19079] AH00169: caught SIGTERM, shutting down
In the access log file there is only one thing suspicious around these times: The last or one of the last accesses have no useragent. But that's all. The IPs are "normal" and have not much hits.

And, as I already said, the server load is quite low. See this from Plesk Watchdog for the last 24 hours:

stats 24h.jpg
 
The value is "120" in my Plesk settings. But I did not do anything in Plesk since weeks. So this cannot be the reason for the restarts. And I never activated "graceful restart".
 
It must not be a page with high load, it could even be the page with the lowest average load and one single bug in for example a php application ...
Example: An ajax call that calls php scripts in a loop ... but not as usual one after the other, it just calls 5000 of them within miliseconds.
 
OK, thanks, I will check all logfiles for finding the reason for the crashes.

Interesting is that the crashes seem to occur always from Monday to Tuesday morning.
 
If the timing is always nearly identical .... have a look which sceduled tasks are running at this time!
 
The last two weeks (or maybe even longer) I had no crashes, but this week from Monday afternoon until today (Wednesday) afternoon I had really a lot of crashes. Every crash was again introduced by an error message in the Apache error logfile (as mentioned in post #1).

I guess I can exclude the cronjob theory now.

Meanwhile I increased the values (in steps) to 1000:
Code:
MaxRequestWorkers 1000
ServerLimit 1000
I set the values in
/etc/apache2/mods-enabled/mpm_prefork.conf
and
/etc/apache2/mods-enabled/mpm_event.conf
(to be honest I do not know how to find out which one is used, but one of the files had to be newly created).

I also had errors in the files "/var/log/*php*-fpm/error.log" like this:
Code:
WARNING: [pool example.com] server reached max_children setting (10), consider raising it

I thought maybe the two things have to do with each other. I increased the value within Plesk for the domain(s) in steps of 5 and finally came to a value which does not produce more error messages.

But the crashed still occur! And there is another problem this week: The plesk-php71-fpm service freezes constantly. It does not crash, it just does not respond and so my complete site does not respond. The only solution is to manually restart the service via console. Why doesn't Watchdog restart the service?

And I checked all logfiles, used this command:
Code:
cat /var/www/vhosts/*/logs/access_ssl_log | grep "07:54:" > /log.txt

Absolutely no suspicious entries at the crash times!

What else could I do?
 
Back
Top