Issue A lot of random "500 Internal Server Error" from nginx since March 10th

King555 · Mar 16, 2025

Since several days, this is since March 10th, my sites are randomly down for a few minutes with the message "500 Internal Server Error", which comes from nginx. Today the nginx service also was shutdown and Watchdog released it from monitoring (which led to a timeout instead of the error message). The error 500 comes mostly on single sites, so not on all at once. But always different sites (completely random).

I don't think that nginx got an update in the last weeks. But I installed all the other updates which were available for Plesk and my operating system, max. 24 h after their availability for my system.

The only log entry I could find is this:

Code:

Stopping nginx.service - Startup script for nginx service...
nginx.service: Deactivated successfully.
Stopped nginx.service - Startup script for nginx service.
nginx.service: Consumed 9h 29min 51.852s CPU time, 172.0M memory peak, 0B memory swap peak.

Does anyone else experienced this kind of problem in the last weeks?

mow · Mar 18, 2025

What are your hosting settings? Do you use nginx only or as reverse proxy?

King555 · Mar 18, 2025

I use nginx only as a reverse proxy, my main web server is Apache. All web server specific settings are the Plesk defaults.

King555 · Mar 19, 2025

Today nginx crashes nonstop since a few hours. I see these error messages in /var/log/nginx/error.log:

Code:

[alert] 3824807#0: socket() failed (24: Too many open files) while requesting certificate status, responder: r10.o.lencr.org, peer: [2001:2030:21::3e73:fc59]:80, certificate: "/opt/psa/var/certificates/scf29GeRI"

[emerg] 23706#0: open() "/var/www/vhosts/system/example.com/logs/proxy_access_ssl_log" failed (30: Read-only file system)

I have no idea whether these errors have something to do with the crashes.

I also restarted the whole server, but that did not solve the problem at all.

mow · Mar 19, 2025

King555 said:
I use nginx only as a reverse proxy, my main web server is Apache. All web server specific settings are the Plesk defaults.

Then the 500 actually comes from Apache. Check the Apache error log. (/var/www/vhosts/system/yourdomain/logs/error_log)

mow · Mar 19, 2025

King555 said:
I also restarted the whole server, but that did not solve the problem at all.

Is that a real server or just a vhost?

King555 · Mar 19, 2025

I thought the error comes from nginx, because nginx is mentioned in the error message in the browser. Also nginx crahes all the time, Apache doesn't.

And it's a dedicated (real) server.

I will check the logs.

King555 · Apr 16, 2025

I wanted to give some final feedback concerning this issue. It turned out that in the days where nginx crashed, a lot (up to about 8,000 simultaneous) bots from Singapore were accessing my server, probably crawlers. Since blocking Singapore completely via firewall, it works perfectly again. Today I had the time to analyze all the monitoring graphs. I can say that it was clearly not a resource or load problem. nginx may have crashed due to the nature of the requests. If anything, only the RAM consumption was a little higher than normal. But far below the possible values. So I have no explanation as to why these crashes happened. Nevertheless, the problem has been solved.

alvarezcruz · Apr 16, 2025

The problem wasn't your system resources, but the configured limits. The error you shared: 24: Too many open files, happens when the web server needs to open more files than it is currently allowed to.

Based on what you said, your resources have some room to increase this, so you could do it to prevent a full crash in case you get another attack in the future: https://support.plesk.com/hc/en-us/articles/12377774585495

It's also a good idea to generally block attacks or bots with a DDoS service or filter.

King555 · Apr 16, 2025

Thanks, I will increase the value, although 1024 (current default for Soft Limit) seems already quite high. The linked article speaks of Plesk domains * 16 as the minimum value. Including subdomains and aliases I have 22 entries in the domain list. So 1024 is almost 3 times higher than that value (1024 / (22*16) = 2,9).

I will now double the value to 2048.

Concerning blocking: I use fail2ban and ModSecurity, but I guess these do not block those attacks in the configuration shipped with Plesk.

alvarezcruz · Apr 16, 2025

The default config for Fail2Ban will block requests after 3 failed attempts, for 10 minutes. This is okay for someone who entered the wrong password by mistake to not be locked out forever, but you can be a lot stricter with bad bots. Check out the Fail2Ban section from here for a better config: How to Avoid High CPU Load & Block Bad Bots with Plesk

King555 · Apr 16, 2025

Thanks. I already set stricter values in fail2ban, especially the badbot filter/jail. But I'm afraid bots like that one in March will not be detected, as they are "normal" crawlers in the first place, like Google bot. Sometimes Google also sends too many bots at once. I guess the large increase in crawler bots is because of AI (to collect information).

Btw: Concerning the nginx soft limit value, I saw that Apache is already at 8192, so running the command from the link would decrease the value for Apache, so I did not do it. Maybe when I have the same problem again I will increase the value (for nginx). I'm really wondering that nginx crashes in these cases, because it is considered to be a web server for increased traffic, isn't it?

King555 · Apr 16, 2025

I updated the apache-badbot filter as described in the article and then many IPs were blocked immediately (and this continues). I checked some of them and according to abuseipdb.com they are indeed bad. The list of user agents in the filter file hasn't changed, but the regex line has. I guess there were no bad bots blocked before, because of the outdated regex.

alvarezcruz · Apr 16, 2025

Yes, nginx is lighter than Apache by default and helps with higher traffic, but the configuration has to allow for it. You should match the nginx file limit to the one Apache has. That will prevent unnecessary crashes.

Good to hear you're getting a lot of that traffic blocked now! Please tell me if you have other questions.

Issue A lot of random "500 Internal Server Error" from nginx since March 10th

King555

Regular Pleskian

mow

Silver Pleskian

King555

Regular Pleskian

King555

Regular Pleskian

mow

Silver Pleskian

mow

Silver Pleskian

King555

Regular Pleskian

King555

Regular Pleskian

alvarezcruz

Lead Engineer

King555

Regular Pleskian

alvarezcruz

Lead Engineer

King555

Regular Pleskian

King555

Regular Pleskian

alvarezcruz

Lead Engineer

Similar threads