• Our team is looking to connect with folks who use email services provided by Plesk, or a premium service. If you'd like to be part of the discovery process and share your experiences, we invite you to complete this short screening survey. If your responses match the persona we are looking for, you'll receive a link to schedule a call at your convenience. We look forward to hearing from you!
  • We are looking for U.S.-based freelancer or agency working with SEO or WordPress for a quick 30-min interviews to gather feedback on XOVI, a successful German SEO tool we’re looking to launch in the U.S.
    If you qualify and participate, you’ll receive a $30 Amazon gift card as a thank-you. Please apply here. Thanks for helping shape a better SEO product for agencies!
  • The BIND DNS server has already been deprecated and removed from Plesk for Windows.
    If a Plesk for Windows server is still using BIND, the upgrade to Plesk Obsidian 18.0.70 will be unavailable until the administrator switches the DNS server to Microsoft DNS. We strongly recommend transitioning to Microsoft DNS within the next 6 weeks, before the Plesk 18.0.70 release.
  • The Horde component is removed from Plesk Installer. We recommend switching to another webmail software supported in Plesk.

Issue A lot of random "500 Internal Server Error" from nginx since March 10th

King555

Regular Pleskian
Server operating system version
Ubuntu 24.04.2 LTS
Plesk version and microupdate number
Plesk Obsidian 18.0.68 Update #1 Web Admin Edition
Since several days, this is since March 10th, my sites are randomly down for a few minutes with the message "500 Internal Server Error", which comes from nginx. Today the nginx service also was shutdown and Watchdog released it from monitoring (which led to a timeout instead of the error message). The error 500 comes mostly on single sites, so not on all at once. But always different sites (completely random).

I don't think that nginx got an update in the last weeks. But I installed all the other updates which were available for Plesk and my operating system, max. 24 h after their availability for my system.

The only log entry I could find is this:

Code:
Stopping nginx.service - Startup script for nginx service...
nginx.service: Deactivated successfully.
Stopped nginx.service - Startup script for nginx service.
nginx.service: Consumed 9h 29min 51.852s CPU time, 172.0M memory peak, 0B memory swap peak.

Does anyone else experienced this kind of problem in the last weeks?
 
I use nginx only as a reverse proxy, my main web server is Apache. All web server specific settings are the Plesk defaults.
 
Today nginx crashes nonstop since a few hours. I see these error messages in /var/log/nginx/error.log:

Code:
[alert] 3824807#0: socket() failed (24: Too many open files) while requesting certificate status, responder: r10.o.lencr.org, peer: [2001:2030:21::3e73:fc59]:80, certificate: "/opt/psa/var/certificates/scf29GeRI"

[emerg] 23706#0: open() "/var/www/vhosts/system/example.com/logs/proxy_access_ssl_log" failed (30: Read-only file system)

I have no idea whether these errors have something to do with the crashes.

I also restarted the whole server, but that did not solve the problem at all.
 
I use nginx only as a reverse proxy, my main web server is Apache. All web server specific settings are the Plesk defaults.
Then the 500 actually comes from Apache. Check the Apache error log. (/var/www/vhosts/system/yourdomain/logs/error_log)
 
I thought the error comes from nginx, because nginx is mentioned in the error message in the browser. Also nginx crahes all the time, Apache doesn't.

And it's a dedicated (real) server.

I will check the logs.
 
I wanted to give some final feedback concerning this issue. It turned out that in the days where nginx crashed, a lot (up to about 8,000 simultaneous) bots from Singapore were accessing my server, probably crawlers. Since blocking Singapore completely via firewall, it works perfectly again. Today I had the time to analyze all the monitoring graphs. I can say that it was clearly not a resource or load problem. nginx may have crashed due to the nature of the requests. If anything, only the RAM consumption was a little higher than normal. But far below the possible values. So I have no explanation as to why these crashes happened. Nevertheless, the problem has been solved.
 
The problem wasn't your system resources, but the configured limits. The error you shared: 24: Too many open files, happens when the web server needs to open more files than it is currently allowed to.

Based on what you said, your resources have some room to increase this, so you could do it to prevent a full crash in case you get another attack in the future: https://support.plesk.com/hc/en-us/articles/12377774585495

It's also a good idea to generally block attacks or bots with a DDoS service or filter.
 
Thanks, I will increase the value, although 1024 (current default for Soft Limit) seems already quite high. The linked article speaks of Plesk domains * 16 as the minimum value. Including subdomains and aliases I have 22 entries in the domain list. So 1024 is almost 3 times higher than that value (1024 / (22*16) = 2,9).

I will now double the value to 2048.

Concerning blocking: I use fail2ban and ModSecurity, but I guess these do not block those attacks in the configuration shipped with Plesk.
 
Thanks. I already set stricter values in fail2ban, especially the badbot filter/jail. But I'm afraid bots like that one in March will not be detected, as they are "normal" crawlers in the first place, like Google bot. Sometimes Google also sends too many bots at once. I guess the large increase in crawler bots is because of AI (to collect information).

Btw: Concerning the nginx soft limit value, I saw that Apache is already at 8192, so running the command from the link would decrease the value for Apache, so I did not do it. Maybe when I have the same problem again I will increase the value (for nginx). I'm really wondering that nginx crashes in these cases, because it is considered to be a web server for increased traffic, isn't it?
 
I updated the apache-badbot filter as described in the article and then many IPs were blocked immediately (and this continues). I checked some of them and according to abuseipdb.com they are indeed bad. The list of user agents in the filter file hasn't changed, but the regex line has. I guess there were no bad bots blocked before, because of the outdated regex.
 
Yes, nginx is lighter than Apache by default and helps with higher traffic, but the configuration has to allow for it. You should match the nginx file limit to the one Apache has. That will prevent unnecessary crashes.

Good to hear you're getting a lot of that traffic blocked now! Please tell me if you have other questions.
 
Back
Top