• Please be aware: Kaspersky Anti-Virus has been deprecated
    With the upgrade to Plesk Obsidian 18.0.64, "Kaspersky Anti-Virus for Servers" will be automatically removed from the servers it is installed on. We recommend that you migrate to Sophos Anti-Virus for Servers.
  • The Horde webmail has been deprecated. Its complete removal is scheduled for April 2025. For details and recommended actions, see the Feature and Deprecation Plan.
  • We’re working on enhancing the Monitoring feature in Plesk, and we could really use your expertise! If you’re open to sharing your experiences with server and website monitoring or providing feedback, we’d love to have a one-hour online meeting with you.

Resolved The dreaded 502 Bad Gateway, nginx + php-fpm

OlgaKM

Basic Pleskian
I recently switched to nginx + php-fpm on my server. There are about 60 domains on the server, most low traffic, but some mid-to-high traffic. All seemed to be working fine for the first 12 hours, but then I started getting a 502 Bad Gateway error on one of the high traffic domains. The nginx error log is full of messages like:

Code:
2017/02/10 00:32:42 [error] 46165#0: *119100 connect() to unix:///var/www/vhosts/system/mysite.com/php-fpm.sock failed (11: Resource temporarily unavailable) while connecting to upstream, client: 46.229.168.71, server: mysite.com, request: "GET /some-page.php HTTP/1.1", upstream: "fastcgi://unix:///var/www/vhosts/system/mysite.com/php-fpm.sock:", host: "www.mysite.com"

I started googling, and this seems to be quite a common problem, but nothing I have tried so far has helped. I am guessing, because only 1 domain is affected, and that is the highest traffic one, that it is an issue of php-fpm being overwhelmed. I create a file /var/www/vhosts/system/mysite.com/conf/php.ini where I tried playing around with the pool settings, gradually increasing it. Currently, I have the following settings, and I'm still getting the error:

Code:
[php-fpm-pool-settings]
; By default use ondemand spawning (this requires php-fpm >= 5.3.9)
pm = dynamic
pm.max_children = 1500
pm.process_idle_timeout = 10s
; Following pm.* options are used only when 'pm = dynamic'
pm.start_servers = 20
pm.min_spare_servers = 20
pm.max_spare_servers = 50

According to what I've read, the max_children value is certainly too high for my specs! However, I tried every value in between including 5, 40, 100, 300, 1000.

I have read that I should try switching from unix to tcp/ip socket, but I'm not sure of a safe way to do this on Plesk. The config file for the domain in /opt/plesk/php/7.0/etc/php-fpm.d/ says:

Code:
; Don't override following options, they are relied upon by Plesk internally

The socket is listed just below this.

In short, I am not sure what I can do next. Any advice?

Edit: It seems that the other domains on the server, which are lower traffic, have also started throwing this error. So perhaps it is not a traffic issue? Anyway, I read somewhere that the error message "pool seems busy" is posted to the error log when it is a load issue...

Edit 2: Ok, this is very strange. If I set php-fpm as the PHP handler for 1 service plan (which contains 5 domains, but these are by far the PHP-using domains with the highest traffic), everything seems to work fine. As soon as I start changing the settings for other service plans, I begin getting errors!
 
Last edited:
Most frequently a 502 is caused by fail2ban blocking one of these addresses:
- 127.0.0.1
- your server's public IPv4 address
Add both of them to the fail2ban whitelist to avoid that issue.
 
I finally figured it out. It was a multi-part problem (and nothing to do with fail2ban).

Firstly, when I updated the settings, sometimes the php-fpm service would hang and need to be restarted. This issue is described here:

https://support.plesk.com/hc/en-us/...m-show-502-Bad-Gateway-or-504-Gateway-Timeout

Note that if you run PHP 7.0, the correct command is:

Code:
service plesk-php70-fpm restart

A second issue was that I had a memory leak somewhere, and since I had tweaked php-fpm to be "dynamic" rather than on demand, this was building up and eventually causing the outage as the size of the PHP-FPM child processes grew. To prevent this, I added the setting:

Code:
pm.max_requests = 1000

To the relevant /var/www/vhosts/system/<vhost>/conf/php.ini
 
Back
Top