1. Please take a little time for this simple survey! Thank you for participating!
    Dismiss Notice

Resolved The dreaded 502 Bad Gateway, nginx + php-fpm

Discussion in 'Plesk 12.x for Linux' started by OlgaKM, Feb 9, 2017.

  1. OlgaKM

    OlgaKM Basic Pleskian

    9
    20%
    Joined:
    Aug 31, 2016
    Messages:
    36
    Likes Received:
    1
    Location:
    NY
    I recently switched to nginx + php-fpm on my server. There are about 60 domains on the server, most low traffic, but some mid-to-high traffic. All seemed to be working fine for the first 12 hours, but then I started getting a 502 Bad Gateway error on one of the high traffic domains. The nginx error log is full of messages like:

    Code:
    2017/02/10 00:32:42 [error] 46165#0: *119100 connect() to unix:///var/www/vhosts/system/mysite.com/php-fpm.sock failed (11: Resource temporarily unavailable) while connecting to upstream, client: 46.229.168.71, server: mysite.com, request: "GET /some-page.php HTTP/1.1", upstream: "fastcgi://unix:///var/www/vhosts/system/mysite.com/php-fpm.sock:", host: "www.mysite.com"
    I started googling, and this seems to be quite a common problem, but nothing I have tried so far has helped. I am guessing, because only 1 domain is affected, and that is the highest traffic one, that it is an issue of php-fpm being overwhelmed. I create a file /var/www/vhosts/system/mysite.com/conf/php.ini where I tried playing around with the pool settings, gradually increasing it. Currently, I have the following settings, and I'm still getting the error:

    Code:
    [php-fpm-pool-settings]
    ; By default use ondemand spawning (this requires php-fpm >= 5.3.9)
    pm = dynamic
    pm.max_children = 1500
    pm.process_idle_timeout = 10s
    ; Following pm.* options are used only when 'pm = dynamic'
    pm.start_servers = 20
    pm.min_spare_servers = 20
    pm.max_spare_servers = 50
    According to what I've read, the max_children value is certainly too high for my specs! However, I tried every value in between including 5, 40, 100, 300, 1000.

    I have read that I should try switching from unix to tcp/ip socket, but I'm not sure of a safe way to do this on Plesk. The config file for the domain in /opt/plesk/php/7.0/etc/php-fpm.d/ says:

    Code:
    ; Don't override following options, they are relied upon by Plesk internally
    The socket is listed just below this.

    In short, I am not sure what I can do next. Any advice?

    Edit: It seems that the other domains on the server, which are lower traffic, have also started throwing this error. So perhaps it is not a traffic issue? Anyway, I read somewhere that the error message "pool seems busy" is posted to the error log when it is a load issue...

    Edit 2: Ok, this is very strange. If I set php-fpm as the PHP handler for 1 service plan (which contains 5 domains, but these are by far the PHP-using domains with the highest traffic), everything seems to work fine. As soon as I start changing the settings for other service plans, I begin getting errors!

     
    Last edited: Feb 9, 2017
  2. Peter Debik

    Peter Debik Golden Pleskian Plesk Guru

    37
     
    Joined:
    Oct 15, 2015
    Messages:
    1,818
    Likes Received:
    364
    Location:
    Berlin, Germany
    Most frequently a 502 is caused by fail2ban blocking one of these addresses:
    - 127.0.0.1
    - your server's public IPv4 address
    Add both of them to the fail2ban whitelist to avoid that issue.
     
  3. OlgaKM

    OlgaKM Basic Pleskian

    9
    20%
    Joined:
    Aug 31, 2016
    Messages:
    36
    Likes Received:
    1
    Location:
    NY
    Best Answer
    I finally figured it out. It was a multi-part problem (and nothing to do with fail2ban).

    Firstly, when I updated the settings, sometimes the php-fpm service would hang and need to be restarted. This issue is described here:

    https://support.plesk.com/hc/en-us/...m-show-502-Bad-Gateway-or-504-Gateway-Timeout

    Note that if you run PHP 7.0, the correct command is:

    Code:
    service plesk-php70-fpm restart
    A second issue was that I had a memory leak somewhere, and since I had tweaked php-fpm to be "dynamic" rather than on demand, this was building up and eventually causing the outage as the size of the PHP-FPM child processes grew. To prevent this, I added the setting:

    Code:
    pm.max_requests = 1000
    To the relevant /var/www/vhosts/system/<vhost>/conf/php.ini
     
Loading...