1. Please take a little time for this simple survey! Thank you for participating!
    Dismiss Notice
  2. Dear Pleskians, please read this carefully! New attachments and other rules Thank you!
    Dismiss Notice
  3. Dear Pleskians, I really hope that you will share your opinion in this Special topic for chatter about Plesk in the Clouds. Thank you!
    Dismiss Notice

Resolved The dreaded 502 Bad Gateway, nginx + php-fpm

Discussion in 'Plesk 12.x for Linux' started by OlgaKM, Feb 9, 2017.

  1. OlgaKM

    OlgaKM Basic Pleskian

    10
    85%
    Joined:
    Aug 31, 2016
    Messages:
    36
    Likes Received:
    1
    Location:
    NY
    I recently switched to nginx + php-fpm on my server. There are about 60 domains on the server, most low traffic, but some mid-to-high traffic. All seemed to be working fine for the first 12 hours, but then I started getting a 502 Bad Gateway error on one of the high traffic domains. The nginx error log is full of messages like:

    Code:
    2017/02/10 00:32:42 [error] 46165#0: *119100 connect() to unix:///var/www/vhosts/system/mysite.com/php-fpm.sock failed (11: Resource temporarily unavailable) while connecting to upstream, client: 46.229.168.71, server: mysite.com, request: "GET /some-page.php HTTP/1.1", upstream: "fastcgi://unix:///var/www/vhosts/system/mysite.com/php-fpm.sock:", host: "www.mysite.com"
    I started googling, and this seems to be quite a common problem, but nothing I have tried so far has helped. I am guessing, because only 1 domain is affected, and that is the highest traffic one, that it is an issue of php-fpm being overwhelmed. I create a file /var/www/vhosts/system/mysite.com/conf/php.ini where I tried playing around with the pool settings, gradually increasing it. Currently, I have the following settings, and I'm still getting the error:

    Code:
    [php-fpm-pool-settings]
    ; By default use ondemand spawning (this requires php-fpm >= 5.3.9)
    pm = dynamic
    pm.max_children = 1500
    pm.process_idle_timeout = 10s
    ; Following pm.* options are used only when 'pm = dynamic'
    pm.start_servers = 20
    pm.min_spare_servers = 20
    pm.max_spare_servers = 50
    According to what I've read, the max_children value is certainly too high for my specs! However, I tried every value in between including 5, 40, 100, 300, 1000.

    I have read that I should try switching from unix to tcp/ip socket, but I'm not sure of a safe way to do this on Plesk. The config file for the domain in /opt/plesk/php/7.0/etc/php-fpm.d/ says:

    Code:
    ; Don't override following options, they are relied upon by Plesk internally
    The socket is listed just below this.

    In short, I am not sure what I can do next. Any advice?

    Edit: It seems that the other domains on the server, which are lower traffic, have also started throwing this error. So perhaps it is not a traffic issue? Anyway, I read somewhere that the error message "pool seems busy" is posted to the error log when it is a load issue...

    Edit 2: Ok, this is very strange. If I set php-fpm as the PHP handler for 1 service plan (which contains 5 domains, but these are by far the PHP-using domains with the highest traffic), everything seems to work fine. As soon as I start changing the settings for other service plans, I begin getting errors!

     
    Last edited: Feb 9, 2017
  2. Peter Debik

    Peter Debik Golden Pleskian Plesk Guru

    37
    80%
    Joined:
    Oct 15, 2015
    Messages:
    1,992
    Likes Received:
    405
    Location:
    Berlin, Germany
    Most frequently a 502 is caused by fail2ban blocking one of these addresses:
    - 127.0.0.1
    - your server's public IPv4 address
    Add both of them to the fail2ban whitelist to avoid that issue.
     
  3. OlgaKM

    OlgaKM Basic Pleskian

    10
    85%
    Joined:
    Aug 31, 2016
    Messages:
    36
    Likes Received:
    1
    Location:
    NY
    Best Answer
    I finally figured it out. It was a multi-part problem (and nothing to do with fail2ban).

    Firstly, when I updated the settings, sometimes the php-fpm service would hang and need to be restarted. This issue is described here:

    https://support.plesk.com/hc/en-us/...m-show-502-Bad-Gateway-or-504-Gateway-Timeout

    Note that if you run PHP 7.0, the correct command is:

    Code:
    service plesk-php70-fpm restart
    A second issue was that I had a memory leak somewhere, and since I had tweaked php-fpm to be "dynamic" rather than on demand, this was building up and eventually causing the outage as the size of the PHP-FPM child processes grew. To prevent this, I added the setting:

    Code:
    pm.max_requests = 1000
    To the relevant /var/www/vhosts/system/<vhost>/conf/php.ini
     
Loading...