• Please be aware: Kaspersky Anti-Virus has been deprecated
    With the upgrade to Plesk Obsidian 18.0.64, "Kaspersky Anti-Virus for Servers" will be automatically removed from the servers it is installed on. We recommend that you migrate to Sophos Anti-Virus for Servers.
  • The Horde webmail has been deprecated. Its complete removal is scheduled for April 2025. For details and recommended actions, see the Feature and Deprecation Plan.
  • We’re working on enhancing the Monitoring feature in Plesk, and we could really use your expertise! If you’re open to sharing your experiences with server and website monitoring or providing feedback, we’d love to have a one-hour online meeting with you.

Issue Websites Hanging loading for minutes and 502

stas styler

Basic Pleskian
Hi guys,

Lately I'm seeing a problem that makes me so frustrated.
Some websites suddenly could not be loaded. When I type the URL and press enter it just trying to load them for ages and then show me 502 bad gateway.

I've tried to find the problem / reproduce it but with no luck.
The first quick fix was to restart the server. But then I found that if I go to PHP settings and change PHP version or just clicking OK, the website is coming back to life till the next time (It happens once a week / once every couple of days).

Restarting Nginx or apache doesn't help.

Background about our infrastructure:
We provide shared web hosting to our clients.
We got about 300 Websites (Pipe logs checked).
Server is dedicated, 128GB Ram, AMD EPYC 7401P 24-Core Processor (48 core(s)), Raid10 7TB SSD, Mariadb 11, Reversed Proxy nginx + Apache, Plesk Onyx 17.8, patch 14, centos 7.

htop doesn't show any important data, we barely use 20-30% of our cores & memory.
Logs show this:

Code:
(70007)The timeout specified has expired: AH01075: Error dispatching request to :, referer: https://homediet.co.il/nutritionists/317202106/

Code:
55818#0: *918084 upstream prematurely closed connection while reading response header from upstream

Code:
63988#0: *921196 connect() failed (111: Connection refused) while connecting to upstream

Many of them show PHP-Socket problem:
Code:
(2)No such file or directory: AH02454: FCGI: attempt to connect to Unix domain socket /var/www/vhosts/system/example.com/php-fpm.sock (*) failed
[proxy_fcgi:error] [pid 3838:tid 140126940247808] [client 203.0.113.2:56904] AH01079: failed to make connection to backend: httpd-UDS


I tried every suggestion \ solution in the threads, unfortunately it didn't help...
Anyone know what is going on? I think it started happening shortly after moving up to 17.8.
 
Last edited:
What says /var/log/php-fpm/error.log

You need php-frm? If not, you could try Fast-Cgi

Is nginx running? If yes, what happens if you swith the service off?
 
What says /var/log/php-fpm/error.log

You need php-frm? If not, you could try Fast-Cgi

Is nginx running? If yes, what happens if you swith the service off?
Nginx is serving statis content while apache is running besides it on the same server (as configured as stock installation).

Yes, php-fpm has a lot faster response time because it spawns children that are waiting for tasks to come.

Logs for php-fpm show only notice errors about not importent stuff from some clients websites.
 
Probably a similar problem like this?
Variable not replaced - AH01079: failed to make connection to backend: httpd-UDS - i-MSCP - internet - Multi Server Control Panel

It at least should point you to the direction of the problem
It seems that they talk about the same error, but with different panel.
Can anyone confirm that it is a bug in plesk \ apache \ NGINX that is going to be fixed somehow?
If so, I need a manual fix or even a clue that it is going to be fixed.

My client's websites look as if they got down... that just not professional.
 
The symptoms do not describe a single, easily identifieable cause. My suggestion is to first check into the /var/log/plesk-phpXX-fpm/error.log log files. Maybe you find some more specific hints there. One frequent cause of the symptoms is, that the "max_children" limit is reached, thus further connections are dropped by the service. Other similar factors can play a role, too.
 
I guess I'll drop our problem in this thread since it seems very similar. We run a fairly high traffic website. The past months we've used the option "FastCGI application served by Apache" to run php 7.1 (since we've had issues with FPM in the past)
Yesterday we switched to FPM again because we guessed the issues would have been resolved by now. Unfortunately the website became unresponsive this morning. After a few minutes I rebooted nginx which resolved the problem. The logs show:

Code:
proxy_error_log
2018/09/12 10:35:46 [error] 20215#0: *18681969 upstream timed out (110: Connection timed out) while reading response header from upstream, client:

Code:
error_log
[Wed Sep 12 10:35:11.783708 2018] [proxy_fcgi:error] [pid 15655:tid 139972506019584] (70007)The timeout specified has expired: [client ] AH01075: Error dispatching request to :, referer:

I've attached screenshots of the Apache Server Status during:

server_status_issue.png

and after the issue:
server_status_normal.png

We've switched back to FastCGI again now to prevent this issue.
 
Make sure that your PHP-FPM settings are high enough, like in /var/log/vhosts/system/<domain>/conf:
Code:
[php-fpm-pool-settings]
pm.max_children = 50
pm.max_requests = 10000
pm.process_idle_timeout = 120s
These can be set in the additional PHP-FPM directives in the GUI, too (with the [php-fpm-pool-settings] headline of course).
Also make sure that the scripts that are running can actually finish what they are doing. Sometimes scripts end up in infinite loops that can cause similar issues as described.
 
Back
Top