Issue Websites stop working until I restart apache and nginx

dgarcia90 · Nov 29, 2022

Hello everyone, we have a dedicated hosting with plesk. We have a really weird issue where our websites stop responding until I restart apache and nnginx or if I wait like 5 - 10 minutes they are accesible again. I haven't found a cause or pattern for this issue, as it could happen once a month, or once in three months but it happened twice yesterday and today it happened again.

I can defend myself with linux but when it comes to troubleshooting I'm no expert.

I appriciate any help you can give me

Thanks!

Yaroslav_T · Nov 30, 2022

Hello @dgarcia90 ,
I would begin by checking the logs for the timeframe when the issue appeared.
The logs I would check for:

The logs of a few affected websites (Domains > example.com > Logs) - maybe there is a pattern, e.g. similar errors on different websites that were down.
The webserver logs /var/log/nginx/error.log and /var/log/apache2/error.log
There also can be something in the output of journalctl --unit=apache2 (or --unit=nginx)

dgarcia90 · Nov 30, 2022

Yaroslav_T said:
Hello @dgarcia90 ,
I would begin by checking the logs for the timeframe when the issue appeared.
The logs I would check for:

The logs of a few affected websites (Domains > example.com > Logs) - maybe there is a pattern, e.g. similar errors on different websites that were down.

The webserver logs /var/log/nginx/error.log and /var/log/apache2/error.log

There also can be something in the output of journalctl --unit=apache2 (or --unit=nginx)

Hey, thanks for your reply.

Unfortunetly there's nothing in those logs at the time the issue happened

dgarcia90 · Dec 2, 2022

Hey guys, any idea of what else I could check, I had another "crash" yesterday night

Thanks in advance
Kind regards,
Daniel García

IgorG · Dec 2, 2022

What about related error messages in /var/log/syslog ?

dgarcia90 · Dec 2, 2022

IgorG said:
What about related error messages in /var/log/syslog ?

Thanks for your reply, I've just checked that log and besides some unauthorized login attemps for my smtp or ssh services (I have fail2ban enable) there's nothing there that I can see that could cause that issue.

mr-wolf · Dec 2, 2022

Plesk this week, "out of the blue", added the line " ssl_dhparam /opt/psa/etc/dhparams2048.pem;" to the file /etc/nginx/conf.d/ssl.conf

check with these commands if that line was indeed added just before your sites went off-line

ls -l /etc/nginx/conf.d/ssl.conf
grep param /etc/nginx/conf.d/ssl.conf

It however doesn't explain why the sites went off-line then, because that entry is a valid entry in normal conditions.

In my case it wasn't a valid entry as I always had such a line added to my nginx config for ages and nginx doesn't allow to have a double entry for that.
That's my punishment for being progressive.

I made a thread for that as well, but somehow @IgorG didnt comment there. ;-)
I presume he didn't want to bump the thread.

Issue - All websites stopped after Plesk changed /etc/nginx/conf.d/ssl.conf and added ssl_dhparam /opt/psa/etc/dhparams2048.pem;

This morning all websites on 1 server stopped working. Thanks to monitoring software (zabbix) I knew that this was because of an invalid nginx-config (which tests the output of "nginx -t") nginx -t gave me: nginx: [emerg] "ssl_dhparam" directive is duplicate in /etc/nginx/conf.d/ssl.conf:5...

talk.plesk.com

I'm mentioning it here because it can put you at ease regarding your search for an "unlawful" entry of your server.

Plesk will rewrite many of your configs and will restart services as a result of an update.
99% of the time you will not notice it.
We just don't live in a perfect world....

When things go wrong I will call it an "update from hell"
I had one this week.
In this case neither parties (me, nginx nor Plesk) were to blame.

dgarcia90 · Dec 2, 2022

mr-wolf said:
Plesk this week, "out of the blue", added the line " ssl_dhparam /opt/psa/etc/dhparams2048.pem;" to the file /etc/nginx/conf.d/ssl.conf

check with these commands if that line was indeed added just before your sites went off-line

It however doesn't explain why the sites went off-line then, because that entry is a valid entry in normal conditions.

In my case it wasn't a valid entry as I always had such a line added to my nginx config for ages and nginx doesn't allow to have a double entry for that.
That's my punishment for being progressive.

I made a thread for that as well, but somehow @IgorG didnt comment there. ;-)
I presume he didn't want to bump the thread.

Issue - All websites stopped after Plesk changed /etc/nginx/conf.d/ssl.conf and added ssl_dhparam /opt/psa/etc/dhparams2048.pem;

This morning all websites on 1 server stopped working. Thanks to monitoring software (zabbix) I knew that this was because of an invalid nginx-config (which tests the output of "nginx -t") nginx -t gave me: nginx: [emerg] "ssl_dhparam" directive is duplicate in /etc/nginx/conf.d/ssl.conf:5...

talk.plesk.com

I'm mentioning it here because it can put you at ease regarding your search for an "unlawful" entry of your server.

Plesk will rewrite many of your configs and will restart services as a result of an update.
99% of the time you will not notice it.
We just don't live in a perfect world....

When things go wrong I will call it an "update from hell"

Hello, thanks for your reply. Looks like I also have those lines added in that file

mr-wolf · Dec 2, 2022

...and when were they added??

But still, it should not be a problem. Maybe once, but not repeatedly...

dgarcia90 · Dec 2, 2022

mr-wolf said:
...and when were they added??

Hello, the last modification of that file was may 13th 2022

mr-wolf · Dec 2, 2022

In that case this particular change has no bearing on your problem...
Do you still have this problem?

dgarcia90 · Dec 2, 2022

mr-wolf said:
In that case this particular change has no bearing on your problem...
Do you still have this problem?

I do have this problem but it's completly random. It can happen once a month, or twice a month. Then a few months with no issue, then starts happening again...

WebHostingAce · Dec 2, 2022

Have you checked your CPU and Memory usage when this happened?

Bitpalast · Dec 3, 2022

@dgarcia90 The "5 - 10 minutes" is suspicious for a fail2ban ban, because anywhere between 3 and 10 minutes is the normal first response ban time for the jails. When your server appears to be "offline", can you still access it from a different ip address, for example through a cell with your phone (bypassing your wifi)?

dgarcia90 · Dec 3, 2022

Peter Debik said:
@dgarcia90 The "5 - 10 minutes" is suspicious for a fail2ban ban, because anywhere between 3 and 10 minutes is the normal first response ban time for the jails. When your server appears to be "offline", can you still access it from a different ip address, for example through a cell with your phone (bypassing your wifi)?

Thanks for your reply. When the issue happens. and I'm able to see it (cause sometimes it happens at night time and I cannot do any test) I'm not able to see to any of my websites no matter where I try from. I tried my phone(4g connection), my PC, another PC that is in different network (public ip is different) and the outcome was the same in every attempt

Bitpalast · Dec 3, 2022

Here are some more ideas for you:

Sometimes attackers or bad bots send many requests against a website. This can have the effect, that Apache beefs up its instances quickly, reaching the max of 255 instances. It will then stop accepting new requests and wait until the running requests can be served. It is also possible that PHP FPM is loaded up with many scripts that it ought to handle, but the maximum number of allowed children is reached so that no further children can be spawned. In either case you may experience a several minute wait until the system becomes responsive again. The best defense against this is to have all Fail2Ban jails in place so that attackers are blocked.

It is also possible that RAM usage explodes. This can trigger swapping, and this again can slow down a system considerably so that it feels as if it became unresponsive. A great counter measure is using CGroups that are included as an extension in Plesk. You can limit RAM and CPU usage with it.

Kritarth · Oct 5, 2024

Websites stop working until I restart apache

what is the solution for this?

Kritarth · Oct 5, 2024

dgarcia90 How did you resolve it?

Bitpalast · Oct 6, 2024

@Kritarth What have you tried so far to find the cause of the issue?

Kritarth · Oct 6, 2024

I have checked the CPU and memory utilization, enabled debug logs, reviewed error logs, and examined the configuration for httpd. Everything seems to be fine, but the website goes down almost every week. It remains down until we manually start the Apache service, and we continue to receive alerts during this downtime, as previously mentioned

Issue Websites stop working until I restart apache and nginx

Basic Pleskian

Basic Pleskian

Basic Pleskian

Basic Pleskian

Plesk addicted!

Basic Pleskian

Silver Pleskian

Basic Pleskian

Silver Pleskian

Basic Pleskian

Silver Pleskian

Basic Pleskian

Silver Pleskian

Plesk addicted!

Basic Pleskian

Plesk addicted!

New Pleskian

Websites stop working until I restart apache what is the solution for this?​

New Pleskian

dgarcia90 How did you resolve it?​

Plesk addicted!

New Pleskian

Similar threads

Websites stop working until I restart apache

what is the solution for this?

dgarcia90 How did you resolve it?