Server extremely slow due to PHP-FPM cpu

Bart Scheffer · Jan 19, 2016

Last week we transferred a website from another hosting company to our server. The website (wordpress) is quite popular and can easily reach 1100 people viewing at peak moments. We decided to go with a 12 vCPU server with 32GB RAM and SSD storage, running on Linux CentOS 7 and Plesk as backend. We have PHP 7.0.2 installed with FPM.

The problem: The server is extremely slow (avg 8,5 seconds), we found a Wordpress Cache plugin which makes the site a bit more reachable but this does not solve the problem behind it, the server is clearly looping somewhere or something because it's constantly running at 100% CPU. This causes the site to crash a lot and I'm nowhere.

I did however find the files where it's going wrong with "ps faxuewwww" which gave me the following results:

(This is with CGI but it's the same for FPM)

Is there somebody who knows what to do after seeing those files or where the critical problem might be? It has to be a loop but I just can't find it.

Thanks in advance!

muskratt · Jan 20, 2016

What caching plugin are you running? What does the iowait and memory values look like at that load? Any HTTPS going on? I assume you're using a local database?

whytefyre · Jan 20, 2016

I'm not a WordPress expert but ran into a similar problem. It may have to do with the permalink structure and the current .htaccess / nginx rules. By the way, I've found OPCache (in PHP 7.x+) to be a lot better than other caching plugins.

SiteDart Admin · Jan 28, 2016

Unless you have time to go through all of the code and look for a coding issue manually, one of the best ways to spot troubled code in a specific wordpress site is to run it against a profiler.
When we run into these problems, we copy the site to a vanilla web server, has whatever php version we need on it, and a php extension called xdebug.
Xdebug allows us to dump performance data on function calls into a chart. We usually find out that it's either file i/o, a blocking call like curl, or that our php needs to be rebalanced on the domain (servers, childprocesses, etc).
We typically only rebalance FPM as a last resort- if your node is dedicated though, you should have less concern, unless you plan on ever migrating the domain into a shared plesk environment.
In the meantime, i would suggest looking into some basic optimizations:
cloudflare for static content - caveats here around how SSL may be handled, or specific JS may be affected if you use rocket loader
mariaDB - not many caveats now that its supported by plesk- but you may have a collation problem if you migrate it away from maria down the road.
nginx - if you're not already using it- make it do as much as possible
wordpress has a few plugins- W3Total Cache is highly recommended- and can use a memcache pool if you have one.
Memcached - if you have that much ram, carve some out for a local memcache pool, WP will not have to rebuild quite a few things if you give it this via W3TC. create 1 pool (slab) for each domain you need cache for.
iptables/fail2ban - sounds strange, but make sure you have setup a WP jail. culling out bad requests to make room for the good ones is a good use of time.
Wordfence - this plugin seems good, however be careful with it on large sites. esp the filescan. the filescan runs every 24h or so from the time of install (you can't schedule it unless you have the paid version) - and it checksums every file in webroot! We've been bitten when we get a large influx of traffic during one of these scan windows.

If you already have done all of those things, my best advice would be to profile it and look for slow or bad code. you will find 1 or 2 plugins that are usually poor performing. We usually replace them with alternatives. If that's isn't possible (sometimes it isn't b/c of the client), then we start considering tweaking php-fpm settings. FPM settings are too deep for me to post here- but a word of advice there is to tweak slowly and measure the results, changing server counts and child processes or requests can impact the server in negative ways; you will find that settings for one domain are not optimal for another, so don't dial it in and apply it everywhere.

Have you had any luck finding the issue?

trialotto · Jan 29, 2016

Bart Scheffer said:
Last week we transferred a website from another hosting company to our server. The website (wordpress) is quite popular and can easily reach 1100 people viewing at peak moments. We decided to go with a 12 vCPU server with 32GB RAM and SSD storage, running on Linux CentOS 7 and Plesk as backend. We have PHP 7.0.2 installed with FPM.

The problem: The server is extremely slow (avg 8,5 seconds), we found a Wordpress Cache plugin which makes the site a bit more reachable but this does not solve the problem behind it, the server is clearly looping somewhere or something because it's constantly running at 100% CPU. This causes the site to crash a lot and I'm nowhere.

I did however find the files where it's going wrong with "ps faxuewwww" which gave me the following results:

(This is with CGI but it's the same for FPM)

Is there somebody who knows what to do after seeing those files or where the critical problem might be? It has to be a loop but I just can't find it.

Thanks in advance!

Ehm, you are running a custom php setup.

If you ask me, you can or should simply kill the process by switching to FPM and (either) keep using fpm (on Apache) (or) return to FastCGI.

Anyway, there is no advantage of FastCGI over FPM, one should prefer FPM.

The interesting part is in the "test of returning to FastCGI": you should continu using FPM, but for the sake of curiosity, return to FastCGI to do a small and relevant test.

It seems to be the case (indeed) that your PHP is looped OR your site is executing something bad: if a return to FastCGI again results to heavy resource (over-)usage, then your site is very likely to be executing code that is not standard Plesk code.

That can imply a hack, or bad coding from your side, or even a bad migration (the latter is the most likely cause, let´s hope it is not a hack).

Furthermore, if you are on FPM and there still is a FastCGI process, well, than you have some really malicious code: in this case, I would strongly advice to kill the relevant processes.

Please try to do the above "switch" and please report some data (output of log files and a screen shot of simple ps aux | grep php)

Regards....

Server extremely slow due to PHP-FPM cpu

Bart Scheffer

New Pleskian

muskratt

New Pleskian

whytefyre

New Pleskian

SiteDart Admin

New Pleskian

trialotto

Golden Pleskian

Similar threads