• If you are still using CentOS 7.9, it's time to convert to Alma 8 with the free centos2alma tool by Plesk or Plesk Migrator. Please let us know your experiences or concerns in this thread:
    CentOS2Alma discussion

Resolved Backup job caused high CPU

raytracy

Basic Pleskian
OS ‪CloudLinux 7.2 (Valeri Kubasov)‬
Plesk 12.5.30 Update #43
CageFS has installed
LVE have turned on for every site.

My system has heavy load issue when backup job running.
The backup job running well since system go production at months ago until now.

This is collected from CPU usage of Plesk healthy check.
Notice the Interrupt (or Wait-IO?) goes very high when I running backup job:
upload_2016-8-25_23-36-52.png

But I have recorded only few I/O activity from my host OS level:
upload_2016-8-25_23-40-41.png

I try to monitor the tar process with iotop and found it has used only very low I/O:
upload_2016-8-25_23-42-33.png

But system load shown extremely high and all sites will not response:
upload_2016-8-25_23-43-57.png

LVE info shown very high mIO than normal when backup running:
upload_2016-8-25_23-55-47.png
This is normal LVE activity without backup job running:
upload_2016-8-25_23-57-10.png

The system load will back to normal if I killed tar process.

I have no idea about how to troubleshoot the cpu high issue.

The only thing that made me unclear is about pigz process which should capture the tar stdout stream. But I cannot find the pigz process and monitor its activity when tar is progressing.

This issues does not related to backup data size and happen for every site.

Any hint or advice are welcome.
 
Some thoughts on this:
It is normal that backups create a high system load. I am unable to determine whether in your specific case the load is extremely high compared to the machine you are using. We are seening loads of up to 7 on a 12-core machine, but normally load is at around 1.5 - 2.5 on a 12-core machine during backups.

A 4.7% CPU load by tar is "low". It can be up to 70%, that would be considered "high".

pigz is a zip program, that was meant to use multiple CPU cores for zipping to speed the process up.

The large the backups get, the longer the CPU load is high.

Suggestions:

a)
Control panel > Tools & Settings > Backup Manager > Backup Settings:
Check the "Run scheduled backup procesess with low priority" or "Run all backup processes with low priority" checkboxes to reduce the CPU load.

b)
Run incremental backups instead of full backups

c)
Check your hard disks for failures. We've seen issues with high load and long runtime of the backup when the hard disk/file system contains errors.
 
I am having now the same issue but in other context. My backup is doing over a NFS mounted device. Centos 7 pleks 17.

While the backup is running the cpu load is veryh high by some moments.( load average 20) in htop with 8 cores. by some moments.

when this happens websites go down.

htop shows:

root 14535 36.9 0.0 617724 7024 ? SNl 03:32 95:02 /usr/bin/pigz

seems that pigz process are causing this.

I use incremental backups with a fulll backup once a week.

The full backup takes 16 hours to finish and have 460GB.

I have a question about that. I have configured in psa.conf DUMP_D variable my remote NFS device. and DUMP_TMP_D in /usr/local/psa/PMM/tmp
Files are zipped directly to my NFS device or are ziped in my TMP local device and then moved?

Thanks for help!
 
Some thoughts on this:
It is normal that backups create a high system load. I am unable to determine whether in your specific case the load is extremely high compared to the machine you are using. We are seening loads of up to 7 on a 12-core machine, but normally load is at around 1.5 - 2.5 on a 12-core machine during backups.

A 4.7% CPU load by tar is "low". It can be up to 70%, that would be considered "high".

pigz is a zip program, that was meant to use multiple CPU cores for zipping to speed the process up.

The large the backups get, the longer the CPU load is high.

Suggestions:

a)
Control panel > Tools & Settings > Backup Manager > Backup Settings:
Check the "Run scheduled backup procesess with low priority" or "Run all backup processes with low priority" checkboxes to reduce the CPU load.

b)
Run incremental backups instead of full backups

c)
Check your hard disks for failures. We've seen issues with high load and long runtime of the backup when the hard disk/file system contains errors.
Hi Peter
I am having a similar problem and suspect my disk partition might be the main cause ( in your point c).
Is there any way to test the hard disk for errors. All articles I read suggest that I have to unmount to perform to check the hard disk? Which is a bit impractical on a production server.
 
Back
Top