• Please be aware: Kaspersky Anti-Virus has been deprecated
    With the upgrade to Plesk Obsidian 18.0.64, "Kaspersky Anti-Virus for Servers" will be automatically removed from the servers it is installed on. We recommend that you migrate to Sophos Anti-Virus for Servers.
  • The Horde webmail has been deprecated. Its complete removal is scheduled for April 2025. For details and recommended actions, see the Feature and Deprecation Plan.
  • We’re working on enhancing the Monitoring feature in Plesk, and we could really use your expertise! If you’re open to sharing your experiences with server and website monitoring or providing feedback, we’d love to have a one-hour online meeting with you.

High CPU Wait I/O and Load Average during backup

Will-NYESDigital

Regular Pleskian
PRODUCT, VERSION, OPERATING SYSTEM, ARCHITECTURE

Parallels Plesk Panel 11.5.30 MU #44 CentOS 6.5 (Final) 64bit

PROBLEM DESCRIPTION

2 of our our servers are suddenly experiencing high Wait I/O Times, and high Load Averages during the backup process. During this period the Plesk grinds to a halt, sometimes crashing out completely (although SSh is still possible. We have been in talks with our server suppliers (assuming this would be node related) however they have done a lot of testing etc. and categorically state the node is fine , with no other users affecting it.

STEPS TO REPRODUCE

We backup the server using the scheduled backup service and Wait I/O immediately goes up.

ACTUAL RESULT

Plesk downtime / Website downtime

EXPECTED RESULT

No downtime, successful back up

ANY ADDITIONAL INFORMATION

Some other info :
All other processes (MYSQL, apache, Nginx etc) are all running between 1 - 10%

Partition "/usr" utilization 4.2% used (1.81 GB of 43.3 GB) (?)
Partition "/var" utilization 50.6% used (61.8 GB of 122 GB) (?)

We are struggling to identify what has changed on the server that would cause this sudden change, any help greatly appreciated.
 
Hi Igor,

Thanks for the response, I had read that post earlier and we have already tackled out ISP but they have tested and confirmed the node is fine, and there is nobody over using that would affect us.

I did the write test and got the following results (which support he node being ok I would guess)

Working Server

16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 15.8477 s, 67.8 MB/s

Problem Server

16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 17.8032 s, 60.3 MB/s

Have you any other guidance that would help us investigate this issue, it only happens during Plesk backup..

Cheers
 
Sorry to bump so soon.

We have had 2 servers go offline again due to high I/O during the Plesk Backup phase. We have had confirmation that both these Servers are on a different nodes and are now really struggling to get to grips with the issues.

looking at the TOP command we can see the following proicesses running quite huigh :

pigz (which is apparently the zip process for the backup)

and

flush-253:1 (which has not too high cpu but has been running for 15 hours - since the beginning of the backup)
 
Back
Top