Plesk backup NIGHTMARE !

FirstPoint · Feb 3, 2015

Hello,

Since we migrated to our new server, last sunday, we have a serious issue.

We have our Plesk backups configured every night at 3am. The backup procedure seems to go fine. But at about 5 am, the server has a high load and httpd, mysqld, etc don't respond anymore. We then need to connect in SSH, kill all psa-pmm, mysql_dump, plesk_agent_manager, httpd, and mysqld processes to start them again. Then, the server works fine during a few hours, and at 3pm, the server starts to have the same problem: very high CPU usage, services don't work anymore.

When the server doesn't answer on httpd anymore, the top command shows many perl processes, initiated by apache, with command "sh -c", using over 100 % of the CPU. When we stop pmm-ras, plesk_agent_manager, and restart apache, everything works fine again.

This is a nigthmare for us, as we already had the same problem on our old server. Today we disabled the automated backups as we are getting plently of complaining customers.

I don't understand why the backup uses so many ressources, I see many posts on the forums about this, and since a few years... It seems Parallels doesn't want to finally build a working backup environment ?

Please tell us what we should do. Our investigations didnt bring us anyware. I think it's during the transfer of the backup file to the remote FTP that everything goes wrong. Here are a few details concerning the logs. I am showing the latest backup task.

psadump.log (last updated at 15:29:50):

== STDERR ====================
1+0 records in
1+0 records out
31457280 bytes (31 MB) copied, 0.0468695 s, 671 MB/s
Use of uninitialized value in length at /usr/local/psa/admin/bin/plesk_agent_manager line 827.

==============================

Last lines of migration.log (last updated at 10:41:07:

[2015-02-03 10:19:04.319|16476] INFO: ENV[LANG]=en_US.UTF-8
[2015-02-03 10:19:04.319|16476] INFO: Executing utility: /bin/sh -c -e /bin/tar\ --create\ --file\ /usr/local/psa/PMM/tmp/backupeFDiju/backup_1502030852.tar\ --directory\ /tmp/repo_transport_tmp_UgyWh0/\ --dereference\ --files-from\ /tmp/repo_transport_tmp_ToJjVN
[2015-02-03 10:34:09.070|16476] INFO: The utility executed with the return code 0
[2015-02-03 10:34:09.070|16476] INFO: The return code was
No output on STDOUT were performed by the utility
No output on STDERR were performed by the utility
[2015-02-03 10:34:09.070|16476] INFO: ENV[LANG]=en_US.UTF-8
[2015-02-03 10:34:09.070|16476] INFO: Executing utility: /usr/local/psa/admin/sbin/backup_sign --sign-dump /usr/local/psa/PMM/tmp/backupeFDiju/backup_1502030852.tar --session-path=/usr/local/psa/PMM/sessions/2015-02-03-085208.117
[2015-02-03 10:34:09.134|20639] INFO: backup_sign started : /usr/local/psa/admin/sbin/backup_sign --sign-dump /usr/local/psa/PMM/tmp/backupeFDiju/backup_1502030852.tar --session-path=/usr/local/psa/PMM/sessions/2015-02-03-085208.117
[2015-02-03 10:41:07.515|20639] INFO: backup_sign finished. Exit code: 0
[2015-02-03 10:41:07.551|16476] INFO: The utility executed with the return code 0
[2015-02-03 10:41:07.551|16476] INFO: Ftp init url ftps://XXXXXXXXXXX//backup_1502030852.tar

The server got to the high load at 15:15 - 15:30 approximately.

Looking forward to your help ! Thanks !

FirstPoint · Feb 3, 2015

I forgot to mention that this is the second task of the day, it's a new task since the one from 4 am was stopped by ourselves since the server wasn't responding anymore...

Sergey L · Feb 3, 2015

Yes, the backup task can be quite time- and resource- consuming in case there is a lot of content to backup. This is why we are working on Incremental Backup (to backup only recent changes) and previously already did a number of smaller optimizations.

The remote FTP can be an issue. i.e. low performance of remote storage will result in slow traffic and high load on your server. i.e. previously some people discovered they had bandwidth limits on storage and that was the issue. I would recommend trying local storage temporarily.

If you go Tools&Settings > Backup settings, you can see there some options potentially helping to reduce load:
- Run tasks at lower priority (more time, but less load)
- Reduce volume of simultaneously running tasks (more time, but less load)
- Disable archiving (more space consumed, but faster; however it may make it worse if FTP is the issue)
- Disable check if your server has the sufficient amount of free disk space (risky in case of low disk space, but faster)

Have you had remote storage on your old server? Are all servers in the same DC?

Regards

Plesk backup NIGHTMARE !

FirstPoint

Basic Pleskian

FirstPoint

Basic Pleskian

Sergey L

CTO Plesk

Similar threads