• If you are still using CentOS 7.9, it's time to convert to Alma 8 with the free centos2alma tool by Plesk or Plesk Migrator. Please let us know your experiences or concerns in this thread:
    CentOS2Alma discussion

Issue FTP(S) Backups slow as molasses

carlheinz

New Pleskian
Server operating system version
Debian 10.12 x86_64
Plesk version and microupdate number
Plesk Obsidian 18.0.43.1
Hi!

Currently I have an issue on only one instance of my Plesk infrastructure.

I've even gone to the extremes of rebuilding the whole setup, Plesk + FTP servers.
Same issues after rebuilds.

Backups to a remote FTP(S) store takes forever to complete.
A full server backup of 61.9 GB takes 12 hours.
Full or Multi volumes produce the same results.

Server Information:
plesk -v
Product version: Plesk Obsidian 18.0.43.1
OS version: Debian 10.12 x86_64
Build date: 2022/04/14 18:00
Revision: 1a6b26fb2fd0ac923f3ca10bdfd13b721cb5c676

Memory Usage (with backup/transfer):
free -m
total used free shared buff/cache available
Mem: 8192 1079 6575 107 536 7112
Swap: 0 0 0

CPUs + Load:
cat /proc/cpuinfo | grep processor | wc -l
8

09:56:03 up 1 day, 10 min, 3 users, load average: 3.04, 4.85, 5.06

Configs:
Firewalls disabled on remote FTP/SFTP and Plesk instance.
Currently testing on same private LAN.

Backup Manager -> Remote Storage Settings -> FTP
Use Passive Mode (Y)
Use FTPS (Y)
Backup Settings (all defaults)

Only a single entry of scheduled backups in the DB:
MariaDB [psa]> select count(*) from BackupsScheduled;
+----------+
| count(*) |
+----------+
| 1 |
+----------+
1 row in set (0.000 sec)

Some manual CLI tests for sanity (disclaimer they all perform as expected):

FTP
ftp> put backup_domainmail_2205181657.tzst
local: backup_domainmail_2205181657.tzst remote: backup_domainmail_2205181657.tzst
200 PORT command successful
150 Connecting to port 60227
226-File successfully transferred
226 287.199 seconds (measured here), 20.97 Mbytes per second
6316500889 bytes sent in 287.20 secs (20.9746 MB/s)

SFTP
SFTP:
sftp> put backup_domainmail_2205181657.tzst
Uploading backup_domainmail_2205181657.tzst to /data/backup_domainmail_2205181657.tzst
backup_domainmail_2205181657.tzst 100% 6024MB 73.9MB/s 01:21

RSYNC
backup_domainmail_2205181657.tzst
6,316,500,889 100% 84.27MB/s 0:01:11 (xfr#1, to-chk=0/1)

10 Minute transfer speed sample from Plesk server while backup is running
Sampling eth0 (600 seconds average)..
376333 packets sampled in 600 seconds
Traffic average for eth0

rx 488.95 kbit/s 583 packets/s
tx 13.25 Mbit/s 43 packets/s

So max throughput achieved is 1.65625 Mbytes per second with the supplied backup setup built into Plesk.

The server also has ample disk space for creation in /var/lib/psa/dumps/ before transfer.
Storage medium is a set of NVME's for the server.

I've tried to be as extensive as possible in testing.
However now I'm slowly approaching the "at a loss" stage.

Any help, feedback or test ideas would be appreciated.

Cheers,
Carl
 
Have you tried the free Backup Telemetrics extension? It can show you where the hold-up is.
 
Hi Peter!

Indeed unfortunately it indicates the uploads is where the issues resides, but my tests indicate that this should be a non factor and yet it is.
I've attached a Telemetry example, however sw-tar --create is included in the upload process.

You can see this from the process breakdown.

Cheers,
Carl
 

Attachments

  • Screenshot 2022-05-19 at 16.08.00.png
    Screenshot 2022-05-19 at 16.08.00.png
    173.6 KB · Views: 7
  • Screenshot 2022-05-19 at 16.13.59.png
    Screenshot 2022-05-19 at 16.13.59.png
    115.2 KB · Views: 7
Would that not rather be a slow tar process? If tar is slow, it can either be for a lack of CPU power, but it can also mean that it has to pack a very large number of files. Sometimes user accumulate hundreds of thousands of temporary "session" files in their subscription directories. This can lead to excessive tar run times. Maybe you can check your /var/www/vhosts directories (recursively with their subdirectories) for the number of files. Example:

# find . -type d ! -path "./logs" ! -path "./logs/*" ! -path "./usr" ! -path "./usr/*" ! -path "./etc" ! -path "./etc/*" ! -path "./error_docs" ! -path "./error_docs/*" -print0 | while read -d '' -r dir; do files=("$dir"/*); printf "%5d files in directory %s\n" "${#files[@]}" "$dir"; done | sort -n
 
Thank you,

I had previously run the above command which did show that user sites in /var/www/vhosts was well within norm.
I also ran the same test against /var/qmail/mailnames which produced the same results.

Then I decided to dig a little deeper.

The server that backups really fast with sw-tar + remote-ftp:
# rsync --stats --dry-run -ax /var/www/vhosts /tmp
Number of regular files transferred: 2,812,498

# rsync --stats --dry-run -ax /var/qmail/mailnames /tmp
Number of regular files transferred: 1,224,346

The server that backups really slow with sw-tar + remote-ftp:
# rsync --stats --dry-run -ax /var/www/vhosts /tmp
Number of regular files transferred: 1,187,110

# rsync --stats --dry-run -ax /var/qmail/mailnames /tmp
Number of regular files transferred: 1,448,217

As we can see the server with less files is the slow one...

I've attached a telemetry example of a users backup which forms part of the whole server backup including the amount of files below.
# rsync --stats --dry-run -ax /var/qmail/mailnames/<domain_example>/ /tmp
Number of regular files transferred: 35,830

Run a command (sw-tar --create) -> 29m 55s
Upload a file to remote storage (backup_domainmail_2205191213.tzst) -> 7h 6m 51s

Unfortunately, still no resolution to the problem.

Thanks,
Carl
 

Attachments

  • Screenshot 2022-05-20 at 07.41.56.png
    Screenshot 2022-05-20 at 07.41.56.png
    115.4 KB · Views: 3
Would that not rather be a slow tar process? If tar is slow, it can either be for a lack of CPU power, but it can also mean that it has to pack a very large number of files. Sometimes user accumulate hundreds of thousands of temporary "session" files in their subscription directories. This can lead to excessive tar run times. Maybe you can check your /var/www/vhosts directories (recursively with their subdirectories) for the number of files. Example:

# find . -type d ! -path "./logs" ! -path "./logs/*" ! -path "./usr" ! -path "./usr/*" ! -path "./etc" ! -path "./etc/*" ! -path "./error_docs" ! -path "./error_docs/*" -print0 | while read -d '' -r dir; do files=("$dir"/*); printf "%5d files in directory %s\n" "${#files[@]}" "$dir"; done | sort -n

I like that one command :)

However, the issue is very likely to be a (simple) combination of factors, often not the most obvious ones.

First of all, there is the hint provided by the file extension : tzst .......... a tar file compressed with zstd ....... resulting in a backup file.

Second, there is the compressor zstd ....... just as pigz (the previous one), presented as a savior, since - in theory - zstd should be better.

Third, there is the timing of backups ........ and given any duration of any backup, all kinds of backups running together will result in issues.

Fourth, there is the duration of backups ....... and given any the time of any backup, the duration of a specific backup can cause problems with other backups.

Fifth, there is the choice between "incremental" and "full" backup ........ and given the situation in question, the one outperforms the other.


Sure, all of the above are (mostly known) OBVIOUS factors determining the relevance of LESS OBVIOUS factors, such as CPU usage.

However, it is a two-edged short : all these (obvious and/or less obvious) factors do have effect on each other and determine backup efficiency.


To my surprise, it is mostly forgotten that it is not about all of the aforementioned factors.

If one looks carefully at the output of the command "cat /proc/mdstat" when a specific backup is slow, then one often sees RAID activity.


In my humble opinion, the whole backup process is a bit odd : all the performance required to tar, zstd and then rsync files...... when one can rsync directly!

In my experience, in most cases of slow backups, all of the aforementioned factors coincide and make the backup process slow ..... and, if present, any RAID activity makes performance even worse.


I must admit that Plesk native backups are convenient and interesting, but there are serious disadvantages.

I must also admit that the beautiful simplicity of scripted rsyncs of (changed) files is not that straightforward, so we will have to do with the Plesk backups.


Anyway, the simple conclusion should be that the Plesk native backups are (and never will be) really efficient, but they are (and always will be) effective!


Kind regards.....
 
Thank you,

I had previously run the above command which did show that user sites in /var/www/vhosts was well within norm.
I also ran the same test against /var/qmail/mailnames which produced the same results.

Then I decided to dig a little deeper.

The server that backups really fast with sw-tar + remote-ftp:


The server that backups really slow with sw-tar + remote-ftp:


As we can see the server with less files is the slow one...

I've attached a telemetry example of a users backup which forms part of the whole server backup including the amount of files below.


Unfortunately, still no resolution to the problem.

Thanks,
Carl

@carlheinz,

Just have a look at the timings of the backups first : make sure that there is sufficient time between them - make it 15 minutes or more!

If changing the timings does not work, then have a look at the size of each backup : add the backup with the largest backup size at the end of the list of scheduled backups!

If changing the order of backups does not work, then have a look at the "type" of backup : choose "full" backup (read: never choose incremental, since any incremental backup will become ridiculuously slow or bad performing, certainly when it concerns a considerable number of files).

If changing the backup type does not work, then have a look at RAID activity : you cannot do much about bad performance of the backup compressor (pigz or zstd), but you can do something about simultaneous occurrence of RAID activity and backups - simply schedule the backups at a later time!

In theory, the aforementioned steps should be sufficient to combat the issues that you have.

As a final remark, please note that bad backup performance can also be the result of a RAID setup failing - if one of the disks is broken or on the verge of being broken, then your backup will not perform well : it is of utmost importance to change the (broken) disk as soon as possible.

I hope the above helps a bit.

Kind regards.....
 
Back
Top