• If you are still using CentOS 7.9, it's time to convert to Alma 8 with the free centos2alma tool by Plesk or Plesk Migrator. Please let us know your experiences or concerns in this thread:
    CentOS2Alma discussion

n.fandrich

New Pleskian
Hello community,

I have a very annoying problem with my Plesk installation since a couple of weeks. The only thing I can do then is to stop the backup process to avoid website downtime.

During the backup I observed a few things:
  • In the process list there is a process called pzstd which is in sleeping state and utilizes around 9 % CPU. This is a extension used for file compression.
  • Disk I/O for this process is at 0 KiB/s - seems like no data is being read or written.
  • A second process called sw-tar has the status "Uninterruptible sleep"
How can it be, that these backup-related tasks are sleeping? Is this a bug related to plesk or any kind of misconfiguration?

I didn't make any changes to the backup settings before the problems started.

I'm using a VM in a shared hosting running Ubuntu 18.04. I'm aware that this version isn't supported by Plesk anymore, however, I need a full backup before I can upgrade Ubuntu.

No noticeable messages in the log files.

Any ideas regarding this?

Thanks for any help!

Kind regards
 

Attachments

  • Screenshot 2023-07-27 at 10-58-44 Prozessliste - Plesk Obsidian 18.0.52.png
    Screenshot 2023-07-27 at 10-58-44 Prozessliste - Plesk Obsidian 18.0.52.png
    142.7 KB · Views: 2
  • Screenshot 2023-07-27 at 10-58-48 Details zum Backup - Plesk Obsidian 18.0.52.png
    Screenshot 2023-07-27 at 10-58-48 Details zum Backup - Plesk Obsidian 18.0.52.png
    66.1 KB · Views: 2
I have installed Backup Telemetry and exported the output (replaced personal data with "John Doe" or XXXXX - these are just placeholders!), please have a look.

Seems to stuck at the command during the backup of the first subscription:

Code:
Run a command (sw-tar --create)

Code:
{"type":"operation","name":"filwrpr '28'","guid":"2f96d8d9-1345-4e4b-a40a-05e623370ce0","start":"2023-07-27 20:34:19.891","attributes":{"process.command_line":"'filwrpr' '28'"}}
{"type":"operation","guid":"2f96d8d9-1345-4e4b-a40a-05e623370ce0","finish":"2023-07-27 20:34:19.902","attributes":{}}
{"type":"operation","name":"backup","guid":"9d713e5d-05a5-4fb1-840a-8900c3ac65ce","start":"2023-07-27 20:34:19.938","attributes":{}}
{"type":"operation","name":"backup.check_free_space","guid":"dd0233c8-9341-486a-9395-973681c0c3c1","start":"2023-07-27 20:34:21.073","attributes":{}}
{"type":"operation","name":"backup.webspace.list_files","guid":"a407bec6-1b18-4147-baf2-469aeb6d1bc7","start":"2023-07-27 20:34:21.088","attributes":{"domain.name":"johndoe.com"}}
{"type":"operation","guid":"a407bec6-1b18-4147-baf2-469aeb6d1bc7","finish":"2023-07-27 20:34:30.166","attributes":{}}
{"type":"operation","name":"backup.webspace.list_files","guid":"35f75bd1-e29e-427f-91bc-3fe679d0585c","start":"2023-07-27 20:34:30.187","attributes":{"domain.name":"xxxxxxxxx.de"}}
{"type":"operation","guid":"35f75bd1-e29e-427f-91bc-3fe679d0585c","finish":"2023-07-27 20:34:30.713","attributes":{}}
{"type":"operation","name":"backup.webspace.list_files","guid":"178349cb-2cf1-402c-a273-3cf5685832a3","start":"2023-07-27 20:34:30.714","attributes":{"domain.name":"xxxxxxxxx.com"}}
{"type":"operation","guid":"178349cb-2cf1-402c-a273-3cf5685832a3","finish":"2023-07-27 20:34:31.182","attributes":{}}
{"type":"operation","name":"backup.webspace.list_files","guid":"ff69986f-a785-41c2-b2af-c4bc85982eba","start":"2023-07-27 20:34:31.185","attributes":{"domain.name":"xxxxxxxxx.de"}}
{"type":"operation","guid":"ff69986f-a785-41c2-b2af-c4bc85982eba","finish":"2023-07-27 20:34:31.515","attributes":{}}
{"type":"operation","name":"backup.webspace.list_files","guid":"19d9be22-c398-4ae2-a4a7-a14cf12addc5","start":"2023-07-27 20:34:31.517","attributes":{"domain.name":"xxxxxxxxx.de"}}
{"type":"operation","guid":"19d9be22-c398-4ae2-a4a7-a14cf12addc5","finish":"2023-07-27 20:34:32.027","attributes":{}}
{"type":"operation","name":"backup.webspace.list_files","guid":"ef2f81f7-b518-422a-8441-c9926553b365","start":"2023-07-27 20:34:32.029","attributes":{"domain.name":"xxxxxxxxx.de"}}
{"type":"operation","guid":"ef2f81f7-b518-422a-8441-c9926553b365","finish":"2023-07-27 20:34:32.839","attributes":{}}
{"type":"operation","name":"backup.webspace.list_files","guid":"8cc3f19b-8537-46df-82fa-8d5624ace701","start":"2023-07-27 20:34:32.840","attributes":{"domain.name":"xxxxxxxxx.de"}}
{"type":"operation","guid":"8cc3f19b-8537-46df-82fa-8d5624ace701","finish":"2023-07-27 20:34:34.159","attributes":{}}
{"type":"operation","name":"backup.webspace.list_files","guid":"9ef2f1fa-dfbe-44f9-a9f8-b23da1f09af4","start":"2023-07-27 20:34:34.161","attributes":{"domain.name":"xxxxxxxxx.at"}}
{"type":"operation","guid":"9ef2f1fa-dfbe-44f9-a9f8-b23da1f09af4","finish":"2023-07-27 20:34:34.345","attributes":{}}
{"type":"operation","name":"backup.webspace.list_files","guid":"3dfe470d-4ad7-41e4-a719-c4071e81bb92","start":"2023-07-27 20:34:34.347","attributes":{"domain.name":"xxxxxxxxx.de"}}
{"type":"operation","guid":"3dfe470d-4ad7-41e4-a719-c4071e81bb92","finish":"2023-07-27 20:34:34.798","attributes":{}}
{"type":"operation","name":"backup.webspace.list_files","guid":"9f3b01df-4e2d-46ca-b839-706469c732e9","start":"2023-07-27 20:34:34.800","attributes":{"domain.name":"xxxxxxxxx.com"}}
{"type":"operation","guid":"9f3b01df-4e2d-46ca-b839-706469c732e9","finish":"2023-07-27 20:34:35.542","attributes":{}}
{"type":"operation","name":"backup.webspace.list_files","guid":"d79e4c1f-ff4c-4a77-961f-ae99c6373653","start":"2023-07-27 20:34:35.544","attributes":{"domain.name":"xxxxxxxxx.com"}}
{"type":"operation","guid":"d79e4c1f-ff4c-4a77-961f-ae99c6373653","finish":"2023-07-27 20:34:36.205","attributes":{}}
{"type":"operation","name":"backup.webspace.list_files","guid":"125d8c9e-52de-4d0a-a450-05986f6aa184","start":"2023-07-27 20:34:36.206","attributes":{"domain.name":"xxxxxxxxx.shop"}}
{"type":"operation","guid":"125d8c9e-52de-4d0a-a450-05986f6aa184","finish":"2023-07-27 20:34:36.750","attributes":{}}
{"type":"operation","guid":"dd0233c8-9341-486a-9395-973681c0c3c1","finish":"2023-07-27 20:34:37.144","attributes":{}}
{"type":"operation","name":"backup.server","guid":"6c143c85-bab2-4532-880d-dd2e21ae8034","start":"2023-07-27 20:34:37.168","attributes":{}}
{"type":"operation","name":"backup.customer","guid":"73fb1260-c086-401c-a06d-8c3422b1b89c","start":"2023-07-27 20:34:37.172","attributes":{"client.name":"John Doe"}}
{"type":"operation","name":"backup.subscription","guid":"f3529736-41fe-4a07-9dc7-846bde4c16e3","start":"2023-07-27 20:34:37.172","attributes":{"domain.name":"johndoe.com"}}
{"type":"operation","name":"backup.scheduled_tasks","guid":"9ba1f0fe-6cc2-4c48-be2d-6b34f6c5b06f","start":"2023-07-27 20:34:37.523","attributes":{}}
{"type":"operation","name":"crontabmng 'get'","guid":"bb0b38df-f913-4361-9513-ab5327637fd3","start":"2023-07-27 20:34:37.528","attributes":{"process.command_line":"'crontabmng' 'get' 'johndoe'"}}
{"type":"operation","guid":"bb0b38df-f913-4361-9513-ab5327637fd3","finish":"2023-07-27 20:34:37.538","attributes":{}}
{"type":"operation","guid":"9ba1f0fe-6cc2-4c48-be2d-6b34f6c5b06f","finish":"2023-07-27 20:34:37.547","attributes":{}}
{"type":"operation","name":"backup.web_settings","guid":"8a734552-01c3-4260-8751-81a6741c2999","start":"2023-07-27 20:34:37.573","attributes":{}}
{"type":"operation","name":"filemng 'root'","guid":"c7aba2ea-778e-44b0-b800-2cd5c4408549","start":"2023-07-27 20:34:37.603","attributes":{"process.command_line":"'filemng' 'root' 'file_exists' '\/var\/www\/vhosts\/system\/johndoe.com\/conf\/vhost.conf' '--allow-root'"}}
{"type":"operation","guid":"c7aba2ea-778e-44b0-b800-2cd5c4408549","finish":"2023-07-27 20:34:37.614","attributes":{}}
{"type":"operation","name":"filemng 'root'","guid":"9059278f-cc9a-4b97-96c1-8c4e855f83a3","start":"2023-07-27 20:34:37.614","attributes":{"process.command_line":"'filemng' 'root' 'file_exists' '\/var\/www\/vhosts\/system\/johndoe.com\/conf\/vhost_ssl.conf' '--allow-root'"}}
{"type":"operation","guid":"9059278f-cc9a-4b97-96c1-8c4e855f83a3","finish":"2023-07-27 20:34:37.624","attributes":{}}
{"type":"operation","name":"filemng 'root'","guid":"23dbe00f-c93d-4bf1-8b0e-01b1f10e68fa","start":"2023-07-27 20:34:37.625","attributes":{"process.command_line":"'filemng' 'root' 'file_exists' '\/var\/www\/vhosts\/system\/johndoe.com\/conf\/vhost_nginx.conf' '--allow-root'"}}
{"type":"operation","guid":"23dbe00f-c93d-4bf1-8b0e-01b1f10e68fa","finish":"2023-07-27 20:34:37.634","attributes":{}}
{"type":"operation","guid":"8a734552-01c3-4260-8751-81a6741c2999","finish":"2023-07-27 20:34:37.635","attributes":{}}
{"type":"operation","name":"backup.webspace.files","guid":"bedb6d68-90e9-4349-b88a-619998ee0e02","start":"2023-07-27 20:34:37.638","attributes":{"domain.name":"johndoe.com"}}
{"type":"operation","name":"backup-archiver '--pack-incrementally'","guid":"4380a1df-3c2d-4340-ae00-b7f4c5bbccc3","start":"2023-07-27 20:34:37.643","attributes":{"process.command_line":"'\/opt\/psa\/admin\/bin\/backup-archiver' '--pack-incrementally' '--source=\/var\/www\/vhosts\/johndoe.com' '--destination=clients\/johndoe\/domains\/johndoe.com\/backup_user-data_2307272034.tar' '--listing-file=\/tmp\/bufSstYcI' '--created-index-file=\/tmp\/bifvAzdLT' '--incremental-dependencies=\/tmp\/bdfgHvAt2' '--user=johndoe' '--no-recursion' '--session-path=\/opt\/psa\/PMM\/sessions\/2023-07-27-203419.729' '--warnings=\/tmp\/bwsvqHXbb' '--compression-method=zstd' '--compression-level=no-compression' '--debug' '--verbose'"}}
{"attributes":{"process.command_line":"/opt/psa/admin/sbin/backup-archiver --pack-incrementally --source=/var/www/vhosts/johndoe.com --destination=clients/johndoe/domains/johndoe.com/backup_user-data_2307272034.tar --listing-file=/tmp/bufSstYcI --created-index-file=/tmp/bifvAzdLT --incremental-dependencies=/tmp/bdfgHvAt2 --user=johndoe --no-recursion --session-path=/opt/psa/PMM/sessions/2023-07-27-203419.729 --warnings=/tmp/bwsvqHXbb --compression-method=zstd --compression-level=no-compression --debug --verbose","process.runtime.name":"GNU C++"},"guid":"76931fac-9dab-42b3-86c2-48b87d6ae33f","name":"backup-archiver --pack-incrementally","start":"2023-07-27 20:34:37.659","type":"operation"}
{"attributes":{"process.command_line":"/usr/lib/plesk-9.0/sw-tar --create --file - --ignore-failed-read --warning no-file-changed --no-recursion --verbatim-files-from --anchored --files-from /tmp/pmm-ru-ca-includes-6jIfYk"},"guid":"9a62d718-3a5d-4578-89e4-b2d6b441e241","name":"sw-tar --create","start":"2023-07-27 20:34:43.910","type":"operation"}
 
When you observe
# ps aux | grep pmm
What is the output that you are seeing?
This is the output I'm seeing:

Code:
XXXXXXX@XXXXX:~# ps aux | grep pmm
johndoe     10937  0.9  0.0 160112 40772 ?        S    22:45   0:01 /usr/bin/python3 -Estt /opt/psa/admin/sbin/pmmcli_daemon
johndoe    11134  0.0  0.0   4624   780 ?        SN   22:46   0:00 sh -c '/opt/psa/admin/bin/pmm-ras' '--export-dump-as-file' '--session-path=/opt/psa/PMM/sessions/2023-07-28-224544.112' '--volume-names=/tmp/bvsIoEHse' '--debug' '--verbose' '--split-size=2146435072' '--remove-dump' '--dump-file-specification=ext://google-drive-backup/server/backup_2307282246.tar' '--dump-storage=/var/lib/psa/dumps'
johndoe     11135  1.1  0.0 139828 12336 ?        SN   22:46   0:01 /opt/psa/admin/bin/pmm-ras --export-dump-as-file --session-path=/opt/psa/PMM/sessions/2023-07-28-224544.112 --volume-names=/tmp/bvsIoEHse --debug --verbose --split-size=2146435072 --remove-dump --dump-file-specification=ext://google-drive-backup/server/backup_2307282246.tar --dump-storage=/var/lib/psa/dumps
janedoe+ 11194  4.6  0.0  20340  3988 ?        SN   22:46   0:03 /usr/lib/plesk-9.0/sw-tar --create --file - --ignore-failed-read --warning no-file-changed --no-recursion --verbatim-files-from --anchored --files-from /tmp/pmm-ru-ca-includes-vFsmdu
johndoe     11937  0.0  0.0  13064   956 pts/0    S+   22:47   0:00 grep --color=auto pmm
 
Please
# cd /opt/psa/PMM/sessions/2023-07-28-224544.112
then check the content of the status file and the migration.log file. Is migration.log still being appended?

Also
# strace -p 11194
Is this appending lines as it displays? and is
# strace -p 11135
appending lines? In these lines do you see any repetitive patters, e.g. repeating entries like "waiting on ..."? If so you might find out why backups seem to hang. But in general, as the cpu load is not shown as zero on three of the relevant processes I am not sure if this backup is really stuck or if it is simply taking a lot of time to complete. You can tell from the migration and status file contents from the sessions directory (as mentioned above).
 
Thanks for your reply. Last night I started a backup that ran for four hours (with progress at 0 %). I canceled it when customers complained that their website was down. I started another backup twenty minutes ago, this is the current status:

Code:
# ps aux | grep pmm
johndoe     23890  0.2  0.0 160112 40752 ?        S    10:18   0:01 /usr/bin/python3 -Estt /opt/psa/admin/sbin/pmmcli_daemon
janedoe+ 24516  5.1  0.0  21836  5440 ?        DN   10:20   0:21 /usr/lib/plesk-9.0/sw-tar --create --file - --ignore-failed-read --warning no-file-changed --no-recursion --verbatim-files-from --anchored --files-from /tmp/pmm-ru-ca-includes-wK5LAi
johndoe     25444  0.0  0.0  13064  1096 pts/0    S+   10:26   0:00 grep --color=auto pmm

Code:
# strace -p 24516
Yes, lines are appended.

The output of strace -p 23890 is:
Code:
# strace -p 23890
strace: Process 23890 attached
select(0, NULL, NULL, NULL, {tv_sec=830, tv_usec=105607}
No more lines are appended, also not after a few minutes.
 
Please
# cd /opt/psa/PMM/sessions/2023-07-28-224544.112
then check the content of the status file and the migration.log file. Is migration.log still being appended?

Also
# strace -p 11194
Is this appending lines as it displays? and is
# strace -p 11135
appending lines? In these lines do you see any repetitive patters, e.g. repeating entries like "waiting on ..."? If so you might find out why backups seem to hang. But in general, as the cpu load is not shown as zero on three of the relevant processes I am not sure if this backup is really stuck or if it is simply taking a lot of time to complete. You can tell from the migration and status file contents from the sessions directory (as mentioned above).
In the migration.log file I can find messages like "Transport: ...", "Repository [...] Initializing", "Transport: List dir" and so on. File ends with the line "pmm-ras finished. Exit code: 0". Looks fine so far, I think.

The file "dump-status.xml" tells me this:
Code:
<agent-dump-status total-accounts="13" total-domains="12" completed-accounts="13" completed-domains="12"/>
 
I canceled it when customers complained that their website was down.
That can only happen when the backup consumes so many resources that the web server runs into a timeout and cannot server websites in a timely fashion. To mitigate this and to find out whether the timeout is caused by this, disable compression of the backup. You can also lower the compression level. Less compress = less cpu usage. I also recommend to check the "Run scheduled backup processes with low priority" checkbox in backup settings and to set a high cpu priority level such as 13, 14, or 16 and an highest io priority level 7. This will give other processes more time slices and lower the chance that they time out.

<agent-dump-status total-accounts="13" total-domains="12" completed-accounts="13" completed-domains="12"/>
This is the completed state. But it does not match the strace -p 24516 which indicates that the compression utility is still doing something. Normally you can tell from the appended lines what the tool is doing. Like is it accessing files? Then the backup is still running. Is it repeating holding messages? Then it might have crashed for yet unknown reason.

When a backup has finished, some verification algorithms are run. It is possible that these are still doing their work, but normally they only last a few minutes. You should see "Validate content files" messages in the backup.log, e.g. "# tail -n 50 /var/log/plesk/PMM/<your backup directory>/backup.log | grep Validate content". You should also see lines like "backup-archiver finished. Exit code: 0" in there. Do you?

The backup processes can be seen for a while after the backup actually completed, but should be removed from the process list after some minutes. Also the backup set should become visible in the Backup Manager backups list.
 
That can only happen when the backup consumes so many resources that the web server runs into a timeout and cannot server websites in a timely fashion. To mitigate this and to find out whether the timeout is caused by this, disable compression of the backup. You can also lower the compression level. Less compress = less cpu usage. I also recommend to check the "Run scheduled backup processes with low priority" checkbox in backup settings and to set a high cpu priority level such as 13, 14, or 16 and an highest io priority level 7. This will give other processes more time slices and lower the chance that they time out.


This is the completed state. But it does not match the strace -p 24516 which indicates that the compression utility is still doing something. Normally you can tell from the appended lines what the tool is doing. Like is it accessing files? Then the backup is still running. Is it repeating holding messages? Then it might have crashed for yet unknown reason.

When a backup has finished, some verification algorithms are run. It is possible that these are still doing their work, but normally they only last a few minutes. You should see "Validate content files" messages in the backup.log, e.g. "# tail -n 50 /var/log/plesk/PMM/<your backup directory>/backup.log | grep Validate content". You should also see lines like "backup-archiver finished. Exit code: 0" in there. Do you?

The backup processes can be seen for a while after the backup actually completed, but should be removed from the process list after some minutes. Also the backup set should become visible in the Backup Manager backups list.
There are no lines like "validate" or "backup-archiver finished" in the backup.log.

I took a closer look at the file and figured out that the flow is always as follows:
  1. Plesk calculates the backup size and lists files in the subscription directories.
  2. Plesk successfully finishes these preparation steps.
  3. Next line is "Create discovered folder: xxxxxx ..."
  4. After that we have the following lines (which could be interesting I think):
Code:
[2023-07-30 17:21:59.742|10141] DEBUG [util_exec] [64c68017b550b] Starting asynchronously: /opt/psa/admin/sbin/backup-archiver --pack-incrementally --source=/var/www/vhosts/johndoe.net --destination=clients/johndoe/domains/johndoe.net/backup_user-data_2307301721.tar --listing-file=/tmp/bufyeWaFp --created-index-file=/tmp/bifIcPbF6 --incremental-dependencies=/tmp/bdf5b4SWa --user=johndoe --no-recursion --session-path=/opt/psa/PMM/sessions/2023-07-30-172143.406 --warnings=/tmp/bwsi5FAef --compression-method=zstd --compression-level=no-compression --debug --verbose
[2023-07-30 17:21:59.744|10141] DEBUG [util_exec] [64c68017b550b] Finished in 0.00168s, Error code: TRUE
[2023-07-30 17:21:59.752|10278] INFO: backup-archiver started : /opt/psa/admin/sbin/backup-archiver --pack-incrementally --source=/var/www/vhosts/johndoe.net --destination=clients/johndoe/domains/johndoe.net/backup_user-data_2307301721.tar --listing-file=/tmp/bufyeWaFp --created-index-file=/tmp/bifIcPbF6 --incremental-dependencies=/tmp/bdf5b4SWa --user=johndoe --no-recursion --session-path=/opt/psa/PMM/sessions/2023-07-30-172143.406 --warnings=/tmp/bwsi5FAef --compression-method=zstd --compression-level=no-compression --debug --verbose
[2023-07-30 17:21:59.753|10278] INFO: Repository '/var/lib/psa/dumps': Initializing...
[2023-07-30 17:21:59.753|10278] INFO: Transport: Get repository transport /var/lib/psa/dumps
[2023-07-30 17:21:59.753|10278] INFO: Transport: Init credentials for user ''
[2023-07-30 17:21:59.753|10278] INFO: Repository '/var/lib/psa/dumps': Initialized
[2023-07-30 17:21:59.753|10278] INFO: Created temp file: /tmp/pmm-repo-pi-files-4RW6w4
[2023-07-30 17:21:59.753|10278] INFO: Transport: List dir clients/johndoe/domains/johndoe.net
[2023-07-30 17:21:59.754|10278] INFO: Calculate list of files to pack
[2023-07-30 17:22:06.180|10278] INFO: Created temp file: /tmp/pmm-repo-skip-files-PuGLmq
[2023-07-30 17:22:06.180|10278] INFO: Created temp file: /tmp/pmm-repo-callback-skip-files-YJJrcM
[2023-07-30 17:22:06.180|10278] INFO: Repository '/var/www/vhosts/': Initializing...
[2023-07-30 17:22:06.180|10278] INFO: Transport: Get repository transport /var/www/vhosts/
[2023-07-30 17:22:06.180|10278] INFO: Transport: Init credentials for user ''
[2023-07-30 17:22:06.180|10278] INFO: Repository '/var/www/vhosts/': Initialized
[2023-07-30 17:22:06.180|10278] INFO: Repository '/var/lib/psa/dumps': Pack  from /var/www/vhosts/johndoe.net to clients/johndoe/domains/johndoe.net/backup_user-data_2307301721.tar. Additional files count: 0. Split size: -1
[2023-07-30 17:22:06.180|10278] INFO: Created temp file: /tmp/pmm-ru-ca-includes-YAFb27
[2023-07-30 17:22:06.228|10278] INFO: ENV[LANG]=en_US.UTF-8
[2023-07-30 17:22:06.228|10278] INFO: Executing utility: /usr/lib/plesk-9.0/sw-tar --create --file - --ignore-failed-read --warning no-file-changed --no-recursion --verbatim-files-from --anchored --files-from /tmp/pmm-ru-ca-includes-YAFb27
[2023-07-30 17:22:06.228|10278] INFO: Load settings from /opt/psa/admin/conf/panel.ini

And then nothing happens for hours until I stop the backup, also no log messages.
 
Back
Top