• If you are still using CentOS 7.9, it's time to convert to Alma 8 with the free centos2alma tool by Plesk or Plesk Migrator. Please let us know your experiences or concerns in this thread:
    CentOS2Alma discussion

Question Migrating SpamAssassin database

Fabian H

Basic Pleskian
Hi there,
I am just migrating my plesk server to another plesk server (other os).
As I used the sa-learn utility multiple times for learning spam and my customers always marked mails as spam to train spamassassin, I want to migrate the spamassassins database to the new server too.
How can I do that?
 
As far as I know, the trained data is stored in bayes_seen, bayes_journal, and bayes_toks.
These files can be found in the .spamassassin folder:

/var/qmail/mailnames/domain/account/.spamassassin

So, when you migrate to a new server, my assumption is these files will be migrated too.
 
Alright, thanks.

Another question about this:
If using sa-learn utility, in which .spamassassin folder is the updated database synced? Maybe in all which are having the .spamassassin folder?
 
After my migration, lots more spam is incoming to my mail accounts.
I checked with the sa-learn utility and the following is what I found:

On the old (destination) server:
Code:
[root@host-old:~]$sa-learn --dump magic
0.000          0          3          0  non-token data: bayes db version
0.000          0       1025          0  non-token data: nspam
0.000          0       1323          0  non-token data: nham
0.000          0     122009          0  non-token data: ntokens
0.000          0 1453152776          0  non-token data: oldest atime
0.000          0 1648325086          0  non-token data: newest atime
0.000          0          0          0  non-token data: last journal sync atime
0.000          0          0          0  non-token data: last expiry atime
0.000          0          0          0  non-token data: last expire atime delta
0.000          0          0          0  non-token data: last expire reduction count

On the new (target) server:
Code:
[root@host:~]$sa-learn --dump magic
0.000          0          3          0  non-token data: bayes db version
0.000          0        100          0  non-token data: nspam
0.000          0          0          0  non-token data: nham
0.000          0      10144          0  non-token data: ntokens
0.000          0 1648416734          0  non-token data: oldest atime
0.000          0 1649728411          0  non-token data: newest atime
0.000          0          0          0  non-token data: last journal sync atime
0.000          0          0          0  non-token data: last expiry atime
0.000          0          0          0  non-token data: last expire atime delta
0.000          0          0          0  non-token data: last expire reduction count

So, it seems like the trained data isn't migrated with the Plesk migrator.
Any ideas on how I can migrate this too?
 
So, it seems like the trained data isn't migrated with the Plesk migrator.
Any ideas on how I can migrate this too?
The data is migrated. You just need to point it to the mailboxes .spamassassin folder. Remember by default bayes data is per mailbox. You were just checking the global settings. You need to use:
Code:
sa-learn --dbpath /var/qmail/mailnames/example.com/user/.spamassassin --dump magic
 
The data is migrated. You just need to point it to the mailboxes .spamassassin folder. Remember by default bayes data is per mailbox. You were just checking the global settings. You need to use:
Code:
sa-learn --dbpath /var/qmail/mailnames/example.com/user/.spamassassin --dump magic
Thanks!
Seems like it's correct, the nspam and nham in the databases are nearly the same.
But what path is used without using the --dbpath parameter?
So, sometimes obviously spam mails get a score of 2.0, but I and my customers are training the filter each time they get spam mails.

For some of my accounts, I am running /var/qmail/mailnames/$domain/$account/Maildir/.Spam/cur/ to train manually (within a script).
 
EDIT: I meant I am using
Code:
 sa-learn --spam /var/qmail/mailnames/$domain/$account/Maildir/.Spam/cur/
to train manually.
 
Unfortunately spamassassin takes much more than just bayes training in order to configure it properly. (Which is why your spam scores are at 2.0). Spamassassin is extremely powerful but has a very high learning curve. I've been using it myself for almost 15 years and I'm still learning new things. When properly configured most spam scores should be well above 10.0 (many are above 20.0)
 

Attachments

  • 2022-04-13_04h03_14.png
    2022-04-13_04h03_14.png
    311.1 KB · Views: 12
Unfortunately spamassassin takes much more than just bayes training in order to configure it properly. (Which is why your spam scores are at 2.0). Spamassassin is extremely powerful but has a very high learning curve. I've been using it myself for almost 15 years and I'm still learning new things. When properly configured most spam scores should be well above 10.0 (many are above 20.0)
Okay, I see!
So, can you give some recommendments on how to configure it?
 
Back
Top