• If you are still using CentOS 7.9, it's time to convert to Alma 8 with the free centos2alma tool by Plesk or Plesk Migrator. Please let us know your experiences or concerns in this thread:
    CentOS2Alma discussion

Resolved SpamAssassin not learning, not deleting old spam

Richard18

Basic Pleskian
Onxy 17.5.3 #12
Ubuntu 16.04.2 LTS

I've got SpamAssassin enabled globally and turned on for each mailbox on the server, but it doesn't seem to be completely working for every mailbox.

It does detect and change the headers of spam, but it seems to be the bayes learning aspect of it that doesn't work properly.

I can see that "/usr/local/psa/bin/sw-engine-pleskrun /usr/local/psa/admin/plib/DailyMaintainance/script.php" is definitely being run each morning at around 06:25 so that should kick off "ExecuteSpamTrain"

/var/qmail/mailnames/DOMAIN.com/MAILBOX2/.spamassassin/bayes_toks is being touched each morning, the date is always todays' date.

However, for example I know that there are 40 messages in the .Spam folder for this mailbox however when I run:

Code:
sa-learn --dbpath /var/qmail/mailnames/DOMAIN.com/MAILBOX2/.spamassassin/ --dump magic

I get:
Code:
0.000          0          3          0  non-token data: bayes db version
0.000          0         35          0  non-token data: nspam
0.000          0        541          0  non-token data: nham
0.000          0      83865          0  non-token data: ntokens
0.000          0 1308737499          0  non-token data: oldest atime
0.000          0 1499320725          0  non-token data: newest atime
0.000          0 1499232343          0  non-token data: last journal sync atime
0.000          0          0          0  non-token data: last expiry atime
0.000          0          0          0  non-token data: last expire atime delta
0.000          0          0          0  non-token data: last expire reduction count

nspam is incorrect and nham is also incorrect, these don't seem to match the number of messages in .Spam or .Inbox also there's no expiry atime, expire atime delta or expire reduction count.

If I run the same command for another mailbox:

Code:
sa-learn --dbpath /var/qmail/mailnames/DOMAIN.com/MAILBOX1/.spamassassin/ --dump magic

I get:
Code:
0.000          0          3          0  non-token data: bayes db version
0.000          0         86          0  non-token data: nspam
0.000          0       7061          0  non-token data: nham
0.000          0     184360          0  non-token data: ntokens
0.000          0 1477216115          0  non-token data: oldest atime
0.000          0 1499327509          0  non-token data: newest atime
0.000          0 1499318751          0  non-token data: last journal sync atime
0.000          0 1499318741          0  non-token data: last expiry atime
0.000          0   22118400          0  non-token data: last expire atime delta
0.000          0       1261          0  non-token data: last expire reduction count

nspam and nham are correct, newest atime last journal sync atime and last expiry atime are all today, unlike the other mailbox.

Why would SpamAssassin be updating bayes info correctly for MAILBOX 1, but MAILBOX2 either seems to be delayed or just wrong?

Also for MAILBOX 2, there are messages in the .Spam folder older than 30 days, but SpamAssassin is set to remove messages older than 30 days.

What can I do to troubleshoot?
 
Not sure if that's still applicable to Oynx, but at the time I discovered some issues in Plesk 12's `spamtrain` script and mailboxes with weird characters (eg. '), which would cause it to skip an user's mailbox entirely.
 
Just an idea: It could be possible that there is nothing new to learn so that no action is required.
 
For what it's worth (and I should have replied to the thread a long time ago), this sorted itself out!

After reading the documentation it seems that spamassassin needed 30 days worth of data before it learns properly.

Since the mailboxes were new it just took a little while to start working!
 
Back
Top