1. Please take a little time for this simple survey! Thank you for participating!
    Dismiss Notice
  2. Dear Pleskians, please read this carefully! New attachments and other rules Thank you!
    Dismiss Notice
  3. Dear Pleskians, I really hope that you will share your opinion in this Special topic for chatter about Plesk in the Clouds. Thank you!
    Dismiss Notice

SpamAssassin Training and Running System wide BAYES DB

Discussion in 'Plesk for Linux - 8.x and Older' started by SecondPhase, Dec 14, 2005.

  1. SecondPhase

    SecondPhase Guest

    Hi There, wanted to ask a little about SA training.. when working with sa-learn from the command line it wants to only work with the system user home directory files and cannot deal with the virtual users that qmail uses..

    What i did which seemed to work ok, although a lot of work which i haven't finished yet was to run sa-learn as root on a few archives of spam and ham.. then did a sync... and copied the bayes and auto-whitelist files to each qmail users home dir.. (chown popuser) I am now getting bayes test results in the spam subject for these users.

    Does anybody know if running a system wide bayes deal would be better than the default way this works? or should i not worry about it now that i'm seeded.

  2. eddyweddy

    eddyweddy Guest

    Hi SecondPhase,

    Now that you are "seeded" it will work as well as a site wide. But if I remember correctly, Bayes entries do get aged off the system after a while. So what you are locking yourself into is a cycle of periodic updates to keep the user bayes working.

    A site wide has the benefit of being easier to manually update. But the problem with a system wide bayes is that user submitted training will not update into it.

    You can check this thread
    Semi-automated site wide Bayes training
    on the EV1servers forum on implementing a site wide bayes and getting the users to forward spam messages to a spam mailbox on your system which gets fed back into the bayes.

    I find that the users in general do NOT do much training, so their personal bayes filters don't get optimised. As a result site wide works better for me.

    In fact, one user has pointed out to me that he finds the way training is done on Plesk to be rather clunky. Eg; He has to login to webmail to check his email and if any of them are deemed spam which has been missed, then he has to open a separate window to login to Plesk to do the training.

    What I am trying to figure out in the long run is whether it is possible to use BOTH the site wide bayes in conjunction with the personal bayes ......