• If you are still using CentOS 7.9, it's time to convert to Alma 8 with the free centos2alma tool by Plesk or Plesk Migrator. Please let us know your experiences or concerns in this thread:
    CentOS2Alma discussion

spamd SIGHUP not working, OPTIONS changed

SigmundS

Basic Pleskian
Hello there, I don't know for sure if this belongs here, but you you'll tell if not ;).
I'm using Ubuntu 14.04.2 LTS with Plesk version 12.0.18 Update #38. Spamassassin (spamd) 3.4.0 was installed already. System is up and running for 2 weeks now. The spamd is activated and configured in plesk. Options are "Switch on server-wide SpamAssassin spam filtering" and "Apply individual settings to spam filtering" so every user can do as he pleases. Rest is default. spamd stopps working after approx 48h. Here is what I did to get spamd working... (please read the whole text, some things I'll write are for your information, because I might have missed something)

spamd stopped working, so I had a look at the logfile (/opt/psa/var/log/maillog) and found this:

Mar 7 10:54:39 MYserverID spamc[19200]: connect to spamd on ::1 failed, retrying (#1 of 3): Connection refused
Mar 7 10:54:39 MYserverID spamc[19200]: connect to spamd on 127.0.0.1 failed, retrying (#1 of 3): Connection refused
Mar 7 10:54:40 MYserverID spamc[19200]: connect to spamd on ::1 failed, retrying (#2 of 3): Connection refused
Mar 7 10:54:40 MYserverID spamc[19200]: connect to spamd on 127.0.0.1 failed, retrying (#2 of 3): Connection refused
Mar 7 10:54:41 MYserverID spamc[19200]: connect to spamd on ::1 failed, retrying (#3 of 3): Connection refused
Mar 7 10:54:41 MYserverID spamc[19200]: connect to spamd on 127.0.0.1 failed, retrying (#3 of 3): Connection refused
Mar 7 10:54:41 MYserverID spamc[19200]: connection attempt to spamd aborted after 3 retries

I logged on shell and tried to restart spamd (service spamassassin restart) as I always did. What I got was this:

SpamAssassin Mail Filter Daemon: No /usr/bin/perl found running; none killed

spamd wasn't running that's why spamc couldn't find it. The PID of spamd was different from the one spamc was looking for. Let's have another look in the log after the restart.
Mar 7 19:32:41 MYserverID spamd[21617]: logger: removing stderr method
Mar 7 19:32:42 MYserverID spamd[21622]: zoom: able to use 323/360 'body_0' compiled rules (89.722%)
Mar 7 19:32:43 MYserverID spamd[21622]: spamd: server started on IO::Socket::INET6 [127.0.0.1]:783, IO::Socket::INET6 [::1]:783 (running version 3.4.0)
Mar 7 19:32:43 MYserverID spamd[21622]: spamd: server pid: 21622
Mar 7 19:32:43 MYserverID spamd[21622]: spamd: server successfully spawned child process, pid 21623
Mar 7 19:32:43 MYserverID spamd[21622]: spamd: server successfully spawned child process, pid 21624
Mar 7 19:32:43 MYserverID spamd[21622]: prefork: child states: IS
Mar 7 19:32:43 MYserverID spamd[21622]: prefork: child states: II

The restart startet spamd and spawned two child processes, and spamc was happy. I had a look in the spamd.pid and the PID was 22182 but the log said 21622. In the spamd documentation I found this:

-r pidfile, --pidfile Write the process id to pidfile

Note: If spamd receives a SIGHUP, it internally reloads itself, which means that it will change its pid and might not restart at all if its environment changed (ie. if it can't change back into its own directory). If you plan to use SIGHUP, you should always start spamd with the -r switch to know its current pid.

Sounds like my problem. BOOYAAHH, I solved it without whining in some forums ;) :p. I added the option "--pidfile" to the config file in "/etc/default/spamassassin". It worked the first 24h. The next morning spamd was running and filtering and everything was fine. After the second 24h I got spam. Again. Hm, sh**. I thought I had it.
Another look in the log:

spamd: server killed by SIGTERM, shutting down

What is that?

The kill(2) system call sends a specified signal to a specified process, if permissions allow. Similarly, the kill(1) command allows a user to send signals to processes.

I didn't send anything. But who does it? Something I haven't found out yet. A look at the config file in "/etc/default/spamassassin" revealed that it was altered. I added the option "--pidfile" again and had a look at the date. After 4 days spamd stopped working. So spamd runs for two days, then the config is altered and after another two days SIGTERM seems to kill the spamd process (I am not 100% about the timing, the altering could happen after 25h or 36h).
I tried to change the file properties of the config in "/etc/default/spamassassin" in read only, but to my surprise that didn't work.

So I (seem to) have a solution for my problem, but something keeps rewriting that config file after a certain time. Is there an option in plesk that I missed? A cron job I didn't see? HELP ME PLEASE! Great, I'm whining. Sorry.

Thx 4 F1.

st.
 
Hello again! No one? TL;DR ??? Who cares, maybe someone can use it some day ^^ ;).
But found something else. A bug that seems to be my problem.

/etc/init.d/spamassassin reload fails unless /var/run/spamd/ exists for pid file

Found here: https://bugs.launchpad.net/ubuntu/+source/spamassassin/+bug/1421787

I did what was said to solve the problem althoung I use a newer version, the bug still seems to be active.

Let's hope for the best.

st.
 
I think I have a nother problem.
I created a folder /run/spamd and put the spamd.pid in there like the LP link said. But today the file is gone, but the folder is still there. And I found out that everyday at around 10:28:58 SIGTERM kills spamd.
WHY THE F*** does it do this. I'm slowly loosing patience with this... (not your fault, but I hate it when I try to fix something and I don't understand how things are)
Mar 17 10:28:58 MyServerID spamd[22143]: spamd: server killed by SIGTERM, shutting down

Why does it kill the spamd? Why 10:28:58 EVERY DAY? How can I find out what happends around 10:28:58?

st.
 
OK, I couldn't find a cron task fitting that specific event of time, but I had a long and close look to the syslog files. First I thought this could be part of the problem:
Mar 18 10:28:28 MYserverID postfix/master[30899]: message repeated 4 times: [ warning: master_wakeup_timer_event: service pickup(public/pickup): Connection refused]
Mar 18 10:28:38 MYserverID postfix/master[30899]: warning: master_wakeup_timer_event: service qmgr(public/qmgr): Connection refused
Mar 18 09:23:18 MYserverID spamd[21096]: message repeated 29 times: [ prefork: child states: II]
Mar 18 10:29:21 MYserverID spamd[21096]: spamd: server killed by SIGTERM, shutting down
Mar 18 10:29:28 MYserverID postfix/master[30899]: warning: master_wakeup_timer_event: service pickup(public/pickup): Connection refused

But I found a thread (http://talk.plesk.com/threads/postfix-master-connection-refused.303699/) that says it's just part of the default master.cf of postfix.

There are no cronjob near enough that SIGTERM kill that could affect it (today at 10:29:21), just two default crons from plesk itself.
Mar 18 10:25:01 MYserverID CRON[28889]: (root) CMD (test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily ))
Mar 18 10:25:01 MYserverID CRON[28891]: (root) CMD ([ -x /opt/psa/admin/sbin/backupmng ] && /opt/psa/admin/sbin/backupmng >/dev/null 2>&1)

So I am pretty much stuck now. I have absolutely no idea where to look next. I set up a cron job that starts spamd 5min after SIGTERM kills it, so at least it keeps running and filtering. But I hate ducktape solutions...

HELP!!!!!

st.
 
Hi SigmundS,

could you please check, that the postfix configuration for "qmgr" and "pickup" at "/etc/postfix/master.cf" is
Code:
qmgr fifo n - n 1 1 qmgr
pickup fifo n - - 60 1 pickup

... and not "unix" instead of "fifo"? If the general configuration is still existent and Plesk just added the correct lines under the general basic configuration, please comment the "unix" - lines out ( with an "#" at the front of the line ) and restart postfix ( service postfix restart or /etc/init.d/postfix restart ).
 
could you please check, that the postfix configuration for "qmgr" and "pickup" at "/etc/postfix/master.cf" is
Code:
qmgr fifo n - n 1 1 qmgr
pickup fifo n - - 60 1 pickup

... and not "unix" instead of "fifo"? If the general configuration is still existent and Plesk just added the correct lines under the general basic configuration, please comment the "unix" - lines out ( with an "#" at the front of the line ) and restart postfix ( service postfix restart or /etc/init.d/postfix restart ).

Yes I could, yes it was, yes I did.

Can't you help me with the other problem as well? That would be more after my fancy ^^ ;).
So now I have fixed a problem I didn't know was there because it didn't affect me and still have a problem I can't solve because I don't know where to look :(.

st.
 
Hi SigmundS,

I didn't face the issue yet, as you described in your posts and therefore didn't suggest anything to your posts yet... it could be, that your daily log - rotation for postfix has a misconfiguration, or called the SIGTERM kill because of your "unix" - "fifo" - errors.
The quoted time for your daily logrotation - time seems to point as well to a logrotation - issue
Mar 18 10:25:01 MYserverID CRON[28889]: (root) CMD (test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )

To investigate, if the daily logrotation might be the issue, please post your depending logrotation file ( mostly named "rsyslog" at "/etc/logrotate.d" on Ubuntu - systems ).
 
Hi UFHH01,
thx for the hint. I checked the syslog file again and the error is gone. Before it came every 5min. Since the change and restart around 19:40 it didn't came back. So far so good. Here is the rsyslog:

/var/log/syslog
{
rotate 7
daily
missingok
notifempty
delaycompress
compress
postrotate
reload rsyslog >/dev/null 2>&1 || true
endscript
}

/var/log/mail.info
/var/log/mail.warn
/var/log/mail.err
/var/log/mail.log
/var/log/daemon.log
/var/log/kern.log
/var/log/auth.log
/var/log/user.log
/var/log/lpr.log
/var/log/cron.log
/var/log/debug
/var/log/messages
{
rotate 4
weekly
missingok
notifempty
compress
delaycompress
sharedscripts
postrotate
reload rsyslog >/dev/null 2>&1 || true
endscript
}

I can't say if there is something wrong. What I can say is that the following files stated above don't exist:

/var/log/mail.info
/var/log/mail.warn
/var/log/daemon.log
/var/log/user.log
/var/log/lpr.log
/var/log/cron.log
/var/log/debug

Hope you'll find something.

st.
 
Hi SigmundS,

please change the lines:
Code:
postrotate
reload rsyslog >/dev/null 2>&1 || true
endscript

to:
Code:
    postrotate
        reload rsyslog >/dev/null 2>&1 || true
    endscript
    su root root

This should solve issues with wrong owner rights after the logrotation.
 
Hi UFHH01,
I changed the lines as you said. I'll have a look at the log files tomorrow to see if the SIGTERM kill still occurs.

st.

EDIT 19.03.2015 - 15:55

Thats the log file from today, still with a SIGTREM kill, but in the same second the cron job "service spamassassin restart" starts. So I think this is related. The cron job was executed yesterday and today. So I remove the cron job to see if it happens tomorrow. Stay tuned. Here is the log, btw:

Mar 19 10:33:01 MYserverID CRON[19508]: (root) CMD (service spamassassin restart)
Mar 19 09:51:38 MYserverID spamd[18461]: message repeated 26 times: [ prefork: child states: II]
Mar 19 10:33:01 MYserverID spamd[18461]: spamd: server killed by SIGTERM, shutting down
Mar 19 10:33:03 MYserverID spamd[19514]: logger: removing stderr method
Mar 19 10:33:04 MYserverID spamd[19516]: zoom: able to use 324/360 'body_0' compiled rules (90%)
Mar 19 10:33:04 MYserverID spamd[19516]: spamd: server started on IO::Socket::INET6 [127.0.0.1]:783, IO::Socket::INET6 [::1]:783 (running version 3.4.0)
Mar 19 10:33:04 MYserverID spamd[19516]: spamd: server pid: 19516
Mar 19 10:33:04 MYserverID spamd[19516]: spamd: server successfully spawned child process, pid 19517
Mar 19 10:33:04 MYserverID spamd[19516]: spamd: server successfully spawned child process, pid 19518
Mar 19 10:33:04 MYserverID spamd[19516]: prefork: child states: IS
Mar 19 10:33:04 MYserverID spamd[19516]: prefork: child states: II

This the exact order of the logfile, funny that there is an entry from 9:51:38 between the two 10:33:01 entries.

st.
 
Last edited:
Hi UFHH01,
the problem seems to be solved. WOOHHOO. Finally. Today no SIGTERM kill in the log file and the spamd is still running with correct ID in pidfile and everything. I hope this is it.
Thanks for your help. :):):)

st.
 
Oh well, its been 8 days, here we go again. You know the drill. spamd killed with SIGTERM at around 10:29. I had a look at this line
Mar 29 09:44:25 myserverid spamd[19516]: message repeated 61 times: [ prefork: child states: II]

And found something here (http://www.linuxforen.de/forums/showthread.php?270414-spamd-prefork-child-states-II) and a solution here (http://serversupportforum.de/forum/227630-post10.html).
But this line "spamassassin_destination_recipient_limit=1" doesn't exist in my version of the main.cf.

So any new ideas?

st.
 
Hi SigmundS,

correct, this doesn't exist, because you didn't define it. You might notice, that the poster peter123 declared, that he defined this definition, in order to transfer only one recipient and not, as a standard, a maximum of 50. If there is no definition in postfix, the standard definitions are taken. ( documented at: http://www.postfix.org/postconf.5.html / "default_destination_recipient_limit (default: 50)" ).

Be aware to make a backup of your individual and very own postfix main.cf, because Plesk might overrite any changes in case of updates/upgrades or patches.
 
This is the only thing I found. I don't realy think this has to do with this problem, but then I didn't think I would get this far with the changes you suggested either. I don't change files which plesk generates because then there is a solution within plesk which then is the better choice because the changes won't get undone in case of future updates.
Is there a way to see where a SIGTERM comes from? Is there a log I can activate to see something like this? There muss be something that sends it.

st.
 
I found something about the SIGTERM signal in the GNU C Library:

Macro: int SIGTERM
The SIGTERM signal is a generic signal used to cause program termination. Unlike SIGKILL, this signal can be blocked, handled, and ignored. It is the normal way to politely ask a program to terminate.
The shell command kill generates SIGTERM by default.

The last part sounds like someone would write a "kill spamd" in the ssh shell, but I don't think so. Ain't nobody got time for that ;). But the first part is also interesting. Can I tell spamd to simply ignore the SIGTERM???

st.
 
I have the same problem, spamd stops every day at the same time ( i think it is releated to daily log )

spamd: server killed by SIGTERM, shutting down

sometimes it is started right after shutdown and i get

logger: removing stderr method

but most of the time it is not started after shutdown.

Hi SigmundS,

please change the lines:
Code:
postrotate
reload rsyslog >/dev/null 2>&1 || true
endscript

to:
Code:
    postrotate
        reload rsyslog >/dev/null 2>&1 || true
    endscript
    su root root

This should solve issues with wrong owner rights after the logrotation.


i have tried to edit /etc/logrotate.d/rsyslog do this but to no effect.
 
Last edited:
i still haven't found a solution or why is still happening sometimes, but i have a script that is watching and restarting spamd if its killed.
 
Back
Top