• Please be aware: Kaspersky Anti-Virus has been deprecated
    With the upgrade to Plesk Obsidian 18.0.64, "Kaspersky Anti-Virus for Servers" will be automatically removed from the servers it is installed on. We recommend that you migrate to Sophos Anti-Virus for Servers.
  • The Horde webmail has been deprecated. Its complete removal is scheduled for April 2025. For details and recommended actions, see the Feature and Deprecation Plan.
  • We’re working on enhancing the Monitoring feature in Plesk, and we could really use your expertise! If you’re open to sharing your experiences with server and website monitoring or providing feedback, we’d love to have a one-hour online meeting with you.

Qmail-smtpd spawning many processes, using full cpu

G

gnosis

Guest
I have a problem with qmail-smtpd running up a bunch of processes that are consuming full cpu. I have no idea what's causing this. Here's an example top:
Code:
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
28141 qmaild    25   0  4600 1376 1168 R   27  0.1   0:28.09 qmail-smtpd
27851 qmaild    25   0  4708 1372 1168 R   26  0.1   1:23.48 qmail-smtpd
28268 qmaild    25   0  3948 1372 1168 R   24  0.1   0:05.79 qmail-smtpd
27507 qmaild    25   0  4468 1372 1168 R   23  0.1   1:56.58 qmail-smtpd
28244 qmaild    25   0  4520 1372 1168 R   20  0.1   0:06.82 qmail-smtpd
28045 qmaild    25   0  3820 1376 1168 R   20  0.1   0:48.50 qmail-smtpd
28117 qmaild    25   0  3724 1372 1168 R   20  0.1   0:33.20 qmail-smtpd
28118 qmaild    25   0  4452 1372 1168 R   20  0.1   0:33.59 qmail-smtpd
28163 qmaild    25   0  5200 1380 1168 R   20  0.1   0:25.30 qmail-smtpd
There are often more than a dozen, sometimes more. If I take steps to shut them all down, another will start with 100% cpu and then divide until there are many, always using as much cpu as possible.

Stopping qmail service doesn't stop them. Stopping xinetd does.

This is a new install of centos 4.3 with a Plesk control panel on it. I've run several of these for years and never seen anything like this.

The mail queues remain empty and I can see no indication that the server is sending out spam. Relaying is closed. Incoming connections are steady, but relatively low...about 15-30 per minute.

The server does continue to send and receive mail properly, if slightly delayed.

I am a little worried about a compromise, however, because for the past two days, my logwatch reports have suddenly failed to include several sections that would normally be there (pam_unix, Connections (secure-log), and sshd).

Does anyone have any ideas on how I can troubleshoot this?
 
I'm facing same problem, the server hang and load very high...more than 10 times per day especially peak hour, qmail running more than 30 process.

Please help!!
 
This is insane. I have been struggling with this for 4 days. It seems that no matter what I do, qmail-smptd wants to go apeshit, spawning dozens of processes and using 100% of my hyperthreaded P4. Even with multiple connections per second, there's no way qmail should need that much juice.

I have completely disabled Dr. Web and returned the qmail-queue file back to its orignal. So I don't think Dr. Web is the problem.

I have also turned off spamassassin and disabled it for all mail accounts, so it's not involved anymore either.

I've seen several mentions of this problem on various boards...but no solutions. If anyone has any thoughts, please chime in!
 
Hello,

Please check /var/qmail/control for the following 2 cert files:

dh512.pem
dh1024.pem

If you don't see them or with another naming (dhparam512/1024), rename them accordingly.
 
Originally posted by Mirco
Hello,

Please check /var/qmail/control for the following 2 cert files:

dh512.pem
dh1024.pem

If you don't see them or with another naming (dhparam512/1024), rename them accordingly.
Holy sweet goddamn, this seems to have worked.
Code:
[root@glitternet3 control]# cp dhparam1024.pem dhparam1024.pem.bak
[root@glitternet3 control]# cp dhparam512.pem dhparam512.pem.bak
[root@glitternet3 control]# mv dhparam1024.pem dh1024.pem
[root@glitternet3 control]# mv dhparam512.pem dh512.pem
[root@glitternet3 control]# service qmail restart
Stopping : Starting qmail:                                 [FAILED]
[root@glitternet3 control]# service qmail restart
Starting qmail:                                            [FAILED]
[root@glitternet3 control]# service xinetd restart
Stopping xinetd:                                           [  OK  ]
Starting xinetd:                                           [  OK  ]
[root@glitternet3 control]# service qmail restart
Starting qmail:                                            [  OK  ]
As you can see, qmail would not restart after the change until xinetd was restarted. After the change, I watched the top readout for a couple of minutes. qmail-smtpd never spawned more than 3 processes and never used more than 6% cpu. Thank you!

Please, for the sake of others, and for my learning, can you tell us what you know about what those .pem files are, what this change did, why they were named wrong, and why the hell the process was going nuts? Thanks!
 
These 2 certificate files are used by qmail-smtpd for TLS sessions. qmail tries to open them on each incoming connection, so if they are not available qmail has to generate them on-the-fly which occupies all cpu time.
 
Thanks much for the info.

I wonder why the files were misnamed and why this happens now and then.

I have two other servers running plesk qmail and they both have the properly named files.
 
Got both files, but still facing same problem...

-su-2.05b# ls
badmailfrom concurrencyremote dh512.pem me rsa512.pem virtualdomains
clientcert.pem databytes doublebounceto rcpthosts servercert.pem
concurrencylocal dh1024.pem locals rejectnonexist smtpplugins

Anyone else?
 
This hasn't helped me... running Plesk 8.1 on CentOS 4... this is absolutely killing our server!
 
You could limit the number of incoming connections by editing /etc/xinetd.d/smtp_psa

especially look for these lines, add them if they are not there. cps limits the connections to 20 per second, and pauses the service for 1 second if that limit is exceeded. instances just sets a limit on how many qmail-smtpd's you can have at once.

instances = 50
cps = 20 1

-turgut
 
This fix actually worked after a full server restart... I then had to repair a lot of MySQL tables since the heavy server load (and subsequent psa restarts by our server monitoring software) corrupted some of them.
 
phew, glad I found this thread. That fix worked perfectly for me.

Had a load of 600+ for two days since moving to centos 4.4. Moved those certs around and bingo :)

thanks dudes!
 
Thanks, this works for me too.
But now I've other problem, the outgoing mails taken too loooong time to arrive to his destiny, almost 4 hours.
I don't know if fix the qmail many process can couse this, anyone?
 
Could be a DNS resolution problem ... have you checked your DNS is all resolving correctly?
 
Originally posted by Mirco
Hello,

Please check /var/qmail/control for the following 2 cert files:

dh512.pem
dh1024.pem

If you don't see them or with another naming (dhparam512/1024), rename them accordingly.

Thanks for this tip, this seems to work out on 3 FreeBSD 6.x & Plesk 8.x servers. No more high loads and too many spawning processes.

Thanks!
 
I had a similar problem but for me the fix appeared to be to do:
/etc/init.d/psa-spamassassin stop

I also renamed the PEM files but at first this didn't appear to fix the problem.

I waited a while and then did
/etc/init.d/psa-spamassassin start

and the problem did not re-appear. Thanks to the post I was at least able to isolate the problem.
 
Back
Top