Hi,
Today my server suddenly stop sending email messages.
In log I can find messages like:
I figure out that error occur because server doesn't closing connection to 127.0.0.1:12346 and that cause server to run out of available postfix connections.
I can't figure why server is not closing that connections.
With tcpdump I can see that postfix-srs response as it should, but connection stay open (ESTABLISHED).
Only solution I can find is to restart postfix or kill all staled processes when process limit is used.
What can cause that service to stay open and not closing connections? how can I fix that problem?
Today my server suddenly stop sending email messages.
In log I can find messages like:
Apr 2 13:18:24 xxx postfix/spawn[31138]: warning: /usr/lib/plesk-9.0/postfix-srs: process id 31139: command time limit exceeded
...
Apr 2 13:18:29 xxx postfix/smtpd[2894]: warning: connect to TCP map 127.0.0.1:12346: Connection timed out
...
Apr 2 13:18:42 xxx postfix/master[25462]: warning: service "smtp" (25) has reached its process limit "100": new clients may experience noticeable delays
Apr 2 13:18:42 xxx postfix/master[25462]: warning: to avoid this condition, increase the process count in master.cf or reduce the service time per client
Apr 2 13:18:42 xxx postfix/master[25462]: warning: see Postfix Stress-Dependent Configuration for examples of stress-adapting configuration settings
I figure out that error occur because server doesn't closing connection to 127.0.0.1:12346 and that cause server to run out of available postfix connections.
[root@xxx]/>netstat -anvpt | grep 12346 | grep EST
tcp 0 0 127.0.0.1:12346 127.0.0.1:59268 ESTABLISHED 16874/spawn
tcp 0 0 127.0.0.1:12346 127.0.0.1:59504 ESTABLISHED 17273/spawn
tcp 0 0 127.0.0.1:60448 127.0.0.1:12346 ESTABLISHED 17934/cleanup
tcp 0 0 127.0.0.1:12346 127.0.0.1:59764 ESTABLISHED 17474/spawn
tcp 0 0 127.0.0.1:59598 127.0.0.1:12346 ESTABLISHED 17281/smtpd
tcp 0 0 127.0.0.1:12346 127.0.0.1:60898 ESTABLISHED 18529/spawn
tcp 0 0 127.0.0.1:60446 127.0.0.1:12346 ESTABLISHED 17912/smtpd
tcp 0 0 127.0.0.1:59764 127.0.0.1:12346 ESTABLISHED 17473/cleanup
...
[root@xxx]/>netstat -anvpt | grep 12346 | grep EST | wc -l
114
[root@xxx]/>ps aux| grep srs
postfix 16871 0.0 0.0 83852 7760 ? S 00:25 0:00 spawn -n 127.0.0.1:12346 -t inet user=popuser popuser argv=/usr/lib/plesk-9.0/postfix-srs
postfix 16874 0.0 0.0 83852 7848 ? S 00:25 0:00 spawn -n 127.0.0.1:12346 -t inet user=popuser popuser argv=/usr/lib/plesk-9.0/postfix-srs
popuser 16875 0.0 0.0 26564 2532 ? Ss 00:25 0:00 /usr/lib/plesk-9.0/postfix-srs
postfix 16880 0.0 0.0 83852 7716 ? S 00:25 0:00 spawn -n 127.0.0.1:12346 -t inet user=popuser popuser argv=/usr/lib/plesk-9.0/postfix-srs
postfix 16883 0.0 0.0 83852 7760 ? S 00:25 0:00 spawn -n 127.0.0.1:12346 -t inet user=popuser popuser argv=/usr/lib/plesk-9.0/postfix-srs
popuser 16884 0.0 0.0 26564 2492 ? Ss 00:25 0:00 /usr/lib/plesk-9.0/postfix-srs
postfix 17021 0.0 0.0 83852 7812 ? S 00:25 0:00 spawn -n 127.0.0.1:12346 -t inet user=popuser popuser argv=/usr/lib/plesk-9.0/postfix-srs
popuser 17022 0.0 0.0 26564 2552 ? Ss 00:25 0:00 /usr/lib/plesk-9.0/postfix-srs
postfix 17024 0.0 0.0 83852 7808 ? S 00:25 0:00 spawn -n 127.0.0.1:12346 -t inet user=popuser popuser argv=/usr/lib/plesk-9.0/postfix-srs
popuser 17025 0.0 0.0 26564 2524 ? Ss 00:25 0:00 /usr/lib/plesk-9.0/postfix-srs
postfix 17145 0.0 0.0 83852 7784 ? S 00:26 0:00 spawn -n 127.0.0.1:12346 -t inet user=popuser popuser argv=/usr/lib/plesk-9.0/postfix-srs
...
[root@xxx]/>ps aux| grep clean
postfix 16873 0.0 0.0 84168 9460 ? S 00:25 0:00 cleanup -z -t unix -u -c
postfix 17023 0.0 0.0 84132 8920 ? S 00:25 0:00 cleanup -z -t unix -u -c
postfix 17147 0.0 0.0 84132 9116 ? S 00:26 0:00 cleanup -z -t unix -u -c
...
I can't figure why server is not closing that connections.
With tcpdump I can see that postfix-srs response as it should, but connection stay open (ESTABLISHED).
Only solution I can find is to restart postfix or kill all staled processes when process limit is used.
What can cause that service to stay open and not closing connections? how can I fix that problem?