r***@gmail.com
2016-12-09 20:42:25 UTC
Hi everyone,
I'm facing a very weird postfix behavior.
I have VMWARE running in two different datacenters.
I had this machine working on DC-A perfectly.
After moving it to DC-B, postfix is taking too long to respond to HELO / EHLO commands. In some cases, client gets locked and connection is freezed.
My first tip was to check my DNS. I have tried using Google and some other DNSs but it gets even worse (hmm, might be a fail point). The best performance I can get is using local DNS server.
After that, I tried reinstalling gentoo and after getting everything up, I had the same problem.
I started sending 1 email. When I try about 40 consecutive connections in different IPs (I have POSTFIX working on 250 IPS on this same box), it kinds of stop working properly.
I'm able to connect to SMTP on port 25, but after I type HELO HOST, it freezes the connection.
In MAIL.LOG, there are some uncommon information:
connect from unknown[unknown]
Dec 9 20:13:24 smtp-dev-spo postfix/smtpd[393313]: lost connection after CONNECT from unknown[unknown]
Dec 9 20:13:24 smtp-dev-spo postfix/smtpd[393313]: disconnect from unknown[unknown] commands=0/0
Dec 9 20:13:38 smtp-dev-spo postfix/anvil[396384]: statistics: max connection rate 1/60s for (smtp:unknown) at Dec 9 20:10:18
Dec 9 20:13:38 smtp-dev-spo postfix/anvil[396384]: statistics: max connection count 1 for (smtp:unknown) at Dec 9 20:10:18
The unknown info calls for my attention. I'm connecting from local (public) IP.
I then tried to disable ANVIL and SCACHE but it got even worse.
Another strange identified behavior is that when I connect to the SMTP, after waiting for EHLO command, server returns with this info:
cache btree:/var/lib/mailservers/XXXX.CCC.com/verify_cache full cleanup: retained=0 dropped=0 entries
I have 250 instances of postfix working on same box. Works smoothly on DC-A but can't get it to work on the other one.
I'm using GENTOO with POSTFIX+SASL+COURIER AUTHLIB.
MySQL is running local but I had changed to another huge server we have and symptom persists.
Any tip would be much appreciated.
Thanks a lot.
BR,
Rafael
I'm facing a very weird postfix behavior.
I have VMWARE running in two different datacenters.
I had this machine working on DC-A perfectly.
After moving it to DC-B, postfix is taking too long to respond to HELO / EHLO commands. In some cases, client gets locked and connection is freezed.
My first tip was to check my DNS. I have tried using Google and some other DNSs but it gets even worse (hmm, might be a fail point). The best performance I can get is using local DNS server.
After that, I tried reinstalling gentoo and after getting everything up, I had the same problem.
I started sending 1 email. When I try about 40 consecutive connections in different IPs (I have POSTFIX working on 250 IPS on this same box), it kinds of stop working properly.
I'm able to connect to SMTP on port 25, but after I type HELO HOST, it freezes the connection.
In MAIL.LOG, there are some uncommon information:
connect from unknown[unknown]
Dec 9 20:13:24 smtp-dev-spo postfix/smtpd[393313]: lost connection after CONNECT from unknown[unknown]
Dec 9 20:13:24 smtp-dev-spo postfix/smtpd[393313]: disconnect from unknown[unknown] commands=0/0
Dec 9 20:13:38 smtp-dev-spo postfix/anvil[396384]: statistics: max connection rate 1/60s for (smtp:unknown) at Dec 9 20:10:18
Dec 9 20:13:38 smtp-dev-spo postfix/anvil[396384]: statistics: max connection count 1 for (smtp:unknown) at Dec 9 20:10:18
The unknown info calls for my attention. I'm connecting from local (public) IP.
I then tried to disable ANVIL and SCACHE but it got even worse.
Another strange identified behavior is that when I connect to the SMTP, after waiting for EHLO command, server returns with this info:
cache btree:/var/lib/mailservers/XXXX.CCC.com/verify_cache full cleanup: retained=0 dropped=0 entries
I have 250 instances of postfix working on same box. Works smoothly on DC-A but can't get it to work on the other one.
I'm using GENTOO with POSTFIX+SASL+COURIER AUTHLIB.
MySQL is running local but I had changed to another huge server we have and symptom persists.
Any tip would be much appreciated.
Thanks a lot.
BR,
Rafael