How we configure HAProxy for mail server failover

Mail infrastructure is unforgiving. A web server going down for ten minutes is annoying. A mail server going down for ten minutes means queued mail, delayed delivery reports, and clients asking what happened to their email. For business-critical mail systems, a single point of failure is unacceptable.

HAProxy solves this elegantly for SMTP and SMTP submission. This article covers the actual configuration we use for active-passive mail server failover — not a tutorial-level overview, but the real thing with health checks, SMTP proxy mode, and stick tables.

The problem with a single SMTP server

A typical mail setup has one server handling inbound delivery (port 25) and one handling submission from mail clients (port 587, authenticated). If that server goes down, inbound mail from other mail servers queues at the senders for some hours before starting to bounce. Submission from clients fails immediately.

For a small deployment, this is acceptable downtime. For infrastructure handling mail for hundreds of domains, it's not.

The standard approach is active-passive: a primary mail server handles all traffic, a secondary stays warm and ready, and an HAProxy instance in front routes traffic and handles the failover. Because HAProxy operates at layer 4 (TCP), it can handle SMTP without needing to understand the SMTP protocol in detail.

HAProxy in TCP mode for SMTP

SMTP is a text protocol that starts with a server greeting. This complicates TCP proxying slightly — unlike HTTP, the server speaks first. HAProxy's tcp mode handles this correctly without any special configuration.

The basic frontend and backend for port 25:

frontend smtp_inbound
    bind *:25
    mode tcp
    timeout client 1m
    default_backend mail_servers

backend mail_servers
    mode tcp
    timeout connect 10s
    timeout server 1m
    option tcp-check
    tcp-check connect port 25
    tcp-check expect string "220 "
    server mail1 192.168.1.10:25 check
    server mail2 192.168.1.11:25 check backup

The backup flag on mail2 means it only receives traffic when mail1 fails health checks. This is active-passive failover.

Health checks that actually work for SMTP

A TCP health check that just verifies the port is open isn't sufficient. An SMTP server might accept connections but not be accepting mail (during a Milter rejection, for example, or after a software crash leaves the socket open). We want health checks that verify the SMTP greeting is correct.

The tcp-check expect string "220 " above checks for the SMTP greeting banner. Most SMTP servers return 220 hostname ESMTP — checking for "220 " (with a space) catches the standard banner while being flexible about what follows.

For submission (port 587), the check is the same but on the submission port:

frontend smtp_submission
    bind *:587
    mode tcp
    timeout client 1m
    default_backend submission_servers

backend submission_servers
    mode tcp
    timeout connect 10s
    timeout server 5m
    option tcp-check
    tcp-check connect port 587
    tcp-check expect string "220 "
    server mail1 192.168.1.10:587 check
    server mail2 192.168.1.11:587 check backup

Note the longer timeout server for submission — authenticated mail clients can take a long time to upload a message.

Stick tables for connection persistence

SMTP connections negotiate TLS and sometimes authentication state. Breaking a live SMTP connection mid-session when failing over is disruptive. Stick tables keep HAProxy from routing an established client session to a different server.

backend submission_servers
    mode tcp
    stick-table type ip size 10k expire 30m
    stick on src
    ...

This creates a source-IP stick table with 10,000 entries, expiring after 30 minutes of inactivity. A mail client submitting messages keeps hitting the same backend server for 30 minutes. When the primary fails and clients reconnect, they'll route to the backup for the remainder of their stick table entry.

Handling PROXY protocol

If your mail server software supports the PROXY protocol (Postfix does via the haproxy policy service; Zimbra's Postfix does too), you can configure HAProxy to send the original client IP to the mail server rather than HAProxy's internal IP. This matters for per-IP rate limiting and logging.

backend mail_servers
    mode tcp
    server mail1 192.168.1.10:25 check send-proxy
    server mail2 192.168.1.11:25 check backup send-proxy

On the Postfix side, you configure the haproxy service in master.cf to accept the PROXY protocol on a dedicated port (typically 10025) and add it to smtpd_upstream_proxy_protocol.

We don't use this in all deployments — the added complexity is worthwhile for large deployments where per-IP rate limiting is important, but overkill for smaller setups.

Monitoring and alerting

HAProxy's stats socket gives you live view of backend health. Enabling it:

global
    stats socket /var/run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s

Then:

echo "show servers state" | socat stdio /var/run/haproxy/admin.sock

This returns the current state of every backend server including whether it's UP, DOWN, or DRAIN, the last check result, and how long it's been in the current state.

We integrate this with our monitoring stack (Prometheus via the HAProxy exporter) and alert when the primary mail server fails a health check — which means we know about the failover at the same time HAProxy starts routing to the backup, not when clients start complaining.

What this looks like from the outside

From a sender's perspective, failover is invisible. Their mail server connects to port 25, gets an SMTP greeting, and delivers the message. They never know whether that connection hit mail1 or mail2.

From a client perspective, there may be a brief disconnection during failover (the existing TCP connection drops when the primary goes down), but their mail client will immediately reconnect and submit successfully to the backup. Authentication state doesn't need to be shared between servers because modern mail clients re-authenticate on each connection anyway.

The result is a mail infrastructure that tolerates the failure of any single server without user-visible downtime.

This is the kind of infrastructure detail that matters for mail systems that can't afford downtime. If you're running email infrastructure for clients who need this level of reliability, see our email hosting → or talk to us about your requirements →.

How we configure HAProxy for mail server failover

The problem with a single SMTP server

HAProxy in TCP mode for SMTP

Health checks that actually work for SMTP

Stick tables for connection persistence

Handling PROXY protocol

Monitoring and alerting

What this looks like from the outside

Need hosting for your project?

Related posts

Node.js on shared hosting — what works, what doesn't, and when to go VPS

Proxmox vs. traditional VPS — what the difference means for your hosting

DNSSEC explained for web developers — why it matters and how to check it