fbpx

Dovecot Auth Service Not Ready on a WHM/cPanel Server

On a stable and current WHM/Cpanel server, rarely rebooted, a reboot to disable SELinux caused Dovecot to stop working. The first major clue something was wrong was a client noting that webmail is down with:

503 Service Unavailable The server is temporarily busy, try again later

At first glance one would assume it’s something to do with SELinux, but why? Could it really be?

The log to see the error is:

# tail -f /usr/local/cpanel/logs/error_log 
dovecot auth service not ready

Googling had no exact results, typically not a good sign.

Only two actions were tried before contacting WHM and logging a ticket:

  1. Rebuilding Dovecot configuration
    • scripts/builddovecotconf
  2. Restarting the Dovecot service
    • scripts/restartsrv_dovecot

This did not resolve the problem. As this is a perfectly stable server with numerous clients, it’s not worth messing about. Dovecot should always just start. The server has been running for more than a year. So immediate ticket logged.

Hats of the WHM support for resolving the issue. Along the way numerous interesting commands were obtain in troubleshooting. This article documents some of these commands:

Restarting Dovecot on WHM/cPanel with verbose output

/usr/local/cpanel/scripts/restartsrv_dovecot --html --wait --verbose

Checking if Dovecot is actually running

systemctl status dovecot -l
...
├─1741 dovecot/auth
├─7862 dovecot/auth -w
...

Logging into an IMAP mailbox from the command line to test a WHM/cPanel server

First create a test mailbox, then:

doveadm auth login [email protected] secret

Using NMAP to determine if all important WHM ports are open

nmap -p 25,587,465,110,995,143,993,585,2096,2095 102.130.114.172 --reason

At this point the engineer noted that ports 2095 and 2096 were closed and apparently one or both are used for Webmail.

If they were open it would have looked like this:

2095/tcp open nbx-ser syn-ack
2096/tcp open nbx-dir syn-ack

But they were closed so it looked like this:

2095/tcp closed nbx-ser reset
2096/tcp closed nbx-dir reset

Solution

The solution to the problem was to increase authentication processes:

Maximum Number of Authentication Processes in WHM > Mailserver Configuration

Reason Why This Happened

If difficult to say why this happened. It could be that when the server came back after a reboot, many (100s) of IMAP sessions were starved to connect to the server and suddenly overwhelmed the system.

References

Share this article

Share on facebook
Share on twitter
Share on linkedin

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to Top