OpenSMTPd log scraping with AWK

Giving spammers a timeout

March 15, 2021

In my previous post I covered how I set up my own mail server on OpenBSD using OpenSMTPd. Promptly after setting up the server, I started noticing spammers connecting via SMTP to try and abuse the server for relaying their spam messages using presumably stolen or fake credentials.

The entries I found in /var/log/maillog all looked something like this:

Mar  9 08:55:42 mail smtpd[60467]: 9a2bd95a086bc4b0 smtp connected address=143.110.219.209 host=<unknown>
Mar  9 08:55:42 mail smtpd[60467]: 9a2bd95a086bc4b0 smtp tls ciphers=TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256
Mar  9 08:55:42 mail smtpd[60467]: 9a2bd95a086bc4b0 smtp authentication [email protected] result=permfail
Mar  9 08:55:42 mail smtpd[60467]: 9a2bd95a086bc4b0 smtp failed-command command="AUTH LOGIN (password)" result="535 Authentication failed"
Mar  9 08:55:43 mail smtpd[60467]: 9a2bd95a086bc4b0 smtp disconnected reason=quit

So the machine with IP address 143.110.219.209 connected via SMTPS and tried to authenticate using the user name [email protected] and a password. Given that my mail server doesn’t have the user name [email protected] in its credentials database, OpenSMTPd correctly refuses the authentication attempt.

I wrote the following AWK script to process the logfile and tally up the number of failed authentication attempts for a given IP address:

#!/bin/awk -f
/smtp connected address=.*/ {
        # get offending ip address
        n = split($9, offender, "=");
        # save the IP address in a list indexed by record ID
        addresses[$6] = offender[2];
}
# this matches stuff like [email protected] --> not our domain, hard strike them
/smtp authentication user=.*@.* result=permfail/ {
        # split the user name into user and domain
        n = split($9, email, "@");
        # extract the domain
        domainpart = email[2];
        if ((domainpart != "domain1.example") || (domainpart != "domain2.example")) {
                hard_strikes[addresses[$6]]++;
        }
}
# this matches any single user name like we're also using --> only apply soft strike
# it might be my own account testing something
/smtp authentication user=[[:alnum:]\.]* result=permfail/ {
        soft_strikes[addresses[$6]]++;
}
END {
        # hard strikes are always blocked
        for (ip in hard_strikes) print ip;
        # soft strikes have a threshold applied
        for (ip in soft_strikes) if (soft_strikes[ip] > 5) print ip;
}

The first block matches any connection entry and stores the IP address in the addresses array, where the array key is the log ID like 9a2bd95a086bc4b0 in the example log above. It’s important to keep track of the ID because we can’t rely on the log output being nicely in order like the example above. Otherwise we might end up associating the wrong log messages to an IP and blocking the wrong IP. The addresses array here functions as a mapping from log ID to IP address.

The second block matches any authentication attempt of the form [email protected]. AWK splits the username field into the user and example.com parts and checks wether the domain name matches on of the configured domain names (domain1.example and domain2.example). If it doesn’t match, we know whoever tries to connect is connecting to the wrong server. The hard_strikes array is used to store the number of occurences of authentication failures, this time the array key is the actual IP address.

The third block matches any single user record, i.e. authentication without the domain part. I have to handle this case seperately as my own users authenticate in the same way and I don’t want to get blocked immediately if I enter a wrong password when setting up a new client. For these type of records there is a separate array soft_strikes.

When the script is done processing the log file I print out all the IP addresses that have collected strikes. In the case of the soft_strikes array I apply a threshold of 5 failed attempts to prevent myself from getting locked out when I’m testing something and accumulate a couple of authentication failures.

Another type of entry I saw in the logfile was this:

Mar  9 08:57:41 mail smtpd[60467]: 9a2bd96031c80f43 smtp connected address=137.116.168.210 host=<unknown>
Mar  9 08:58:02 mail smtpd[60467]: 9a2bd96031c80f43 smtp disconnected reason="io-error: No TLS error"
Mar  9 08:58:02 mail smtpd[60467]: 9a2bd9613a6ac8cf smtp connected address=137.116.168.210 host=<unknown>
Mar  9 08:58:23 mail smtpd[60467]: 9a2bd9613a6ac8cf smtp disconnected reason="io-error: No TLS error"
Mar  9 08:58:23 mail smtpd[60467]: 9a2bd96200d7af07 smtp connected address=137.116.168.210 host=<unknown>
Mar  9 08:58:43 mail smtpd[60467]: 9a2bd96200d7af07 smtp disconnected reason="io-error: No TLS error"
Mar  9 08:58:43 mail smtpd[60467]: 9a2bd9631e6274bc smtp connected address=137.116.168.210 host=<unknown>
Mar  9 08:59:04 mail smtpd[60467]: 9a2bd9631e6274bc smtp disconnected reason="io-error: No TLS error"
Mar  9 08:59:04 mail smtpd[60467]: 9a2bd96433ed7e88 smtp connected address=137.116.168.210 host=<unknown>
Mar  9 08:59:25 mail smtpd[60467]: 9a2bd96433ed7e88 smtp disconnected reason="io-error: No TLS error"
Mar  9 08:59:25 mail smtpd[60467]: 9a2bd965e9c64dea smtp connected address=137.116.168.210 host=<unknown>
Mar  9 08:59:45 mail smtpd[60467]: 9a2bd965e9c64dea smtp disconnected reason="io-error: No TLS error"
Mar  9 08:59:46 mail smtpd[60467]: 9a2bd96636f64f0a smtp connected address=137.116.168.210 host=<unknown>
Mar  9 09:00:06 mail smtpd[60467]: 9a2bd96636f64f0a smtp disconnected reason="io-error: No TLS error"
Mar  9 09:00:06 mail smtpd[60467]: 9a2bd9673587584d smtp connected address=137.116.168.210 host=<unknown>
Mar  9 09:00:27 mail smtpd[60467]: 9a2bd9673587584d smtp disconnected reason="io-error: No TLS error"
Mar  9 09:00:27 mail smtpd[60467]: 9a2bd968df190e9b smtp connected address=137.116.168.210 host=<unknown>
Mar  9 09:00:47 mail smtpd[60467]: 9a2bd968df190e9b smtp disconnected reason="io-error: No TLS error"
Mar  9 09:00:48 mail smtpd[60467]: 9a2bd969ff622399 smtp connected address=137.116.168.210 host=<unknown>

This seems like some type of fire-and-forget strategy being used by a spammer to try and send as many mails through an open SMTP relay as possible. Given that I have configured OpenSMTPd to only accept authenticated SMTP over TLS connections and the spammer doesn’t even try to do a TLS handshake, it again correclty refuses the connection.

Because the attempts occur fairly rapidly, I could simply use the rate-limiting facilities of the pf firewall. I do actually use firewall rate limting, however I’ve also seen these fire-and-forget attempts at much lower cadence with several minutes in between each attempt. At some point the metering intervall for rate-limiting would become so long that I may block legitimate attempts to send mail.

I decided to write another AWK script to parse the logs for this kind of behaviour:

#!/bin/awk -f
/smtp connected address=.*/ {
        # get the IP address for this ID
        n = split($9, offender, "=");
        # add the IP address to a list
        # indexed by record ID
        addresses[$6] = offender[2];
}
/smtp disconnected reason="io-error: No TLS error"/ {
        # retrieve address by record ID and tally up the
        # number of failures
        strikes[addresses[$6]]++;
}
END {
        for (ip in strikes) if (strikes[ip] > 5) print ip;
}

The first block is essentially identical to the one in the first script and constructs an array mapping the log ID to an IP address. The second block matches those connections that error out because of missing TLS and the number of failed attempts is tallied up per IP. Finally, all IP addresses that failed more than 5 times are printed out.

The final piece of the puzzle is instructing the firewall to block connection attempts from the offending IP addresses. I simply wrote a shell script that pipes the output of the scripts into xargs and adds them to the spammers table using pfctl:

#!/bin/sh
for file in /etc/mail/scripts/*.awk; do
        awk -f $file /var/log/maillog | xargs pfctl -t spammers -T add
done

This script is periodically run using a cron job:

~       *       *       *       *       sh /etc/mail/scripts/run.sh 2> /dev/null

The matching lines in /etc/pf.conf are:

table <spammers> persist
[...]
block in quick from <spammers>

In order to prevent blocking an IP address forever (users of IP addresses do occasionally change), I’m also expiring table entries that are more than one week (604800 seconds) old:

~       *       *       *       *       pfctl -t spammers -T expire 604800 2> /dev/null

By using awk in combination with pfctl it’s possible to block some of these abusive hosts using only built-in tools of OpenBSD.