I am also sad to report that spamassassin is quite disappointing (so far) in recognizing spam in an online form submission. It appears to be designed entirely around parsing a mail message format, and most of its functionality hinges around making sure that all the necessary mail headers exist (To, From, Received, Subject, etc.) and that they obey various header-related rules.
I tested spamassassin on ubuntu by installing it with sudo apt install spamassassin
and then running it on a text file containing a spam message. The output has a brief section describing its spam findings. Running it in local-only (with the -L
option) mode (which doesn't check online resources like spamhaus), yields this bit of analysis:
Content analysis details: (6.2 points, 5.0 required)
pts rule name description
---- ---------------------- --------------------------------------------------
-0.0 NO_RELAYS Informational: message was not relayed via SMTP
0.9 MISSING_HEADERS Missing To: header
1.0 PP_MIME_FAKE_ASCII_TEXT BODY: MIME text/plain claims to be ASCII
but isn't
2.7 MISSING_DATE Missing Date: header
1.0 MISSING_FROM Missing From: header
-0.0 NO_RECEIVED Informational: message has no Received headers
0.6 MISSING_MID Missing Message-Id: header
0.0 MISSING_SUBJECT Missing Subject: header
0.0 NO_HEADERS_MESSAGE Message appears to be missing most RFC-822
headers
0.0 T_FILL_THIS_FORM_SHORT Fill in a short form with personal
information
While missing headers certainly seem like a problem for messages received via SMTP or some mail protocol, these checks are hardly applicable to contact form spam. Also, most of the headers can be easily added to the spam message text and then the message will no longer be considered spam, even if the message is extremely spammy.
If you drop the -L option, the spamassassin check is slower:
spamassassin spam.txt
and the only additional check I've noticed is the spamhaus check, which does recognize a spammy url in the text message, but that barely adds anything to the spam score.
Content analysis details: (6.6 points, 5.0 required)
pts rule name description
---- ---------------------- --------------------------------------------------
-0.0 NO_RELAYS Informational: message was not relayed via SMTP
1.2 MISSING_HEADERS Missing To: header
1.0 PP_MIME_FAKE_ASCII_TEXT BODY: MIME text/plain claims to be ASCII
but isn't
0.1 URIBL_SBL_A Contains URL's A record listed in the Spamhaus SBL
blocklist
[URIs: talkwithwebvisitors.com]
1.0 MISSING_FROM Missing From: header
-0.0 NO_RECEIVED Informational: message has no Received headers
1.4 MISSING_DATE Missing Date: header
1.8 MISSING_SUBJECT Missing Subject: header
0.1 MISSING_MID Missing Message-Id: header
0.0 NO_HEADERS_MESSAGE Message appears to be missing most RFC-822
headers
0.0 T_FILL_THIS_FORM_SHORT Fill in a short form with personal
information
I noticed that these checks don't appear to mention DKIM or SPF. I guess those might be plugins. Sadly, these checks are also not applicable to contact form spam.