Discussion:
Encoded subjects aren't rejected by header_checks
(too old to reply)
Tomasz Papszun
2004-11-30 11:34:14 UTC
Permalink
Hello, List.

I have searched the list archive and read the documentation but I
haven't found the answer.

The question is:

how to block messages with _encoded_ known, "blacklisted" contents in
the Subject field?

I have
header_checks = pcre:$config_directory/badheaders

and the badheaders file contains some simple patterns used (among
other steps) for rejecting spam, e.g.:

/^Subject: .*prescription/ REJECT
/^Subject: .*penis/ REJECT

It works for not encoded headers but unfortunately, when the Subject
field is encoded (ISO-8859-1, utf-8 etc.) a spam containing the unwanted
word in the Subject is accepted anyway.
The MUA (mutt) displays the subject in a readable way but the real
header contents isn't human-readable. Below are 3 examples:

1) The MUA displays:

Subject: FREE Sildenafil Citrate PRESCRIPTION

But the raw mailbox contains:

Subject: =?ISO-8859-1?b?RlJFRSBTaWxkZW5hZmlsIENpdHJhdGUgIFBSRVNDUklQVElPTg==?=

2) The MUA displays:
Subject: isde Natural penis enlaargement pilll. NEW! niku

But the raw mailbox contains:

Subject: =?utf-8?B?aXNkZSBOYXR1cmFsIHBl?=
=?utf-8?B?bmlzIGVubGFhcmdlbWVu?=
=?utf-8?B?dCBwaWxsbC4gTkVXISBu?=
=?utf-8?B?aWt1?=

3) The MUA displays:

Subject: =?utf-8?q?Natural increase you?=
=?utf-8?q?r penis solution!?=

And the raw mailbox contains:

Subject: =?utf-8?q?Natural increase you?=
=?utf-8?q?r penis solution!?=

Oh, the 3rd example is another case. The raw header does contain that
word but the message was let in probably due to 2-line Subject?

So there is another question: how to reject such illegal(?) (multi-line)
headers?

Thank you in advance
--
Tomasz Papszun SysAdm @ TP S.A. Lodz, Poland | And it's only
***@lodz.tpsa.pl http://www.lodz.tpsa.pl/iso/ | ones and zeros.
***@clamav.net http://www.ClamAV.net/ A GPL virus scanner
Magnus Bäck
2004-11-30 12:12:19 UTC
Permalink
On Tuesday, November 30, 2004 at 12:33 CET,
Post by Tomasz Papszun
I have searched the list archive and read the documentation but I
haven't found the answer.
=20
=20
how to block messages with _encoded_ known, "blacklisted" contents in
the Subject field?
Use a proper content filter, not header_checks.
Post by Tomasz Papszun
I have
header_checks =3D pcre:$config_directory/badheaders
=20
and the badheaders file contains some simple patterns used (among
=20
/^Subject: .*prescription/ REJECT
/^Subject: .*penis/ REJECT
=20
It works for not encoded headers but unfortunately, when the Subject
field is encoded (ISO-8859-1, utf-8 etc.) a spam containing the
unwanted word in the Subject is accepted anyway.
The problem is not ISO-8859-1 or UTF-8 but Base64 and quoted-printable.

header_checks only passes the raw header contents in its lookups. It is
just not the right tool for this. Implement a proper content filter if
you want to get serious on rejecting spam based on message contents.

[...]

--=20
Magnus B=E4ck
***@dsek.lth.se
Tomasz Papszun
2004-11-30 12:33:28 UTC
Permalink
Post by Magnus Bäck
On Tuesday, November 30, 2004 at 12:33 CET,
=20
Post by Tomasz Papszun
=20
how to block messages with _encoded_ known, "blacklisted" contents in
the Subject field?
=20
Use a proper content filter, not header_checks.
Oh, I have just hoped there is some simple and fast, Postfix-native
solution.
Post by Magnus Bäck
Post by Tomasz Papszun
I have
header_checks =3D pcre:$config_directory/badheaders
=20
and the badheaders file contains some simple patterns used (among
=20
/^Subject: .*prescription/ REJECT
/^Subject: .*penis/ REJECT
=20
It works for not encoded headers but unfortunately, when the Subject
field is encoded (ISO-8859-1, utf-8 etc.) a spam containing the
unwanted word in the Subject is accepted anyway.
=20
The problem is not ISO-8859-1 or UTF-8 but Base64 and quoted-printable.
So maybe someone has a working examples of Base64 and QP strings for
blocking popular spam patterns?
Post by Magnus Bäck
header_checks only passes the raw header contents in its lookups. It is
just not the right tool for this. Implement a proper content filter if
you want to get serious on rejecting spam based on message contents.
Could you suggest some appropriate content filters for this task?

Thank you!
--=20
Tomasz Papszun SysAdm @ TP S.A. Lodz, Poland | And it's only
***@lodz.tpsa.pl http://www.lodz.tpsa.pl/iso/ | ones and zeros.
***@clamav.net http://www.ClamAV.net/ A GPL virus scanner
Magnus Bäck
2004-11-30 12:56:52 UTC
Permalink
On Tuesday, November 30, 2004 at 13:33 CET,
[...]
Post by Tomasz Papszun
Post by Magnus Bäck
The problem is not ISO-8859-1 or UTF-8 but Base64 and
quoted-printable.
=20
So maybe someone has a working examples of Base64 and QP strings for
blocking popular spam patterns?
The problem is that the Base64 representation of a single word
is dependent of the context, so there is no single Base64 to use.
Example (foobar being the obvious bad word):

$ perl -MMIME::Base64 -e 'print encode_base64("enlarge your foobar");'
ZW5sYXJnZSB5b3VyIGZvb2Jhcg=3D=3D
$ perl -MMIME::Base64 -e 'print encode_base64("get bigger foobar");'
Z2V0IGJpZ2dlciBmb29iYXI=3D

It's easier with QP, but it's not as widely used by spammers.
Post by Tomasz Papszun
Post by Magnus Bäck
header_checks only passes the raw header contents in its lookups.
It is just not the right tool for this. Implement a proper content
filter if you want to get serious on rejecting spam based on message
contents.
=20
Could you suggest some appropriate content filters for this task?
SpamAssassin, DSPAM, SpamBouncer, Bogofilter, ...

The former two can be integrated into Postfix with amavisd-new.

--=20
Magnus B=E4ck
***@dsek.lth.se
Tomasz Papszun
2004-11-30 14:40:06 UTC
Permalink
Post by Magnus Bäck
On Tuesday, November 30, 2004 at 13:33 CET,
[...]
Post by Tomasz Papszun
Post by Magnus Bäck
The problem is not ISO-8859-1 or UTF-8 but Base64 and
quoted-printable.
=20
So maybe someone has a working examples of Base64 and QP strings for
blocking popular spam patterns?
=20
The problem is that the Base64 representation of a single word
is dependent of the context, so there is no single Base64 to use.
I suspected something similar :-( but didn't know how to express it
precisely. Thanks!
Post by Magnus Bäck
=20
$ perl -MMIME::Base64 -e 'print encode_base64("enlarge your foobar");'
ZW5sYXJnZSB5b3VyIGZvb2Jhcg=3D=3D
$ perl -MMIME::Base64 -e 'print encode_base64("get bigger foobar");'
Z2V0IGJpZ2dlciBmb29iYXI=3D
=20
It's easier with QP, but it's not as widely used by spammers.
=20
Post by Tomasz Papszun
Post by Magnus Bäck
header_checks only passes the raw header contents in its lookups.
It is just not the right tool for this. Implement a proper content
filter if you want to get serious on rejecting spam based on messag=
e
Post by Magnus Bäck
Post by Tomasz Papszun
Post by Magnus Bäck
contents.
=20
Could you suggest some appropriate content filters for this task?
=20
SpamAssassin, DSPAM, SpamBouncer, Bogofilter, ...
=20
The former two can be integrated into Postfix with amavisd-new.
Thank you for the help!

--=20
Tomasz Papszun SysAdm @ TP S.A. Lodz, Poland | And it's only
***@lodz.tpsa.pl http://www.lodz.tpsa.pl/iso/ | ones and zeros.
***@clamav.net http://www.ClamAV.net/ A GPL virus scanner

Loading...