A piece of phish spam with some clever URL obfuscation

July 15, 2020

We were the target of a phish spam run today. In many respects it was a standard modern phish; it was specifically targeted to us, with a message and claimed sender tuned to here, it was in HTML, and the inducement to click was a claim of 'go here to retrieve a voicemail message'. However, it had one interesting trick that I haven't seen before, and that was how it obfuscated its target URL.

The first level of obfuscation was that the target in the <a href="..."> was entirely encoded in HTML hex entities, which probably only stops very basic spam recognizer engines (and serves as a big warning sign for others). However, even when decoded the direct URL came out to be '/blah/?of=<email address>', with no host in evidence. At first I stared at this in puzzlement, and then the penny dropped and I looked at the full HTML. Up at the top was a little thing:

<html> <base href="&#x68;&#x74; &#x74;&#x70; &#x73;&#x3A; &#x5C;&#x2F;[...]

(For a bit of extra obfuscation, that decodes to 'https:\/'. I've removed the hostname, and added strategic spaces between some hex entities so that this entry doesn't get an extra-wide line.)

The phish spammers had split their URL in two by using a base URL element. The base URL element had the hostname (and the https://, sort of); the <a href> had the path on the host. Given this, it seems likely that a decent number of anti-spam engines that parse HTML don't handle it to the extent of base URL elements (and anything that just does basic text matching is out in the cold).

(I have a personal little program that extracts URLs from email messages for my own uses. It didn't understand the base URL element, but I'm not sure I should bother fixing that.)

I expect that IMAP mail clients properly reconstruct the full URL as part of properly rendering modern HTML, although I haven't tested that. I don't know if web based things like GMail do, although it's possible that document base URLs are used frequently enough in real HTML email that they have to.

(The phish spammer targeting us may have assumed that anyone using GMail or the like was a lost cause anyway, and have aimed at people using desktop or mobile IMAP clients.)

Written on 15 July 2020.
« Today I learned that Python's argparse module allows you to abbreviate long command line options
Malware spammers put .exe Windows executables in everything »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Jul 15 22:39:47 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.