Wandering Thoughts archives

2006-05-14

Absolute versus relative URLs in syndication feeds

I just changed DWiki to generate absolute URLs for the '(N comments)' links at the bottom of entries in my Atom syndication feeds, instead of the absolute path URLs it used to generate (URLs without the http://host/ portion) . This scrubs out the last non-absolute URLs in my syndication feed; URLs in the text of entries in the feed have always been absolute, because I'm cautious and cynical.

In theory absolute URLs are unnecessary in Atom entries, because Atom has rules for how to handle relative URLs. And if you believe that all feed readers properly implement those rules, I have a pony for you. In theory the programmers of the bad feed reader are bozos, because the Atom spec is clear; in practice, even with Atom people creating syndication feeds have a choice between purity and having your feed being widely read. Using only absolute links is one of the aspects of that choice.

(The difference between Atom and RSS in this is that who is the bozo is very clear.)

Back when I added syndication feeds to DWiki I made a choice to be pessimistic about feed readers getting relative URLs right all the time, and modified the DWikiText to HTML converter to generate absolute links for syndication feeds. The '(N comments)' link is generated separately, so I missed it; a problem report today validated my cynicism and pushed me to make this change.

(Depressingly, I believe the feed reader that had a problem was NetNewsWire 2.1; I had expected a bit better of it since it's well regarded.)

Other people feel differently, and deliberately stick to their guns in order to push the technology forward and so on. For example, Tim Bray uses fully relative URLs for the images in his feeds combined with XHTML and xml:base declarations (themselves relative to his feed's URL); the result is a nice test of proper XHTML and XML handling in feed readers. (Some fail, liferea included, but this encourages people to get them fixed.)

tech/AbsoluteUrlsInFeeds written at 21:15:58; Add Comment

Link: an engineering management hack

Engineering Management Hacks: The BigBook Technique is an amusing story of how a group of engineers got their management to pay attention to Brooks's Law ("Adding manpower to a late software project makes it later"). I won't spoil the punchline; read it yourself.

Around here we don't have problems with Brooks's Law, perhaps because we don't have the extra manpower to add to late projects to start with.

(From Daring Fireball.)

links/AManagementHack written at 15:55:52; Add Comment

A small user interface suggestion

Your button for 'mark all items as read' should not be right next to the button for 'advance to next unread item'. Liferea, I'm looking at you.

(Fortunately it was not a feed with a lot of unread or updated items. I think.)

programming/SmallUISuggestion written at 02:27:49; Add Comment

Weekly spam summary on May 13th, 2006

Unfortunately, the SMTP frontend died shortly after midnight on Tuesday morning, so some of the connection statistics are missing about 2.6 days. Given that, this week we:

  • got 11,652 messages from 229 different IP addresses.
  • handled 16,296 sessions from 808 different IP addresses.
  • received 110,313 connections from at least 35,408 different IP addresses since early Tuesday morning.
  • hit a highwater of 11 connections being checked at once since early Tuesday morning.

At the Monday morning volume timestamp, we had received 210,731 connections from at least 7,733 different IP addresses; from this I suspect that that spam storm from Saturday of last week continued full-bore on last Sunday.

Kernel level packet filtering top ten:

Host/Mask           Packets   Bytes
218.254.83.47         13422    644K
212.216.176.0/24       6112    305K
209.91.186.139         4554    273K
61.128.0.0/10          3484    173K
68.147.8.249           3397    163K
221.216.0.0/13         3047    151K
218.0.0.0/11           2692    137K
220.160.0.0/11         2358    118K
74.0.215.4             2309    117K
68.167.80.52           2132   99671

Overall, this is a bit more active than last week, but it's mostly driven up by a few people; there seems to have been no overall volume surge.

  • 218.254.83.47 is a Hong Kong cablemodem, and was mentioned in passing last week.
  • 209.91.186.139 is in the CBL. (And Canadian, alas.)
  • 68.147.8.249 in in a Shaw Cable SPEWS listing. I've actually seen it in log summaries for previous weeks (although never high enough to get in this report), and it has a good looking DNS name, and it's not listed anywhere else, so I am going to whitelist it and see what happens.
  • 74.0.215.4 is a covad.net 'dialup' machine.
  • 68.167.80.52 returns from this April; we consider it a dialup machine, and it's also in bl.spamcop.net and the DSBL.

Connection time rejection stats:

  40201 total
  19942 dynamic IP
  16960 bad or no reverse DNS
   2033 class bl-cbl
    233 class bl-spews
    119 class bl-sdul
    118 class bl-dsbl
     83 class bl-sbl
     49 class bl-ordb
     19 class bl-njabl
      3 class bl-opm

Although this looks down from last week, the details make Sunday's spam storm pop out. All 30 of the top 30 most rejected IP addresses were rejected more than 100 times; the most active one was our friend 218.254.83.47, with 619. 27 of the top 30 are currently in the CBL, 4 are currently in bl.spamcop.net, and 222.252.50.91 (123 rejections) is in SBL39408.

SBL39408 is one of those depressing SBL listings; it is for 222.252.0.0/15, which belongs to Vietnam Posts and Telecommunications Corp (VNN.VN). Created April 10th 2006, the two /16 halves of it are apparently the current worst and second worst /16 spam source networks on the Internet. Somehow I suspect that they are going to retain that status for a while.

Hotmail is doing much better this week:

  • one message accepted.
  • 4 messages rejected because they came from non-Hotmail email addresses (all from various non-US Hotmail domains; I really have to improve that check).
  • no messages sent to spamtraps, refused because the sender had already hit spamtraps, or rejected because of their originating IP address.

I'm willing to tentatively declare that Hotmail has fixed their problem. Besides, as far as I can tell the problem free webmail provider is now Yahoo; I am getting significant advance fee fraud spam through Yahoo from a spam gang that they haven't stopped. (The situation is bad enough that I have started blocking non-US Yahoo operations as they spam us.)

The final numbers:

what # this week (distinct IPs) # last week (distinct IPs)
Bad HELOs 448 49 405 46
Bad bounces 10 10 8 7

More than half (244 out of 448) of the bad HELOs came from btconnect.com's pool of SMTP senders in 213.123.26.0/24, which HELO with names like 'hesa05uker.he.local' (sometimes capitalized). The pattern for usernames in the bad bounces is fairly similar to last week, including another bounce to that 38-character hex sequence (but from a different domain).

spam/SpamSummary-2006-05-13 written at 01:26:48; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.