Wandering Thoughts archives

2007-01-28

Why DWiki doesn't use fully REST-ful URLs

REST is a style of web application writing where, among other things, you use simple structured URLs to represent resources instead of heavily parameterized ones. For example, 'http://example.com/users/cks/' is a RESTful URL but 'http://example.com/users?name=cks' is not.

(RESTful URLs are virtuous for a number of reasons, including being less alarming to search engines and being simpler, so it's easier for people to remember them and pass them around and use them. Non-RESTUful URLs map more directly to what the web application is actually doing, so the application doesn't need to decode and crack apart the URL to determine what to do.)

DWiki URLs are mostly but not entirely REST; things like the oldest 10 blog entries are '.../blog/oldest/10/', but actions like adding a comment use URLs like '.../Entry?writecomment'. I chose to use URL parameters for handling actions because that way I could guarantee there never would be a name collision between an action and the name of a real page.

This name collision issue comes up because a fully REST approach overloads the URL; it both names a resource and specifies what you want to do with it. If a given URL can have both sub-resources and things done to it, you have a potential for name collision, and either way you lose. Since at least some DWiki URLs have this potential problem, I opted to punt and go with explicit URL parameters for actions. (Well, usually. Logging in to DWiki uses a synthetic page with a name that can never be valid. I could have given actions similar illegal page names, but that would have made their URLs look ugly.)

For URLs that are more user visible, like '.../blog/oldest/10/' and '.../blog/2007/01/', I decided that I wanted pretty URLs more than I wanted to avoid the chance of name collision. Since these are only alternate views for resources that you can get at already, they just turn off if there's a name collision with a real page.

In hindsight the one blemish in the action approach is that 'show page with comments' is an action, but is for something that users will routinely see (and thus see the uglier URL). Since only real pages (not directories) can have comments, it would have been unambiguous to use REST URLs like '.../Entry/withcomments' instead of the current approach of '.../Entry?showcomments'.

(As a corollary, any action that only applies to real pages could be done that way. But I prefer to keep action handling uniform, even at the cost of somewhat uglier URLs.)

web/RESTNameCollisions written at 23:03:30; Add Comment

Weekly spam summary on January 27th, 2007

This week, we:

  • got 14,755 messages from 268 different IP addresses.
  • handled 23,910 sessions from 1,483 different IP addresses.
  • received 248,718 connections from at least 75,622 different IP addresses.
  • hit a highwater of 37 connections being checked at once.

Volume seems noticeably up compared to last week. The apparent jump in the number of different IP addresses trying to talk to us concerns me, since it is probably yet another indication of the growing zombie armies.

Day Connections different IPs
Sunday 35,443 +12,619
Monday 36,954 +12,391
Tuesday 31,523 +8,475
Wednesday 42,126 +13,711
Thursday 39,633 +11,063
Friday 36,720 +9,858
Saturday 26,319 +7,505

About all I can say about this table is that I can remember when 20,000 connections a day was the ordinary baseline. Hopefully it'll go back there someday.

Kernel level packet filtering top ten:

Host/Mask           Packets   Bytes
193.70.192.0/24       15081    680K
213.4.149.12          13873    712K
213.29.7.0/24         12301    734K
221.186.214.155        8435    505K
66.46.180.235          6534    392K
60.231.152.85          5745    292K
81.115.40.8            5059    270K
72.1.187.162           4544    212K
64.40.176.61           4412    212K
66.15.119.165          3188    149K

I'd call this about the same as last week.

  • 213.4.149.12 and 66.46.180.235 return from last week.
  • 221.186.214.155 is a Japanese IP address without valid reverse DNS.
  • 60.231.152.85 is a bigpond.net.au cablemodem that has appeared here before.
  • 81.115.40.8 is a telecomitalia.it machine reappearing from December.
  • 72.1.187.162 and 64.40.176.61 were blocked for repeated bad HELOs.
  • 66.15.119.165 is in the SORBS DUL.

In general, a broad variety of the usual suspects.

Connection time rejection stats:

  63406 total
  40314 dynamic IP
  15167 bad or no reverse DNS
   4941 class bl-cbl
   1475 class bl-sbl
    281 class bl-dsbl
    210 class bl-njabl
    124 class bl-pbl
    106 class bl-sdul

This is up significantly this week compared to last week, and the CBL and the SBL appear to have caught on fire. Such a huge SBL presence deserves a breakdown:

1291 SBL50451 69.42.169.0/24, listed as a spam source and spam website hoster (25-Jan-2007)
137 SBL43664 63.139.56.0/23, aka 'GO TECH HOSTING', listed as a spam source and more (18-Oct-2006)
16 SBL50430 plus SBL50333 wanadoo.co.uk's main mail machines, listed for advance fee fraud (24-Jan-2007)
10 SBL50325 sify.net webmail, listed for advance fee fraud (22-Jan-2007)
10 SBL50181 advance fee fraud spam source (18-Jan-2007, but active since November, and tried to hit us last week too)
10 SBL50211 65.99.209.155, also a carryover from last week, but now apparently removed from the SBL.

SBL50451 managed to connect to us a few times before it got SBL listed, but it looks like it didn't manage to deliver anything because the messages it was trying to send had URLs that tripped some of our other spam filtering.

Only two of the top 30 most rejected IP addresses were rejected 100 times or more this week, but the leader is a real champion; 83.196.30.53 was rejected 2,546 times due to it being a wanadoo.fr dialup, and 189.139.79.21 was rejected 100 times due to being a Mexican IP address without working reverse DNS (it's also on the CBL et al). In other news, 14 of the top 30 are currently in the CBL, 9 are currently listed in bl.spamcop.net, and two are in the SBL.

The SBL-listed two are 209.205.237.36, part of SBL41018, a /20 Pacnet escalation listing from 24-Dec for spammer hosting that we saw before in December, and 66.236.249.115, SBL37424, a /26 ROKSO listing from 19-Oct for Richard Simnett aka S-Infotech and Direct Media Network. As usual, neither were actually rejected for being SBL-listed; the Pacnet IP was blocked for bad reverse DNS, and the Simnett IP was blocked because we have that /24 blocked as an old spam source.

This week Hotmail brought us:

  • 2 messages accepted.
  • no messages rejected because they came from non-Hotmail email addresses.
  • 29 messages sent to our spamtraps.
  • no messages refused because their sender addresses had already hit our spamtraps.
  • 1 message refused due to its origin IP address being from saix.net of South Africa.

And the final numbers:

what # this week (distinct IPs) # last week (distinct IPs)
Bad HELOs 1171 134 1578 101
Bad bounces 229 130 455 345

This is an improvement from last week, but not a great one. There's no clear winner of the bad HELO sweepstakes, just a bunch of people with middle double-digit rejection counts.

The clear champion of bad bounces is 193.138.163.135, with 64, all to the username 'erqxsdtlqele'. In terms of general sources, Germany and Italy fought it out this week, with contributions from Russia and various other places around the world. Random alphabetic jumble usernames continued their overall domination of the bad bounce targets, but this week saw some ones with leading numbers show up, along with a number of more plausible usernames. Bad bounces were sent to only 148 different bad usernames this week.

spam/SpamSummary-2007-01-27 written at 01:23:22; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.