Why DWiki doesn't use fully REST-ful URLs
REST is a style of web application writing where, among other things, you use simple structured URLs to represent resources instead of heavily parameterized ones. For example, 'http://example.com/users/cks/' is a RESTful URL but 'http://example.com/users?name=cks' is not.
(RESTful URLs are virtuous for a number of reasons, including being less alarming to search engines and being simpler, so it's easier for people to remember them and pass them around and use them. Non-RESTUful URLs map more directly to what the web application is actually doing, so the application doesn't need to decode and crack apart the URL to determine what to do.)
DWiki URLs are mostly but not entirely REST; things like the oldest 10 blog entries are '.../blog/oldest/10/', but actions like adding a comment use URLs like '.../Entry?writecomment'. I chose to use URL parameters for handling actions because that way I could guarantee there never would be a name collision between an action and the name of a real page.
This name collision issue comes up because a fully REST approach overloads the URL; it both names a resource and specifies what you want to do with it. If a given URL can have both sub-resources and things done to it, you have a potential for name collision, and either way you lose. Since at least some DWiki URLs have this potential problem, I opted to punt and go with explicit URL parameters for actions. (Well, usually. Logging in to DWiki uses a synthetic page with a name that can never be valid. I could have given actions similar illegal page names, but that would have made their URLs look ugly.)
For URLs that are more user visible, like '.../blog/oldest/10/' and '.../blog/2007/01/', I decided that I wanted pretty URLs more than I wanted to avoid the chance of name collision. Since these are only alternate views for resources that you can get at already, they just turn off if there's a name collision with a real page.
In hindsight the one blemish in the action approach is that 'show page with comments' is an action, but is for something that users will routinely see (and thus see the uglier URL). Since only real pages (not directories) can have comments, it would have been unambiguous to use REST URLs like '.../Entry/withcomments' instead of the current approach of '.../Entry?showcomments'.
(As a corollary, any action that only applies to real pages could be done that way. But I prefer to keep action handling uniform, even at the cost of somewhat uglier URLs.)
Weekly spam summary on January 27th, 2007
This week, we:
- got 14,755 messages from 268 different IP addresses.
- handled 23,910 sessions from 1,483 different IP addresses.
- received 248,718 connections from at least 75,622 different IP addresses.
- hit a highwater of 37 connections being checked at once.
Volume seems noticeably up compared to last week. The apparent jump in the number of different IP addresses trying to talk to us concerns me, since it is probably yet another indication of the growing zombie armies.
About all I can say about this table is that I can remember when 20,000 connections a day was the ordinary baseline. Hopefully it'll go back there someday.
Kernel level packet filtering top ten:
Host/Mask Packets Bytes 188.8.131.52/24 15081 680K 184.108.40.206 13873 712K 220.127.116.11/24 12301 734K 18.104.22.168 8435 505K 22.214.171.124 6534 392K 126.96.36.199 5745 292K 188.8.131.52 5059 270K 184.108.40.206 4544 212K 220.127.116.11 4412 212K 18.104.22.168 3188 149K
I'd call this about the same as last week.
- 22.214.171.124 and 126.96.36.199 return from last week.
- 188.8.131.52 is a Japanese IP address without valid reverse DNS.
- 184.108.40.206 is a bigpond.net.au cablemodem that has appeared here before.
- 220.127.116.11 is a telecomitalia.it machine reappearing from December.
- 18.104.22.168 and 22.214.171.124 were blocked for repeated bad
- 126.96.36.199 is in the SORBS DUL.
In general, a broad variety of the usual suspects.
Connection time rejection stats:
63406 total 40314 dynamic IP 15167 bad or no reverse DNS 4941 class bl-cbl 1475 class bl-sbl 281 class bl-dsbl 210 class bl-njabl 124 class bl-pbl 106 class bl-sdul
This is up significantly this week compared to last week, and the CBL and the SBL appear to have caught on fire. Such a huge SBL presence deserves a breakdown:
|1291||SBL50451||188.8.131.52/24, listed as a spam source and spam website hoster (25-Jan-2007)|
|137||SBL43664||184.108.40.206/23, aka 'GO TECH HOSTING', listed as a spam source and more (18-Oct-2006)|
|16||SBL50430 plus SBL50333||wanadoo.co.uk's main mail machines, listed for advance fee fraud (24-Jan-2007)|
|10||SBL50325||sify.net webmail, listed for advance fee fraud (22-Jan-2007)|
|10||SBL50181||advance fee fraud spam source (18-Jan-2007, but active since November, and tried to hit us last week too)|
|10||SBL50211||220.127.116.11, also a carryover from last week, but now apparently removed from the SBL.|
SBL50451 managed to connect to us a few times before it got SBL listed, but it looks like it didn't manage to deliver anything because the messages it was trying to send had URLs that tripped some of our other spam filtering.
Only two of the top 30 most rejected IP addresses were rejected 100
times or more this week, but the leader is a real champion; 18.104.22.168
was rejected 2,546 times due to it being a wanadoo.fr dialup, and
22.214.171.124 was rejected 100 times due to being a Mexican IP address
without working reverse DNS (it's also on the CBL et al). In other
news, 14 of the top 30 are currently in the CBL, 9 are currently
bl.spamcop.net, and two are in the SBL.
The SBL-listed two are 126.96.36.199, part of SBL41018, a /20 Pacnet escalation listing from 24-Dec for spammer hosting that we saw before in December, and 188.8.131.52, SBL37424, a /26 ROKSO listing from 19-Oct for Richard Simnett aka S-Infotech and Direct Media Network. As usual, neither were actually rejected for being SBL-listed; the Pacnet IP was blocked for bad reverse DNS, and the Simnett IP was blocked because we have that /24 blocked as an old spam source.
This week Hotmail brought us:
- 2 messages accepted.
- no messages rejected because they came from non-Hotmail email addresses.
- 29 messages sent to our spamtraps.
- no messages refused because their sender addresses had already hit our spamtraps.
- 1 message refused due to its origin IP address being from saix.net of South Africa.
And the final numbers:
|what||# this week||(distinct IPs)||# last week||(distinct IPs)|
This is an improvement from last week, but not a great one. There's
no clear winner of the bad
HELO sweepstakes, just a bunch of people
with middle double-digit rejection counts.
The clear champion of bad bounces is 184.108.40.206, with 64, all to
the username '
erqxsdtlqele'. In terms of general sources, Germany
and Italy fought it out this week, with contributions from Russia
and various other places around the world. Random alphabetic jumble
usernames continued their overall domination of the bad bounce targets,
but this week saw some ones with leading numbers show up, along with a
number of more plausible usernames. Bad bounces were sent to only 148
different bad usernames this week.