Wandering Thoughts archives


A growing realization about tcpdump and reading IP traffic

Here is a gotcha about reading tcpdump output that recent events have been tattooing on my forehead:

The only sure way to tell whether a packet is going to your gateway or to something on the local network is to look at the destination Ethernet address.

To put it another way: a packet being sent to your network's gateway does not have the gateway's IP address in it. Thus, reading tcpdump output without Ethernet addresses is not really telling you whether a packet was really sent to your gateway or whether it was just floating by on the network. Similarly if you are reading tcpdump output on the sending machine; until you look at the destination MAC, you don't actually know where the machine is sending the packets, you just think you know.

This is obvious once you think about it (assuming that you know enough about how IP works), as is its interaction with tcpdump being promiscuous and how switches can flood traffic through your network. But you do have to think about it, and not doing so has tripped me up at least twice now. It's certainly not intuitive that more or less the only thing your machine's IP stack does with your gateway's IP address is to ARP for its Ethernet address.

(I think one reason that this is so easy to overlook is that it feels like a layering violation. It's rational to think that the use of an IP gateway should be visible in the IP headers of a packet, instead of only showing up one lever lower.)

sysadmin/IPRealization written at 23:41:45; Add Comment

BitTorrent's file fragmentation problem

When BitTorrent receives a file, it gets the various chunks out of order, generally in a completely random one. This presents the client with the problem of putting them in order and in place.

I believe that historically there have been three approaches to deal with this. The very first BitTorrent clients did it the simple way: they put every received block in its correct place by seeking to that spot and writing the block out. The problem with this was horrible file fragmentation (resulting in terrible sequential read performance); because the blocks were written in random order they were generally allocated randomly around the disk, instead of nicely sequential.

Next came the approach of always growing the file in order, and reordering blocks inside the file. When the client got block N, it initially wrote it at the current end of file; when the file grew to be more than N blocks long, block N finally could be swapped into its correct location in exchange for whatever was already there. This avoids file fragmentation (the client is always expanding the file sequentially), but at at the cost of an increasing amount of file IO to shuffle blocks into their correct places.

(This file IO does not matter all that much for typical clients, which have much more disk bandwidth than network bandwidth, but it can be a significant issue if you are running BitTorrent on fast networks, especially since disks are seek limited.)

The final approach is for the client to pre-write the file (with empty contents) before it starts receiving anything, and then to directly write received blocks into their correct locations. Pre-writing the file forces sequential allocation (possibly better than growing the file does), and rewriting parts of it later generally doesn't change this. The cost of this approach is a potentially significant startup delay, as the client writes what may be several gigabytes to disk.

(Note that many of these sequential allocation assumptions break down if you are using a log-structured filesystem such as ZFS. Copying the file again after you've received it may be the only good solution.)

I wish I could tell you that BitTorrent has solved this problem, but as far as I know it hasn't; you just get to pick which drawback you want. I believe that most BitTorrent clients today default to the second approach but give you an option to do the third.

tech/BitTorrentFragmentation written at 01:28:08; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.