How dd does blocking

August 31, 2006

For a conceptually simple program, dd has a number of dark corners. One of them (at least for me) is how it deals with input and output block sizes, and how the various blocking arguments change things around.

  • ibs= sets the input block size, the size of the read()s that dd will make. Since you can get partial reads in various situations, this is really the maximum size that dd will ever read at once.
  • obs= sets the output block size and makes dd 'reblock' output; dd will accumulate input until it can write a full sized output block (except at EOF, where it may write a final partial block).
  • bs= sets the (maximum) IO block size for both reads and writes, but it turns off reblocking; if dd gets a partial read, it will immediately write that partial block.

Because of the reblocking or lack thereof, 'ibs=N obs=N' is subtly different from 'bs=N'. The former will accumulate multiple partial reads together in order to write N bytes, while the latter won't.

(On top of this is the 'conv=sync' option, which pads partial reads.)

So if you're reading from a network or a pipe but want to write in large efficient blocks, you want to use obs, not bs (and you probably want to use ibs too, because otherwise you'll be doing a lot of 512 byte reads, which are kind of inefficient).

Written on 31 August 2006.
« A problem with debugging threaded Python programs
SIGCHLD versus Python: a problem of semantics »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Aug 31 19:14:44 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.