|
2013-05-13
The Unix philosophy is not an end to itself
Today I feel like opening a can of worms that I've alluded to before.
Here is something very important about the Unix philosophy (regardless
of what exactly that is): the Unix philosophy was not conceived as an
empty philosophy that was an end to itself. Instead it is above all a
theory about how to make computers easy, powerful, and useful. This
philosophy (or at least the things built by people following it at Bell
Labs and elsewhere) has been extraordinarily successful, and I'm not
just talking about Unix; concepts first pioneered in Unix and C now form
core pieces of pretty much every computer system in the world.
But it's possible to take this too far. To put it one way, it's my
strong view that the core goal of Unix is to be useful, not to be
philosophically pure. The underlying purpose comes first and fitting
how to be useful into 'the Unix way of doing things' comes second. If
Unix has to be non-Unixy for a while (or even permanently) in order
to be useful, then, well, I pick usefulness. Excessive minimalism
and 'Unixness' for the sake of minimalism and Unixness is a kind of
masochism.
(Of course the devil is in the details, as it always is. It's certainly
possible to ruin Unix without getting anything worth it in exchange.)
What this biases me towards is an environment where one solves the
problem first then try to make it fit into the traditional 'Unix way'
second. Which is why part of me thinks that GNU sort's -h option is perfectly fine because it solves a real problem (and
solves it now).
(The counterargument is that Unix cannot be all things to all people.
As with all systems, at some point you have to draw a line and say 'this
doesn't fit, you need to go elsewhere'. I don't know how to balance
this. I do know that a certain amount of griping about 'the one true
Unix way' and how (some) modern Unixes are ruining it reminds me an
awful lot of the griping of Lisp adherents at the rise of Unix, and for
that matter the griping of Unix people (myself sometimes included) at
the rise of Windows and Macs.)
UnixPhilosophyPurpose written at 00:29:34; Add Comment
2013-05-05
Unix is not necessarily Unixy
As I've written about before, in some quarters there
is a habit of saying that everything added to Unix needs to be 'Unixy'.
One of the many problems with this is that a number of aspects of Unix
itself are not 'Unixy'. I don't mean that in a theoretical way, where
we debate about whether a particular API or approach is really 'Unixy'.
I mean that in a concrete sense, in that Bell Labs, generally regarded
as the home of Unix and the people who understand its essential nature
best, built various things differently than mainline Unix. In some cases
they did this after mainline Unix had established something, which is a
clear sign that they felt that other Unix developers had gotten it wrong.
(In the end their vision of the right way to do things was so extreme
that they started over from scratch so they didn't have to worry
about backwards compatibility. The result of that was Plan 9.)
The easiest place to see this is in the approach that Bell Labs took to
networking. Unfortunately I don't believe that manual pages from post-V7
Research Unix are online, but the next best thing is the networking
manual pages for Plan 9 (which has essentially the same interface from
what I understand). Plan 9 networking is completely different from the
BSD sockets API that is now the Unix standard; it is in large part much
more high level. You can read about it in the Plan 9 dial(2) manpage, and a version of
this interface without the Plan 9 bits has resurfaced in the Go net
package's Dial() and Listen() APIs.
You can certainly argue that these APIs are fundamentally not comparable
to the BSD sockets API because they're on a different level (the BSD
sockets API is a kernel API, while most of the Plan 9 API is implemented
in library code). But in a sense this is besides the point, which
is that the Plan 9 API is how Bell Labs thought programs should do
networking.
(You can also argue that the Plan 9 API is insufficient in practice
and that programs need and want more control over networking than it
offers. I'm sympathetic to this argument but it does open up a can of
worms about when one should discount the Bell Labs view on 'what is
Unix' and what can replace it.)
UnixIsNotUnixy written at 23:37:01; Add Comment
2013-05-01
Two xargs gotchas that you may not know about
I know, I've been harping on xargs a bit lately. But this stuff is
important because most people's vague intuitions about how xargs
behaves is actually wrong.
If you're like most people, you probably vaguely think that xargs
operates on lines of input and the purpose of the GNU -o extension to
xargs (and find et al) is so that some joker putting a newline in a
file name doesn't cause the world to blow up. Actually it's much worse
than that.
The simple way to put this is xargs doesn't operate on lines, it
operates on words. Words are the same as lines only if your lines
don't have any whitespace, backslashes, single quotes (') or double
quotes ("), all of which xargs will interpret in various ways. Oh,
and blank lines are neither errors nor empty arguments under normal
circumstances, they are simply word-separating whitespace. In short,
newlines are only the beginning of the things that nasty people can put
in their filenames to give you heartburn.
(Normally you don't see any of this because your input to xargs is
well formed and simple.)
The other trap (as I alluded to) is the
portable behavior of xargs if you don't give an explicit -E
argument. If you don't, some versions of xargs will assume that a
line with only an underscore (_) actually means the (logical) end
of file and won't read any further input. It will probably surprise no
one that Solaris 10 update 8 (that bastion of old times) behaves this
way. Fortunately Linux, FreeBSD, and OpenBSD don't appear to do so.
(One of the morals here is that sometimes GNU programs make important
innovations, as I believe that xargs -0 and find ... -print0 came
from GNU.)
XargsTwoGotchas written at 23:48:39; Add Comment
2013-04-25
How SuS probably requires the 'run at least once' xargs behavior
A commentator left a long comment on my entry about how xargs
behaves with no input arguing that the
Single Unix Specification for xargs
actually requires it to not run if standard input is empty. I think
it's more likely to be the other way around, so today I want to run
down why I think the SuS probably requires this annoying behavior.
There are two important sections of the SuS xargs specification
here and I'm going to quote both, bolding important bits:
The xargs utility shall construct a command line consisting of
the utility and argument operands specified followed by as many
arguments read in sequence from standard input as fit in length and
number constraints specified by the options. The xargs utility
shall then invoke the constructed command line and wait for its
completion. This sequence shall be repeated until one of the following
occurs:
- An end-of-file condition is detected on standard input.
[... other conditions elided ...]
[...] The utility named by utility shall be executed one or more
times until the end-of-file is reached or the logical end-of file
string is found. [...]
Now we get to play the fun game of interpreting standards. The easiest
place to play this game with is the last sentence I quoted, which
says both that the utility shall be executed at least once and that
this happens until end-of-file is reached. If end of file is reached
immediately, which takes precedence? In the style of reading standards
that I've absorbed, explicit statements generally trump implications;
that would mean that the explicit promise that utility shall be
executed at least once trumps the potential implication of not running
it on immediate EOF.
The first paragraph as a whole offers a similar conflict. It is easy to
read it as a series of steps: first read in as many arguments as you can
that fit, then run the command, and only then check for exit conditions
and repeat if they are not met. You don't check for exit conditions
before you run the command once because that's not what the series
of steps tells you to do, and 'zero arguments' is not ruled out as a
valid number of arguments to read from standard input; ergo, xargs
runs the command line once even on immediate EOF. You can also read it
as a general description instead of a series of steps, with the 'this
sequence shall be repeated until ...' forming the framing procedure
around the specific two steps used to form and run each command line; in
this reading it's correct to run zero times if there is an immediate end
of file on standard input since the framing loop's exit condition has
been met.
If we read the first paragraph using an 'explicit trumps implicit' rule
then I think that we have to conclude that the paragraph is the set of
steps that xargs is intended to follow as it executes because this is
exactly how the paragraph is written. This interpretation is reinforced
by the 'once or more' language in the later paragraph.
None of this is unambiguous; the SuS specification never comes out and
says outright 'xargs runs once even if it reads no arguments'. But
given how much the usual extremely legalistic, 'every word and phrase
and ordering decision counts' approach to reading standards pushes us
towards the 'xargs runs once on EOF' interpretation, I think it's
probably what SuS actually requires.
(Note that none of this matters in practice. As covered in the first
entry, existing systems have no common behavior.
The closest you can get is to always specify -r so that xargs does
not run once, which works on GNU findutils, sufficiently recent FreeBSD,
and OpenBSD.)
PS: this is not the most crazy thing in the SuS xargs specification.
If you care about xargs portability and want to be horrified, read
the description of -E carefully.
(Also, these crazy things are almost certainly not the fault of
the SuS authors.)
XargsZeroArgsIssueII written at 01:02:16; Add Comment
2013-04-10
Some important things about OpenBSD PF's max-* options
In older versions of the OpenBSD pf.conf manpage (such as the one you
may be running on a firewall that is too important to reboot, much less
put through a chancy upgrade), the 'max <number>' option for stateful
tracking is described this way:
Limits the number of concurrent states the rule may create. When this
limit is reached, further packets that would create state will not
match this rule until existing states time out.
This is, how shall I put it, a lie (as before). In current versions of pf.conf this phrasing has
been declared inoperative and revised to be 'further packets that would
create state are dropped until existing states time out'. This new
phrasing is correct as far as it goes but it leaves several important
things out.
First this also applies to all of the max-* variants (max-src-nodes,
max-src-states, max-src-conn, and max-src-conn-rate), which you
could maybe deduce because the manpage doesn't say anything about what
they do when the limit is hit so clearly they inherit max's behavior
(this is the way of Unix manpages).
Next, how things are logged (if they are logged) depends on your
OpenBSD version. In OpenBSD 4.4, this dropping is completely silent
and in fact happens after the point where packets are logged (so if
you specify log on such a rule, what it logs will be sometimes be a
lie; it will claim that packets are accepted when they were in fact
dropped). Because overload <table> is only used (or allowed) for the
TCP connection limits, this means that there is essentially no way to
tell when a UDP ratelimit (perhaps one to limit traffic to your DNS
server) has triggered or what it
affects.
(You can watch some pfctl -si counters tick up. This is not very much
use if you want to know what your ratelimit is affecting and whether it
is too small, too big, or just right.)
In OpenBSD 5.2 the logs are now honest, as far as I can tell from the
kernel source code (I don't have a handy 5.2-based firewall where I can
test this). The logs will now accurately record both that a packet
was dropped and that it was dropped due to connection limits. There is
still no way to log just dropped packets but at least you can now log
all traffic and sort out the mess later (assuming that your logs do not
explode from the volume).
(The overload <table> clause still only applies to TCP connections.
As far as I can tell this is a completely artificial limitation in PF
and I personally think it's a stupid one. I would certainly like to be
able to automatically put IPs that are hammering on our DNS server with
UDP queries into a table to be blocked wholesale for a while.)
My overall conclusion from my recent experiences (this included) is that
OpenBSD PF is not very good for UDP ratelimiting. For instance, actual
volume per time limits can only be constructed indirectly and only work
for some UDP-based protocols (and, I think, often only for cooperative
clients).
(I'm not completely sure how OpenBSD matches states for UDP packets, but
I have a sneaking suspicion that a DDoS program that reused the same UDP
source port for all its forged DNS queries would match an existing PF
UDP state table entry and so never hit PF's state table entry based rate
limits. You can't play this trick with TCP connections because they
have actual connection state.)
OpenBSDPfMaxNotes written at 00:15:53; Add Comment
2013-04-04
An irritating OpenBSD PF limitation on redirections
I am generally fond of OpenBSD's PF packet filter but every so often I
run across a seemingly arbitrary limitation that drives me up the wall.
Today's limitation is on where you can redirect packets to as part of
NAT'ing and general address translation. I'll start by sketching out a
simplified version of the problem I'm trying to solve.
Part of our complex networking setup is a scheme where specific internal
machines, sitting on 'sandbox' subnets in private address space, can be
reached by the outside world through public IP addresses that sit on
what is effectively a virtual subnet. Through a complex dance involving
two firewalls, these machines are bidirectionally NAT'd to their public
IPs when they talk to the outside world. Our problem is that sometimes
internal machines try to use the public IPs, and we'd like to make that
work.
What we want to do is conceptually simple: when a packet from the
internal network and to the public IP shows up on the sandbox firewall,
it should be rewritten to the internal IP instead and put back on the
internal network. Something like, in pf-ese:
pass in quick on $int_if from <int_lan> to $PUBIP rdr-to $INTIP route-to $int_if
(It's not necessary to rewrite the source address and in
fact it's a feature to not do so. Update: as covered in comments, it
may be necessary to rewrite the source address to force return traffic
to flow through the firewall to be fixed up.)
As it happens, OpenBSD PF is specifically documented (in the
pf.conf manpage)
to not allow this:
Redirections cannot reflect packets back through the interface they
arrive on, they can only be redirected to hosts connected to different
interfaces or to the firewall itself.
In the fine OpenBSD tradition this is in
fact not completely true. The specific LAN segment that is $int_if
actually has two separate subnets on it for historical reasons and machines on the other subnet can
talk to $INTIP through this rdr-to rule without problems. It's
only machines on the same subnet that can't (and not because PF
blocks the packets; I've checked).
What I assume is happening is that PF and OpenBSD's routing stack are
interacting badly. Under normal circumstances a router will not route
a packet from host A on a subnet to host B on the same subnet (at most
it will send an ICMP redirect). In an ideal world PF would be able to
bypass this restriction when it rdr-to's something, especially with an
explicit route-to (in my books, route-to should mean 'shut up and
send the packet out that interface no matter what'). In this world PF
apparently can't, which is an irritating limitation that gets in the
way of what I maintain is a perfectly sensible thing to want to do.
(There are any number of cases where you might want to redirect traffic
nominally to the outside world back to an internal machine.)
PS: as the pf.conf manpage notes, theoretically the way around this
is to add NAT'ing with a pass out rule. I was unable to get this to
work when I tried it but I might have been using options that were
slightly wrong. I assume that this NAT'ing process is enough to fool
the routing system into accepting the packet as something that could
be validly routed.
(On the other hand, if 'pass out' is applied after routing is done
I don't see how this can work. It would make sense for it to be a
post-routing action, since routing is what normally decides the outbound
interface, but the pf.conf manpage doesn't document whether this is
the case or whether some deep magic is happening.)
OpenBSDPfRedirIssue written at 02:54:24; Add Comment
2013-04-02
Why listen(2)'s backlog parameter has such an odd meaning
In light of what the listen(2) backlog parameter actually means, ie very little, you might sensibly wonder why
it has such an odd and basically useless definition. For instance, it
might be quite useful to be able to put a real limit on the number of
TCP connections that have been fully established but not accept()'d by
your server.
The simple answer is that the listen(2) backlog is not about helping
your program out, it is about limiting how many kernel resources can be
(invisibly) consumed by connections that haven't yet been accept()'d
and surfaced in your program (and limited, if only by file descriptor
limits). Such connections are basically invisible to normal tools,
programs, and Unix limits because they haven't yet been materialized
as file descriptors and exist only in the depths of the kernel socket
stack. This means that the kernel needs to limit them somehow. In theory
the kernel could have applied the same limit to everything and not
provided any way for applications to change it. In practice, I suspect
that the early developers of BSD wanted to allow a way for some select
daemons they expected to be unusually active to raise the normal limit
(and perhaps for very inactive daemons to lower it to save kernel
memory).
This leads to a simple but not particularly useful rule for what the
listen(2) backlog actually limits: anything that your kernel thinks
uses enough resources to care about. And this has changed over time.
As kernels have found clever ways to handle various things that have
traditionally consumed resources (such as half-open TCP connections),
they've stopped counting against the backlog limits. Some of this
evolution has been driven by necessity (such as people on the Internet
exploiting half-open TCP connections as one of the first denial of
service attacks) and some of it has simply been driven by the cleverness
of kernel programmers. This has led to the current situation where
understanding the effects of any specific backlog requires knowing
something about the kernel implementation of the specific socket type
involved and what things in it do and don't use up kernel resources.
(See also Derrick Petzold's somaxconn - That pesky limit, which has some interesting
quotes from Stevens' Unix Network Programming.)
WhyListenBacklog written at 01:04:29; Add Comment
2013-03-30
One irritation in xargs's interface
Xargs is generally a nice command that more or less works right.
Some people could criticize Unix for needing it so much (which is
mostly a product of command line length limitations) and the need for -0 is a bit annoying,
but on the whole it's good. But xargs has one little corner case
that is really annoying; as a bonus, it's even non-portable in an
irritating way.
Here it is, presented in illustrated form:
$ xargs echo does run </dev/null
Now the question: will this produce any output? In other words, does
xargs run the command once even if there are no (extra) arguments
to give to it?
The answer is that it does in some but not all versions of xargs:
- Solaris 10 runs
echo once and has no option to disable this.
- GNU findutils
xargs (commonly used on Linux) normally runs echo
once but can turn this off with -r aka --no-run-if-empty.
- FreeBSD doesn't run
echo and has no option to change this.
Recent versions accept -r for compatibility with GNU xargs;
old versions don't.
- OpenBSD runs
echo once but can turn this behavior off with -r.
- Mac OS X doesn't run
echo and has no -r argument.
Based on the current manpage, NetBSD xargs behaves the same as
FreeBSD xargs (including accepting a do-nothing -r argument).
The Single Unix Specification for xargs is
rather ambiguous about what behavior is allowed or required; it
certainly never definitely states things either way (and it has no
-r argument). My close reading leads me to believe that SuS
probably requires xargs to run echo once, but only by implication.
This would match what I believe is historical behavior (as suggested
by Solaris, which is very historical). I assume that at some point
FreeBSD decided that this historical behavior was a bad idea and
changed it.
My view is that (historical) xargs behavior is stupid and is a bear
trap waiting to bite you in unusual situations. You almost never
want to run the xargs command even if there is nothing for it to
operate on. In many situations and usages you'll get odd results if
there is nothing to operate on; in extreme cases you may get dangerous
explosions. This is an easy issue to overlook because everyone almost
always uses xargs in situations that do generate arguments list
(especially when you're testing your command lines or scripts). In fact
I suspect that many people using xargs on Linux, Solaris, and OpenBSD
machines don't even know about this potential gotcha, which sort of
proves my point.
(This entry is yet another illustration of how a simple entry idea can
turn out much more interesting than I expected when I started writing
it. Before I started actually checking systems I would have confidently
told you that all versions of xargs would run echo once; I had no
idea how tangled the actual situation was.)
XargsZeroArgsIssue written at 01:44:47; Add Comment
2013-03-21
The FreeBSD iSCSI initiator is not ready for serious use (as of 9.1)
Our ZFS fileservers use iSCSI to talk
to the backend disk storage, so if FreeBSD is to be a viable Solaris
replacement for us its iSCSI initiator
implementation needs to be up to the level of Solaris's (or Linux's).
I recently did some basic testing to see if it looked like it was, and
I'm afraid that it's not; as of FreeBSD 9.1, the iSCSI initiator seems
suitable only for experimentation.
In my testing I ran into two significant issues and one major issue,
after which I stopped looking for further problems because it seemed
pointless.
The first issue is clear in the FreeBSD iscsi.conf(5) manpage,
which is simply full of 'not implemented yet'
notes for various iSCSI connection parameters. Unfortunately a number of these
are (potentially) important for good
performance. This is the least significant issue, since I wouldn't
really care about it if everything else worked (instead it would
just be a vague caution sign).
The second issue is how the iSCSI initiator is managed and
connections are established. Basically, FreeBSD provides absolutely
no support for this and in particular there is no boot time
daemon that you run to connect to all of your configured iSCSI
initiators. Instead you get to somehow run one instance of
iscontrol(8) per
target, restarting any that die because their error recovery is (as the
manual says) not 'fully compliant' or due to other issues. iscontrol
has at least some bad limitations; in my experimentation, it would not
start against a target that had no LUNs advertised (which is valid). I
did not test its behavior if all LUNs of a target got deleted while it
was running, but I wouldn't be surprised if it also exited (which would
be a bad problem for us).
Both of these pale against the major issue, which was performance or
the lack thereof. I'm not going to quote numbers for reasons I'll
discuss later, but it was bad at all levels: in ZFS, in UFS, and just
banging on the raw iSCSI disk. Streaming read performance was roughly
3/5ths of what Linux got in the same environment. Untuned streaming
write performance was one tenth of the streaming read performance,
but with drastically increased iSCSI parameters (not needed for Linux)
FreeBSD managed to pull that up to 2/3rds of its read performance and
2/5ths of what Linux could get. Web searches turned up other people
reporting catastrophically bad iSCSI write speeds (I suspect that they
did not tune iSCSI parameters, which reduces it to merely bad), so I
don't think that this is just me or just my test environment.
This level of (non)performance is a complete non-starter for us. It's
so bad that there is no point in me spending more time to go beyond my
quick experiment and basic tests. Under other circumstances I might have
looked at the code and dug into things further to see if I could find
some fixable defect, but I don't feel that there's any point here. The
other issues make it clear to me that no one has run the FreeBSD iSCSI
initiator in production (at least no one sane), and I have no desire to
be the first person on the block to find all of the other problems it
may have.
(The situation with iscontrol alone makes it clear that no one has
exposed this to real usage, because no sane sysadmin would tolerate
running their entire iSCSI initiator connection handling that way. I
don't object to separate iscontrol instances; I do object to no master
daemon and no integration with the FreeBSD startup system.)
(Also, you don't want to know how FreeBSD handles or in this case
doesn't handle the various iSCSI dynamic discovery methods.)
All of this leaves me disappointed. I wanted FreeBSD to be a viable
competitor and alternative, something that we could really consider.
Now our options are much narrower.
(Well, I can always hope that the FreeBSD iSCSI initiator improves
drastically in the next, oh, year, since we're not about to replace our
current Solaris fileserver infrastructure right away. We've only just
started to think about a replacement project; it may be two or three
years before we actually need to make a choice and deploy.)
Sidebar: my test environment and cautions
At this point I will say it out loud: I was not testing FreeBSD on
physical hardware. I discovered all of this during very basic tests
in a virtual machine. Normally this would make even me question my
results, but I did a number of things to validate them. First, I tested
(streaming) TCP bandwidth between the FreeBSD VM and the iSCSI backend
(which is on real hardware) and got figures of close to the raw wire
bandwidth; I can be reasonably sure that the FreeBSD VM was not having
its network bandwidth choked by the virtualization system. Second,
I also ran a Linux VM in the same virtualization environment and
measured its performance (network and iSCSI). As noted above, it did
significantly better than FreeBSD did (despite actually having less
RAM allocated to it).
It's always possible that FreeBSD iSCSI is choking on something about
the virtualization environment that doesn't affect its raw TCP speed or
Linux. My current view is that the odds of this are sufficiently low
(for various reasons) that it is not worth the hassle of spinning up a
physical FreeBSD machine just to be sure.
(Partly this is because I found other people on the Internet also
complaining about the FreeBSD iSCSI write speeds. If I was the sole
person having problems, I would suspect myself instead of FreeBSD.)
I suppose the one quick test I should do is to feed the FreeBSD VM a
whole lot more memory to see if that suddenly improves both read and
write speeds a whole lot. But even if FreeBSD had Linux-level read and
write performance, the other significant issues would probably sink it
here.
FreeBSDiSCSIClientNoGo written at 00:37:53; Add Comment
2013-03-15
A dive into the depths of yes `yes no`
The blog entry of the current time interval is m. tang's yes `yes no`, which winds up exploring just
that; as the author says, doing it 'sort of slowed my computer below
the threshold of usefulness, so I had to restart it'. Unfortunately
the author's original explanation that the second, outer yes buffers
the output of yes no endlessly (eating up all memory in the process)
is totally wrong (as various people have noted and corrected since the
entry started going around). As it happens I think that there are some
interesting things hiding under the covers here, so I'm going to talk
about them.
First off, let's understand why this command line probably explodes
your system. To be clear I'll rewrite this in a more modern but less
aesthetically looking shell syntax and call it 'yes $(yes no)'.
This is more or less equivalent to:
shvar=$(yes no)
yes $shvar
Just the first line alone is enough to blow up your shell because it
asks the shell to read an endless amount of input and try to hold it in
memory (here in the form of a shell variable). The same thing happens in
the original command line, just without the intermediate shell variable.
You might wonder why the shell doesn't have some limit on how much
input it's willing to read this way. While this is a self-inflicted
accident, it's not as if Unix machines really deal well with running
out of memory; on a 64-bit machine you could easily blow up the entire
system doing this (on a 32-bit machine you might run into per-process
address space limits before then). Saving you from this would be at
least somewhat nice. I suspect that the real answer basically boils down
to 'tradition'; this is such a rare (and self-inflicted) situation that
no shell has bothered to deal with it yet and since no shell has, not
dealing with it has become the default.
(Unix has a great deal of this sort of 'someone else did it this way
first' historical practice that has basically fossilized over the years.
Even if it doesn't necessarily entirely make sense it's often easier for
people who are reimplementing commands to just go with existing (lack
of) practice. Part of this ties into the social problems involved with
changing things in Unix.)
Beyond that, though, there are some issues with having a limit. First
you have to decide on the semantics of what happens when the limit is
hit. Do you discard the output entirely or truncate it? Do you count
this as a failure for the purposes of set -e or do you pass on the
exit status of the 'yes no', whatever that is (and it may not be a
failure)? In the case of 'yes $(yes no)' do you even try to run the
second yes (with a truncated or empty argument list) or do you fail
the entire command on the spot? There are arguments either way for
much of this (and the choices interact with each other); you'll have
to figure out what's the most useful answers in practice, whether your
proposed change breaks any existing script practices, and then how much
POSIX lets you get away with (if you care about being a POSIX-compatible
Bourne shell; things like zsh have it easier here).
Then there's the issue of what the limit should be. We don't want
to just limit the shell to the kernel's exec() limit (which
is on the combined size of the arguments and the environment); it's valid to simply read a lot of output
into an unexported shell variable and then process it. In fact in some
situations this is how you deal with an 'arguments too big' problem. So
what do you set? People are probably going to complain about almost any
value.
(I suppose the real answer is to have the limit be user-settable. You
could even start out with the limit available but default to unlimited,
then a year or two later introduce a default limit.)
Sidebar: $() and set -e today
Since I just experimented with this:
set -e
echo 1 $(false)
v=$(false)
echo 2
This will echo '1' but not '2' in Bash, dash, ksh, FreeBSD's sh, and
Solaris 10's /usr/xpg4/bin/sh, which makes me assume that this is
actually what POSIX requires. Just to be different, Solaris 10's
/bin/sh doesn't echo anything (even if you flip the order around).
(I have not been masochistic enough to obtain and boot a PDP 11 V7 image
just to see what the V7 sh would do, but I suspect it's the same as
Solaris /bin/sh.)
ExploringYesYesNo written at 00:13:17; Add Comment
|
These are my WanderingThoughts
(About the blog)
GettingAround
Full index of entries
Recent comments
This is part of CSpace, and is written by ChrisSiebenmann.
Twitter: @thatcks
* * *
Atom feeds are available; see the bottom of most pages.
This is a DWiki.
(Help)
Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web
|