Wandering Thoughts

2015-01-31

The problem with punishing people over policy violations

Back in my entry on why user-hostile polices are a bad thing I said that I believed threatening to punish people was generally not an effective thing from a business perspective. I owe my readers an explanation for that, because on the surface it seems like an attractive idea.

The problem with punishing people is that practically by definition a meaningful punishment must hurt, and generally it can't hurt retroactively. However, when you hurt people and especially when you hurt people's future with you (through bad performance reviews because of policy violations, docking their future pay, and so on), the people involved may decide to react to the hurt by just quitting and finding another job.

This means that any time you are contemplating punishing someone in a meaningful way, you must ask yourself whether whatever they did is bad enough to risk losing them over it (or bad enough that you should lose them over it). Sometimes the answer will be yes that it was really really bad; sometimes the answer will be yes because they're easy to replace. But if it was not a really bad thing and if they would be disruptive to lose and a pain to replace, well, do you want to run that risk?

Obviously, the worse your punishment is the higher the chance of this happening is. In particular, if your punishment means that they'll wind up noticeably underpaid relative to their counterparts elsewhere (whether through denial of promotion, denial of performance raises, or so on) you'd better hope that they really love working for you.

(You can always hope that they'd have a hard time finding another job (or at least another job that's as attractive as yours even after you punish them) so that they don't really have a choice but sucking it up and taking it. But for high-demand professionals this is probably not very likely. And even if it's the case now you've armed a ticking time bomb; I suspect that you're going to lose them as soon as they can go.)

(This is separate from the additional problems of punishing people at universities, where I was more focused on removal of computer or network access than a larger view of punishments in general.)

tech/PolicyPunishmentProblem written at 23:35:59; Add Comment

Upgrades and support periods

Suppose, hypothetically, that you are a vendor and you want to push people to upgrade more frequently. No problem, you say, you will just reduce the support period for your old releases. This is a magic trick that will surely cause everyone to upgrade at least as fast as you want them to, basically at a pace that you chose, right?

Well, no, obviously not. There are clearly at least two forces operating here. On the one hand you have people's terror of lack of support; this pushes them to upgrade. On the other hand, you have people's 'terror' of the work and risk involved in upgrades; this pushes them to not upgrade. Pushing on ever shortening support from the vendor side can only get you so far because the other force is pushing back against you, and after a certain point people simply don't move any more. Once you've hit that point you can reduce your support period all you want but it won't have any effect.

Generally I think there will be diminishing returns from shorter and shorter support periods as you push more and more people to their limit of terror and they say 'well, to hell with it then'. I also suspect that this neither a linear decay nor a smooth one; there are probably inflection points where a whole lot of people will drop out at once.

Aggressively lowering your support periods will have one effect, though: you can persuade people to totally abandon your system and go find another one that isn't trying to drag them around through terror. This is a win only if you don't want users.

(By the way, the rapidly upgrading products in the real world that do this at scale don't do it by having short support periods.)

tech/UpgradesAndSupport written at 01:31:54; Add Comment

2015-01-30

I've come to believe Django's way of defining database tables is wrong

Django defines both database tables and HTML forms in the same way, a way that seems to be extremely common in web frameworks across several languages (and which I think first surfaced in Rails, although I may well be wrong there):

class AForm(...):
   login = forms.CharField(...)
   email = forms.EmailField(...)
   ...

This is very appealing initially and Django goes well out of its way to make it all work. But over time I've come around to feeling that this is in fact the wrong way to do forms, database tables, and so on in Python. Why not boils down to one famous sentence from 'import this' (aka the zen of Python):

Explicit is better than implicit.

Django form classes and database classes are full to the brim of implicit magic. They're essentially an illusion. Worse, they're not really a very Pythonic illusion. We accept the illusion because it's convenient and this way of defining forms and tables has become more or less standard, but that doesn't mean that it's right in Python.

(My view is that the initial Rails version was a reasonably natural looking DSL that happened to also be valid Ruby code with the right mangling. The Python version is clearly not something you can read as a DSL, so I think that the seems show much more; it looks like Python but it's kind of bizarre Python.)

Given the Tim Peters remark I led with, I think a more Pythonic way would make explicit various things that are currently implicit. I don't have a good handle on what that would look like, though. Doing the setup in class __init__? Defining the table or form layout by calling code instead of defining a class? Either would be (or at least could be) more explicit and less magical.

(Part of the issue is that Python has decided that structures with named fields are only really available as classes. Once you have classes, all sorts of temptations start materializing and looking at least partly natural.)

PS: It's my view that the magical, illusory nature of Django form and table definitions starts showing through very distinctly once you want to do more complex things with them. Like much magic, it works best if you don't touch it at all.

python/ORMMagicClassesWrong written at 02:14:09; Add Comment

2015-01-29

The practical result of OpenBSD's support policy

Recently I read Ted Unangst's long term support considered harmful (via), where he mentions OpenBSD's relatively short term support policy (it's one year at most and less in practice) and then says:

Now on the one hand, this forces users to upgrade at least once per year. On the other hand, this forces users to upgrade at least once per year. [...]

Oh, how this makes me laugh. Sadly.

We have a not insignificant number of OpenBSD firewalls. I don't believe that any of them are running a currently supported OpenBSD release and if they are it's not because we upgraded them, it's because they were set up recently. Other than that we leave the firewalls strictly alone until we have to change them for some reason.

(One of the reasons we never attempt to upgrade firewalls is that OpenBSD more or less explicitly doesn't have backwards compatibility between releases in things like PF; OpenBSD can and has changed PF syntax, rules, and rule handling around from release to release. When the core of your firewall may have changed, upgrades are not a 'cvs up; recompile', they are a full exercise of 'install a spare machine then retest and requalify everything from scratch' (which in our small environment is done by hand on scrounged hardware). Deployment of the result has its own pains.)

I don't think we're alone here; I suspect that there are lots of people running OpenBSD releases that are out of support. OpenBSD's short support period certainly accomplishes the goal of less (valid) bug reports to OpenBSD and less work for OpenBSD to do, but it doesn't necessarily either get users to upgrade or reduce the actual bugs that they may encounter. Instead the effect of this short support period and lack of long term support is to maroon more OpenBSD users without any support at all.

I doubt that OpenBSD cares about such marooned users, but as usual I feel like mentioning the practical results of a policy, not just the theoretical ones.

All of this leads me to laugh very hollowly at Ted Unangst's conclusion that:

A one year support window isn't too short; it's too long.

I'm pretty certain that the major goal this would achieve would be to allow OpenBSD to reject even more bug reports.

(There is a general lesson here but I'm going to leave it to another entry.)

Sidebar: OpenBSD and practical support

To put it very simply, we aren't worried much by the lack of support we have with our current firewalls because we don't expect any support and I don't think we'll ever file any bug reports with OpenBSD even if we find issues (which we do from time to time). Our attitude on the state of OpenBSD is 'we get what we get and it's nice if it works'. If it doesn't work, we find something that does.

(This is why we ran Fedora L2TP VPN servers for a while.)

Some people might consider this lack of bug reports to be antisocial, but I have some opinions for them.

unix/OpenBSDSupportPolicyResults written at 01:07:27; Add Comment

2015-01-28

A thought about social obligations to report bugs

One of the things that people sometimes say is that you have a social obligation to report bugs when you find them. This seems most common in the case of open source software, although I've read about it for eg developers on closed source platforms. Let's set aside all of the possible objections with this for the moment, because I want to point out an important issue here that I feel doesn't get half as much attention as it should.

If users have a social obligation to report bugs, projects have a mirror social obligation to make reporting bugs a pleasant or at least not unpleasant experience.

Put flatly, this is only fair. If you are going to say that people need to go out of their way to do something for you (in the abstract and general sense), I very strongly reject the idea that you get to require them to go through unpleasant things or get abused in the process. If you try to require that, you are drastically enlarging the scope of the social obligation you are trying to drop on people, and this is inequitable. You're burdening people all out of proportion for what they are doing.

As a corollary to this, if you want to maintain that users of any particular project (especially your project) have a social obligation to report bugs to 'pay for' the software, you have the obligation of 'paying for' their bug reports by making that project's bug reporting a pleasant process. If you create or tolerate an unpleasant bug reporting process or environment while putting pressure on people to report bugs, you are what I can only describe as an asshole.

(You're also engaged in something that is both ineffective and alienating, but I'm not talking about practicalities here, I'm talking about what's just. If we're all in this together, being just is for everyone to try to make everyone else's life better. Projects make the life of users better by developing software, users make projects better by doing good bug reports, and projects make the life of users better by making bug reports as pleasant as possible.)

(This is probably one of the cases where either I've convinced you by the end of the thesis or you're never going to be convinced, but sometimes I write words anyways.)

tech/BugReportExperienceObligation written at 01:58:20; Add Comment

2015-01-27

Our current email anti-virus system is probably ineffective now

Last month I noticed that classical viruses by email were still around, despite a past history of low virus detection by our main mail system. Well, funny you should mention that. As it happens, late last week the whole university was battered a large tide of infected phish/virus emails over several days (and we had several infections ourselves). If our anti-spam system is any good at detecting viruses, I'd expect a serious uptick in virus detection because the actual rate of virus emails was clearly up significantly.

The good news is that there is a definite uptick over the two days with the bulk of the attack. The bad news is that it is not to very high numbers; 81 Monday, 95 Tuesday, 112 Wednesday, 101 Thursday, and 47 Friday. A normal weekday appears to run around 50 viruses detected a day. And it's highly likely that at least some viruses made it through this screening to reach our users.

(Note that some of these 'viruses' are actually phish spam. It's possible that they're phish spam with executables attached; I don't know.)

It's possible that some of the viruses were detected as spam, but there are two strikes against this. The first is that detected spam volume does not seem to fluctuate much over those days. The second is that detecting viruses as spam instead is actually bad for us; if it's detected as an actual virus, the anti-spam system removes the viral content instead of merely marking the Subject: line.

Unfortunately I don't know what options we have, and also how much work it's worth putting into this in general. After all, if our actual virus email rate is quite low outside of anomalies such as this it probably doesn't matter that our current anti-spam system seems at best so-so at detecting viruses. We could plow a lot of time and effort into evaluating (free) options like ClamAV only to find out blocking only a small extra amount of email, which hardly seems worth it.

(I have complicated attitudes on anti-virus stuff, but the short summary is that I think it's very dangerous to put much emphasis on email filtering keeping them out.)

spam/LowVirusDetection-2015-01 written at 01:36:54; Add Comment

2015-01-26

Some notes on keeping up with Go packages and commands

Bearing in mind that just go get'ing things is a bad way to remember what packages you're interested in, it can be useful to keep an eye on updates to Go packages and commands. My primary tool for this is Dmitri Shuralyov's Go-Package-Store, which lets you keep an eye on not only what stuff in $GOPATH/src has updates but what they are. However, there are a few usage notes that I've accumulated.

The first and most important thing to know about Go-Package-Store, and something that I only realized recently myself (oh the embarrassment) is that Go-Package-Store does not rebuild packages or commands. All it does is download new versions (including fetching and updating their dependencies). You can see this in the commands it's running if you pay attention, since it specifically runs 'go get -u -d'. This decision is sensible and basically necessary, since many commands and (sub) packages aren't installed with 'go get <repo top level>', but it does mean that you're going to have to do this yourself when you want to.

So, the first thing this implies is that you need to keep track of the go get command to rebuild each command in $GOPATH/bin that you care about; otherwise, sooner or later you'll be staring at a program in $GOPATH/bin and resorting to web searches to find out what repository it came from and how it's built. I suggest just putting this information in a simple shell script that just does a mass rebuild, with one 'go get' per line; when I want to rebuild just a specific command, I cut and paste its line.

(Really keen people will turn the text file into a script so that you can do things like 'rebuild <command>' to run the right 'go get' to rebuild the given command.)

The next potentially tricky area is dependent packages, in several ways. The obvious thing is that having G-P-S update a dependent package doesn't in any way tell you that you should rebuild the command that uses it; in fact G-P-S doesn't particularly know what uses what package. The easy but bruce force way to deal with this is just to rebuild all commands every so often (well, run 'go get -u' against them, I'm not sure how much Make-like dependency checking it does).

The next issue is package growth. What I've noticed over time is that using G-P-S winds up with me having extra packages that aren't needed by the commands (and packages) that I have installed. As a result I both pay attention to what packages G-P-S is presenting updates for and periodically look through $GOPATH/src for packages that make me go 'huh?'. Out of place packages get deleted instead of updated, on the grounds that if they're actual dependencies of something I care about they'll get re-fetched when I rebuild commands.

(I also delete $GOPATH/pkg/* every so often. One reason that all of this rebuilding doesn't bother me very much is that I track the development version of Go itself, so I actively want to periodically rebuild everything with the latest compiler. People with big code bases and stable compilers may not be so sanguine about routinely deleting compiled packages and so on.)

I think that an explicit 'go get -u' of commands and packages that you care about will reliably rebuild dependent packages that have been updated but not (re)built in the past by Go-Package-Store, but I admit that I sometimes resort to brute force (ie deleting $GOPATH/pkg/*) just to be sure. Go things build very fast and I'm not building big things, so my attitude is 'why not?'.

Sidebar: Where I think the extra packages come from

This is only a theory. I haven't tested it directly; it's just the only cause I can think of.

Suppose you have a command that imports a sub-package from a repository. When you 'go get' the command, I believe that Go only fetches the further imported dependencies of the sub-package itself. Now, later on Go-Package-Store comes along, reports that the repository is out of date, and when you tell it to update things it does a 'go get' on the entire repository (not just the sub-package initially used by the command). This full-repo 'go get' presumably imports either all dependencies used in the repository or all dependencies of the code in the top level of the repository (I'm not sure which), which may well add extra dependencies over what the sub-package needed.

(The other possible cause is shifting dependencies in packages that I use directly, but some stray packages are so persistent in their periodic returns that I don't really believe that.)

programming/GoPackagesKeepingUp written at 01:53:43; Add Comment

2015-01-25

The long term problem with ZFS on Linux is its license

Since I've recently praised ZFS on Linux as your only real choice today for an advanced filesystem, I need to bring up the long term downside because, awkwardly, I do believe that btrfs is probably going to be the best pragmatic option in the long term and is going to see wider adoption once it works reliably.

The core of the problem is ZFS's license, which I've written about before. What I didn't write about back then because I didn't know enough at the time was the full effects on ZoL of not being included in distributions. The big effect is it will probably never be easy or supported to make your root filesystem a ZFS pool. Unless distributions restructure their installers (and they have no reason to do so), a ZFS root filesystem needs first class support in the installer and it will almost certainly be rather difficult (both politically and otherwise) to add this. This means no installer-created filesystem can be a ZFS one, and the root filesystem has to be created in the installer.

(Okay, you can shuffle around your root filesystem after the basic install is done. But that's a big pain.)

In turn this means that ZFS on Linux is probably always going to be a thing for experts. To use it you need to leave disk space untouched in the installer (or add disk space later), then at least fetch the ZoL packages from an additional repository and have them auto-install on your kernel. And of course you have to live with a certain amount of lack of integration in all of the bits (especially if you go out of your way to use a ZFS root filesystem).

(And as I've seen there are issues with mixing ZFS and non-ZFS filesystems. I suspect that these issues will turn out to be relatively difficult to fix, if they can be at all. Certainly things seem much more likely to work well if all of your filesystems are ZFS filesystems.)

PS: Note that in general having non-GPLv2, non-bundled kernel modules is not an obstacle to widespread adoption if people want what you have to offer. A large number of people have installed binary modules for their graphics cards, for one glaring example. But I don't think that fetching these modules has been integrated into installers despite how popular they are.

(Also, I may be wrong here. If ZFS becomes sufficiently popular, distributions might at least make it easy for people to make third party augmented installers that have support for ZFS. Note that ZFS support in an installer isn't as simple as the choice of another filesystem; ZFS pools are set up quite differently from normal filesystems and good ZFS root pool support has to override things like setup for software RAID mirroring.)

linux/ZFSOnLinuxRootFSProblem written at 04:20:46; Add Comment

2015-01-24

Web applications and generating alerts due to HTTP requests

One of the consequences and corollaries of never trusting anything you get from the network is that you should think long and hard before you make your web application generate alerts based on anything in incoming HTTP requests. Because outside people can put nearly anything into HTTP requests and because the Internet is very big, it's very likely that sooner or later some joker will submit really crazy HTTP requests will all sorts of bad or malicious content. If you're alerting on this, well, you can wind up with a large pile of alerts (or with an annoying trickle of alerts that numbs you to them and to potential problems).

Since the Internet is very big and much of it doesn't give much of a damn about your complaints, 'alerts' about bad traffic from random bits of the Internet are unlikely to be actionable alerts. You can't get the traffic stopped by its source (although you can waste a lot of time trying) and if your web application is competently coded it shouldn't be vulnerable to these malicious requests anyways. So it's reporting that someone rattled the doorknobs (or tried to kick the door in); well, that happens all the time (ask any sysadmin with an exposed SSH port). It's still potentially useful to feed this information to a trend monitoring system, but 'HTTP request contains bad stuff' should not be an actual alert that goes to humans.

(However, if your web application is only exposed inside what is supposed to be a secured and controlled environment, bad traffic may well be an alert-worthy thing because it's something that's really never supposed to happen.)

A corollary to this is that web frameworks should not default to treating 'HTTP request contains bad stuff' as any sort of serious error that generates an alert. Serious errors are things like 'cannot connect to database' or 'I crashed'; 'HTTP request contains bad stuff' is merely a piece of information. Sadly there are frameworks that get this wrong. And yes, under normal circumstances a framework's defaults should be set for operation on the Internet, not in a captive internal network, because this is the safest and most conservative assumption (for a definition of 'safest' that is 'does not deluge people with pointless alerts').

(This implies that web frameworks should have a notion of different types or priorities of 'errors' and should differentiate what sort of things get what priorities. They should also document this stuff.)

web/WebAppsAndAlerts written at 00:15:16; Add Comment

2015-01-23

A problem with gnome-terminal in Fedora 21, and tracking it down

Today I discovered that Fedora 21 subtly broke some part of my environment to the extent that gnome-terminal refuses to start. More than that, it refuses to start with a completely obscure error message:

; gnome-terminal
Error constructing proxy for org.gnome.Terminal:/org/gnome/Terminal/Factory0: Error calling StartServiceByName for org.gnome.Terminal: GDBus.Error:org.freedesktop.DBus.Error.Spawn.ChildExited: Process org.gnome.Terminal exited with status 8

If you're here searching for the cause of this error message, let me translate it: what it really means is that your session's dbus-daemon could not start /usr/libexec/gnome-terminal-server when gnome-terminal asked it to. In many cases, it may be because your system's environment has not initialized $LC_CTYPE or $LANG to some UTF-8 locale at the time that your session was being set up (even if one of these environment variables gets set later, by the time you're running gnome-terminal). In the modern world, increasing amount of Gnome bits absolutely insist on being in a UTF-8 locale and fail hard if they aren't.

Some of you may be going 'what?' here. What you suspect is correct; the modern Gnome 3 'gnome-terminal' program is basically a cover script rather than an actual terminal emulator. Instead of opening up a terminal window itself, it exists to talk over DBus to a master gnome-terminal-server process (which will theoretically get started on demand). It is the g-t-s process that is the actual terminal emulator, creates the windows, starts the shells, and all. And yes, one process handles all of your gnome-terminal windows; if that process ever hits a bug (perhaps because of something happening in one window) and dies, all of them die. Let's hope g-t-s doesn't have any serious bugs.

To find the cause of this issue, well, if I'm being honest a bunch of this was found with an Internet search of the error message. This didn't turn up my exact problem but it did turn up people reporting locale problems and also a mention of gnome-terminal-server, which I hadn't known about before. For actual testing and verification I did several things:

  • first I used strace on gnome-terminal itself, which told me nothing useful.

  • I discovered that starting gnome-terminal-server by hand before running gnome-terminal made everything work.

  • I used dbus-monitor --session to watch DBus messages when I tried to start gnome-terminal. This didn't really tell me anything that I couldn't have seen from the error message, but it did verify that there was really a DBus message being sent.

  • I found the dbus-daemon process that was handling my session DBus and used 'strace -f -p ...' on it while I ran gnome-terminal. This eventually wound up with it starting gnome-terminal-server and g-t-s exiting after writing a message to standard error. Unfortunately the default strace settings truncated the message, so I reran strace while adding '-e write=2' to completely dump all messages to standard error. This got me the helpful error message from g-t-s:
    Non UTF-8 locale (ANSI_X3.4-1968) is not supported!

    (If you're wondering if dbus-daemon sends standard error from either itself or processes that it starts to somewhere useful, ha ha no, sorry, we're all out of luck. As far as I can tell it specifically sends standard error to /dev/null.)

  • I dumped the environment of the dbus-daemon process with 'tr '\0' '\n' </proc/<PID>/environ | less' and inspected what environment variables it had set. This showed that it had been started without my usual $LC_CTYPE setting (cf).

With this in hand I could manually reproduce the problem by trying to start gnome-terminal-server with $LC_CTYPE unset, and then I could fix up my X startup scripts to set $LC_CTYPE before they ran dbus-launch.

(This entry is already long enough so I am going to skip my usual rant about Gnome and especially Gnome 3 making problems like this very difficult for even experienced system administrators to debug because there are now so many opaque moving parts to even running Gnome programs standalone, much less in a full Gnome environment. How is anyone normal supposed to debug this when gnome-terminal can't even be bothered to give you a useful error summary in addition to the detailed error report from DBus?)

linux/GnomeTerminalUTF8Required written at 01:54:19; Add Comment

(Previous 10 or go back to January 2015 at 2015/01/22)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.