Wandering Thoughts archives


I wish Python didn't allow any callable to be a 'metaclass'

One of the things that any number of discussions of metaclasses will tell you is that any callable object can be a metaclass, not just a class. This is literally and technically true, in that you can set any callable object (such as a function) as a class's metaclass and it will be invoked on class creation so that it can modify the class being created (which is one of the classical uses of metaclasses).

The problem is that modifying classes as they are being created is the least of what metaclasses can do. Using a callable as your 'metaclass' means that you get none of those other things (all of which require being a real metaclass class). And even your ability to modify classes as they're being created is incomplete; using a callable as your metaclass means that any subclasses of your class will not get your special metaclass processing. This may surprise you, the users of your code and your 'metaclass', or both at once. Unfortunately it's easy to miss this if you don't know metaclasses well and you don't subclass your classes (either in testing or in real code).

I understand why Python has allowed any callable to be specified as the metaclass of a class; it's plain convenient. In the simple case it gives you a minimal way of processing a class (or several classes) as they're being created; you can just write a function that fiddles around with things and be done with it; you don't need the indirection and extra code of a class that inherits from type() and has a __new__ function and all of that. It also at least looks more general than restricting metaclasses to be actual classes.

The problem is that this convenience is a trap lying in wait for the unwary. It works only in one place and one way and doesn't in others, failing in non-obvious ways. And if you need to convert your callable into a real metaclass because now you need some additional features of a real metaclass class, suddenly the behavior of subclasses of the original class may change.

So on the whole I wish Python had not done this. I feel it's one of the rare places where Python has prioritized surface convenience and generality a little too much. Unfortunately we're stuck with this decision, since setting metaclass to any callable is fully documented in Python 3 and probably can't ever be deprecated.

PS: Note that Python is actually inconsistent here between real metaclass classes and other callables, since a metaclass that is a class will have its __new__ invoked, not its __call__, even if it has the latter and thus is callable in general. This is absolutely necessary to get metaclass classes working right, but that this inconsistency exists is another sign that this whole 'any callable' thing shouldn't be there in the first place.

Sidebar: the arguments to your metaclass callable

The arguments for a general callable are slightly difference from the arguments a real metaclass __new__ receives. You get called as:

def metacls(cname, bases, cdict):
    return type(cname, bases, cdict)

If you want to call type.__new__ directly, you must provide a valid metaclass as the first argument. type itself will do, of course. Using a metacls() function that shims in an actual class as the real metaclass is beautifully twisted but is going to confuse everyone. Especially if your real metaclass has a __new__.

(If your real metaclass has a __new__, this will get called for any subclasses of what you set the metaclass function on. I suppose you could abuse this to more or less block subclassing a class if you wanted to. Note that this turns out to not be a complete block, at least in Python 3, but that's another entry.)

python/MetaclassCallableIssues written at 01:32:34; Add Comment


Your example code should work and be error-free

This is one of those cases where I'm undoubtedly preaching to the choir, but one thing that sends me into a moderate rage is articles about programming that helpfully include sample code as illustrations but then have errors in the sample code. The worst errors are subtle errors, things where the code almost works but occasionally is going to blow up.

Actually, let me clarify that. It's common to omit error checking in sample code and sometimes the demands of space and comprehension mean that you can't really put in all of the code to handle the full complexity of a situation with all of its corner cases. But if you do this you should also add a note about it, especially about unhandled corner cases. Omitting error checks (especially without a note) is more forgivable in a language with exceptions, since the code will at least malfunction obviously.

Perhaps it is not obvious to my readers why this is a bad idea. The answer is simple: sooner or later someone is going to either copy your sample code into their program more or less as is or use it as a guideline to write their own. After all, you've already given them the algorithm and the structure of what they want to do; why shouldn't they copy your literal code rather than rewrite from scratch based on their understanding of your article? If your code is good, it's actually less error-prone to just copy it. Of course the exact people who need your article and are going to copy your code are the people who are the worst equipped to spot the errors and omitted corner cases lurking in it.

The more subtle problem is that anyone who does know enough to spot errors in your sample code is going to immediately distrust your entire article. If you have buggy code in your examples, what else have you screwed up? Everything you're saying is suspect, especially if the flaw is relatively fundamental, like a concurrency race, as opposed to a a relatively simple mistake.

(Since I've put code examples into entries on Wandering Thoughts, this is of course kind of throwing stones in what may well be a glass house. I know I've had my share of errors that commentors have pointed out, although I do try to run everything I put in an entry in the exact form it appears here.)

(This entry is sadly brought to you by my irritation with this article on some Go concurrency patterns. It contains potentially interesting stuff but also code with obvious, classic, and un-commented-on concurrency races, so I can't trust it at all. (Well, hopefully the concurrency races are as obvious as I think they are.))

programming/ExamplesShouldWork written at 01:57:15; Add Comment


Web ads considered as a security exposure

One of the things that reading Twitter has exposed me to is a number of people who deploy browser adblockers as part of their security precautions. This isn't because they're the kind of person who's strongly opposed to ads, and it's not even because they don't want their users (and themselves) to be tagged and tracked around the web (although that is a potential concern in places). It's because they see web ads themselves as a security risk, or more specifically a point of infection.

The problem with web ads is web ad networks. It's a fact that every so often web ad networks have been compromised by attackers and used to serve up 'ads' that are actually exploits. This doesn't just affect secondary or sketchy websites; major mainstream websites use ad networks, which means that visiting sites normally considered quite trustworthy and secure (like major media organizations) can expose you to this.

(As an extra risk, almost all ad networks use HTTP instead of HTTPS so you're vulnerable to man in the middle attacks on exposed networks like your usual random coffee shop wifi.)

Based on my understanding of modern sophisticated ad networks and the process of targeting ads, they also offer great opportunities for highly targeted attacks. At least some networks offer realtime bidding on individual ad impressions and as part of this they pass significant amounts of information about the person behind the request to the bidders. Want to target your malware against people in a narrow geographical area with certain demographics? You can do that, either by winning bids or by hijacking the same information processes from within a compromised ad network. You might even be able to do very specific 'watering hole' style attacks against people who operate from a restricted IP address range, such as a company's outgoing firewall.

(The great thing about winning bids is that you may not even be playing with your own money. After all, it's probably not too difficult to compromise one of the companies that's bidding to put its ads in front of people.)

If you're thinking about the risks here, web ad blockers make a lot of sense. They don't even have to be deeply comprehensive; just blocking the big popular web ad networks that are used by major sites probably takes out a lot of the exposure for most people.

I don't think about ad blockers this way myself, partly because I already consider myself low risk (I'm a Linux user with JavaScript and Flash blocked by default), but this is certainly something I'm going to think about this for people at work. Maybe we should join the places that do this as a standard recommendation or configuration.

web/WebAdsSecurityExposure written at 01:53:32; Add Comment


My current views on Firefox adblocker addons

I normally do my web browsing through a filtering proxy that strips out many ads and other bad stuff, and on top of that I use NoScript so basically all JavaScript based things drop out. However this proxy only does http, so I've known for a while that as the web moved more and more to https my current anti-ad solution would be less and less effective. This led to me playing around with various options in my testing browser but never pushed me to putting anything in my main browser. What pushed me over the edge to do this relatively recently was reaching my tolerance limit for Youtube ads and discovering that AdBlock Plus would reliably block them. Adding ABP made YouTube a drastically nicer experience for me; I consider its additional ad-blocking features to basically be a nice side effect.

(The popup ads are only slightly irritating, but then YT started feeding me more and more long, unskippable ads. At that point it was either stop watching YT videos or do something about it.)

What makes a bunch of people twitchy about AdBlock Plus is that it's run by a company plus their business model of allowing some ads through. Although ABP is open source, this means that its development is subject to changes in business model and we've seen that cause problems before. Eventually various things made me uncomfortable and unhappy enough to switch to AdBlock Edge (also), which is a fork of ABP with a bunch of things removed. In my 'basically use the defaults' setup, AdBlock Edge works the same as AdBlock Plus. It certainly removes the YouTube ads, which is what I really care about right now.

(My honest opinion is that AdBlock Plus is probably not going to go bad, partly because a fair number of people are paying attention to it since it's a quite popular Firefox extension. Still, I feel a bit better with AdBlock Edge, perhaps because I've been burned by changing extension business models before.)

Both AdBlock Plus and AdBlock Edge don't appear to have made my Firefox either particularly slow or particularly memory consuming. It's possible that I simply haven't noticed the impact because it's mild enough to not be visible for me, especially given my already filtered non-JavaScript browser environment. People certainly do report that these extensions cause them problems.

Recently µBlock has been in the information sources that I follow, so I gave it a try. Sadly, the results for me aren't positive in that µBlock did nothing to stop YouTube ads. Since this is the most important thing for me, I'm willing to forgive ABP and ABE a certain amount of resource consumption in order to get it. I do like the general µBlock pitch of being leaner and more efficient, so someday I hope it picks up this ability.

(As far as I know there's nothing else that blocks YouTube ads. I'd obviously be happy with a standalone extension for this plus µBlock for general blocking, but as far as I know no such thing exists.)

PS: I use other technology to block the scourge of YouTube autoplay. It's possible that this pile of hacks is interacting badly with µBlock.

web/FirefoxAdBlockers written at 03:01:30; Add Comment


Planning ahead in documentation: kind of a war story

I'll start with my tweet:

Current status: changing documentation to leave notes for myself that we'll need in three or four years. Yes, this is planning way ahead.

What happened is that we just upgraded our internal self-serve DHCP portal from Ubuntu 10.04 LTS to Ubuntu 14.04 LTS because 10.04 is about to go out of support. After putting the new machine into production last night, we discovered that we'd forgotten one part of how the whole system was supposed to work and so that bit of it didn't work on the new server. Specifically, the part we'd forgotten involved another machine that needed to talk to our DHCP portal; the DHCP portal hostname had changed, and the DHCP portal system wasn't set up to accept requests from the other machine. That we'd forgotten this detail wasn't too surprising, given that the last time we really thought much about the whole collection of systems was probably four years or so ago when we updated it to Ubuntu 10.04.

So what I spent part of today doing was adding commentary to our build instructions that will hopefully serve as a reminder that parts of the overall DHCP portal extend off the machine itself. I also added some commentary about gotchas I'd hit while building and testing the new machine, and some more stuff about how to test the next version. I put all of this into the build instructions because the build instructions are the one single piece of documentation that we're guaranteed to read when we're building the next version.

As it happens, I can make a pretty good prediction of when the next version will be built: somewhat before when Ubuntu 14.04 stops being supported. On Ubuntu's current schedule that will be about a year after Ubuntu 18.04 LTS comes out, ie four years from now (but this time around we might rebuild the machine sooner than 'in a race with the end of support').

Preparing documentation notes for four years in the future may seem optimistic, but this time around it seemed reasonably prudent given our recent experiences. At the least it could avoid future me feeling irritated with my past self for not doing so.

(I'm aware that in some places either systems would hardly last four years without drastic changes or at the least people would move on so it wouldn't really be your problem. Neither are true here, and especially our infrastructure is surprisingly stable.)

sysadmin/DocumentingPlanningAhead written at 01:20:24; Add Comment


The technical side of Python metaclasses (a pointer)

I recently read Ionel Cristian Mărieș' Understanding Python metaclasses (via Planet Python), which is a great but deeply technical explanation of what is going on with them in Python 3. To give you the flavour, Ionel goes right down to the CPython interpreter source code to explain some aspects of attribute lookup. If nothing else, this is probably the most thorough documentation I've ever seen of the steps and order in Python's attribute lookups. There are even very useful decision tree diagrams. I'll probably be using this as a reference for some time, and if you're interested in this stuff I highly recommend reading it.

I'm personally biased, of course, so I prefer my own series on using and then understanding Python metaclasses. Ionel has a much more thorough explanation of the deep technical details (and it's for Python 3, where mine is for Python 2), but I think it would have lacked context and made my eyes glaze over had I read it before I wrote my series and wound up with my own understanding of metaclasses. But Ionel's writeup is a great reference that's more thorough than, for example, my writeup on attribute lookup order.

(But the curse (and blessing) of writing the entries myself is that I can no longer look at metaclass explanations with a normal person's eyes; I simply know too much and that influences me even if I try to adopt a theoretical outsider view.)

I do disagree with Ionel on one aspect, which is that I don't consider general callable objects to be real metaclasses. General callable objects can only hook __new__ in order to mutate the class being created; true metaclasses do much more and work through what is a fundamentally different mechanism. But this is really a Python documentation issue, since the official documentation is the original source of this story and I can hardly blame people for repeating it or taking it at its word.

PS: I continue to be surprised that Python lacks official documentation of its attribute lookup order. Yes, I know, the Python language specification is not actually a specification, it's an informal description.

python/MetaclassIonelTechnicalSide written at 02:20:53; Add Comment


Good technical writing is not characterless and bland

Recently Evan Root left a comment on my entry on a bad Linux kernel message where he said:

I believe the reason why the Yama message is cryptic and 'intriguing' is because tedious committee sanitized messages such as "AppArmor: AppArmor initialized" are at odds with the core principal behind Ubuntu "Linux for human beings"

This is not an uncommon view in some quarters but as it happens I disagree with it. It's my view that there are two things wrong here.

The largest is that clear technical writing doesn't have to be characterless. Good technical writing is alive; it has personality and character. Bland dry technical writing, the kind of writing that has been scrubbed clean of all trace of character or voice by some anodyne committee, is not good writing. You can be informative without boring people to sleep, even in messages like this. In fact, if you look around it's plain that the best technical writing does very much have a voice and is actively talking to you in that voice.

(There is technical writing where you mostly have to scrub the voice out, like technical specifications, but this is because they are very formal and have to be absolutely clear and unambiguous.)

Such writing with personality is of course harder to create than bland dry writing, which is one reason people settle for unobjectionably bland writing. Pretty much anyone can turn that out on demand just by being as boring and literal as possible. But that is not what people should be producing; we should be producing writing that is clear, informative, and has a voice, even if it takes more effort. This is possible.

(This is the same broad school of writing that produces useless code comments that say nothing at great length.)

The smaller thing wrong is that the original message of 'Yama: becoming mindful' cannot be described as a message for human beings (not in the sense that the Ubuntu slogan means it, at least). That is because it is an in-joke and practically by definition in-jokes are not particularly aimed at outsiders. Here the audience for the in-joke is not even 'current Linux users', it is 'kernel developers and other experts'. A relative outsider can, with work and the appropriate general cultural background, decode the in-joke to guess what it means, but that doesn't make it any less of an in-joke.

(And if you do not know what 'Yama' is in the context of the kernel, you will probably be completely lost.)

An in-joke may have character and voice, but it neatly illustrates that merely having character and voice doesn't make writing (or messages) good. The first goal of good writing is to be clear and informative. Then you give it voice.

(This is of course not a new or novel thing in any way; lots of people have been saying this about technical writing for years. I just feel like adding another little brick to the pile.)

tech/GoodWritingNotDry written at 22:19:01; Add Comment

ZFS can apparently start NFS fileservice before boot finishes

Here's something that I was surprised to discover the other day: ZFS can start serving things over NFS before the system is fully up. Unfortunately this can have a bad effect because it's possible for this NFS traffic to cause further ZFS traffic in some circumstances.

Since this sounds unbelievable, let me report what I saw first. As our problem NFS fileserver rebooted, it stalled reporting 'Reading ZFS config:'. At the same time, our iSCSI backends reported a high ongoing write volume to one pool's set of disks and snoop on the fileserver could see active NFS traffic. ptree reported that what was running at the time was the 'zfs mount -a' that is part of the /system/filesystem/local target.

(I recovered the fileserver from this problem by the simple method of disconnecting its network interface. This caused nlockmgr to fail to start, but at least the system was up. ZFS commands like 'zfs list' stalled during this stage; I didn't think to do a df to capture the actual mounts.)

Although I can't prove it from the source code, I have to assume that 'zfs mount -a' is enabling NFS access to filesystems as it mounts them. An alternate explanation is that /etc/dfs/sharetab had listings for all of the filesystems (ZFS adds them as part of sharing them over NFS) and this activated NFS service for filesystems as they appeared. The net effect is about the same.

This is obviously a real issue if you want your system to be fully up and running okay before any NFS fileservice starts. Since apparently some sorts of NFS traffic under some circumstances can stall further ZFS activity, well, this is something you may care about; we certainly do now.

In theory the SMF dependencies say that /network/nfs/server depends on /system/filesystem/local, as well as nlockmgr (which didn't start). In practice, well, how the system actually behaves is the ultimate proof and all I can do is report what I saw. Yes, this is frustrating. That ZFS and SMF together hide so much in black magic is a serious problem that has made me frustrated before. Among other things it means that when something goes odd or wrong you need to be a deep expert to understand what's going on.

solaris/ZFSNFSServiceDuringBoot written at 01:41:42; Add Comment


Our ZFS fileservers have a serious problem when pools hit quota limits

Sometimes not everything goes well with our ZFS fileservers. Today was one of those times and as a result this is an entry where I don't have any solutions, just questions. The short summary is that we've now had a fileserver get very unresponsive and in fact outright lock up when a ZFS pool that's experiencing active write IO runs into a pool quota limit.

Importantly, the pool has not actually run out of actual disk space; it has only run into the quota limit, which is about 235 GB below the space limit as 'zfs list' reports it (or would, if there was no pool quota). Given things we've seen before with full pools I would not have been surprised to experience these problems if the pool had run itself out of actual disk space. However it didn't; it only ran into an entirely artificial quota limit. And things exploded anyways.

(Specifically, the pool had a quota setting, since refquota on a pool where all the data is in filesystems isn't good for much.)

Unfortunately we haven't gotten a crash dump. By the time there was serious problem indications the system had locked up, and anyways our past attempts to get crash dumps in the same situation have been ineffective (the system would start to dump but then appear to hang). To the extent that we can tell anything, the few console messages that get logged sort of vaguely suggest kernel memory issues. Or perhaps I am simply reading too much into messages like 'arl_dlpi_pending unsolicited ack for DL_UNITDATA_REQ on e1000g1'. Since the problem is erratic and usually materializes with little or no warning, I don't think we've captured eg mpstat output during the run-up to a lockup to see things like if CPU usage is going through the roof.

I don't think that this happens all the time, as we've had this specific pool go to similar levels of being almost full before and the system hasn't locked up. The specific NFS IO pattern likely has something to do with it, as we've failed to reproduce system lockups in a test setup even with genuinely full pools, but of course we have no real idea what the IO pattern is. Given our multi-tenancy we can't even be confident that IO to the pool itself is the only contributor; we may need a pattern of IO to other pools as well to trigger problems.

(I also suspect that NFS and iSCSI are probably all involved in the problem. Partly this is because I would have expected a mere pool quota issue with ZFS alone to have been encountered before now, or even with ZFS plus NFS since a fair number of people run ZFS based NFS fileservers. I suspect we're one of the few places using ZFS with iSCSI as the backend and then doing NFS on top of it.)

One thing that writing this entry has convinced me is that I should pre-write a bunch of questions and things to look at in a file so I have them on hand the next time things start going south and I don't have to rely on my fallible memory to come up with what troubleshooting we want to try. Of course these events are sufficiently infrequent that I may forget where I put the file by the time the next one happens.

solaris/ZFSNFSPoolQuotaProblem written at 00:38:04; Add Comment


'Inbox zero' doesn't seem to work for me but it's still tempting

Every so often I read another paen to the 'inbox zero' idea and get tempted to try to do it myself. Then I come to my senses, because what I've found over time is that the 'inbox zero' idea simply doesn't work for me because it doesn't match how I use email.

I do maintain 'inbox zero' in one sense; I basically don't allow unread email to exist. If it's in my actual MH inbox, I've either read it, am in the process of reading it, or I've been distracted by something being on fire. But apart from that my inbox becomes one part short term to-do tracker, one part 'I'm going to reply to this sometime soon', and one part 'this is an ongoing issue' (and there's other, less common parts).

What I do try to do is keep the size of my inbox down; at the moment my goal is 'inbox under 100', although I'm a bit short of achieving that (as I write this my inbox has 105 messages). Some messages naturally fall out as I deal with them or their issue resolves itself; other messages start quietly rotting until I go in to delete them or otherwise dump them somewhere else. Usually messages start rotting once they aren't near the top of my inbox, because then they scroll out of visibility. I try to go through my entire inbox every so often to spot such messages.

What it would take to get me to inbox zero is ultimately not a system but discipline. I need most or all of the things that linger in my inbox, so if they're not in my inbox they need to be somewhere else and I need to check and maintain that somewhere else just as I check and maintain my inbox. So far I've simply not been successful at the discipline necessary to do that; when I take a stab at it, I generally backslide under pressure and then the 'other places' that I established this time around start rotting (and I may forget where they are).

On the other hand, I'm not convinced that inbox zero would be useful for me as opposed to make-work. To the extent that I can see things that would improve my ability to deal with email and not have things get lost, 'inbox zero' seems like a clumsy indirect way to achieve them. More useful would be something like status tags so that I could easily tag and see, say, my 'needs a reply' email. You can do such status tagging via separate folders, but that's kind of a hack from one perspective.

(I'd also love to get better searching of my mail. Of course none of this is going to happen while I insist on clinging grimly to my current mail tools. But on the other hand my current tools work pretty well and efficiently for me and I haven't seen anything that's really as attractive and productive as they are.)

(A couple of years ago I wrote about how I use email, which touches on this from a somewhat different angle. This entry I'm writing partly to convince myself that trying for inbox zero or pining over it is foolish, at least right now.)

Sidebar: why the idea of inbox zero is continually tempting

I do lose track of things every so often. I let things linger without replies, I forget things I was planning to do and find them again a month later, and so on. Also I delete a certain amount of things because keeping track of them (whether in my inbox or elsewhere) is just too much of a pain. And I've had my inbox grow out of control in the past (up to thousands of messages, where of course I'm not finding anything any more).

A neat, organized, empty inbox where this doesn't happen is an attractive vision, just like a neat organized and mostly or entirely clear desk is. It just doesn't seem like a realistic one.

sysadmin/InboxNonZero written at 02:06:58; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.