Wandering Thoughts archives

2015-08-21

What surprised me about the Python assignment puzzle

Yesterday I wrote about a Python assignment puzzle and how it worked, but I forgot to write about what was surprising about it for me. The original puzzle is:

(a, b) = a[b] = {}, 5

The head-scratching bit for me was the middle, including the whole question of 'how does this even work'. So the real surprise here for me is that in serial assignments, Python processes the assignments left to right.

The reason this was a big surprise is due to what was my broad mental model of serial assignment, which comes from C. In C, assignment is an expression that yields the value assigned (ie the value of 'a = 2' is 2). So in C and languages like this, serial assignment is a series of assignment expressions that happen right to left; you start out with the actual expression producing a value, you do the rightmost assignment which yields the value again, and you ripple leftwards. So a serial assignment groups like this:

a = (b = (c = (d = <expression>)))

Python doesn't work this way, of course; assignment is not an expression and doesn't produce a value. But I was still thinking of serial assignment as proceeding right to left by natural default and was surprised to learn that Python has chosen to do it in the other order. There's nothing wrong with this and it's perfectly sensible; it's just a decision that was exactly opposite from what I had in my mind.

(Looking back, I assumed in this entry that Python's serial assignment order was right to left without bothering to look it up.)

How did my misapprehension linger for so long? Well, partly it's that I don't use serial assignment very much in Python; in fact, I don't think anyone does much of it and I have the vague impression that it's not considered good style. But it's also that it's quite rare for the assignment order to actually matter, so you may not discover a mistaken belief about it for a very long time. This puzzle is a deliberately perverse exercise where it very much does matter, as the leftmost assignment actively sets up the variables that the next assignment then uses.

AssignmentPuzzleSurprise written at 21:58:55; Add Comment

What's going on with a Python assignment puzzle

Via @chneukirchen, I ran across this tweet:

Armin just came up with this puzzle, how well do you know obscure Python details? What's a after this statement?:
(a, b) = a[b] = {}, 5

This is best run interactively for maximum head-scratching. I had to run it in an interpreter myself and then think for a while, because there are several interesting Python things going on here.

Let's start by removing the middle assignment. That gives us:

(a, b) = {}, 5

This is Python's multiple variable assignment ('x, y = 10, 20') written to make the sequence nature of the variable names explicit (hence why the Python tutorial calls this 'sequence unpacking'). Writing the list of variables as an explicit tuple (or list) is optional but is something even I've done sometimes, although I think writing it this way has fallen out of favour. Thus it's equivalent to:

t = ({}, 5)
(a, b) = t

The next trick is that (somewhat to my surprise) when you're assigning a tuple to several variables at once (as 'x = y = 10') and doing sequence unpacking for one of those assignments, Python doesn't require you to do sequence unpacking for every assignment. The following is valid:

(a, b) = x = t

Here a and b become the individual elements of the tuple t while x is the whole tuple. I suppose this is a useful trick to remember if you sometimes want both the tuple and its elements for different purposes.

The next trick happening is that Python explicitly handles repeated variable assignment (sometimes called 'chained assignment' or 'serial assignment') in left to right order. So first the leftmost set of assignments are handled, and second the next leftmost, and so on. Here we only have two sets of assignments, so the entire statement is equivalent to the much more verbose form:

t = ({}, 5)
(a, b) = t
a[b] = t

(When you do this outside of a function, the first (leftmost) assignment also creates a and b as names, which means that the second (right) assignment then has them available to use and doesn't get a 'name is not defined' error.)

The final 'trick' is due to what variables mean in Python, which creates the recursion in a[b]'s value. The tuple t that winds up assigned to a[b] contains a reference to the dictionary that a becomes another reference to, which means that the tuple contains a dictionary that contains the tuple again and it's recursion all the way down.

(When you combine Python's name binding behavior with serial assignment like this, you can wind up with fun bugs.)

AssignmentPuzzleUnpacked written at 01:57:27; Add Comment

2015-08-20

Using abstract namespace Unix domain sockets and SO_PEERCRED in Python

Linux has a special version of Unix domain sockets where the socket address is not a socket file in the filesystem but instead in an abstract namespace. It's possible to use them from Python without particular problems, including checking permissions with SO_PEERCRED, but it's not completely obvious how.

(For general information on using Unix domain sockets from Python, see UnixDomainSockets.)

With a normal Unix domain socket, the address you give is the path to a socket file. Per the Linux unix(7) manpage, an abstract socket address is simply your abstract name with a 0 byte on the front. This is trivial in Python and works exactly as you'd hope:

import socket
s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
s.bind("\0" + sname)
s.listen(10)
# or s.connect(...) to talk to a server
....

This works in both Python 2 and Python 3. Somewhat to my surprise, Python 3 converts the Unicode null 'byte' codepoint to a 0 byte without complaints. How Python 3 converts any non-ASCII in sname to bytes depends on your locale, as usual, which means that under some circumstances you may need to do explicit conversion to bytes and handle conversion errors. You can call .bind() or .connect() with a bytes address instead of a Unicode one.

Sockets in the abstract namespace have no permissions, unlike regular Unix domain sockets (which are protected by file and/or directory permissions). If you want to add a permissions system, you can obtain the UID, GID, and PID of the other end with SO_PEERCRED like so:

import struct
SO_PEERCRED = getattr(socket, "SO_PEERCRED", 17)
creds = s.getsockopt(socket.SOL_SOCKET, SO_PEERCRED, struct.calcsize("3i"))
pid, uid, gid = struct.unpack("3i", creds)

This comes from a 2011 Stackoverflow answer, more or less (I have added my own little modifications to it).

The situation with the definition for SO_PEERCRED turns out to be a little bit complicated. The Python 3 socket module has had a definition for it for some time (it looks like since 2011 or so). Most versions of Python 2.x don't have a SO_PEERCRED constant defined in the socket module; the exception is the Fedora version of Python, which apparently has had this patched in for a very long time now. In addition, the '17' here is only correct on mainstream Linux architectures; some oddball ones like MIPS have other values. You may have to check in Python 3 or compile a little C program to get the correct value. Yes, this is irritating and you can see why the Fedora people patched Python (and why it got added to Python 3).

As you might suspect, SO_PEERCRED can be used by either end of a Unix domain socket connection (and it works on any Unix domain socket, not just ones in the abstract namespace). It's merely most useful for a server to find out what the client is, since clients usually trust servers.

(Trusting the server may or may not be wise when you're dealing with Unix domain sockets in the abstract namespace, since anyone can grab any name in it. For my purposes I don't really care; my use is a petty little hack on my own personal machine and it doesn't involve anything sensitive.)

AbstractUnixSocketsAndPeercred written at 01:19:05; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.