Wandering Thoughts archives

2018-09-17

Python 3 supports not churning memory on IO

I am probably late to this particular party, just as I am late to many Python 3 things, but today (in the course of research for another entry) I discovered the pleasant fact that Python 3 now supports read and write IO to and from appropriate pre-created byte buffers. This is supported at the low level and also at the high level with file objects (as covered in the io module).

In Python 2, one of the drawbacks of Python for relatively high performance IO-related code was that reading data always required allocating a new string to hold it, and changing what you were writing also required new strings (you could write the same byte string over and over again without memory allocation, although not necessarily a Unicode string). Python 3's introduction of mutable bytestring objects (aka 'read-write bytes-like objects') means that we can bypass both issues now. With reading data, you can read data into an existing mutable bytearray (or a suitable memoryview), or a set of them. For writing data, you can write a mutable bytestring and then mutate it in place to write different data a second time. This probably doesn't help much if you're generating entirely new data (unless you can do it piece by piece), but is great if you only need to change a bit of the data to write a new chunk of stuff.

One obvious question here is how you limit how much data you read. Python modules in the standard library appear to have taken two different approaches to this. The os module and the io module use the total size of the pre-allocated buffer or buffers you've provided as the only limit. The socket module defaults to the size of the buffer you provide, but allows you to further limit the amount of data read to below that. This initially struck me as odd, but then I realized that network protocols often have situations where you know you want only a few more bytes in order to complete some element of a protocol. Limiting the amount of data read below the native buffer size means that you can have a single maximum-sized buffer while still doing short reads if you only want the next N bytes.

(If I'm understanding things right, you could do this with a memoryview of explicitly limited size. But this would still require a new memoryview object, and they actually take up a not tiny amount of space; sys.getsizeof() on a 64-bit Linux machine says they're 192 bytes each. A bytearray's fixed size is actually smaller, apparently coming in at 56 bytes for an empty one and 58 bytes for one with a single byte in it.)

Sidebar: Subset memoryviews

Suppose you have a big bytearray object, and you want a memoryview of the first N bytes of it. As far as I can see, you actually need to make two memoryviews:

>>> b = bytearray(200)
>>> b[0:4]
bytearray(b'\x00\x00\x00\x00')
>>> m = memoryview(b)
>>> ms = m[0:30]
>>> ms[0:4] = b'1234'
>>> b[0:4]
bytearray(b'1234')

It is tempting to do 'memoryview(b[0:30])', but that creates a copy of the bytearray that you then get a memoryview of, so your change doesn't actually change the original bytearray (and you're churning memory). Of course if you intend to do this regularly, you'd create the initial memoryview up front and keep it around for the lifetime of the bytearray itself.

I'm a little bit surprised that memoryview objects don't have support for creating subset views from the start, although I'm sure there are good reasons for it.

python/Python3MutableBufferIO written at 23:32:23; Add Comment

The importance of explicitly and clearly specifying things

I was going to write this entry in an abstract way, but it is easier and more honest to start with the concrete specifics and move from there to the general conclusions I draw and my points.

We recently encountered an unusual Linux NFS client behavior, which at the time I called a bug. I have since been informed that this is not actually a bug but is Linux's implementation of what Linux people call "close to open cache consistency", which is written up in the Linux NFS FAQ, section A8. I'm not sure what to call the FAQ's answer; it is partly a description of concepts and partly a description of the nominal kernel implementation. However, this kernel implementation has changed over time, as we found out, with changes in user visible behavior. In addition, the FAQ doesn't make any attempt to describe how this interacts with NFS locking or if indeed NFS locking has any effect on it.

As someone who has to deal with this from the perspective of programs that are running on Linux NFS clients today and will likely run on Linux NFS clients for many years to come, what I need is a description of the official requirements for client programs. This is not a description of what works today or what the kernel does today, because as we've seen that can change; instead, it would be a description of what the NFS developers promise will work now and in the future. As with Unix's file durability problem, this would give me something to write client programs to and mean that if I found that the kernel deviated from this behavior I could report it as a bug.

(It would also give the NFS maintainers something clear to point people to if what they report is not in fact a bug but them not understanding what the kernel requires.)

On the Linux NFS mailing list, I attempted to write a specific description of this from the FAQ's wording (you can see my attempt here), and then asked some questions about what effect using flock() had on this (since the FAQ is entirely silent on this). This uncovered another Linux NFS developer who apparently has a different (and less strict) view of what the kernel should require from programs here. It has not yet yielded any clarity on what's guaranteed about flock()s interaction with Linux CTO cache consistency.

The importance of explicitly and clearly specifying things is that it deals with all four issues that have been uncovered here. With a clear and explicit specification (which doesn't have to be a formal, legalistic thing), it would be obvious what writers of programs must do to guarantee things working (not just now but also into the future), all of the developers could be sure that they were in agreement about how the code should work (and if there's disagreement, it would be immediately uncovered), any unclear or unspecified areas would at least become obvious (you could notice that the specification says nothing about what flock() does), and it would be much clearer if kernel behavior was a bug or if a kernel change introduced a deviation from the agreed specification.

This is a general thing, not something specific to the Linux kernel or kernels in general. For 'kernel' you can substitute 'any system that other people base things on', like compilers, languages, web servers, etc etc. In a sense this applies to anything that you can describe as an API. If you have an API, you want to know how you use the API correctly, what the API actually is (not just the current implementation), if the API is ambiguous or incomplete, and if something is a bug (it violates the API) or just a surprise. All of this is very much helped by having a clear and explicit description of the API (and, I suppose I should add, a complete one).

tech/ExplicitSpecImportance written at 01:06:10; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.