2013-10-06
What reloading a Python module really does
If you're like me, your initial naive idea of what reload()
does is
that it re-imports the module and then replaces the old module object
in sys.modules
with the new module object. Except that that can't be
right, because that would leave references to the old module object in
any other module that had imported the module. So the better but still
incorrect vision of reloading is that it re-imports the module as a new
module object then overwrites the old module's namespace in place with
the new module's namespace (making all references to the module use the
new information). But it turns out that this is still wrong, as is
hinted in the official documentation for reload()
.
What seems to really happen is the new module code is simply executed
in the old module's namespace. As the new code runs it defines
names or at least new values for names (including for functions and
classes since def
and class
are actually
executable statements) and those new names (or values) overwrite
anything that is already there in the module namespace. After this
finishes you have basically overwritten the old module namespace in
place with all of the new module's names and binding and so on.
This has two consequences. The first is mentioned in the official
documentation:
module level names don't get deleted if they're not defined in the new
version of the module. This includes module-level functions and classes,
not just variables. As a corollary, if you renamed a function or a class
between the initial import
and your subsequent reload the reloaded
module will have both the old and the new versions.
The second is that module reloads are not atomic in the face of some errors. If you reload a module and it has an execution error partway through, what you now have is some mix of the new module code (everything that ran before the error happened) and old module code (everything afterwards). As before this applies to variables, to any initialization code that the module runs, and to class and function definitions.
What I take away from this is that module reloading is not something that I want to ever try to use in live production code, however convenient it might be to have on the fly code updating in a running daemon. There are just too many ways to wind up with an overall program that works now but won't work after a full restart (or that will work differently).
(This behavior is the same in Python 2 and Python 3, although in
Python 3 the reload()
call is no longer a builtin and is now a
module level function in the imp
module.)
The per-NFS-operation data for NFS mounts in /proc/self/mountstats
The last big chunk of statistics in mountstats
is a bunch of separate information for each NFS operation. Since the
exact operations are version-specific, the list of operations depends
on the NFS version in use; however, the statistics are the same for
all protocols (and all operations).
A typical example (for NFS v3) looks like this:
per-op statistics NULL: 0 0 0 0 0 0 0 0 GETATTR: 192686 192688 0 22726724 21580832 2800 79242 83598 SETATTR: 10782 10783 0 1699192 1552608 1139 18919 20152 [....]
Operations are actually numerical codes and here they're printed in that numerical order, although with their standard names instead of their numbers. For each operation the actual numbers are, in order:
- operations: How many requests we've done for this operation.
- transmissions: How many times we've actually transmitted a RPC request for this operation. As you might have gathered from the last entry, this can exceed the operation count due to timeouts and retries.
- major timeouts: How many times a request has had a major timeout. Major timeouts produce 'nfs: server X not responding, still trying' messages. You can have timeouts and retries without having major timeouts (and my example lines do).
- bytes sent: This includes not just the RPC payload but also the RPC headers and so on. It closely matches the on-the-wire size.
- bytes received: As with bytes sent, this is the full size.
- cumulative queue time: How long (in milliseconds) all requests spent queued for transmission before they were sent.
- cumulative response time: How long (in milliseconds) it took to get a reply back after the request was transmitted. The kernel comments call this the RPC RTT.
- cumulative total request time: How long (in milliseconds) all requests took from when they were initially queued to when they were completely handled. The kernel calls this the RPC execution time.
At a first guess I think you'll want to divide bytes sent by transmissions but bytes received by operations if you're computing the average request size. Doing it differently will only matter in the face of retransmissions due to retries.
As you can see from my example lines, the total request time is a bit larger than the sum of the queue time and the response time. I don't know enough about the Linux NFS RPC code to know where this extra time may come from.
All of this is extremely useful for performance monitoring, especially
if you sum it up across all operations to get a global view of how RPC
for a given mount is performing. Most NFS RPC operations have relatively
obvious names but it's helpful to know that COMMIT
is part of NFS v3
writes. The other issue is that in NFS
v3 it's possible to set different flags on WRITE
operations (some can
be made synchronous) and the per-operation statistics don't break these
out separately. You may thus see wildly different WRITE
performance
numbers at different times depending on whether they are synchronous
(and generally slow) or asynchronous.
(In fact I don't think anything in the overall mountstats
data lets
you tell these two sorts of WRITE
s apart, even to the extent of
finding out how much data is being written with synchronous WRITE
s and
how much with asynchronous ones. This is an unfortunate omission since
asynch writes are likely to be much faster ones.)
PS: there may be other issues waiting with NFS v4 operations. We don't use NFS v4 here so I've never looked at it.
Sidebar: where to find things in the kernel
The code that prints the data is the rpc_print_iostats()
routine
in net/sunrpc/stats.c
. The data is explained in comments in
include/linux/sunrpc/metrics.h
.