2018-12-31
Our Linux ZFS fileservers work much like to our OmniOS ones
I wrote recently about our new generation of Linux-based ZFS fileservers. One of the things that I covered only in passing is how surprisingly similar they are to our current OmniOS fileservers. Although we've changed from OmniOS to Linux, which is a significant shift at one level, a great deal of the administration and how we work with them has basically not changed at all. From the perspective of using and operating our ZFS fileservers, very little has fundamentally changed and most of our practices and procedures are just as before. Looking at things so far, I think that there are three reasons for this (or perhaps four, if I'm going to be pessimistic).
The first reason is that modern Unixes are quite similar to each
other. On top of that, OmniOS is less foreign than Solaris was (it
used many more GNU versions of things than Solaris had, and on top
of that we added various stuff), and by moving from OmniOS to Linux
we're moving 'upwards', to an environment with more GNU tools,
programs with more features, and so on. Moving from Linux to Solaris
would be dislocating (and was), because so many things we're used
to would be missing (including /bin/sh
as a POSIX shell), but
OmniOS is more mainstream than Solaris and we're moving the other
way.
The second reason is that the two big ZFS commands, zfs
and
zpool
, are basically the same between OmniOS and Linux (and in
general between all ZFS versions). This is pretty much by design.
Right from the start on Solaris, ZFS defined its own management
commands in addition to the kernel components, and people who ported
ZFS to other environments have kept those commands as unchanged as
possible. ZFS on Linux does have some Linux specific components and
aspects, such as its systemd services and ZED, but
we don't interact with them on a day to day basis. Because the ZFS
commands are the same and OmniOS was substantially similar to Linux,
many of our management scripts and so on can (and do) operate
basically the same on Linux as they did on OmniOS, or even are the
same scripts.
The third reason (and the final positive one) is that we do a lot of our ZFS administration through a few locally written front end commands. Since we wrote these commands, we can make them behave exactly the same on Linux as on OmniOS, even if the actual underlying mechanisms are significantly different (for example, we do extensive translation of NFS export options, but that's all hidden by the local command). If we had to work at the level of straight ZFS commands on a routine basis, some things would be more noticeably different; for example, OmniOS device names are quite different from Linux device names.
(In the OmniOS days, transforming the device names into a far more understandable form was much more important than it is now, since we were using iSCSI and thus really wanted to know which iSCSI backend a particular disk was on. Today, most of the important things are visible in the Linux device names we use, although they still require some mental translation.)
The final reason is that we haven't yet had to troubleshoot issues, which is the area where there are clear and significant differences. While we have much more ongoing metrics on our new Linux fileservers than we ever set up on OmniOS, we have no equivalents of our DTrace monitoring scripts. For all that Linux is a more familiar environment for finding some problems, there's some information that the DTrace scripts gave us ready access to that I'm not sure we have any good equivalents of.
(For instance, I'm not sure we have a good way to find out what
clients are the most active ones for a given a fileserver. There's
nfswatch
, but I'm not
sure that's going to be enough. We can (and do) gather client-side
statistics from our own clients, but that doesn't help us if a
significant amount of traffic comes from other people's client
machines, which it sometimes does.)
Of course, getting here took a bunch of work. We had to adopt and modify our local programs and some of our scripts, design and build some new systems to replace things we were doing on OmniOS, and figure out how to hook things into existing Linux and ZFS on Linux facilities like ZED. But now that we're here, it's pleasant how similar the new environment is to the old, as far as operating it goes.
PS: Someday eBPF and bpftrace and so on may make a solid DTrace replacement, with some work on our part to build equivalent scripts, but it's not there out of the box right now on Ubuntu 18.04 LTS.
PPS: Of course, as far as our NFS clients are concerned everything is just the same. Filesystems are mounted in the same way, using the same paths; it's just that the fileserver names changed. With our automounter replacement, clients don't even really care about the fileserver names either; all NFS mounts are driven by a magic file that's automatically generated from our master data file of ZFS filesystems and where they are.
Thinking about DWiki's Python 3 Unicode issues
DWiki (the code behind this blog) is currently Python 2, and it has to move to Python 3 someday, even if I'm in no hurry to make that move. The end of 2018, with only a year of official Python 2 support remaining, seems like a good time to take stock of what I expect to be the biggest aspect of that move, which is character set and Unicode issues (this is also the big issue I ignored when I got DWiki tentatively running under Python 3 a few years ago).
The current Python 2 version of DWiki basically ignores encoding issues. It allows you to specify the character set the HTML will say, but it pretty much treats everything as bytes and makes no attempts to validate that your content is actually valid in the character set you've claimed. This is not viable in Python 3 for various reasons, including that it's not how the Python 3 version of WSGI works (as covered in PEP 3333). Considering Unicode issues for a Python 3 version of DWiki means thinking about everywhere that DWiki reads and writes data from, and deciding what encoding that data is in (and then properly inserting error checks to handle when that data is not actually properly encoded).
The primary source of text data for DWiki is the text of pages and
comments. Here in 2018, the only sensible encoding for these is
UTF-8, and I should probably just hardcode that assumption into
reading them from the filesystem (and writing comments out to the
filesystem). Relying on Python's system encoding setting, whatever
it is, seems not like a good idea, and I don't think this should
be settable in DWiki's configuration file. UTF-8 also has the
advantage for writing things out that it's a universal encoder; you
can encode any Unicode str
to UTF-8, which isn't true of all
character encoding.
Another source of text data is the names of files and directories in the directory hierarchy that DWiki serves content from; these will generally appear in links and various other places. Again, I think the only sensible decision in 2018 is to declare that all filenames have to be UTF-8 and undefined things happen if they aren't. DWiki will do its best to do something sensible, but it can only do so much. Since these names propagate through to links and so on, I will have to make sure that UTF-8 in links is properly encoded.
(In general, I probably want to use the 'backslashreplace
' error
handling option when decoding to Unicode, because that's the option
that both produces correct results and preserves as much information
as possible. Since this introduces extra backslashes, I'll have to
make sure they're all handled properly.)
For HTML output, once again the only sensible encoding is UTF-8. I'll take out the current configuration file option and just hard-code it, so the internal Unicode HTML content that's produced by rendering DWikiText to HTML will be encoded to UTF-8 bytestrings. I'll have to make sure that I consistently calculate my ETag values from the same version of the content, probably the bytestring version (the current code calculates the ETag hash very late in the process).
DWiki interacts with the HTTP world through WSGI, although it's all
my own WSGI implementation in a normal setup. PEP 3333 clarifies
WSGI for Python 3, and it specifies two sides of things here; what
types are used where, and
some information on header encoding. For
output, generally my header values will be in ISO-8859-1; however,
for some redirections, the Location:
header might include UTF-8
derived from filenames, and I'll need to encode it properly. Handling
incoming HTTP headers and bodies is going to be more annoying and
perhaps more challenging; people and programs may well send me
incorrectly formed headers that aren't properly encoded, and for
POST
requests (for example, for comments) there may be various
encodings in use and also the possibility that the data is not
correctly encoded (eg it claims to be UTF-8 but doesn't decode
properly). In theory I might be able to force people to use UTF-8
on comment submissions, and probably most
browsers would accept that.
Since I don't actually know what happens in the wild here, probably a sensible first pass Python 3 implementation should log and reject with a HTTP error any comment submission that is not in UTF-8, or any HTTP request with headers that don't properly decode. If I see any significant quantity of them that appears legitimate, I can add code that tries to handle the situation.
(Possibly I should start by adding code to the current Python 2
version of DWiki that looks for this situation and logs information
about it. That would give me a year or two of data at a minimum.
I should also add an accept-charset
attribute to the current
comment form.)
DWiki has on-disk caches of data created with Python's pickle module. I'll have to make sure that the code reads and writes these objects using bytestrings and in binary mode, without trying to encode or decode it (in my current code, I read and write the pickled data myself, not through the pickle module).
The current DWiki code does some escaping of bad characters in text, because at one point control characters kept creeping in and blowing up my Atom feeds. This escaping should stay in a Python 3 Unicode world, where it will become more correct and reliable (currently it really operates on bytes, which has various issues).
Since in real life most things are properly encoded and even mostly
ASCII, mistakes in all of this might lurk undetected for some time.
To deal with this, I should set up two torture test environments
for DWiki, one where there is UTF-8 everywhere I can think of
(including in file and directory names) and one where there is
incorrectly encoded UTF-8 everywhere I can think of (or things just
not encoded as UTF-8, but instead Latin-1 or something). Running
DWiki against both of these would smoke out many problems and areas
I've missed. I should also put together some HTTP tests with badly
encoded headers and comment POST
bodies and so on, although I'm
not sure what tools are available to create deliberately incorrect
HTTP requests like that.
All of this is clearly going to be a long term project and I've probably missed some areas, but at least I'm starting to think about it a bit. Also, I now have some preliminary steps I can take while DWiki is still a Python 2 program (although whether I'll get around to them is another question, as it always is these days with work on DWiki's code).
PS: Rereading my old entry has also reminded me that there's DWiki's logging messages as well. I'll just declare those to be UTF-8 and be done with it, since I can turn any Unicode into UTF-8. The rest of the log file may or may not be UTF-8, but I really don't care. Fortunately DWiki doesn't use syslog (although I've already wrestled with that issue).
Sidebar: DWiki's rendering templates and static file serving
DWiki has an entire home-grown template system that's used as part of the processing model. These templates should be declared to be UTF-8 and loaded as such, with it being a fatal internal error if they fail to decode properly.
DWiki can also be configured to serve static files. In Python 3, these static files should be loaded uninterpreted as (binary mode) bytestrings and served back out that way, especially since they can be used for things like images (which are binary data to start with). Unfortunately this is going to require some code changes in DWiki's storage layer, because right now these static files are loaded from disk with the same code that is also used to load DWikiText pages, which have to be decoded to Unicode as they're loaded.