Wandering Thoughts archives

2017-04-21

OmniOS's (suddenly) changed future and us

Today, Robert Treat, the CEO of OmniTI, sent a message to the OmniOS user discussion list entitled The Future of OmniOS. The important core of that email is in two sentences:

Therefore, going forward, while some of our staff may continue contributing, OmniTI will be suspending active development of OmniOS. Our next release, currently in beta, will become the final release from OmniTI. [...]

As Treat's message explains, OmniTI has been basically the only major contributor and organizing force behind OmniOS, and no real outside development community has materialized around it. OmniTI is no longer interested in having things continue this way and so they are opting for what Treat adequately calls 'a radical approach'.

(There is much more to Treat's email and if you use OmniOS you should read it in full.)

For various reasons, I don't expect this attempt to materialize a community to be successful. This obviously means that OmniOS will probably wither away as a viable and usable Illumos distribution, especially if you want to more or less freeze a version and then only have important bug fixes or security fixes ported into it (which is what we want). This obviously presents a problem for us, since we use OmniOS on our current generation fileservers. I think that we'll be okay for the remaining lifetime of the current generation, but that's mostly because we're approaching the point where we should start work on the next generation (which I have somewhat optimistically put at some time in 2018 in previous entries).

(We ran our first generation of fileservers for longer than that, but it wasn't entirely a great experience so we'd like to not push things so far this time around.)

One option for our next generation of fileservers is another Illumos distribution. I have no current views here since I haven't looked at what's available since we picked OmniOS several years ago, but investigating the current Illumos landscape is clearly now a priority. However, I have some broad concerns about Illumos in general and 10G Ethernet support on Illumos in specific, because I think 10G is going to be absolutely essential in our next generation of fileservers (living without 10G in our current generation is already less than ideal). Another option is ZFS on a non-Illumos platform, either Linux or FreeBSD, which seems to be a viable option now and should likely be even more viable in 2018.

(Oracle Solaris 11 remains very much not an option for all sorts of reasons, including my complete lack of faith and trust in Oracle.)

Fortunately we probably don't have to make any sort of decision any time soon. We'll probably have at least until the end of 2017 to see how the OmniOS situation develops, watch changes in other Illumos distributions, and so on. And maybe I will be surprised by what happens with OmniOS, as I actually was with Illumos flourishing outside of Sun Oracle.

(The one thing that could accelerate our plans is some need to buy new fileserver hardware sooner than expected because we have a limited time funding source available for it. But given a choice we'd like to defer hardware selection for a relatively long time, partly because perhaps a SSD-based future can come to pass.)

Sidebar: On us (not) contributing to OmniOS development

The short version is that we don't have the budget (in money or in free staff time) to contribute to OmniOS as an official work thing and I'm not at all interested in doing it on a purely personal basis on my own time.

solaris/OmniOSChangedFuture written at 23:08:06; Add Comment

Link: Rob Landley's Linux Memory Management Frequently Asked Questions

Rob Landley's Linux Memory Management Frequently Asked Questions is what it says in the title (via). As of writing this pointer to it, the current version of the FAQ appears to be from 2014, but much of its information is relatively timeless.

Landley's FAQ is good for a lot more than Linux specific information. It's a good general overview of a number of hardware MMU concepts and general Unix MMU concepts. Things like page tables, TLBs, and even memory mappings are basically universal, and the FAQ has a nice solid explanation of all of them and much more.

(See also Rob Landley's other writing, and his website in general.)

links/LandleyLinuxMMFAQ written at 20:35:13; Add Comment

A surprising reason grep may think a file is a binary file

Recently, 'fgrep THING FILE' for me has started to periodically report 'Binary file FILE matches' for files that are not in fact binary files. At first I thought a stray binary character might have snuck into one file this was happening to, because it's a log file that accumulates data partly from the Internet, but then it happened to a file that is only hand-edited and that definitely shouldn't contain any binary data. I spent a chunk of time tonight trying to find the binary characters or mis-encoded UTF-8 or whatever it might be in the file, before I did the system programmer thing and just fetched the Fedora debuginfo package for GNU grep so that I could read the source and set breakpoints.

(I was encouraged into this course of action by this Stackexchange question and answers, which quoted some of grep's source code and in the process gave me a starting point.)

As this answer notes, there are two cases where grep thinks your file is binary: if there's an encoding error detected, or if it detects some NUL bytes. Both of these sound at least conceptually simple, but it turns out that grep tries to be clever about detecting NULs. Not only does it scan the buffers that it reads for NULs, but it also attempts to see if it can determine that a file must have NULs in the remaining data, in a function helpfully called file_must_have_nulls.

You might wonder how grep or anything can tell if a file has NULs in the remaining data. Let me answer that with a comment from the source code:

/* If the file has holes, it must contain a null byte somewhere. */

Reasonably modern versions of Linux (since kernel 3.1) have some special additional lseek() options, per the manpage. One of them is SEEK_HOLE, which seeks to the nearest 'hole' in the file. Holes are unwritten data and Unix mandates that they read as NUL bytes, so if a file has holes, it's got NULs and so grep will call it a binary file.

SEEK_HOLE is not implemented on all filesystems. More to the point, the implementation of SEEK_HOLE may not be error-free on all filesystems all of the time. In my particular case, the files which are being unexpected reported as binary are on ZFS on Linux, and it appears that under some mysterious circumstances the latest development version of ZoL can report that there are holes in a file when there aren't. It appears that there is a timing issue, but strace gave me a clear smoking gun and I managed to reproduce it in a simple test program that gives me a clear trace:

open("testfile", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0600, st_size=33005, ...}) = 0
read(3, "aaaaaaaaa"..., 32768) = 32768
lseek(3, 32768, SEEK_HOLE)              = 32768

The file doesn't have any holes, yet sometimes it's being reported as having one at the exact current offset (and yes, the read() is apparently important to reproduce the issue).

(Interested parties can see more weirdness in the ZFS on Linux issue.)

linux/GrepBinaryFileReason written at 00:57:12; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.