Wandering Thoughts

2019-04-19

V7 Unix programs are often not written the way you would expect

Yesterday I wrote that V7 ed read its terminal input in cooked mode a line at a time, which was an efficient, low-CPU design that was important on V7's small and low-power hardware. Then in comments, frankg pointed out that I was wrong about part of that, namely about how ed read its input. Here, straight from the V7 ed source code, is how ed read input from the terminal:

getchr()
{
	[...]
	if (read(0, &c, 1) <= 0)
		return(lastc = EOF);
	lastc = c&0177;
	return(lastc);
}

gettty()
{
	[...]
	while ((c = getchr()) != '\n') {
	[...]
}

(gettty() reads characters from getchr() into a linebuf array until end of line, EOF, or it runs out of space.)

In one way, this is surprising; it's very definitely not how we'd write this today, and if you did, many Unix programmers would immediately tell you that you're being inefficient by making so many calls to read() and you should instead use a buffer, for example through stdio's fgets(). Very few modern Unix programs do character at a time reads from the kernel, partly because on modern machines it's not very efficient.

(It may have been comparatively less inefficient on V7 on the PDP-11, if for example the relative cost of making a system call was lower than it is today. My impression is that this may have been the case.)

V7 had stdio in more or less its modern form, complete with fgets(). V6 had a precursor version of stdio and buffered IO (see eg the manpage for getc()). However, many V7 and V6 programs didn't necessarily use them; instead they used more basic system calls. This is one of the things that often gives the code for early Unix programs (V7 and before) an usual feel, along with the short variable names and the lack of comments.

The situation with ed is especially interesting, because in V5 Unix, ed appears to have still been written in assembly; see ed1.s, ed2.s, and ed3.s here in 's1' of the V5 sources. In V6, ed was rewritten in C to create ed.c (still in a part of the source tree called 's1'), but it still used the same read() based approach that I think it used in the assembly version.

(I haven't looked forward from V7 to see if later versions were revised to use some form of buffering for terminal input.)

Sidebar: An interesting undocumented ed feature

Reading this section of the source code for ed taught me that it has an interesting, undocumented, and entirely characteristic little behavior. Officially, ed commands that have you enter new text have that new text terminate by a . on a line by itself:

$ ed newfile
a
this is new text that we're adding.
.

This is how the V7 ed manual documents it and how everyone talks about. But the actual ed source code implements this on input is, from that gettty() function:

if (linebuf[0]=='.' && linebuf[1]==0)
        return(EOF);
return(0);

In other words, it turns a single line with '.' into an EOF. The consequence of this is that if you type a real EOF at the start of a line, you get the same result, thus saving you one character (you use Control-D instead of '.' plus newline). This is very V7 Unix behavior, including the lack of documentation.

This is also a natural behavior in one sense. A proper program has to react to EOF here in some way, and it might as well do so by ending the input mode. It's also natural to go on to try reading from the terminal again for subsequent commands; if this was a real and persistent EOF, for example because the pty closed, you'll just get EOF again and eventually quit. V7 ed is slightly unusual here in that it deliberately converts '.' by itself to EOF, instead of signaling this in a different way, but in a way that's also the simplest approach; if you have to have some signal for each case and you're going to treat them the same, you might as well have the same signal for both cases.

Modern versions of ed appear to faithfully reimplement this convenient behavior, although they don't appear to document it. I haven't checked OpenBSD, but both FreeBSD ed and GNU ed work like this in a quick test. I haven't checked their source code to see if they implement it the same way.

unix/EdV7CodedUnusually written at 23:49:59; Add Comment

Links: A Practitioner's Guide to System Dashboard Design (with a bonus)

A Practitioner's Guide to System Dashboard Design is a four article series on system dashboard design by Cory Watson of One Mo' Gin. The parts are:

  1. Structure and Layout
  2. Presentation and Accessibility
  3. What Charts To Use
  4. Context Improvement

If you like these (and I did), you probably also want to read Cory's The CASE Method: Better Monitoring For Humans, and perhaps peruse the full articles index for additional things to read.

(Via somewhere that I've now forgotten and can't find again. Perhaps it was Twitter or Mastodon.)

links/SystemDashboardDesign written at 22:09:26; Add Comment

2019-04-18

One reason ed(1) was a good editor back in the days of V7 Unix

It is common to describe ed(1) as being line oriented, as opposed to screen oriented editors like vi. This is completely accurate but it is perhaps not a complete enough description for today, because ed is line oriented in a way that is now uncommon. After all, you could say that your shell is line oriented too, and very few people use shells that work and feel the same way ed does.

The surface difference between most people's shells and ed is that most people's shells have some version of cursor based interactive editing. The deeper difference is that this requires the shell to run in character by character TTY input mode, also called raw mode. By contrast, ed runs in what Unix usually calls cooked mode, where it reads whole lines from the kernel and the kernel handles things like backspace. All of ed's commands are designed so that they work in this line focused way (including being terminated by the end of the line), and as a whole ed's interface makes this whole line input approach natural. In fact I think ed makes it so natural that it's hard to think of things as being any other way. Ed was designed for line at a time input, not just to not be screen oriented.

(This was carefully preserved in UofT ed's very clever zap command, which let you modify a line by writing out the modifications on a new line beneath the original.)

This input mode difference is not very important today, but in the days of V7 and serial terminals it made a real difference. In cooked mode, V7 ran very little code when you entered each character; almost everything was deferred until it could be processed in bulk by the kernel, and then handed to ed all in a single line which ed could also process all at once. A version of ed that tried to work in raw mode would have been much more resource intensive, even if it still operated on single lines at a time.

(If you want to imagine such a version of ed, think about how a typical readline-enabled Unix shell can move back and forth through your command history while only displaying a single line. Now augment that sort of interface with a way of issuing vi-like bulk editing commands.)

This is part of why I feel that ed(1) was once a good editor (cf). Ed is carefully adapted for the environment of early Unixes, which ran on small and slow machines with limited memory (which led to ed not holding the file it's editing in memory). Part of that adaptation is being an editor that worked with the system, not against it, and on V7 Unix that meant working in cooked mode instead of raw mode.

(Vi appeared on more powerful, more capable machines; I believe it was first written when BSD Unix was running on Vaxes.)

Update: I'm wrong in part about how V7 ed works; see the comment from frankg. V7 ed runs in cooked mode but it reads input from the kernel a character at a time, instead of in large blocks.

unix/EdDesignedForCookedInput written at 23:25:56; Add Comment

A pattern for dealing with missing metrics in Prometheus in simple cases

Previously, I mentioned that Prometheus expressions are filters, which is part of Prometheus having a generally set-oriented view of the world. One of the consequences of this view is that you can quite often have expressions that give you a null result when you really want the result to be 0.

For example, let's suppose that you want a Grafana dashboard that includes a box that tells you how many Prometheus alerts are currently firing. When this happens, Prometheus exposes an ALERTS metric for each active alert, so on the surface you would count these up with:

count( ALERTS{alertstate="firing"} )

Then one day you don't have any firing alerts and your dashboard's box says 'N/A' or 'null' instead of the '0' that you want. This happens because 'ALERTS{alertstate="firing"}' matches nothing, so the result is a null set, and count() of a null set is a null result (or, technically, a null set).

The official recommended practice is to not have any metrics and metric label values that come and go; all of your metrics and label sets should be as constant as possible. As you can tell with the official Prometheus ALERTS metric, not even Prometheus itself actually fully follows this, so we need a way to deal with it.

My preferred way of dealing with this is to use 'or vector(0)' to make sure that I'm never dealing with a null set. The easiest thing to use this with is sum():

sum( ALERTS{alertstate="firing"} or vector(0) )

Using sum() has the useful property that the extra vector(0) element has no effect on the result. You can often use sum() instead of count() because many sporadic metrics have the value of '1' when they're present; it's the accepted way of creating what is essentially a boolean 'I am here' metric such as ALERTS.

If you're filtering for a specific value or value range, you can still use sum() instead of count() by using bool on the comparison:

sum( node_load1 > bool 10 or vector(0) )

If you're counting a value within a range, be careful where you put the bool; it needs to go on the last comparison. Eg:

sum( node_load1 > 5 < bool 10 or vector(0) )

If you have to use count() for more complicated reasons, the obvious approach is to subtract 1 from the result.

Unfortunately this approach starts breaking down rapidly when you want to do something more complicated. It's possible to compute a bare average over time using a subquery:

avg_over_time( (sum( ALERTS{alertstate="firing"} or vector(0) ))[6h:] )

(Averages over time of metrics that are 0 or 1, like up, are the classical way of figuring out things like 'what percentage of the time is my service down'.)

However I don't know how to do this if you want something like an average over time by alert name or by hostname. In both cases, even alerts that were present some of the time were not present all of the time, and they can't be filled in with 'vector(0)' because the labels don't match (and can't be made to match). Nor do I know of a good way to get the divisor for a manual averaging. Perhaps you would want to do an unnecessary subquery so you can exactly control the step and thus the divisor. This would be something like:

sum_over_time( (sum( ALERTS{alertstate="firing"} ) by (alertname))[6h:1m] ) / (6*60)

Experimentation suggests that this provides plausible results, at least. Hopefully it's not too inefficient. In Grafana, you need to write the subquerry as '[$__range:1m]' but the division as '($__range_s / 60)', because the Grafana template variable $__range includes the time units.

(See also Existential issues with metrics.)

sysadmin/PrometheusMissingMetricsPattern written at 00:39:58; Add Comment

2019-04-17

Private browsing mode versus a browser set to keep nothing on exit

These days, apparently a steadily increasing variety of websites are refusing to let you visit their site if you're in private browsing or incognito mode. These websites are advertising that their business model is invading your privacy (not that that's news), but what I find interesting is that these sites don't react when I visit them in a Firefox that has a custom history setting of 'clear history when Firefox closes'. As far as I can tell this still purges cookies and other website traces as effectively as private browsing mode does, and it has the side benefit for me that Firefox is willing to remember website logins.

(I discovered this difference between the two modes in the aftermath of moving away from Chrome.)

So, this is where I say that everyone should do this instead of using private browsing mode? No, not at all. To be bluntly honest, my solution is barely usable for me, never mind someone who isn't completely familiar with Firefox profiles and capable of wiring up a complex environment that makes it relatively easy to open a URL in a particular profile. Unfortunately Firefox profiles are not particularly usable, so much so that Firefox had to invent an entire additional concept (container tabs) in order to get a reasonably approachable version.

(Plus, of course, Private Browsing/Incognito is effectively a special purpose profile. It's so successful in large part because browsers have worked hard to make it extremely accessible.)

Firefox stores and tracks cookies (and presumably local storage) on a per-container basis, for obvious reasons, but apparently doesn't have per-container settings for how long they last or when they get purged. Your browsing history is global; history entries are not tagged with what container they're from. Mozilla's Firefox Multi-Account Containers addon looks like it makes containers more flexible and usable, but I don't think it changes how cookies work here, unfortunately; if you keep cookies in general, you keep them for all containers.

I don't think you can see what container a given cookie comes from through Firefox's normal Preferences stuff, but you can with addons like Cookie Quick Manager. Interestingly, it turns out that Cookie AutoDelete can be set to be container aware, with different rules for different containers. Although I haven't tried to do this, I suspect that you could set CAD so that your 'default' container (ie your normal Firefox session) kept cookies but you had another container that always threw them away, and then set Multi-Account Containers so that selected annoying websites always opened in that special 'CAD throws away all cookies' container.

(As covered in the Cookie AutoDelete wiki, CAD can't selectively remove Firefox localstorage for a site in only some containers; it's all or nothing. If you've set up a pseudo-private mode container for some websites, you probably don't care about this. It may even be a feature that any localstorage they snuck onto you in another container gets thrown away.)

web/PrivateBrowsingVsKeepNothing written at 00:46:39; Add Comment

2019-04-15

How Linux starts non-system software RAID arrays during boot under systemd

In theory, you do not need to care about how your Linux software RAID arrays get assembled and started during boot because it all just works. In practice, sometimes you do, and on a modern systemd-based Linux this seems to be an unusually tangled situation. So here is what I can determine so far about how it works for software RAID arrays that are assembled and started outside of the initramfs, after your system has mounted your real root filesystem and is running from it.

(How things work for starting software RAID arrays in the initramfs is quite varied between Linux distributions. There is some distribution variation even for post-initramfs booting, but these days the master version of mdadm ships canonical udev and systemd scripts, services, and so on and I think most distributions use them almost unchanged.)

As has been the case for some time, the basic work is done through udev rules. On a typical Linux system, the main udev rule file for assembly will be called something like 64-md-raid-assembly.rules and be basically the upstream mdadm version. Udev itself identifies block devices that are potentially Linux RAID members (probably mostly based on the presence of RAID superblocks), and mdadm's udev rules then run mdadm in a special incremental assembly mode on them. To quote the manpage:

This mode is designed to be used in conjunction with a device discovery system. As devices are found in a system, they can be passed to mdadm --incremental to be conditionally added to an appropriate array.

As array components become visible to udev and cause it to run mdadm --incremental on them, mdadm progressively adds them to the array. When the final device is added, mdadm will start the array. This makes the software RAID array and its contents visible to udev and to systemd, where it will be used to satisfy dependencies for things like /etc/fstab mounts and thus trigger them happening.

(There are additional mdadm udev rules for setting up device names, starting mdadm monitoring, and so on. And then there's a whole collection of general udev rules and other activities to do things like read the UUIDs of filesystems from new block devices.)

However, all of this only happens if all of the array component devices show up in udev (and show up fast enough); if only some of the devices show up, the software RAID will be partially assembled by mdadm --incremental but not started because it's not complete. To deal with this situation and eventually start software RAID arrays in degraded mode, mdadm's udev rules start a systemd timer unit when enough of the array is present to let it run degraded, specifically the templated timer unit mdadm-last-resort@.timer (so for md0 the specific unit is mdadm-last-resort@md0.timer). If the RAID array isn't assembled and the timer goes off, it triggers the corresponding templated systemd service unit, using mdadm-last-resort@.service, which runs 'mdadm --run' on your degraded array to start it.

(The timer unit is only started when mdadm's incremental assembly reports back that it's 'unsafe' to assemble the array, as opposed to impossible. Mdadm reports this only once there are enough component devices present to run the array in a degraded mode; how many devices are required (and what devices) depends on the specific RAID level. RAID-1 arrays, for example, only require one component device to be 'unsafe'.)

Because there's an obvious race potential here, the systemd timer and service both work hard to not act if the RAID array is actually present and already started. The timer conflicts with 'sys-devices-virtual-block-<array>.device', the systemd device unit representing the RAID array, and as an extra safety measure the service refuses to run if the RAID array appears to be present in /sys/devices. In addition, the udev rule that triggers systemd starting the timer unit will only act on software RAID devices that appear to belong to this system, either because they're listed in your mdadm.conf or because their home host is this host.

(This is the MD_FOREIGN match in the udev rules. The environment variables come from mdadm's --export option, which is used during udev incremental assembly. Mdadm's code for incremental assembly, which also generates these environment variables, is in Incremental.c. The important enough() function is in util.c.)

As far as I know, none of this is documented or official; it's just how mdadm, udev, and systemd all behave and interact at the moment. However this appears to be pretty stable and long standing, so it's probably going to keep being the case in the future.

PS: As far as I can tell, all of this means that there are no real user-accessible controls for whether or not degraded software RAID arrays are started on boot. If you want to specifically block degraded starts of some RAID arrays, it might work to 'systemctl mask' either or both of the last-resort timer and service unit for the array. If you want to always start degraded arrays, well, the good news is that that's supposed to happen automatically.

linux/SoftwareRaidAssemblySystemd written at 22:37:24; Add Comment

2019-04-14

A VPN for me but not you: a surprise when tethering to my phone

My phone supports VPNs, of course, and I have it set up to talk to our work VPN. This is convenient for reasons beyond mere privacy when I'm using it over networks I don't entirely trust; there are various systems at work that can only be reached from 'inside' machines (including the VPN server), or which are easier to use that way.

My phone also supports tethering other devices to it to give them Internet access through the phone's connection (whatever that is at the time). This is built in to iOS as a standard function, not supplied through a provider addition or feature (as far as I know Apple doesn't allow cellular providers any control over whether iOS allows tethering to be used), and is something that I wind up using periodically.

As I found out the first time I tried to do both at once, my phone has what I consider an oddity: only the phone's traffic uses the (phone) VPN, not the traffic from any tethered devices. The VPN is for the phone only, not for any attached devices; they're on their own, which is sometimes inconvenient for me. It would be a fair bit easier if any random machine I tethered to the phone could take advantage of the phone's VPN and didn't have to set up a VPN configuration itself.

(In fact we've had problems on our VPN servers in the past when there were multiple VPN connections from the same public IP, which is what I'd get if I had both the phone and a tethered machine using the VPN at the same time. I think those aren't there any more, although I'm not sure.)

As far as I know, there is no technical requirement that forces this; in general you certainly could route NAT'd tethered traffic through the VPN connection too. If anything, my phone may have to go out of its way to route locally originated traffic in one way and tethered traffic in another way (although this depends on how NAT and VPNs interact in the iOS kernel). Doing things this way seems likely to be mostly or entirely a policy decision, especially by now (after so many years of iOS development, and a succession of people asking about this on the Internet, and so on).

(I don't currently have a position on whether it's a good or a bad policy decision, although I think it is a bit surprising. I certainly expected tethered traffic to be handled just the same way as local traffic from the phone itself.)

tech/IPhoneExclusiveVPN written at 20:59:30; Add Comment

2019-04-13

Remembering that Prometheus expressions act as filters

In conventional languages, comparisons like '>' and other boolean operations like 'and' give you implicit or explicit boolean results. Sometimes this is a pseudo-boolean result; in Python if you say 'A and B', you famously get either False or the value of B as the end result (instead of True). However, PromQL doesn't work this way. As I keep having to remember over and over, in Prometheus, comparisons and other boolean operators are filters.

In PromQL, when you write 'some_metric > 10', what happens is that first Prometheus generates a full instant vector for some_metric, with all of the metric points and their labels and their values, and then it filters out any metric point in the instant vector where the value isn't larger than 10. What you have left is a smaller instant vector, but all of the values of the metric points in it are their original ones.

The same thing happens with 'and'. When you write 'some_metric and other_metric', the other_metric is used only as a filter; metric points from some_metric are only included in the result set if there is the same set of labels in the other_metric instant vector. This means that the values of other_metric are irrelevant and do not propagate into the result.

The large scale effect of this is that the values that tend to propagate through your rule expression are whatever started out as the first metric you looked at (or whatever arithmetic you perform on them). Sometimes, especially in alert rules, this can bias you toward putting one condition in front of the other. For instance, suppose that you want to trigger an alert when the one-minute load average is above 20 and the five-minute load average is above 5, and you write the alert rule as:

expr: (node_load5 > 5) and (node_load1 > 20)

The value available in the alert rule and your alert messages is the value of node_load5, not node_load1, because node_load5 is what you started out the rule with. If you find the value of node_load1 more useful in your alert messages, you'll want to flip the order of these two clauses around.

As the PromQL documentation covers, you can turn comparison operations from filters into pseudo-booleans by using 'bool', as in 'some_metric > bool 10'. As far as I know, there is no way to do this with 'and', which always functions as a filter, although you can at least select what labels have to match (or what labels to ignore).

PS: For some reason I keep forgetting that 'and', 'or', and 'unless' can use 'on' and 'ignoring' to select what labels you care about. What you can't do with them, though, is propagate some labels from the right side into the result; if you need that, you have to use 'group_left' or 'group_right' and figure out how to re-frame your operation so that it involves a comparison, since 'and' and company don't work with grouping.

(I was going to confidently write an entry echoing something that I said on the Prometheus users mailing list recently, but when when I checked the documentation and performed some tests, it turned out I was wrong about an important aspect of it. So this entry is rather smaller in scope, and is written mostly to get this straight in my head since I keep forgetting the details of it.)

sysadmin/PrometheusExpressionsFilter written at 23:59:31; Add Comment

WireGuard was pleasantly easy to get working behind a NAT (or several)

Normally, my home machine is directly connected to the public Internet by its DSL connection. However, every so often this DSL connection falls over, and these days my backup method of Internet connectivity is that I tether my home machine through my phone. This tethering gives me an indirect Internet connection; my desktop is on a little private network provided by my phone and then my phone NAT's my outgoing traffic. Probably my cellular provider adds another level of NAT as well, and certainly the public IP address that all of my traffic appears from can hop around between random IPs and random networks.

Most of the time this works well enough for basic web browsing and even SSH sessions, but it has two problems when I'm connecting to things at work. The first is that my public IP address can change even while I have a SSH connection present (but perhaps not active enough), which naturally breaks the SSH connection. The second is that I only have 'outside' access to our servers; I can only SSH to or otherwise access machines that are accessible from the Internet, which excludes most of the interesting and important ones.

Up until recently I've just lived with this, because the whole issue just doesn't come up often enough to get me to do anything about it. Then this morning my home DSL connection died at a fairly inopportune time, when I was scheduled to do something from home that involved both access to internal machines and things that very much shouldn't risk having my SSH sessions cut off in mid-flight (and that I couldn't feasibly do from within a screen session, because it involved multiple windows). I emailed a co-worker to have them take over, which they fortunately were able to do, and then I decided to spend a little time to see if I could get my normal WireGuard tunnel up and running over my tethered and NAT'd phone connection, instead of its usual DSL setup. If I could bring up my WireGuard tunnel, I'd have both a stable IP for SSH sessions and access to our internal systems even when I had to use my fallback Internet option.

(I won't necessarily have uninterrupted SSH sessions, because if my phone changed public IPs there will be a pause as WireGuard re-connected and so on. But at least I'll have the chance to have sessions continue afterward, instead of being intrinsically broken.)

Well, the good news is that my WireGuard setup basically just worked as-is when I brought it up behind however many layers of NAT'ing are going on. The actual WireGuard configuration needed no changes and I only had to do some minor tinkering with my setup for policy-based routing (and one of the issues was my own fault). It was sufficiently easy that now I feel a bit silly for having not tried it before now.

(Things would not have been so easy if I'd decided to restrict what IP addresses could talk to WireGuard on my work machine, as I once considered doing.)

This is of course how WireGuard is supposed to work. Provided that you can pass its UDP traffic in both ways (which fortunately seems to work through the NAT'ing involved in my case), WireGuard doesn't care where your traffic comes from if it has the right keys, and your server will automatically update its idea of what (external) IP your client has right now when it gets new traffic, which makes everything work out.

(WireGuard is actually symmetric; either end will update its idea of the other end's IP when it gets appropriate traffic. It's just that under most circumstances your server end rarely changes its outgoing IP.)

I knew that in theory all of this should work, but it's still nice to have it actually work out in practice, especially in a situation with at least one level of NAT going on. I'm actually a little bit amazed that it does work through all of the NAT magic going on, especially since WireGuard is just UDP packets flying back and forth instead of a TCP connection (which any NAT had better be able to handle).

On a side note, although I did everything by hand this morning, in theory I could automate all of this through dhclient hook scripts, which I'm already using to manage my resolv.conf (as covered in this entry). Of course this brings up a little issue, because if the WireGuard tunnel is up and working I actually want to use my regular resolv.conf instead of the one I switch to when I'm tethering (without WireGuard). Probably I'm going to defer all of this until the next time my DSL connection goes down.

linux/WireGuardBehindNAT written at 00:16:23; Add Comment

2019-04-12

Getting (and capturing) spam can sometimes be useful to see what's in it

We have what is now a long standing system for logging email attachment type information (everyone should have one). For more than a year we've been receiving .iso attachments that caused our program to log cryptic reports claiming that we sniffed these as tar archives that were oddly empty:

attachment application/x-iso9660-image; MIME file ext: .iso; tar no files?!

(This one is unusual in that it had a correct MIME type. The more common MIME type these come with is application/octet-stream.)

Our commercial anti-spam system (Sophos PureMessage) consistently identifies these as CXmail/IsoDl-A.

I've been vaguely wanting to figure out why these messages cause our program to do this and what was actually in these file attachments for some time, but I've been hampered by the fact that I didn't actually have an example file. Our email system consistently rejects these for being malware (and anyway they weren't sent to me), and for various reasons we don't try to have our attachment type logging system save copies of things under any circumstances. I added some extra logging to the system, but it didn't produce anything.

(In some environments, an attachment logging and filtering system would be critical enough that you should be able to capture copies of things that either cause it problems or that seem questionable. In our environment, it's not and making it capture things would raise both operational issues (like managing what it captures and not running out of disk space) and policy ones (around privacy and so on).)

However, I also run a sinkhole SMTP server on another machine. Recently it got a boring spam message which I almost ignored, except that I noticed it had a suspicious attachment that claimed to be an ISO file in the MIME type information (although it had a .img extension). Out of a spirit of curiosity, I extracted the attachment and poked around in it, discovering that it really was an ISO image (well, a UDF filesystem) and contained a single .EXE. Out of more curiosity, I fed it to our attachment logger program to see if it would reproduce the 'tar no files?!' issue. Lo and behold, it did. Now armed with a reproduction case that I could poke around in, I was soon able to narrow this down to a long standing issue in the Python tarfile module.

So, every so often it's useful to get (and capture) spam. Provided that it's interesting and useful spam, at least.

spam/SpamCapturingCanBeUseful written at 00:20:38; Add Comment

(Previous 10 or go back to April 2019 at 2019/04/10)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.