2023-09-22
Changing GNU Emacs Lisp functions through advice-add
, not brute force
It's a tradition with me that sooner or later, I hit a GNU Emacs
function that doesn't work the way I want it to and has no applicable
customization options. My traditional brute force approach to dealing
with these functions has been to redefine them; I'd copy their code
to my .emacs or some personal .el file, modify or replace it to
taste, and then insure that my definition got used instead of the
standard one. If what I really cared about was a keybinding, sometimes
I could give my version a new name (and bind the key I cared about
to it). Recently, however, Ben Zanin pointed me at advice-add
and, after a
while, I was able to work out how to do some changes with it in a
nicer way than my previous brute force approach. So here are some
notes.
The simplest situation for advice-add
is when you want to completely replace a function with a different
implementation, which is easily done with ':override
':
(advice-add 'mh-forwarded-letter-subject :override (lambda (_ignored subject) (concat "Fwd: " subject)))
This redefines the function that generates the Subject: header for
email forwarded in MH-E
so that it completely ignores the author of the mail being forwarded
(this turns out to be possible through standard customizations
because format
can use (only) specific arguments, unlike C printf formatting, which
I didn't know at the time I hacked this up).
Sometimes I want a function to not do something, for example to
never use NMH's anno
to
modify my existing email messages. MH-E
doesn't directly expose this as a customization, but it does have
a specific function to execute (N)MH commands and we can just not
run that function if we're trying to execute anno
:
(advice-add 'mh-exec-cmd :before-until (lambda (cmd &rest args) (string-equal cmd "anno")))
The Ways to compose advice
section is a little bit confusing, but my rule of thumb is that
':before-until
' is for situations where I don't want the function
to run if something is true, while ':before-while
' is for situations
where I only want the function to run if something is true.
The brute force version of this is to use ':around
', which lets
you wrap around the original function in a single function of your
own:
(defun no-annotate (oldfun cmd &rest args) (if (not (string-equal cmd "anno")) (apply oldfun cmd args))) (advice-add 'mh-exec-cmd :around 'no-annotate)
Another thing you might want to do is modify the argument list of
a function, for example to change how MH-E handles email replies
so that you always automatically get cc'd on them. This is done
with ':filter-args
' advice, but this advice is slightly tricky
because as far as I can see you don't get passed exactly the normal
arguments; instead you just get a forced list of them. So you have
to write your modification like this:
(defun repl-cc-me (args) (if (string-equal (car args) "repl") (append args '("-cc" "me")) args)) (advice-add 'mh-exec-cmd :filter-args 'repl-cc-me)
Although it may be an obvious thing to say, if you're going to
filter the function's arguments you're going to have to know exactly
what arguments it's passed, and there may be surprises. For example,
mh-exec-cmd
turns out not to always be passed a list of just
strings; sometimes there may be numbers or sublists mixed in.
You can advise the same function (such as mh-exec-cmd
) multiple
times to do different things. Here I've advised it twice, one to
ignore attempts to run "anno" and once to modify what "repl" gets
handed as an argument. Obviously you need your advice to not clash
(as is the case here); otherwise you may need to combine them in a
single piece of advice that sorts it out right.
As both of these examples illustrate, you may need to advice-add
some other function than your initial target. My initial targets
are the behavior of 'mh-forward
' and 'mh-reply
', but in both
cases the behavior is in the middle of the function, so unless I
want to re-define them I have to hook into something else. Finding
where to hook into requires reading the code and following control
flow. If you need to alter one function's use of another function
that is generally used (such as 'mh-exec-cmd
', which is used all
over MH-E), you generally need to hope that it's called with some
argument you can detect that's specific to your function.
(In fact 'mh-forward
's annotation behavior is a couple of layers
of functions down, but once I found the root cause I decided I
didn't want any annotation to be happening from anywhere, not just
forwarding messages.)
One somewhat tricky option here that I didn't actually get working the one time I tried it is to create a dynamically scoped variable that you use to signal that you are in your function of interest:
(defvar mh-reply-marker nil "Are we in mh-reply?") (defun mark-mh-reply (oldfun message &optional reply-to includep) (let ((mh-reply-marker t)) (apply oldfun message reply-to includep))) (advice-add 'mh-reply :around 'mark-mh-reply)
Then your advising code for other functions can check mh-reply-marker
to see if they should do things or just quietly stay out of the
way. This is a real usage case for ':around
', because we need
to wrap the original function in that 'let
'.
The other thing that I haven't gotten fully and completely working
is advising interactive functions. In theory this is supposed to
work, but in practice at various times either invoking the function
itself has failed or other functions that do '(call-interactive
'mh-reply)
' have failed with mysterious errors. My conclusion is
that this is a level of advising that is currently beyond my ken,
and if I need to dabble in the waters to this depth, I'm probably
back to re-defining functions (it may be brute force but it works).
Sidebar: 'let
' and dynamically scoped variables
If you're new to Lisp and dynamic scoping, it may be a little bit
surprising that your in-function 'let
' of
'mh-reply-marker
' persists even into functions that you call, but
it does and this feature is actually used all over GNU Emacs in
various ways. One example is temporarily forcing the value of a
customizable setting to do something. MH-E can be set to prefer
plain text over HTML for email with both, and it has a function
'mh-show-preferred-alternative
' to override that temporarily;
this function works by nulling out the customization you set and
re-displaying the message. You can write an inverse of this with
the same idea:
(defun mh-show-plaintext () "Show text/plain instead of text/html for a message." (interactive) (let ((mm-discouraged-alternatives '("text/html"))) (mh-show nil t)))
(I have other personal functions that work this way.)
2023-09-21
HTTP Basic Authentication and your URL hierarchy
We're big fans of Apache's implementation of HTTP Basic Authentication, but we recently discovered that there are
some subtle implications of how Basic Authentication can interact
with your URL hierarchy within a web application. This is because of when HTTP
Basic Authentication is and isn't sent to you,
and specifically that browsers don't preemptively send the
Authorization
when they are moving up your URL hierarchy (well, for the first
time). That sounds abstract, so let's give a more concrete example.
Let's suppose you have a web application that allows authenticated
people to interact with it at both a high level, with a URL of
'/manage/', and at the level of dealing with a specific item, with
an URL of say '/manage/frobify/{item}'. You would like a person to
frobify some item, so you (automatically) send them email saying
'please visit <url> to frobify {item}'. They visit that URL while
not yet authenticated, which causes the web server to return a
HTTP 401
and gets their browser to ask them for their login and password on
your site (for a specific 'realm', effectively a text label). Their
re-request with an Authorization
header succeeds, and the
person delightedly frobifies their item. At the end of this process,
your web application redirects them to '/manage/'. Because this
URL is above the URL the person been dealing with, their browser
will not preemptively send the Authorization
header, and your
web server will once again respond with a HTTP 401.
Because this is all part of the same web application, your HTTP
Basic Authentication
will use the same realm
setting for both URLs in your web server
and thus your WWW-Authenticate
header. In theory the browser can see that it already knows an
authentication for this realm and automatically retry with the
Authorization
header. In practice a browser may not always
do this in all circumstances, and may instead stop to ask the person
for their login and password again. With this URL design you're at
the mercy of the browser to do what you want.
(This can be confusing to the person, especially if (from their perspective) they just pressed a form button that said 'yes, really frobify {item}' and now they're getting challenged again. They may well think that their action failed; after all, successful actions don't usually cause you to get re-challenged for authentication.)
Unfortunately it's hard to see how to get out of this while still having a sensible URL hierarchy, short of never sending people direct links for actions and always having them enter at the top level of your application. One not entirely great option is that when people frobify their items, they are never automatically redirected up; instead they just wind up back on '/manage/frobify/{item}' except that now the page says 'congratulations, you have frobified this item, click here to go to the top level'. This is slightly less convenient (well, if people actually want to go to your '/manage/' page) but won't leave people in doubt about whether or not they really did successfully frobify their item.
When you look at your logs, this behavior may be surprising to you if you've forgotten the complexities of when browsers preemptively send HTTP Basic Authentication information. HTTP Basic Authentication doesn't work like a regular cookie, where you can set it once and then assume it will always come back, which is the model of authentication we're generally most familiar with.
2023-09-20
Restarting nfs-server on a Linux NFS (v3) server isn't transparent
A while back I wrote an article on enabling NFS v4 on an Ubuntu
22.04 fileserver (instead of just NFS v3),
where one of the final steps was to restart 'nfsd', the NFS server
daemon (sort of), with 'systemctl restart nfs-server
'. In that
article I said that as far as I could tell this entire process was
transparent to NFS v3 clients that were talking to the NFS server.
Unfortunately I have to take that back. Restarting 'nfs-server
'
will cause the NFS server to discard locks obtained by NFS v3
clients, without telling the NFS v3 clients anything about this.
This results in the NFS v3 clients thinking that they hold locks
while the NFS server believes that everything is unlocked and so
will allow another client to lock it.
(What happens with NFS v4 clients is more uncertain to me; they may more or less ride through things.)
On Linux, the NFS server is in the kernel and runs as kernel
processes, generally visible in process lists as '[nfsd]
'. You
might wonder how these processes are started and stopped, and the
answer is through a little user-level shim, rpc.nfsd
. What this
program actually does is write to some files in /proc/fs/nfsd that control
the portlist, the NFS versions offered, and the number of kernel
nfsd threads that are running. To restart (kernel) NFS service, the
nfs-server.service unit first stops it with 'rpc.nfsd 0', telling
the kernel to run '0' nfsd threads, and then starts it again by
writing some appropriate number of threads into place, which starts
NFS service. The nfs-server.service systemd unit also does some
other things.
(As a side note, you can see what NFS versions your NFS server is currently supporting by looking at /proc/fs/nfsd/versions. Sadly this can't be changed while there are NFS server threads running.)
If you restart the kernel NFS server either with 'systemctl restart
nfs-server
' or by hand by writing '0' and then some number to
/proc/fs/nfsd/threads, the kernel will completely drop knowledge
of all locks from NFS v3 clients. Unfortunately running 'sm-notify
' doesn't
seem to recover them; they're just gone.
Locks from NFS v4 clients suffer a somewhat less predictable and
certain fate. If the NFS v4 client is actively doing NFS operations
to the server, its locks will generally be preserved over a 'systemctl
restart nfs-server
'. If the client isn't actively doing NFS
operations and doesn't do any for a while, I'm not certain that its
locks will be preserved, and certainly they aren't immediately there
(they seem to only come back when the NFS v4 client re-attaches to
the server).
Looked at from the right angle, this makes sense. The kernel has to release locks from NFS clients when it stops being an NFS server, and a sensible signal that it's no longer an NFS server is when it's told to run zero NFS threads. However, it does seem to lead to an unfortunate result for at least NFS v3 clients.
2023-09-19
How Unix shells used to be used as an access control mechanism
Once upon a time, one of the ways that system administrators controlled who could log in to what server was by assigning special administrative shells to logins, either on a particular system or across your entire server fleet. Today, special shells (mostly) aren't an effective mechanism for this any more, so modern Unix people may not have much exposure to this idea. However, vestiges of this live on in typical Unix configurations, in the form of /sbin/nologin (sometimes in /usr/sbin) and how many system accounts have this set as their shell in /etc/passwd.
The normal thing for /sbin/nologin to do when run is to print
something like 'This account is currently not available.' and exit
with status 1 (in a surprising bit of cross-Unix agreement, all of
Linux, FreeBSD, and OpenBSD nologin
appear to print exactly the
same message). By making this the shell of some account, anything
that executes an account's shell as part of accessing it will fail,
so login (locally or over SSH) and a normal su
will both fail.
Typical versions of su
usually have special features to keep you
from overriding this by supplying your own shell (often involving
/etc/shells
, or deliberately running the
/etc/passwd shell for the user). Otherwise, there
is nothing that prevents processes from running under the login's
UID, and in fact it's extremely common for such system accounts to
be running various processes.
Unix system administrators have long used this basic idea for their own purposes, creating their own fleet of administrative shells to, for example, tell you that a machine was only accessible by staff. You would then arrange for all non-staff logins on the machine to have that shell as their login shell (there might be such logins if, for example, the machine is a NFS fileserver). Taking the idea one step further, you might suspend accounts before deleting them by changing the account's shell to an administrative shell that printed out 'your account is suspended and will be deleted soon, contact <X> if you think this is a terrible mistake' and then exited. In an era when everyone accessed your services by logging in to your machines through SSH (or earlier, rlogin and telnet), this was an effective way of getting someone's attention and a reasonably effective way of denying them access (although even back then, the details could be complex).
(Our process for disabling accounts gives such accounts a special shell, but it's mostly as a marker for us for reasons covered in that entry.)
You could also use administrative shells to enforce special actions when people logged in. For example, newly created logins might be given a special shell that would make them agree to your usage policies, force them to change their password, and then through magic change their shell to a regular shell. Some of this could be done through existing system features (sometimes there was a way to mark a passwd entry so that it forced an immediate password change), but generally not all of it. Again, this worked well when you could count on people starting using your systems by logging in at the Unix level (which generally is no longer true).
Sensible system administrators didn't try to use administrative shells to restrict what people could do on a machine, because historically such 'restricted shells' had not been very successful at being restrictive. Either you let someone have access or you didn't, and any 'restriction' was generally temporary (such as forcing people to do one time actions on their first login). Used this way, administrative shells worked well enough that many old Unix environments accumulated a bunch of local ones, customized to print various different messages for various purposes.
PS: One trick you could do with some sorts of administrative shells was make them trigger alarms when run. If some people were not really supposed to even try to log in to some machine, you might want to know if someone tried. One reason this is potentially an interesting signal is that anyone who gets as far as running a login shell definitely knows the account's password (or otherwise can pass your local Unix authentication).
(These days I believe this would be considered a form of 'canary token'.)
2023-09-18
Making a function that defines functions in GNU Emacs ELisp
Suppose that for some reason you're trying to create a number of
functions that follow a fixed template; for example, they should
all be called 'mh-visit-<name>' that will all use mh-visit-folder
to visit the (N)MH folder
'+inbox/<name>'. In my last installment
I did this with an Emacs Lisp macro, but
it turns out there are reasons to prefer a function over a macro.
For example, you apparently can't use a macro in dolist
,
which means you have to write all of the macro invocations for all
of the functions you want to create by hand, instead of having a
list of all of the names and dolist'ing over it.
There are two ways to write this function to create functions, a simpler version that doesn't necessarily work and a more complicated version that always does (as far as I know). I'll start with the simple version and describe the problem:
(defun make-visit-func (fname) (let ((folder-name (concat "+inbox/" fname)) (func (concat "mh-visit-" fname))) (defalias (intern func) (lambda () (interactive) (mh-visit-folder folder-name)) (format "Visit MH folder %s." folder-name))))
If you try this in an Emacs *scratch* buffer, it may well work. If you put this into a .el file (one that has no special adornment) and use it to create a bunch of functions in that file, then try to use one of them, Emacs will tell you 'mh-visit-<name>: Symbol’s value as variable is void: folder-name'. This is because folder-name is dynamically scoped, and so is not captured by the lambda we've created here; the 'folder-name' in the lambda is just a free-floating variable. As far as I know, there is no way to create a lexically bound variable and a closure without making all elisp code in the file use lexical binding instead of dynamic scoping.
(As of Emacs 29.1, .el files that aren't specifically annotated on their first lines still use dynamic scoping, so your personal .el files are probably set up this way. If you innocently create a .el file and start pushing code from your .emacs into it, dynamic scoping is what you get.)
Fortunately we can use a giant hammer, basically imitating the
structure of our macro version and directly
calling apply
.
This version looks like this:
(defun make-visit-func (fname) (let ((folder-name (concat "+inbox/" fname)) (func (concat "mh-visit-" fname))) (apply `(defalias ,(intern func) (lambda () ,(format "Visit MH folder %s." folder-name) (interactive) (mh-visit-folder ,folder-name))))))
(This time I've attached the docstring to the lambda, not the alias, which is really the right thing but which seems to be hard in the other version.)
As I understand it, we are effectively doing what a macro would be;
we are creating the S-expression version of the function we want,
with our let
created variables being directly spliced in by value, not by their
(dynamically bound) names.
PS: I probably should switch my .el files over to lexical binding and fix anything that breaks, especially since I only have one right now. But the whole thing irritates me a bit. And I think the apply-based version is still a tiny bit more efficient and better, since it directly substitutes in the values (and puts the docstring on the lambda).
2023-09-17
Unix shells are generally not viable access control mechanisms any more
Once upon a time, if you had a collection of Unix systems, you could reasonably do a certain amount of access control to your overall environment by forcing logins to have specific administrative shells. As a bonus, these administrative shells could print helpful messages about why the particular login wasn't being allowed to use your system. This is a quite attractive bundle of features, but unfortunately this no longer works in a (modern) Unix environment with logins (such as we have). There are two core problems.
First, you almost certainly operate a variety of services that normally only use Unix logins as a source of (password) authentication and perhaps a UID to operate as, and ignore the login's shell. This is the common pattern of Samba, IMAP servers, Apache HTTP Basic Authentication, and so on. In some cases you may be able to teach these services to look at the login's shell and do special things, but some of them are sealed black boxes and even the ones that can be changed require you to go out of your way. If you forget one, it fails open (allowing access to people with an administrative shell that should lock them out).
(One of these services is SSH itself, since you can generally initiate SSH sessions and ask for port forwarding or other features that don't cause SSH to run the login shell.)
Second, you may operate general authentication services, such as LDAP or a Single Sign On system, and if you do these authentication services are generally blind to what they're being used for and thus to whether or not a login with a special shell should be allowed to pass this particular authentication. The only real solution is to have multiple versions of these authentication systems with different logins in them, and point systems at different ones based on exactly who should be allowed to use them.
A similar issue happens with Apache HTTP Basic Authentication in common configurations, where you have a single authentication realm with a single Apache htpasswd file that covers an assortment of different services. If you need certain logins ('locked' logins or the like) to be excluded from some of these services but not others, either you need multiple htpasswd files (at least) or you need to teach each such service to do additional checks.
(In general you're going to have to try to carefully review who should be able to use which of your services when, and the resulting matrix is often surprisingly complicated and tangled. Life gets more complicated if you're using administrative shells for reasons other than just locking people out with a message, for example to try to force an initial password change.)
Today, the only two measures of login access control that really work in a general environment are either scrambling the login's password (and disable any SSH authorized keys) or excluding the login entirely from your various authentication data sources (your LDAP servers, your Apache htpasswd files, and so on). It's a pity that changing people's shells is no longer enough (it was both easy and convenient), but that's how the environment has evolved.
2023-09-16
Apache's HTTP Basic Authentication could do with more logging
Suppose, not entirely hypothetically, that you use Apache and have an area of your website protected with Apache's HTTP Basic Authentication. A user comes to you with a problem report; while interacting with this area of the site, they unexpectedly got re-challenged for authentication. In fact, in your Apache logs you can see that they made an authenticated request that returned a HTTP redirect and literally moments later their browser's GET of the redirection target was met with a HTTP 401 response, indicating that Apache didn't think they were authenticated or maybe authorized. Unfortunately, our options for understanding exactly what happened are limited, because Apache doesn't really do logging about the Basic Authentication process.
There is one useful (or even critical) piece of information that Apache does log in the standard log format, and that is whether or not the HTTP 401 was because of a lack of authorization. Both normally get HTTP 401 responses (although you can change that with AuthzSendForbiddenOnFailure and perhaps should), but they appear differently in the normal access log. If there was a successful authentication but the user was not authorized, you will see their name in the log file:
192.168.1.1 - theuser [...] "GET /restricted HTTP/1.1" 401 ....
If they are not authenticated (for whatever reason), then there will be no user name logged; the leading bit will just be '192.168.1.1 - -'.
However, there are at least five reasons why this request was not authenticated (in Apache's view) and you can't tell them apart. The five reasons are the browser didn't send an Authorization header, the header or a part of it was malformed, the authorization source (such as your htpasswd file) was missing or unreadable, the header contained a user name that Apache didn't find or recognize, or the password in the header was incorrect. It would be nice to know which one of these had happened, because they lead to quite different causes and fixes.
(Apache may log errors if the authorization source is missing or unreadable; I haven't tested. That still leaves the other cases.)
For example, if your logs say that the browser didn't send the header at all, that is probably not a problem on your side. Although the rules for when browsers decide to send this header are a bit complex and potentially surprising, because it doesn't work like cookies, where a browser will always send them once set. And browsers make their own decisions about how to react to HTTP 401 responses on requests where they didn't send Authorization headers, so they may decide to re-ask the person for a name and password even though they have Basic Authentication credentials they could try.
(Having discovered AuthzSendForbiddenOnFailure, I am probably going to set it on several limited-access areas in our Apache configuration, because it's rather more user friendly. It's not an information disclosure for us because there are authenticated but otherwise unrestricted areas on the web server with the same credentials, so an attacker can already validate guessed passwords.)
2023-09-15
Insuring that my URL server and client programs exit after problems
I recently wrote about my new simple system to open URLs on my desktop from remote machines, where a Python client (on the remote server) listens on a Unix domain socket for URLs that programs (like mail clients) want opened, and reports these URLs to the server on my desktop, which passes them to my browser. The server and client communicate over SSH; the server starts by SSH'ing to the remote machine and running the client. On my desktop, I run the server in a terminal window, because that's the easy approach.
Whenever I have a pair of communicating programs like this, one of my concerns is making sure that each end notices when the other goes away or the communication channel breaks, and cleans itself up. If the SSH connection is broken or the remote client exits for some reason, I don't want the server to hang around looking like it's still alive and functioning; similarly, if the server exits or the SSH connection is broken, I want the remote client to exit immediately, rather than hang around claiming to other parties that it can accept URLs and pass them to my desktop to be opened in a browser.
On the server this is relatively simple. I started with my standard stanza for Python programs that I want to die when there are problems:
signal.signal(signal.SIGINT, signal.SIG_DFL) signal.signal(signal.SIGPIPE, signal.SIG_DFL) signal.signal(signal.SIGHUP, signal.SIG_DFL)
If I was being serious I should check to see what SIGINT was initially set to, but this is a casual program, so I'll never run it with SIGINT deliberately masked. Setting SIGHUP isn't necessary today, but I didn't remember that until I checked and Python could change it.
Since all the server does is read from the SSH connection to the client, I can detect both client exit and SSH connection problems by looking for end of file, which is signalled by an empty read result:
def process(host: str) -> None: pd = remoteprocess(host) assert(pd.stdout) while True: in = pd.stdout.readline() if not in: break [...]
As far as I know, our SSH configurations use TCP keepalives, so if the connection between my machine and the server is broken, both ends will eventually notice.
Arranging for the remote client to exit at appropriate points is a bit harder and involves a hack. The client's sign that the server has gone away is that the SSH connection gets closed, and one sign of that is that the client's standard input gets closed. However, the client is normally parked in socket.accept() waiting for new connections over its Unix socket, not trying to read from the SSH connection. Rather than write more complicated Python code to try to listen for both a new socket connection and end of file on standard input (for example using select), I decided to use a second thread and brute force. The second thread tries to read from standard input and forces the entire program to exit if it sees end of file:
def reader() -> None: while True: try: s = sys.stdin.readline() if not s: os._exit(0) except EnvironmentError: os._exit(0) [...] def main() -> None: [the same signal setup as above] t = threading.Thread(target=reader, daemon=True) t.start() [rest of code]
In theory the server is not supposed to send anything to the client, but in practice I decided that I would rather have the client exit only on an explicit end of file indication. The use of os._exit() is a bit brute force, but at this point I want all of the client to exit immediately.
This threading approach is brute force but also quite easy, so I'm glad I opted for it rather than complicating my life a lot with select and similar options. These days maybe the proper easy way to do this sort of thing is asyncio with streams, but I haven't written any asyncio code.
(I may take this as a challenge and rewrite the client as a proper asyncio based program, just to see how difficult it is.)
All of this appears to work in casual testing. If I Ctrl-C the server in my terminal window, the remote client dutifully exits. If I manually kill the remote client, my local server exits. I haven't simulated having the network connection stop working and having SSH recognize this, but my network connections don't get broken very often (and if my network isn't working, I won't be logged in to work and trying to open URLs on my home desktop).
2023-09-14
An important difference between intern
and make-symbol
in GNU Emacs ELisp
Suppose, not hypothetically, that for some reason you're trying to create a GNU Emacs ELisp macro that defines a function, and for your sins you don't want to directly specify the name of your new function. In my case, I want to create a bunch of functions with names of the form 'mh-visit-<something>', which all use mh-visit-folder to visit the (N)MH folder '+inbox/<something>'. Ordinary people using macros to create functions probably give the macro the full name of the new function, but here it's less annoying if I can put it together in the macro.
(I will put the summary here: unless you know what you're doing,
use intern
instead of make-symbol
in situations where you're
making up symbols, like on the fly function definitions.)
If we passed in the name of our new function to be, I believe the resulting macro would be (without niceties like docstrings):
(defmacro make-visit-func (fname folder) `(defalias ',fname (lambda () (interactive) (mh-visit-folder ,(format "+inbox/%s" folder))))) ;; usage: (make-visit-func mh-visit-fred "fred")
(You might think we would use defun
, but defun is actually a macro
itself and defalias
is the direct thing, as I understand it; see
Defining functions.)
The ` and , bits are Backquoting,
which lets us create a quoted list of
the code the macro will generate while splicing in some variables.
The peculiar ',fname
is (E)Lisp for the value of 'fname
'
spliced in to the list but quoted instead of evaluated; 'fname
'
itself is not a string but a symbol
(also),
which we have here because symbols are how functions get their
(global) names.
(We don't have to quote 'mh-visit-fred
' when we use the macro
because macro arguments are effectively automatically quoted, by
virtue of being unevaluated.)
If we're going to use the folder name to create both the full folder
and the name of our new function, we need to turn a string into a
symbol. If you quickly scan the documentation on creating symbols,
there are two plausible ELisp functions to use, intern
and make-symbol
.
The latter's name certainly sounds like what we want, and if you aren't
an ELisp expert, you might rewrite our make-visit-func
to use it, like
so:
;; This doesn't actually work, don't copy it. (defmacro make-visit-func (folder) `(defalias ',(make-symbol (format "mh-visit-%s" folder)) (lambda () (interactive) (mh-visit-folder ,(format "+inbox/%s" folder))))) ;; usage: (make-visit-func "fred")
If you do this, you will spend some time being rather confused. Your macro will
execute without errors and debugging approaches like macroexpand
will give you a result that works when evaluated. If you change to
using intern
, this will work, so you actually want:
;; This does actually work, you can copy it. (defmacro make-visit-func (folder) `(defalias ',(intern (format "mh-visit-%s" folder)) (lambda () (interactive) (mh-visit-folder ,(format "+inbox/%s" folder)))))
What I believe is happening is caused by the following innocent seeming
bit in the description of make-symbol
, emphasis mine:
This function returns a newly-allocated, uninterned symbol whose name is name (which must be a string).
Normally, the symbols that serve as the name of functions
are interned, making
them both visible for later use and kept around by ELisp garbage
collection. However, what make-symbol
gives us is an unreferenced
symbol object (with a name assigned to it), so when defalias
connects our lambda to it to make it a function, it is still not
visible or retained. As a result it disappears into a puff of smoke
afterward; even if it hasn't been literally garbage collected, there
is no reference to it.
Evaluating the result of macroexpand
worked, because it was
the (more or less) list result and when evaluated back created a
normal, interned symbol. Here it is:
(defalias 'mh-visit-fred (lambda nil (interactive) (mh-visit-folder "+inbox/fred")))
(I've split this on three lines for readability.)
Although 'mh-visit-fred
' came from 'make-symbol
', it is printed
as text (well, its name is), and when read back to evaluate, it
will be a normal interned string. This is one of the cases where
the printed representation
is not actually the read syntax
for this value. I believe that the actual neurotically correct print
syntax for 'make-symbol
's result is '#:
'. Although I believe
Emacs accepts this as read syntax, I don't know if current versions
ever print this on output unless you try very hard.
(Having gone through this once, I'm writing it down so that if I ever do this sort of thing again I won't have to re-learn this.)
PS: Since I am not an Emacs Lisp expert, it's quite possible that there are better ways to do all of this macro.
2023-09-13
A user program doing intense IO can manifest as high system CPU time
Recently, our IMAP
server had unusually high CPU usage and was increasingly close to
saturating its CPU. When I investigated with 'top' it was easy to
see the culprit processes, but when I checked what they were doing
with the strace
command, they were all busy madly doing IO, in
fact processing recursive IMAP LIST
commands by walking around in the
filesystem. Processes that intensely do IO like this normally wind
up in "iowait", not in
active CPU usage (whether user or system CPU usage). Except here
these processes were, using huge amounts of system CPU time.
What was happening is that these IMAP processes trying to do recursive
IMAP LISTs of all available 'mail folders' had managed to escape
into '/sys
'. The processes were working away more or less endlessly
because Dovecot (the IMAP server
software we use) makes the entirely defensible but less common
decision to follow symbolic links when traversing directory trees, and Linux's /sys
has a
lot of them (and may have ones that form cycles, so a directory
traversal that follows symbolic links may never terminate). Since
/sys
is a virtual filesystem that is handled entirely inside the
Linux kernel, traversing it and reading directories from it does
no actual IO to actual disks. Instead, it's all handled in kernel
code, and all of the work to traverse around it, list directories,
and so on shows up as system time.
Operating on a virtual filesystem isn't the only way that a program
can turn a high IO rate into high system time. You can get the same
effect if you're repeatedly re-reading the same data that the kernel
has cached in memory. Since the kernel can satisfy your IO requests
without going to disk, all of the effort required turns into system
CPU time inside the kernel. This is probably easiest to have happen
with reading data from files, but you can also have programs that
are repeatedly scanning the same directories or calling stat()
(or lstat()
) on the same filesystem names. All of those can wind
up as entirely in-kernel activities because the modern Linux kernel
is very good at caching things.
(Most people's IMAP servers don't have the sort of historical configuration issues we have that create these exciting adventures.)