Wandering Thoughts

2025-02-09

Providing pseudo-tags in DWiki through a simple hack

DWiki is the general filesystem based wiki engine that underlies this blog, and for various reasons having to do with how old it is, it lacks a number of features. One of the features that I've wanted for more than a decade has been some kind of support for attaching tags to entries and then navigating around using them (although doing this well isn't entirely easy). However, it was always a big feature, both in implementing external files of tags and in tagging entries, and so I never did anything about it.

Astute observers of Wandering Thoughts may have noticed that some years ago, it acquired some topic indexes. You might wonder how this was implemented if DWiki still doesn't have tags (and the answer isn't that I manually curate the lists of entries for each topic, because I'm not that energetic). What happened is that when the issue was raised in a comment on an entry, I realized that I sort of already had tags for some topics because of how I formed the 'URL slugs' of entries (which are their file names). When I wrote about some topics, such as Prometheus, ZFS, or Go, I'd almost always put that word in the wikiword that became the entry's file name. This meant that I could implement a low rent version of tags simply by searching the (file) names of entries for words that matched certain patterns. This was made easier because I already had code to obtain the general list of file names of entries since that's used for all sorts of things in a blog (syndication feeds, the front page, and so on).

That this works as well as it does is a result of multiple quirks coming together. DWiki is a wiki so I try to make entry file names be wikiwords, and because I have an alphabetical listing of all entries that I look at regularly, I try to put relevant things in the file name of entries so I can find them again and all of the entries about a given topic sort together. Even in a file based blog engine, people don't necessarily form their file names to put a topic in them; you might make the file name be a slug-ized version of the title, for example.

(The actual implementation allows for both positive and negative exceptions. Not all of my entries about Go have 'Go' as a word, and some entries with 'Go' in their file name aren't about Go the language, eg.)

Since the implementation is a hack that doesn't sit cleanly within DWiki's general model of the world, it has some unfortunate limitations (so far, although fixing them would require more hacks). One big one is that as far as the rest of DWiki is concerned, these 'topic' indexes are plain pages with opaque text that's materialized through internal DWikiText rendering. As such, they don't (and can't) have Atom syndication feeds, the way proper fully supported tags would (and you can't ask for 'the most recent N Go entries', and so on; basically there are no blog-like features, because they all require directories).

One of the lessons I took from the experience of hacking pseudo-tag support together was that as usual, sometimes the perfect (my image of nice, generalized tags) is the enemy of the good enough. My solution for Prometheus, ZFS, and Go as topics isn't at all general, but it works for these specific needs and it was easy to put together once I had the idea. Another lesson is that sometimes you have more data than you think, and you can do a surprising amount with it once you realize this. I could have implemented these simple tags years before I did, but until the comment gave me the necessary push I just hadn't thought about using the information that was already in entry names (and that I myself used when scanning the list).

DWikiSimpleTagSolution written at 22:56:03;

2025-01-21

A change in the handling of PYTHONPATH between Python 3.10 and 3.12

Our long time custom for installing Django for our Django based web application was to install it with 'python3 setup.py install --prefix /some/where', and then set a PYTHONPATH environment variable that pointed to /some/where/lib/python<ver>/site-packages. Up through at least Python 3.10 (in Ubuntu 22.04), you could start Python 3 and then successfully do 'import django' with this; in fact, it worked on different Python versions if you were pointing at the same directory tree (in our case, this directory tree lives on our NFS fileservers). In our Ubuntu 24.04 version of Python 3.12 (which also has the Ubuntu packaged setuptools installed), this no longer works, which is inconvenient to us.

(It also doesn't seem to work in Fedora 40's 3.12.8, so this probably isn't something that Ubuntu 24.04 broke by using an old version of Python 3.12, unlike last time.)

The installed site-packages directory contains a number of '<package>.egg' directories, a site.py file that I believe is generic, and an easy-install.pth that lists the .egg directories. In Python 3.10, strace says that Python 3 opens site.py and then easy-install.pth during startup, and then in a running interpreter, 'sys.path' contains the .egg directories. In Python 3.12, none of this happens, although CPython does appear to look at the overall 'site-packages' directory and 'sys.path' contains it, as you'd expect. Manually adding the .egg directories to a 3.12 sys.path appears to let 'import django' work, although I don't know if everything is working correctly.

I looked through the 3.11 and 3.12 "what's new" documentation (3.11, 3.12) but couldn't find anything obvious. I suspect that this is related to the removal of distutils in 3.12, but I don't know enough to say for sure.

(Also, if I use our usual Django install process, the Ubuntu 24.04 Python 3.12 installs Django in a completely different directory setup than in 3.10; it now winds up in <top level>/local/lib/python3.12/dist-packages. Using 'pip install --prefix ...' does create something where pointing PYTHONPATH at the 'dist-packages' subdirectory appears to work. There's also 'pip install --target', which I'd forgotten about until I stumbled over my old entry.)

All of this makes it even more obvious to me than before that the Python developers expect everyone to use venvs and anything else is probably going to be less and less well supported in the future. Installing system-wide is probably always going to work, and most likely also 'pip install --user', but I'm not going to hold my breath for anything else.

(On Ubuntu 24.04, obviously we'll have to move to a venv based Django installation. Fortunately you can use venvs with programs that are outside the venv.)

Pythonpath310Vs312Change written at 22:40:55;

2025-01-16

Some stuff about how Apache's mod_wsgi runs your Python apps (as of 5.0)

We use mod_wsgi to host our Django application, but if I understood the various mod_wsgi settings for how to run your Python WSGI application when I originally set it up, I've forgotten it all since then. Due to recent events, exactly how mod-wsgi runs our application and what we can control about that is now quite relevant, so I spent some time looking into things and trying to understand settings. Now it's time to write all of this down before I forget it (again).

Mod_wsgi can run your WSGI application in two modes, as covered in the quick configuration guide part of its documentation: embedded mode, which runs a Python interpreter inside a regular Apache process, and daemon mode, where one or more Apache processes are taken over by mod_wsgi and used exclusively to run WSGI applications. Normally you want to use daemon mode, and you have to use daemon mode if you want to do things like run your WSGI application as a Unix user other than the web server's normal user or use packages installed into a Python virtual environment.

(Running as a separate Unix user puts some barriers between your application's data and a general vulnerability that gives the attacker read and/or write access to anything the web server has access to.)

To use daemon mode, you need to configure one or more daemon processes with WSGIDaemonProcess. If you're putting packages (such as Django) into a virtual environment, you give an appropriate 'python-home=' setting here. Your application itself doesn't have to be in this venv. If your application lives outside your venv, you will probably want to set either or both of 'home=' and 'python-path=' to, for example, its root directory (especially if it's a Django application). The corollary to this is that any WSGI application that uses a different virtual environment, or 'home' (starting current directory), or Python path needs to be in a different daemon process group. Everything that uses the same process group shares all of those.

To associate a WSGI application or a group of them with a particular daemon process, you use WSGIProcessGroup. In simple configurations you'll have WSGIDaemonProcess and WSGIProcessGroup right next to each other, because you're defining a daemon process group and then immediately specifying that it's used for your application.

Within a daemon process, WSGI applications can run in either the main Python interpreter or a sub-interpreter (assuming that you don't have sub-interpreter specific problems). If you don't set any special configuration directive, each WSGI application will run in its own sub-interpreter and the main interpreter will be unused. To change this, you need to set something for WSGIApplicationGroup, for instance 'WSGIApplicationGroup %{GLOBAL}' to run your WSGI application in the main interpreter.

Some WSGI applications can cohabit with each other in the same interpreter (where they will potentially share various bits of global state). Other WSGI applications are one to an interpreter, and apparently Django is one of them. If you need your WSGI application to have its own interpreter, there are two ways to achieve this; you can either give it a sub-interpreter within a shared daemon process, or you can give it its own daemon process and have it use the main interpreter in that process. If you need different virtual environments for each of your WSGI applications (or different Unix users), then you'll have to use different daemon processes and you might as well have everything run in their respective main interpreters.

(After recent experiences, my feeling is that processes are probably cheap and sub-interpreters are a somewhat dark corner of Python that you're probably better off avoiding unless you have a strong reason to use them.)

You normally specify your WSGI application to run (and what URL it's on) with WSGIScriptAlias. WSGIScriptAlias normally infers both the daemon process group and the (sub-interpreter) 'application group' from its context, but you can explicitly set either or both. As the documentation notes (now that I'm reading it):

If both process-group and application-group options are set, the WSGI script file will be pre-loaded when the process it is to run in is started, rather than being lazily loaded on the first request.

I'm tempted to deliberately set these to their inferred values simply so that we don't get any sort of initial load delay the first time someone hits one of the exposed URLs of our little application.

For our Django application, we wind up with a collection of directives like this (in its virtual host):

WSGIDaemonProcess accounts ....
WSGIProcessGroup accounts
WSGIApplicationGroup %{GLOBAL}
WSGIScriptAlias ...

(This also needs a <Directory> block to allow access to the Unix directory that the WSGIScriptAlias 'wsgi.py' file is in.)

If we added another Django application in the same virtual host, I believe that the simple update to this would be to add:

WSGIDaemonProcess secondapp ...
WSGIScriptAlias ... process-group=secondapp application-group=%{GLOBAL}

(Plus the <Directory> permissions stuff.)

Otherwise we'd have to mess around with setting the WSGIProcessGroup and WSGIApplicationGroup on a per-directory basis for at least the new application. If we specify them directly in WSGIScriptAlias we can skip that hassle.

(We didn't used to put Django in a venv, but as of Ubuntu 24.04, using a venv seems the easiest way to get a particular Django version into some spot where you can use it. Our Django application doesn't live inside the venv, but we need to point mod_wsgi at the venv so that our application can do 'import django.<...>' and have it work. Multiple Django applications could all share the venv, although they'd have to use different WSGIDaemonProcess settings, or at least different names with the same other settings.)

ModWsgiHowAppsRun written at 23:13:25;

2025-01-15

(Multiple) inheritance in Python and implicit APIs

The ultimate cause of our mystery with Django on Ubuntu 24.04 is that versions of Python 3.12 before 3.12.5 have a bug where builtin types in sub-interpreters get unexpected additional slot wrappers (also), and Ubuntu 24.04 has 3.12.3. Under normal circumstances, 'list' itself doesn't have a '__str__' method but instead inherits it from 'object', so if you have a class that inherits from '(list,YourClass)' and YourClass defines a __str__, the YourClass.__str__ is what gets used. In a sub-interpreter, there is a list.__str__ and suddenly YourClass.__str__ isn't used any more.

(mod_wsgi triggers this issue because in a straightforward configuration, it runs everything in sub-interpreters.)

This was an interesting bug, and one of the things it made me realize is that the absence of a __str__ method on 'list' itself had implicitly because part of list's API. Django had set up class definitions that were 'class Something(..., list, AMixin)', where the 'AMixin' had a direct __str__ method, and Django expected that to work. This only works as long as 'list' doesn't have its own __str__ method and instead gets it through inheritance from object.__str__. Adding such a method to 'list' would break Django and anyone else counting on this behavior, making the lack of the method an implicit API.

(You can get this behavior with more or less any method that people might want to override in such a mixin class, but Python's special methods are probably especially prone to it.)

Before I ran into this issue, I probably would have assumed that where in the class tree a special method like __str__ was implemented was simply an implementation detail, not something that was visible as part of a class's API. Obviously, I would have been wrong. In Python, you can tell the difference and quite easily write code that depends on it, code that was presumably natural to experienced Python programmers.

(Possibly the existence of this implicit API was obvious to experienced Python programmers, along with the implication that various builtin types that currently don't have their own __str__ can't be given one in the future.)

InheritanceAndImplictAPI written at 23:16:22;

2025-01-13

A mystery with Django under Apache's mod_wsgi on Ubuntu 24.04

We have a long standing Django web application that these days runs under Python 3 and a more modern version of Django. For as long as it has existed, it's had some forms that were rendered to HTML through templates, and it has rendered errors in those forms in what I think of as the standard way:

{{ form.non_field_errors }}
{% for field in form %}
  [...]
  {{ field.errors }}
  [...]
{% endfor %}

This web application runs in Apache using mod_wsgi, and I've recently been working on moving the host this web application runs on to Ubuntu 24.04 (still using mod_wsgi). When I stood up a test virtual machine and looked at some of these HTML forms, what I saw was that when there were no errors, each place that errors would be reported was '[]' instead of blank. This did not happen if I ran the web application on the same test machine in Django's 'runserver' development testing mode.

At first I thought that this was something to do with locales, but the underlying cause is much more bizarre and inexplicable to me. The template operation for form.non_field_errors results in calling Form.non_field_errors(), which returns a django.forms.utils.ErrorList object (which is also what field.errors winds up being). This class is a multiple-inheritance subclass of UserList, list, and django.form.utils.RenderableErrorMixin. The latter is itself a subclass of django.forms.utils.RenderableMixin, which defines a __str__() special method value that is RenderableMixin.render(), which renders the error list properly, including rendering it as a blank if the error list is empty.

In every environment except under Ubuntu 24.04's mod_wsgi, ErrorList.__str__ is RenderableMixin.render and everything works right for things like 'form.non_field_errors' and 'field.errors'. When running under Ubuntu 24.04's mod_wsgi, and only then, ErrorList.__str__ is actually the standard list.__str__, so empty lists render as '[]' (and had I tried to render any forms with actual error reports, worse probably would have happened, especially since list.__str__ isn't carefully escaping special HTML characters).

I have no idea why this is happening in the 24.04 mod_wsgi. As far as I can tell, the method resolution order (MRO) for ErrorList is the same under mod_wsgi as outside it, and sys.path is the same. The RenderableErrorMixin class is getting included as a parent of ErrorList, which I can tell because RenderableMixin also provides a __html__ definition, and ErrorList.__html__ exists and is correct.

The workaround for this specific situation is to explicitly render errors to some format instead of counting on the defaults; I picked .as_ul(), because this is what we've normally gotten so far. However the whole thing makes me nervous since I don't understand what's special about the Ubuntu 24.04 mod_wsgi and who knows if other parts of Django are affected by this.

(The current Django and mod_wsgi setup is running from a venv, so it should also be fully isolated from any Ubuntu 24.04 system Python packages.)

(This elaborates on a grumpy Fediverse post of mine.)

DjangoModWsgiUbuntu2404Mystery written at 23:10:33;

2024-12-22

Two views of Python type hints and catching bugs

I recently wrote a little Python program where I ended up adding type hints, an experience that I eventually concluded was worth it overall even if it was sometimes frustrating. I recently fixed a small bug in the program; like many of my bugs, it was a subtle logic bug that wasn't caught by typing (and I don't think it would have been caught by any reasonable typing).

One view you could take of type hints is that they often don't catch any actual bugs, and so you can question their worth (when viewed only from a bug catching perspective). Another view, one that I'm more inclined to, is that type hints sweep away the low hanging fruit of bugs. A type confusion bug is almost always found pretty fast when you try to use the code, because your code usually doesn't work at all. However, using type hints and checking them provides early and precise detection of these obvious bugs, so you get rid of them right away before they take up your time with you trying to work out why this object doesn't have the methods or fields that you expect.

("Type hints", which is to say documenting what types are used where for what, also have additional benefits, such as accurate documentation and enabling type based things in IDEs, LSP servers, and so on.)

So although my use of type hints and mypy didn't catch this particular logic oversight, my view of them remains positive. And type hints did help me make sure I wasn't adding an obvious bug when I fixed this issue (my fix required passing an extra argument to something, creating an opportunity for a bit of type confusion if I got the arguments wrong).

Sidebar: my particular non-type bug

This program reports the current, interesting alerts from our Prometheus metrics system. For various reasons, it supports getting the alerts as of some specific time, not just 'now', and it also filters out some alerts when they aren't old enough. My logic bug with was with the filtering; in order to compute the age of an alert, I did:

age = time.time() - alert_started_at

The logic problem is that when I'm getting the alerts at a particular time instead of 'now', I also want to compute the age of the alert as of that time, not as of 'right now'. So I don't want 'time.time()', I want 'as of the logical time when we're obtaining this information'.

(This sort of logic oversight is typical for non-obvious bugs that linger in my programs after they're basically working. I only noticed it because I was adding a new filter, and needed to get the alerts as of a time when what I wanted to filter out was happening.)

TypeHintsAndCatchingBugs written at 23:03:19;

2024-11-29

Python type hints are probably "worth it" in the large for me

I recently added type hints to a little program, and that experience wasn't entirely positive that left me feeling that maybe I shouldn't bother. Because I don't promise to be consistent, I went back and re-added type hints to the program all over again, starting from the non-hinted version. This time I did the type hints rather differently and the result came out well enough that I'm going to keep it.

Perhaps my biggest change was to entirely abandon NewType(). Instead I set up two NamedTuples and used type aliases for everything else, which amounts to three type aliases in total. Since I was using type aliases anyway, I only added them when it was annoying to enter the real type (and I was doing it often enough). I skipped doing a type alias for 'list[namedTupleType]' because I couldn't come up with a name that I liked well enough and that it's a list is fundamental to how it's interacted with in the code involved, so I didn't feel like obscuring that.

Adding type hints 'for real' had the positive aspect of encouraging me to write a bunch of comments about what things were and how they worked, which will undoubtedly help future me when I want to change something in six months. Since I was using NamedTuples, I changed to accessing the elements of the tuples through the names instead of the indexes, which improved the code. I had to give up 'list(adict.items())' in favour of a list comprehension that explicitly created the named tuple, but this is probably a good thing for the overall code quality.

(I also changed the type of one thing I had as 'int' to a float, which is what it really should have been all along even if all of the normal values were integers.)

Overall, I think I've come around to the view that doing all of this is good for me in the same way that using shellcheck is good for my shell scripts, even if I sometimes roll my eyes at things it says. I also think that just making mypy silent isn't the goal I should be aiming for. Instead, I should be aiming for what I did to my program on this second pass, doing things like introducing named tuples (in some form), adding comments, and so on. Adding final type hints should be a prompt for a general cleanup.

(Perhaps I'll someday get to a point where I add basic type hints as I write the code initially, just to codify my belief about the shape of what I'm returning and passing in, and use them to find my mistakes. But that day is probably not today, and I'll probably want better LSP integration for it in my GNU Emacs environment.)

TypeHintsProbablyWorthItForMe written at 23:07:45;

2024-11-27

Some notes on my experiences with Python type hints and mypy

As I thought I might, today I spent some time adding full and relatively honest type hints to my recent Python program. The experience didn't go entirely smoothly and it left me with a number of learning experiences and things I want to note down in case I ever do this again. The starting point is that my normal style of coding small programs is to not make classes to represent different sorts of things and instead use only basic built in collection types, like lists, tuples, dictionaries, and so on. When you use basic types this way, it's very easy to pass or return the wrong 'shape' of thing (I did it once in the process of writing my program), and I'd like Python type hints to be able to tell me about this.

(The first note I want to remember is that mypy becomes very irate at you in obscure ways if you ever accidentally reuse the same (local) variable name for two different purposes with two different types. I accidentally reused the name 'data', using it first for a str and second for a dict that came from an 'Any' typed object, and the mypy complaints were hard to decode; I believe it complained that I couldn't index a str with a str on a line where I did 'data["key"]'.)

When you work with data structures created from built in collections, you can wind up with long, tangled compound type name, like 'tuple[str, list[tuple[str, int]]]' (which is a real type in my program). These are annoying to keep typing and easy to make mistakes with, so Python type hints provide two ways of giving them short names, in type aliases and typing.NewType. These look almost the same:

# type alias:
type hostAlertsA = tuple[str, list[tuple[str, int]]]

# NewType():
hostAlertsT = NewType('hostAlertsT', tuple[str, list[tuple[str, int]]])

The problem with type aliases is that they are aliases. All aliases for a type are considered to be the same, and mypy won't warn if you call a function that expects one with a value that was declared to be another. Suppose you have two sorts of strings, ones that are a host name and ones that are an alert name, and you would like to keep them straight. Suppose that you write:

# simple type aliases
type alertName = str
type hostName = str

func manglehost(hname: hostName) -> hostName:
  [....]

Because these are only type aliases and because all type aliases are treated as the same, you have not achieved your goal of keeping you from confusing host and alert names when you call 'manglehost()'. In order to do this, you need to use NewType(), at which point mypy will complain (and also often force you to explicitly mark bare strings as one or the other, with 'alertName(yourstr)' or 'hostName(yourstr)').

If I want as much protection against this sort of type confusion, I want to make as many things as possible be NewType()s instead of type aliases. Unfortunately NewType()s have some drawbacks in mypy for my sort of usage as far as I can see.

The first drawback is that you cannot create a NewType of 'Any':

error: Argument 2 to NewType(...) must be subclassable (got "Any")  [valid-newtype]

In order to use NewType, I must specify concrete details of my actual (current) implementation, rather than saying just 'this is a distinct type but anything can be done with it'.

The second drawback is that this distinct typing is actually a problem when you do certain sorts of transformations of collections. Let's say we have alerts, which have a name and a start time, and hosts, which have a hostname and a list of alerts:

alertT  = NewType('alertT',  tuple[str, int])
hostAlT = NewType('hostAlT', tuple[str, list[alertT]])

We have a function that receives a dictionary where the keys are hosts and the values are their alerts and turns it into a sorted list of hosts and their alerts, which is to say a list[hostAlT]). The following Python code looks sensible on the surface:

def toAlertList(hosts: dict[str, list[alertT]) -> list[hostAlT]:
  linear = list(hosts.items())
  # Don't worry about the sorting for now
  return linear

If you try to check this, mypy will declare:

error: Incompatible return value type (got "list[tuple[str, list[alertT]]]", expected "list[hostAlT]")  [return-value]

Initially I thought this was mypy being limited, but in writing this entry I've realized that mypy is correct. Our .items() returns a tuple[str, list[alertT]], but while it has the same shape as our hostAlT, it is not the same thing; that's what it means for hostAlT to be a distinct type.

However, it is a problem that as far as I know, there is no type checked way to get mypy to convert the list we have into a list[hostAlT]. If you create a new NewType to be the list type, all it 'aListT', and try to convert 'linear' to it with 'l2 = aListT(linear)', you will get more or less the same complaint:

error: Argument 1 to "aListT" has incompatible type "list[tuple[str, list[alertT]]]"; expected "list[hostAlT]"  [arg-type]

This is a case where as far as I can see I must use a type alias for 'hostAlT' in order to get the structural equivalence conversion, or alternately use the wordier and as far as I know less efficient list comprehension version of list() so that I can tell mypy that I'm transforming each key/value pair into a hostAlT value:

linear = [hostAlT(x) for x in hosts.items()]

I'd have the same problem in the actual code (instead of in the type hint checking) if I was using, for example, a namedtuple to represent a host and its alerts. Calling hosts.items() wouldn't generate objects of my named tuple type, just unnamed standard tuples.

Possibly this is a sign that I should go back through my small programs after I more or less finish them and convert this sort of casual use of tuples into namedtuple (or the type hinted version) and dataclass types. If nothing else, this would serve as more explicit documentation for future me about what those tuple fields are. I would have to give up those clever 'list(hosts.items())' conversion tricks in favour of the more explicit list comprehension version, but that's not necessarily a bad thing.

Sidebar: aNewType(...) versus typing.cast(typ, ....)

If you have a distinct NewType() and mypy is happy enough with you, both of these will cause mypy to consider your value to now be of the new type. However, they have different safety levels and restrictions. With cast(), there are no type hint checking guardrails at all; you can cast() an integer literal into an alleged string and mypy won't make a peep. With, for example, 'hostAlT(...)', mypy will apply a certain amount of compatibility checking. However, as we saw above in the 'aListT' example, mypy may still report a problem on the type change and there are certain type changes you can't get it to accept.

As far as I know, there's no way to get mypy to temporarily switch to a structural compatibility checking here. Perhaps there are deep type safety reasons to disallow that.

TypeHintsAndMyPyNotes written at 23:35:15;

2024-11-26

Python type hints may not be for me in practice

Python 3 has optional type hints (and has had them for some time), and some time ago I was a bit tempted to start using some of them; more recently, I wrote a small amount of code using them. Recently I needed to write a little Python program and as I started, I was briefly tempted to try type hints. Then I decided not to, and I suspect that this is how it's going to go in the future.

The practical problem of type hints for me when writing the kind of (small) Python programs that I do today is that they necessarily force me to think about the types involved. Well, that's wrong, or at least incomplete; in practice, they force me to come up with types. When I'm putting together a small program, generally I'm not building any actual data structures, records, or the like (things that have a natural type); instead I'm passing around dictionaries and lists and sets and other basic Python types, and I'm revising how I use them as I write more of the program and evolve it. Adding type hints requires me to navigate assigning concrete types to all of those things, and then updating them if I change my mind as I come to a better understanding of the problem and how I want to approach it.

(In writing this it occurs to me that I do often know that I have distinct types (for example, for what functions return) and I shouldn't mix them, but I don't want to specify their concrete shape as dicts, tuples, or whatever. In looking through the typing documentation and trying some things, it doesn't seem like there's an obvious way to do this. Type aliases are explicitly equivalent to their underlying thing, so I can't create a bunch of different names for eg typing.Any and then expect type checkers to complain if I mix them.)

After the code has stabilized I can probably go back to write type hints (at least until I get into apparently tricky things like JSON), but I'm not sure that this would provide very much value. I may try it with my recent little Python thing just to see how much work it is. One possible source of value is if I come back to this code in six months or a year and want to make changes; typing hints could give me both documentation and guardrails given that I'll have forgotten about a lot of the code and structure by then.

(I think the usual advice is that you should write type hints as you write the program, rather than go back after the fact and try to add them, because incrementally writing them during development is easier. But my new Python programs tend to sufficiently short that doing all of the type hints afterward isn't too much work, and if it gets me to do it at all it may be an improvement.)

PS: It might be easier to do type hints on the fly if I practiced with them, but on the other hand I write new Python programs relatively infrequently these days, making typing hints yet another Python thing I'd have to try to keep in my mind despite it being months since I used them last.

PPS: I think my ideal type hint situation would be if I could create distinct but otherwise unconstrained types for things like function arguments and function returns, have mypy or other typing tools complain when I mixed them, and then later go back to fill in the concrete implementation details of each type hint (eg, 'this is a list where each element is a ...').

TypeHintsMaybeNotForMe written at 22:58:58;

2024-08-26

What's going on with 'quit' in an interactive CPython session (as of 3.12)

We're probably all been there at some time or the other:

$ python
[...]
>>> quit
Use quit() or Ctrl-D (i.e. EOF) to exit

It's an infamous and frustrating 'error' message and we've probably all seen it (there's a similar one for 'exit'). Today I was reminded of this CPython behavior by a Fediverse conversation and as I was thinking about it, the penny belatedly dropped on what is going on here in CPython.

Let's start with this:

>>> type(quit)
<class '_sitebuiltins.Quitter'>

In CPython 3.12 and earlier, the CPython interactive interpreter evaulates Python statements; as far as I know, it has little to no special handling of what you type to it, it just evaluates things and then prints the result under appropriate circumstances. So 'quit' is not special syntax recognized by the interpreter, but instead a Python object. The message being printed is not special handling but instead a standard CPython interpreter feature to helpfully print the representation of objects, which the _sitebuiltins.Quitter class has customized to print this message. You can see all of this in Lib/_sitebuiltins.py, along with classes used for some other, related things.

(Then the 'quit' and 'exit' instances are created and wired up in Lib/site.py, along with a number of other things.)

This is changing in Python 3.13 (via), which defaults to using a new interactive shell, which I believe is called 'pyrepl' (see Libs/_pyrepl). Pyrepl has specific support for commands like 'quit', although this support actually reuses the _sitebuiltins code (see REPL_COMMANDS in Lib/_pyrepl/simple_interact.py). Basically, pyrepl knows to call some objects instead of printing their repr() if they're entered alone on a line, so you enter 'quit' and it winds up being the same as if you'd said 'quit()'.

CPythonInteractiveQuit written at 21:33:17;

(Previous 10 or go back to July 2024 at 2024/07/31)

Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.