A peculiarity of the X Window System: Windows all the way down

March 5, 2024

Every window system has windows, as an entity. Usually we think of these as being used for, well, windows and window like things; application windows, those extremely annoying pop-up modal dialogs that are always interrupting you at the wrong time, even perhaps things like pop-up menus. In its original state, X has more windows than that. Part of how and why it does this is that X allows windows to nest inside each other, in a window tree, which you can still see today with 'xwininfo -root -tree'.

One of the reasons that X has copious nested windows is that X was designed with a particular model of writing X programs in mind, and that model made everything into a (nested) window. Seriously, everything. In an old fashioned X application, windows are everywhere. Buttons are windows (or several windows if they're radio buttons or the like), text areas are windows, menu entries are each a window of their own within the window that is the menu, visible containers of things are windows (with more windows nested inside them), and so on.

This copious use of windows allows a lot of things to happen on the server side, because various things (like mouse cursors) are defined on a per-window basis, and also windows can be created with things like server-set borders. So the X server can render sub-window borders to give your buttons an outline and automatically change the cursor when the mouse moves into and out of a sub-window, all without the client having to do anything. And often input events like mouse clicks or keys can be specifically tied to some sub-window, so your program doesn't have to hunt through its widget geometry to figure out what was clicked. There are more tricks; for example, you can get 'enter' and 'leave' events when the mouse enters or leaves a (sub)window, which programs can use to highlight the current thing (ie, subwindow) under the cursor without the full cost of constantly tracking mouse motion and working out what widget is under the cursor every time.

The old, classical X toolkits like Xt and the Athena widget set (Xaw) heavily used this 'tree of nested windows' approach, and you can still see large window trees with 'xwininfo' when you apply it to old applications with lots of visible buttons; one example is 'xfontsel'. Even the venerable xterm normally contains a nested window (for the scrollbar, which I believe it uses partly to automatically change the X cursor when you move the mouse into the scrollbar). However, this doesn't seem to be universal; when I look at one Xaw-based application I have handy, it doesn't seem to use subwindows despite having a list widget of things to click on. Presumably in Xaw and perhaps Xt it depends on what sort of widget you're using, with some widgets using sub-windows and some not. Another program, written using Tk, does use subwindows for its buttons (with them clearly visible in 'xwininfo -tree').

This approach fell out of favour for various reasons, but certainly one significant one is that it's strongly tied to X's server side rendering. Because these subwindows are 'on top of' their parent (sub)windows, they have to be rendered individually; otherwise they'll cover what was rendered into the parent (and naturally they clip what is rendered to them to their visible boundaries). If you're sending rendering commands to the server, this is just a matter of what windows they're for and what coordinates you draw at, but if you render on the client, you have to ship over a ton of little buffers (one for each sub-window) instead of one big one for your whole window, and in fact you're probably sending extra data (the parts of all of the parent windows that gets covered up by child windows).

So in modern toolkits, the top level window and everything in it is generally only one X window with no nested subwindows, and all buttons and other UI elements are drawn by the client directly into that window (usually with client side drawing). The client itself tracks the mouse pointer and sends 'change the cursors to <X>' requests to the server as the pointer moves in and out of UI elements that should have different mouse cursors, and when it gets events, the client searches its own widget hierarchy to decide what should handle them (possibly including client side window decorations (CSD)).

(I think toolkits may create some invisible sub-windows for event handling reasons. Gnome-terminal and other Gnome applications appear to create a 1x1 sub-window, for example.)

As a side note, another place you can still find this many-window style is in some old fashioned X window managers, such as fvwm. When fvwm puts a frame around a window (such as the ones visible on windows on my desktop), the specific elements of the frame (the title bar, any buttons in the title bar, the side and corner drag-to-resize areas, and so on) are all separate X sub-windows. One thing I believe this is used for is to automatically show an appropriate mouse cursor when the mouse is over the right spot. For example, if your mouse is in the right side 'grab to resize right' border, the mouse cursor changes to show you this.

(The window managers for modern desktops, like Cinnamon, don't handle their window manager decorations like this; they draw everything as decorations and handle the 'widget' nature of title bar buttons and so on internally.)

Comments on this page:

By Anonymous at 2024-03-06 08:41:25:

'The Gimp' used to do this (even on MS-Windows), and even though these days the default is a single window, there still is an option to get the multiple windows GUI back. https://www.gimp.org/

By cks at 2024-03-06 11:10:57:

I think you're talking about a different thing, what I'll call 'window manager windows'. Programs can use one visible window with UI controls inside it, or separate windows, some with UI controls and some with content (which could let you make the content full-size on one display and park the controls on another). They can also use a MDI style UI, where they render visible sub-windows inside their main window and let you move and resize those sub-windows.

(Although it's not MDI, modern web applications sometimes have similar MDI-like window style objects that you can grab and move around inside the web page. These are obviously not window manager windows, they're entirely rendered in HTML and manipulated through DOM and Javascript.)

However, in X this is almost completely separate from how many protocol-level Window objects the program is using. Every separate top level 'window manager window' has to be a separate protocol-level Window object, but programs can use or not use further Window objects inside their top level windows as they want. This is almost completely independent from how the UI looks, and you can't tell how many protocol-level Window objects an X program is using from looking at its UI; you have to look at protocol level things with tools like 'xwininfo'.

By Adam D. Ruppe at 2024-03-07 18:31:20:

"(I think toolkits may create some invisible sub-windows for event handling reasons. Gnome-terminal and other Gnome applications appear to create a 1x1 sub-window, for example.)"

Indeed. The reason for this is an interesting application and it took me quite a bit of digging to find it when I was faced with the problem it solves.

When you have child windows, keyboard events are sent to the first window, working from the descendant child up to the parent window, that is both under the current mouse pointer and subscribed to key events (aka "focus follows mouse") unless an explicit focus has been set to another sibling.

Suppose you want to embed another application in your window (something I still think is underutilized!). This other application will subscribe to key events, so unless you set an explicit focus to a valid sibling, it is going to get those events on a focus-follows-mouse model. This might be obnoxious for the user - move the mouse and the key events now don't go where you want - and it might be annoying for your application, since you no longer get the events on your top window. meaning things like your menu keyboard shortcuts stop working.

But there's a solution: that latter "unless". If the top-level window creates this extra window outside the rest of the child tree. It is still a child application's top-level parent, but not a child of anything else in there... making it a valid recipient of all these events with an explicit focus call.

Your application sets the focus to this child any time the top level thing gets it, and processes events through it. You then dispatch as you want - either sending it to the child widget handlers internally, or XSendEvent it to child windows as you wish, and things just work.

(Note that even if you don't embed other applications and don't use child windows as widgets, you probably want child windows for things like popup menus, so the technique is generally useful.)

I wrote more details on the stack overflow a couple years ago here: https://stackoverflow.com/questions/71544036/can-i-change-the-focus-behavior-of-child-windows/71800780#71800780 or you can go to the XEmbed spec on freedesktop directly which is the primary source I cite in there.

Written on 05 March 2024.
« An illustration of how much X cares about memory usage
Where and how Ubuntu kernels get their ZFS modules »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Mar 5 21:26:30 2024
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.