Wayland has good reasons to put the window manager in the display server

utcc.utoronto.ca/~ckscks2026年03月18日 02:26

I recently ran across Isaac Freund's Separating the Wayland Compositor and Window Manager (via), which is excellent news as far as I'm concerned. But in passing, it says:

Traditionally, Wayland compositors have taken on the role of the window manager as well, but this is not in fact a necessary step to solve the architectural problems with X11. Although, I do not know for sure why the original Wayland authors chose to combine the window manager and Wayland compositor, I assume it was simply the path of least resistance. [...]

Unfortunately, I believe that there are excellent reasons to put the window manager into the display server the way Wayland has, and the Wayland people (who were also X people) were quite familiar with them and how X has had problems over the years because of its split.

One large and more or less core problem is that event handling is deeply entwined with window management. As an example, consider this sequence of (input) events:

  1. your mouse starts out over one window. You type some characters.
  2. you move your mouse over to a second window. You type some more characters.
  3. you click a mouse button without moving the mouse.
  4. you type more characters.

Your window manager is extremely involved in the decisions about where all of those input events go and whether the second window receives a mouse button click event in the third step. If the window manager is separate from whatever is handling input events, either some things trigger synchronous delays in further event handling or sufficiently fast typeahead and actions are in a race with the window manager to see if it handles changes in where future events should go fast enough or if some of your typing and other actions are misdirected to the wrong place because the window manager is lagging.

Embedding the window manager in the display server is the simple and obvious approach to insuring that the window manager can see and react to all events without lag, and can freely intercept and modify all events as it wishes without clients having to care. The window manager can even do this using extremely local knowledge if it wants. Do you want your window manager to have key bindings that only apply to browser windows, where the same keys are passed through to other programs? An embedded window manager can easily do that (let's assume it can reliably identify browser windows).

(An outdated example of how complicated you can make mouse button bindings, never mind keyboard bindings, is my mouse button bindings in fvwm.)

X has a collection of mechanisms that try to allow window managers to manage 'focus' (which window receives keyboard input), intercept (some) keys at a window manager level, and do other things that modify or intercept events. The whole system is complex, imperfect, and limited, and a variety of these mechanisms have weird side effects on the X events that regular programs receive; you can often see this with a program such as xev. Historically, not all X programs have coped gracefully with all of the interceptions that window managers like fvwm can do.

(X also has two input event systems, just to make life more complicated.)

X's mechanisms also impose limits on what they'll allow a window manager to do. One famous example is that in X, mouse scroll wheel events always go to the X window under the mouse cursor. Even if your window manager uses 'click (a window) to make it take input', mouse scroll wheel input is special and cannot be directed to a window this way. In Wayland, a full server has no such limitations; its window manager portion can direct all events, including mouse scroll wheels, to wherever it feels like.

(This elaborates on a Fediverse post of mine.)