Notes
2026/05/30
What if a web inspector (think browser dev tools) kept track of the provenance of any UI element, so that you could jump straight to the code that is responsible for drawing that part of the UI? Modern web inspectors make the rendered UI of a web page inspectable by keeping track of the HTML structure of a document (and keeping it associated with the rendered visual output), but they do not track how and where the HTML structure originated. This puts a limit on how malleable the resulting web pages are, even if you have access to the source code: Unless the visual component that you might want to change corresponds directly to a part of the code that is easy to search for (e.g. if it contains static text that you can grep for), it can be hard to know which part of the code is responsible for a particular part of the UI.
Could we extend the idea of a web inspector to all the code that renders the UI? Traditionally, code and UI are two separate layers. Code is the input, UI the rendered output. Going from code to UI is a one way street and there is no direct association between a part of the UI and the original note. There are good reasons for that, most importantly flexibility: For any particular pixel on the screen there might be lots of different parts of the code that influence the final output. Think of a complex 2D or 3D renderer with support for different colors, text, shading, and other visual effects. The final color might be determined by completely different and mostly unrelated parts of the code. How could we jump from a UI “element” to the code, when there's neither a stable notion of what an “element” even is nor a single piece of code responsible for it?
A web inspector restricts this flexibility by imposing a grammar onto how UIs can be built: HTML is a shared set of structural rules that limit how web pages can be built, but in return also guarantee that we can all inspect them in terms of their shared structure. Sidestepping this structure, e.g. by rendering directly to a canvas, provides more flexibility, but also makes it harder to inspect the resulting UI (and makes it less accessible out of the box).
Imposing a grammar also (partially) solves the problem of knowing what is being pointed at. Let's say I hover over a text box and want to inspect the underlying UI element. What am I pointing at? Do I point at the entire paragraph? A single text span? The color of the text? Or perhaps even the parent container that positions the text box? There are ways to disambiguate the selection somewhat, by for example using a box selection to select all of the elements that fall within a selected rectangle, but any disambiguation fundamentally relies on a grammar that determines the concepts exposed by the web inspector.
Being able to select and inspect “invisible” elements such as parent containers requires exposing these elements as parts of the UI. We usually don't think of e.g. flexbox divs as being selectable parts of a web page (in contrast to the content they contain), but a web inspector turns them into inspectable and interactable elements of the UI.
If we wanted to keep the final output of our UI associated with the underlying code, we would have to stop thinking of code as input and UI as output and instead expose code as part of the UI. That doesn't mean that code has to be immediately visible, but each UI “component” would need some limited form of interactivity that allows us to “unfold” the code that is attached to that component (and which can point to multiple different parts of the program).
A fully inspectable UI would thus become a form of hypertext, not just because it links together code and UI, but because UI “elements” can link back to multiple snippets of code.
There is an interesting convergence of ideas here that deserve more exploration: Code and UI would remain linked, as hypertext, which also potentially makes it easier to model programs as composable functions instead of siloed and monolithic apps. And if UI components remain linked to the code that affected them, editing that code might feel more like an interactive REPL session than the traditional write-compile-run cycle.
Would such a paradigm feel more like a shell than an app? Would it require a new way of thinking about programming languages, by supporting UI elements as first class citizens instead of just “dead” output? What's the equivalent of the Unix philosophy for UIs of the 21st century?
Write programs that do one thing and do it well.
Write programs to work together.
Write programs to handle text streams, because that is a universal interface.
Text streams are a universal interface because they are not just output but also just as easily input in a world of C programs and shell scripts. This is not true for graphical UIs (not even for most terminal UIs, because they are not text streams), for two reasons: C programs and shell scripts don't have great support for handling more complicated data types out of the box, and graphical UI components cannot easily be consumed and manipulated as input because they lack a universal interface.
This is also why simply using a programming language with more sophisticated typed data and a REPL doesn't solve the fundamental problem: Unless the language and REPL include strong guarantees about how the typed data is displayed and manipulated as UI, the only reliable universal interface will still be text.
Any attempt at applying the Unix philosophy to graphical UIs needs to provide a universal interface for UIs, by imposing a grammar on how UIs can be expressed.
Here's one last direction that is potentially worth exploring in this context, even if it could turn out to be largely unrelated: If the final UI output remains linked to the code where it originated, do we need to change code to affect the final UI, resulting in one-way data flow? Or can we directly manipulate the UI and have these changes “pushed back” into the code?
This is closely linked to the idea of modeling UIs as a constraint system. CSS is implicitly constraint-based but it doesn't expose a good way to deal or even see conflicting constraints: It is easy to end up in a situation where a rule doesn't have any effect because it is overwritten / incompatible with another part of the CSS. What if our UI system explicitly supported the idea of constraints and constraint programming so that we could see conflicting constraints (in the UI inspector) and react to them in the code (by having the pressure of unresolved constraints “pushed back” into the code)? This is only a vague idea at this point, but potentially relevant in the context of imposing a grammar on graphical UIs.