Notes
2026/05/26
Continuation of A sketch for a minimal staged programming language.
Let's extend a simple programming language with the notion of stages, so that a function application can happen either at comptime or runtime:
t = v // variable
| (v, v, ..., v) => { t } // abstraction
| t@...@(t, t, ... t) // application
This is basically standard lambda calculus (with JS-like syntax), except that applications can have an annotation like @@ which marks them as happening at stage 2. We start by evaluating all applications of stage 0, then return the resulting bytecode as the output of stage 0, which might contain applications of later stages, such as f@(...), g@@(...), etc. After every stage, the stage annotations of all remaining function applications get decremented by the stage of the smallest remaining stage annotation in the resulting bytecode (so that the next smallest stage becomes stage 0), effectively eliminating one or more @ per function call.
Let's assume we also have some built-in functions and data types such as + and integers (where 2 +@ 2 evaluates 2 + 2 at stage 1), as well as let bindings to make our code a bit more readable.
Let's also assume that all function definitions are bound using let bindings and that a let binding is by default evaluated at stage 0 unless it is annotated, for example as =@ or =@@. Here's a full example:
twice = (f, x) => {
f@@(f@@(x))
}
add2 =@ (x) => {
x +@@ 2
}
add4 =@@ (x) => {
twice(add2, x)
}
add4@@(5)
Stage 0 of the above example would then evaluate to the following stage 1:
add2 = (x) => {
x +@ 2
}
add4 =@ (x) => {
add2@(add2@(x))
}
add4@(5)
Stage 1 would then evaluate to the following stage 2:
add4 = (x) => {
(x + 2) + 2
}
add4(5)
Stage 2 would then evaluate to the final output of 9.
This works, but implementing stages in such a way comes with a big disadvantage: A function application has to choose the exact stage it is evaluated at. But what if we want to fully apply the same function at different stages, for example reversing a list at comptime and another one at runtime?
Let's use a slightly more lenient approach where a stage annotation like f@@ marks the stage at which the function f is applied at the latest. If f is not fully resolved at stage 2, an error is thrown for f@@. But f can be applied earlier than at stage 2 if a function of stage 0 or 1 triggers a function call of f. This is useful because it allows us to now use the same function at multiple stages.
In the above example, the f@@(f@@(x)) call would then be immediately evaluated as part of twice(add2, x), which requires add2 to be bound at stage 0, so we need to change the let binding for add2 to use = instead of =@. Notice how there are only two stages because stage 1 is fully skipped:
// stage 0:
twice = (f, x) => {
f@@(f@@(x))
}
add2 = (x) => {
x +@@ 2
}
add4 =@@ (x) => {
twice(add2, x)
}
add4@@(5)
// stage 1 is skipped as there's no @(...), only @@(...)
// stage 2:
add4 = (x) => {
(x + 2) + 2
}
add4(5)
But why have an arbitrary number of stages at all? Isn't it enough to have a single comptime and a single runtime stage? Let's take a step back and consider why we might want to have different stages (and a split between comptime and runtime) at all. There are fundamentally two reasons:
If we only cared about the first of these two reasons, a split into a single comptime and a single runtime stage might be all we need. But once we add in side effects and data dependencies, two stages become overly restrictive. We might want to implement an import system as a comptime stage with network access, but this import system might rely on macros which we need to fully evaluate first for performance reasons, so we already need two comptime stages. If the imported code relies on staged evaluation, we might need to run further comptime stages.
What about multiple runtime stages? Having the ability to evaluate runtime code in stages gives us a limited form of just-in-time compilation: We could for example evaluate a regular expression and turn it into more performant code, even if it depends on user input at runtime, and then use the staged regular expression to match strings at a later runtime stage.
This raises the question of how the dependencies between stages should be specified. In the above examples, we assumed a numeric ordering, with every function application either belonging to a single fixed stage or being evaluated at that stage at the latest. But is that the right model in a world of side effects, where one stage might introduce other stages?
One possible alternative is to interpret stage annotations relative to the function they are defined in, so that an annotation marks a function application as being evaluated inside a function body before the function is called. This doesn't immediately solve the problem of establishing an order between multiple stages with complex data dependencies, but it might be a first step towards a dependency tree instead of a global linear order.
So far we have assumed that each run goes from a lower stage to a higher stage, so that we are guaranteed to come to an end after a finite number of stages. But what if a stage can't be successfully completed, for example because a network call is failing? We could of course retry that call in a loop, but that would block the entire stage at that point, when other parts of the stage might not depend on the network call at all. Is there value in letting an effect defer a stage, effectively staying on the same stage?
Staying on the same stage is not possible for regular function applications, because there is no way to specify that an application should remain on the same stage after that stage is evaluated when all we have are annotations such as f@(...). If f is evaluated at stage 1, it cannot contain function applications of stage 0 anymore (because they have already been evaluated) and a function application of stage 1 would be evaluated immediately as part of the current stage, while a function application of stage 2 would only be evaluated at the next stage. There's no way to insert the same stage again. But we can potentially work around that by giving effect handlers the ability to not just resume a continuation but also to defer it, which effectively just re-introduces a function application of the current stage without evaluating it.
This would give stages a way of recursing, by evaluating the same stage an arbitrary number of times, thus giving the evaluation across stages the same computational power as evaluation within a stage (where recursion is of course possible). Is that a good thing? Or would it be preferable to have the guarantee that once a stage is evaluated, we can move on to the next stage, guaranteeing that there's always a finite number of stages?
This is also related to the idea of implementing an import system as a comptime stage that can pull in more code (which can itself rely on various comptime stages). If we require a stage to move on to the next stage without deferring/re-evaluating itself or introducing new stages, the stage responsible for importing code has to execute all of the comptime evaluation of the imported code as part of a single stage.
Is there a difference in practice? Is it really so bad if we defer a stage by looping inside of the effect handler until the effect can be handled? Perhaps we only need different stages beyond one comptime and one runtime stage if different stages require different effect handlers and we want to make sure that different stages are being run with different capabilities. If we see effect handlers as capabilities, we might want to compile different stages with different effect handlers, so that we can for example guarantee that only some stages can make network calls.
But how do we specify that our own macro processing stage can access the file system (but not the network) whereas the import stage can access the network (but not the file system)? And do we still support multiple different runtime stages even if they have the exact same capabilities / effect handlers?
More questions than answers at this point...