Notes
2026/06/05

Staged programming vs. partial evaluation

How does staged programming (either using an arbitrary number of stages or comptime vs. runtime) compare to partial evaluation? Both evaluate some parts of the program earlier than what would normally happen for runtime call-by-value (or call-by-need) evaluation. This can be useful for macros, metaprogramming, unrolling/inlining of operations at comptime, just-in-time compilation, and other situations where a fragment of the program is specialized for a particular value while other values are only supplied later at runtime.

Implicit vs. explicit stages

There is one crucial difference though: Partial evaluation, in its most general form, would partially evaluate every function application that can be evaluated. In the presence of recursion (and Turing-completeness in general), this can quickly lead to infinite loops at comptime or at least to significant slowdown. Deciding what to partially evaluate and what to keep unevaluated is the hard part.

This is where staged programming shines: If we leave it to the programmer to manually annotate function calls as either comptime or runtime, we can just naively evaluate all comptime annotated calls. If this takes too long, well, just remove the comptime annotation and you get the normal runtime evaluation strategy.

So do we need multiple stages then or is a single comptime vs. runtime separation enough? That depends on what happens when a comptime call cannot be evaluated because the function cannot be resolved (yet). Partial evaluation naturally supports evaluating a program in multiple discrete steps, because further parts of the program will be evaluated as soon as we supply more arguments, while all parts of the program that still wait for a data dependency are kept unevaluated. If the comptime stage just errors out when it encounters an unresolved function that needs to be comptime evaluated, then staged programming ends up being less flexible than partial evaluation.

But that doesn't have to be the case. Instead of interpreting a comptime call as “evaluate this at comptime or raise an error” we could simply imagine a comptime call to mean “evaluate this at comptime if possible”. A comptime annotation would then just be a way to signify where partial evaluation can happen, not where it must happen.

Separate evaluation strategies

A language with these semantics would then just have two separate evaluation strategies: One is the partially-evaluated comptime strategy, where comptime calls are annotated as much as possible and runtime calls are left unevaluated, the other is the normal runtime evaluation, where comptime calls are treated just like runtime calls and all are evaluated using normal call-by-value (or call-by-need) evaluation.

One downside of this approach is that there's no way to explicitly raise an error if a comptime annotated call cannot be fully evaluated at comptime. This can be a problem if we have a really expensive call that we definitely want evaluated at comptime instead of runtime. One solution would be to check the result of comptime evaluation before we declare compilation done: If there are still comptime annotations, we raise an error. (But then what if there are some comptime calls that we really want to be evaluated only at comptime, and some that could be evaluated either at comptime or at runtime?)

Evaluating comptime annotations as soon as they can be resolved lets staged programming operate across an arbitrary number of (comptime) stages based on their data dependencies, just like partial evaluation in the general case, and in contrast to explicitly numbered stages that must evaluate a call at a particular stage or fail.

But are explicitly numbered stages really that different from partial evaluation based on whatever arguments are available? If we imagine a language with two separate evaluation modes, one for (partially) comptime evaluating a particular stage and one for runtime evaluating all function applications, we could decide how we want to handle the result of a comptime evaluation stage that still contains (unresolved) function applications of that stage after evaluating it: We could either raise an error, which guarantees that a stage is evaluated in one pass, or potentially evaluate that stage again at a later point in time when the data dependency can be fulfilled.

Data dependencies and deferred effects

How can data dependencies remain unfulfilled? For partial evaluation in general, we could imagine a function being partially applied to some but not all arguments. For entire programs this doesn't really make sense, but there is a related notion of depending on something that might only be available at runtime: Effects that are deferred until a later stage. This is different from using an effect handler to raise an error, because we would instead “return” from the effect handler with code that is to be executed at a later stage and continue the partial evaluation of the rest of the program in the meantime.

So maybe that's the simplest solution? An arbitrary number of numbered stages, with no order between them, which can be either executed using comptime partial evaluation (in which case a single stage is evaluated) or using runtime evaluation (in which case all stages are evaluated, but using normal call-by-value/need). The language itself wouldn't impose any restrictions on how and in which context these different evaluation modes are used and whether a comptime stage can have comptime calls “left over” after evaluating that stage or unresolved comptime calls will raise an error.