Notes
2026/05/14

Static checks that are almost trivially simple

Here's a static check that is so useful that almost every modern programming language implements it: Statically checking that a variable name corresponds to a bound variable, in other words, statically checked lexical scope. If I use baz in my code but there's no lexically bound variable with that name, I get an error, at compile time. This check is so trivial that we wouldn't think of it as a type system (the other big class of static checks that are used in practice) and it's useful enough for the 99% case that we happily give up the little bit of 1% extra dynamism that we would get without the restriction that all variables need to be resolved statically.

Are there other examples of checks that fall into this category? Checks that are so simple and minimally restrictive that we don't even think of them as resembling heavy machinery such as type systems?

Most checks and safety mechanisms in programming languages seem to either fully embrace dynamism and defer checking until runtime, or bring in a relatively complex type system (involving some form of parametric polymorphism, in other words generics). Type systems seem to turn into siren calls of increasing complexity, because adding just a little bit more complexity can make a type system less restrictive and nicer to use, which pushes type systems to become more complex over time. (Zig seems to be a notable example of a language that circumvents the need for generics by using comptime evaluation instead.)

Are there useful checks that can be implemented statically? Maybe even using a trivially simple type system, without leading to a restrictive user experience that would require a much more complex type system to solve. Here are two such proposals:

Function arity checks

What if (in addition to checking that a function name is lexically bound) we also checked that a function is called with the right number of arguments? This can be done with just about the simplest type system out there: simply typed lambda calculus (STLC) with a single base type *. A value gets type *, a function that takes one value and returns a value gets the type * -> *, a function that takes two values and returns a value the type * -> * -> *, a higher order map function might get the type (* -> *) -> *, etc.

Simple enough to implement, but the classic problem is that we might want to treat functions as first class values and for example store functions in a list. But if our value type is * and lists get this type, we have no way to express that a list contains functions. (This is then where generics usually enter the picture.) But what if we simply accept that functions aren't first class values? This does not mean that we lose the ability to return functions as return values or pass them to higher order functions as arguments, we “only” lose the ability to store functions in data structures.

Perhaps that's an acceptable and even useful property for some languages? In return, we get a strict separation between data and code, because we know that nothing of type * will ever contain functions, nor will anything with a function type ever be a plain value that cannot be called. This is actually quite useful:

Knowing the arity of our functions statically allows us to emit much more efficient code, especially for closures and partial evaluation. Efficiently supporting partial evaluation in a bytecode VM is surprisingly non-trivial and requires techniques like the one implemented in the ZINC VM.
Knowing that values of type * never contain functions makes it a lot easier to pretty print values, because we can just statically reject programs that would print anything with a function type and thus never need to worry about how to print functions, let alone closures.
STLC naturally forbids implicit recursion, because it is not possible to express the type of a recursive function without adding an explicit fixed point combinator to the language. This is useful if we want to guarantee that recursion only happens whenever a fixed point combinator is involved, which makes optimizations easier.

Static effect handlers

Here's another example of a useful static check that can be implemented with a trivially simple type system: Statically checking that effects are always called in the context of their corresponding effect handlers. A bit more concretely, let's say we call an effect such as read-file!("file.txt"). Instead of looking for a handler that implements the read-file! effect at runtime, we can use a STLC-like type system to track effect handlers in addition to function arguments.

In terms of implementation, function arguments and effect handlers are resolved in the same way, because we statically know where on the stack the function argument or effect handler is. But from the perspective of someone using the language, effect handlers are implicit function arguments, because a function deep down in the call stack can call an effect that is only handled much higher up in the stack, without any of the intermediate callers having to know about it.

(Does that also mean that any lexically unbound variable could just be treated as an effect? Which then needs to be statically resolved by one of the callers? In contrast to normal functions, effect handlers can capture a continuation and resume or skip execution of the callsite that invoked the effect, but perhaps this could be extended to all functions? So that the only difference between functions and effects is that effect handlers might not resume in a function-like way?)

Just like for function arity, statically resolving effect handlers has implications for passing and returning effects as values: Unless we want to bring in a much more complicated type system, effects cannot be stored as a value that would cause the effect to “lose” its effect signature, which effectively means that in a STLC-like type system we could pass effects directly as function arguments and return them as the result of a function, but not e.g. store them in a list. Maybe that's an acceptable tradeoff for a language?

Multiple return values

One interesting aspect of using STLC for function arity / effect signatures is that there's an asymmetry between passing functions/effects as arguments and returning them from a function: While we can pass multiple functions/effects with different types, we can only return a single one, because a function has a single return type. In functions with tuple types that's not a big deal, because we can just return a tuple of functions/effects, but this is not possible in STLC (without introducing extra tuple types), because a tuple is a value and would thus have the base type *, which doesn't allow us to store a function/effect inside of it.

(There's a slightly more subtle asymmetry between passing and returning values even in languages with more sophisticated type systems: Returning multiple values as a tuple is still returning a single value, which means that the tuple needs to be unpacked before it can be used as multiple function arguments, unless the function expects a single tuple as an argument. But in a ML/Haskell-like language with partial application, there is a difference between passing multiple values through currying vs. through a single tuple. The latter is symmetric to returning a single tuple value, the former isn't.)

One way to restore this symmetry might be to disallow implicit currying, so that calling a function with the wrong number of arguments is a type error. A function f with multiple arguments would then have a signature of (a, b, c) -> d, where (a, b, c) is not a tuple type (because all values would still have the single base type *) but just a more convenient way to express that the function f expects 3 arguments on the stack. We could then introduce a way to return multiple return values on the stack, so that a function g with type (x, y) -> (a, b, c) could be chained with f as f(g(x, y)) (potentially using a different syntax that makes it clear that g(x, y) supplies all three of f's arguments).

Escape hatches: dynamic checks and unsafety

Strictly separating our types into values and functions/effects can be very restrictive. After all, we lose the ability to store functions/effects in a list or any other data structure, which is a pretty fundamental limitation. Can we work around this limitation while keeping most of the benefits of our static checks?

One option is to “throw away” the type information of functions/effects when we want to store them in data structures and to cast them back to functions/effects when we extract them from a data structure and want to call them. We could either make * a supertype of values and functions/effects or introduce a new type such as Any or Dyn that can hold both values and functions/effects. The drawback of using a single type * for both values and functions/effects is that we would lose the ability to ensure that we never pass functions through IO boundaries / pretty print them / compare them for equality, so introducing a separate type that can hold arbitrary values and functions/effects seems preferable.

Once we have a separate type such as Any or Dyn, how do we convert between it to/from our value/function/effect types? We could either require dynamic runtime checks for every cast or just add a bit of unsafety to the language and silently assume that casts are correct. Dynamic checks probably make more sense for the surface language, whereas supporting unsafe casts could make sense for the underlying VM, so that we can compile more sophisticated type systems to bytecode without paying the performance cost of inserting dynamic checks.

A tower of type systems

Adding unsafety as an escape hatch raises another interesting question: Can a trivial type system save us from the siren call of adding a more sophisticated type system, while also making it possible to add more complex type system later, ideally only for those parts of the code that would benefit from it? Most languages treat type systems as all or nothing: You commit to a type system early in the design process and then you're stuck with it, for better or worse.

Could we instead implement one or more type systems as a tower of abstractions, so that specific parts of a program can opt into a more sophisticated type system while the rest of the program relies on much simpler static checks?

This is more of a question than an answer, but perhaps staged programming could be used to implement the static checks described here as a form of metaprogramming, leaving open the possibility that more sophisticated stages could then later be added as preprocessing steps? All stages would then rely on a bytecode with some support for unsafety, so that invariants already proven by a type system don't need to be re-checked dynamically at runtime.