Notes
2026/06/26

Fuck the emoji police

Unicode works. Unicode solves problems that ASCII and other custom encodings couldn't solve. Unicode allows billions of people to communicate in their native languages and writing systems.

Unicode sucks. Unicode assimilates bespoke encodings into a single and universal standard. Unicode forces minority languages and writing systems to fight for inclusion in one global standard.

This has been tried before. It never ends well.

One Ring to rule them all, One Ring to find them, One Ring to bring them all, And in the darkness bind them.

— Some dude who really liked inventing languages and writing systems

Or, if you want to score some points with philosophy sophomores, you can also invoke some French guy to say the same thing.

Such consensus does violence to the heterogeneity of language games.

— Jean-François Lyotard

Unicode is philosophically rotten. It fundamentally misunderstands what language is: Not a single system that we could all agree on but instead a colorful variety of language games, expressed in various languages and writing systems.

What if twins become bored of staging 5 minute plays and instead invent their own language and writing system? Do you think they are going to successfully lobby the UNICODE CONSORTIUM to consider their language next to Klingon? Yeah, I don't think so.

No Unicode != ASCII

What's the alternative? Going back to ASCII? No. Instead of going from a single (universal) encoding back to a single (specific) encoding, any alternative to Unicode that wants to grasp languages and writing systems in all of their variety needs to support a multitude of systems, even ones that are built ad-hoc and might exist for just a few minutes before falling into disuse.

So here's a thought experiment: What if you had a chat app that allowed you to pick and even invent a writing system on the fly? One where you could choose which characters, pictographs, images and memes to send and write? (After all, aren't memes just images that through iterability have become signs?)

This chat app would allow you to decide what your keyboard looks like. It could just be a single button that sends “Yo” (remember when that was worth millions of dollars?), it could your typical QWERTY layout except that all of your keys show and send different cat memes, or it could be a mostly typical QWERTY layout with a few rearranged or additional keys for sending “stuff” that you send frequently, where “stuff” could be as simple as a single glyph or as complex as an entire sentence.

Making the keyboard malleable is the easy part. The hard part is supporting various writing systems without baking in any assumptions about how we write. Left to right? Right to left? Top to bottom? Boustrophedon? Spiraling?

Instead of looking like a typical chat app, it would look more like a shared canvas, and a chat would develop as a collaborative form of scrapchatting. Do you write the glyphs left to right or right to left? Up to you. Are there line breaks or just one endless line? Up to you.

But that's just the direction. How are different glyphs combined? Most Latin alphabets are relatively tame: You might run into a couple of ligatures, but that's mostly it. But what about something like the Arabic script, where a letter can have up to four different forms depending on how it's combined with preceding or following characters. Or Hangul, the Korean alphabet, where a combination of characters is arranged in a block, as two or three characters, sometimes horizontally, sometimes vertically.

Our hypothetical chat app would need to support specifying all these rules. Rules that determine how a sequence of keystrokes are turned into a layout of glyphs on the screen. We could hardcode these rules for a bunch of languages, but then we'd be right back at what Unicode is trying to do. What we need is a way to express the rules. A language of languages (well, writing systems) of sorts.

A (non-)language of languages

Before sketching out a “language” for the rules of various writing systems, let's add one more requirement: The language of rules shouldn't privilege an existing language. No English-derived programming language keywords, etc. Is this just making our lives harder to illustrate a philosophical point? Maybe.

What we need is a visual programming language. (The horror, the horror.) Since the self-imposed constraint of not relying on an existing language or writing system is already hard enough, let's at least make our lives a little easier by deciding what is fixed in our rule language:

Rules are only executed in response to a keystroke. They are functions that operate on structured input (like a list of previous keystrokes) and can access variables (like a variable storing a cursor position). The only effect that rules can have is placing glyphs at positions and removing existing glyphs (so that two keystrokes can show up as a single combined character). Since a chat app is collaborative, there is data that is read-only (the keystrokes of other participants, local variables set by other participants) and data that can be mutated (your own placed glyphs, shared variables for the cursor position).

As data types, there are glyphs (images that are stored outside the app), 2D coordinates (for both absolute points and relative distances), and numbers (as angles for rotating glyphs, as scaling factors).

How are variables identified in the language if the goal is not to use English (or any other fixed language) labels anywhere? We could either identify variables using a combination of glyphs freely positioned on the canvas (without any rules governing the placement or combination) or simply not use any names at all and instead draw lines from the value of a variable to where it's used in the rule.

How are literal values shown? Using decimal numbers? This might be acceptable (and is certainly more general than using English language identifiers for variables), but would privilege one specific writing system for numbers over other representations. What if we show the value of a variable visually, as follows:

This assumes that there are different types for scalar values like angles and scaling factors. Alternatively, if we just want a single integer number type, we could use tally marks, with numbers 1 to 4 being shown as unary marks.

But maybe this is all going a bit too far? Maybe it would be acceptable to piggyback on ASCII for the rule language and start with a simple textual programming language to get the data types and semantics of the language right? Adding a visual programming language that is translated to that textual language could then be done in a second step.