KCL part 2: program memory

In this post I'll cover a fun and interesting problem (and two solutions) in the implementation of KCL. This post is part of a series of blog posts on KCL. Previously: part 0: intro and part1: units.

KCL is an interpreted language - it is interpreted directly rather than being compiled to a binary and executed. When running, the current state of the program (i.e., the runtime values of its variables) is stored in the program memory. In KCL, scoping and name resolution are implemented as part of program memory, so rather than being a flat address space, it is structured according to the dynamic scoping rules of the language. Within a scope, it is basically a map from variable names to values.

Previously, the KCL interpreter made a copy of all of the program memory whenever a function was defined and used that when the function was called (all functions capture their environment, i.e., they're closures, and when executed the environment must reflect values at the time of function definition, not function call). Obviously, that is not very efficient! In programs with lots of functions and significant memory use (KCL also does not have much in the way of garbage collection), this caused dramatic slow-down of execution. I fixed this in two ways inspired by work from databases, concurrency, etc. The first solution used an efficient snapshotting/copy-on-write mechanism, the second used an epoch counter and something similar to MVCC (multi-version concurrency control). These fixed the performance issues with specific problem cases and caused a general performance improvement. The second solution was a simplification of the first and also a step towards another potential performance and engineering improvement (references in program memory).

KCL is mostly immutable - once a variable is created, it cannot be modified. However, the values can change due to a feature called tags. Tags are a way to refer to parts of objects. Consider a simple cube which we might construct by drawing a square using four lines and then extruding the square into a cube. Tags allow referring to each edge of the square. After the square is extruded, the same tags can be used to refer to the surfaces of the cube extruded from each line. In a sense, tags are names, and these names are updated as the program runs (I won't go into the details here, but for the purpose of this blog post, we just need to know that some properties of values can change). (Tags predate my involvement in KCL and are one of the things which makes KCL challenging to work with. However, they seem to be pretty intuitive for CAD users to use, and are a powerful and essential feature. I think they could be done differently and in a way which would make the rest of KCL nicer, but they're so widely used that the change would be infeasibly huge).

Mutable program memory is not an issue - it's how a real computer's address space works. However, KCL also has closures (functions which capture they're environment), and for various reasons it is important that they always see the state of program memory when it is created, not state when the closure is executed. This is an intrinsic problem with closures, and there are a few solutions:

program memory is immutable (the solution used in many functional languages),
closures always see the latest version of data (acceptable in some languages),
use a borrow checker or similar to statically avoid the issue (used in Rust),
make a copy of the memory which the closure might use (kind of common, but requires some static or dynamic analysis of the program state).

KCL used the last one, but because it didn't have any static analysis, it copied the whole program memory. That might seem really bad, but because programs are small, it's not too much of a problem. However, effectively all functions are closures, so potentially this can happen a lot and in some edge cases this can lead to a blow-up in memory usage and terrible performance. It turned out that this was happening often enough to need addressing.

Background - how KCL program memory works

As a reminder, KCL is an interpreted language with no pre-execution analysis or transformation. Program memory is used to store and lookup variables (of course), and it is also responsible for handling scoping, e.g., in

a = 42

fn foo(a) {
    return a + 10
}

b = foo(a = 0)

Program memory stores values for a and is responsible for making sure that the right a is accessed at any given point in execution (a is not renamed or encoded before execution). Scoping in KCL is non-trivial, in the above example, if a were not overridden in foo, then the outer a would be accessible inside foo (put another way, all functions are closures which capture all of their environment).

The way this is implemented is that program memory consists of environments (aka, an 'env') which are lexical scopes, they keep a reference to their enclosing environment. The top-level environments are modules. These environments are distinct from the call stack, though they might overlap (as in the above example, where the function foo is called from the same scope where it is declared).

This scheme of environments handles scoping and name resolution, and if program memory were immutable, it would be all that is needed. Unfortunately, although KCL does not support explicit mutation (e.g., reassignment), values can be mutated as a side-effect of some operations (as described above). So when declaring a function, we need some way to capture its environment at the point in time of declaration. The old way of doing things is that when declaring a function, it would make a copy of all of the program memory. When calling a function a new environment is created which references this copy. Copying all of the program's memory is obviously inefficient.

I've only mentioned creating and reading values in memory. Values should be deleted when no longer needed. This is complicated by the fact that declaring a function implicitly references values it could access. KCL doesn't have a value-level garbage collector. It's not as critical as it would be in a general purpose language. The work described below added some environment-level GC, but I won't describe it in this post.

First iteration - snapshots

My first iteration of improvement was to use copy-on-write (CoW, nothing to do with the Rust library type) snapshots - KittyCAD/modeling-app#5273. As well as the core change to CoW snapshots, that PR includes a bunch of engineering improvements to program memory (having a single memory for the whole program, rather than one module, better encapsulation, remove some special cases, separate out most of values in memory (i.e., memory just stores opaque values), adds a little caching, etc.).

The key ideas of environments and the call stack (as well as the fundamental idea of using program memory for scoping, etc.) are preserved. What changes is what happens when a function is declared. Rather than copying all of program memory, KCL makes a snapshot. Making a snapshot has the same observational semantics as making a copy (specifically, changes to program memory are not observed when reading from a snapshot), but the implementation is optimised. Creating a snapshot is essentially free (precisely, it is O(1) rather than copying memory is O(n) in the size of the memory) and memory use is much, much smaller.

To show how snapshots work, I'll go through a simple example which isn't KCL. Later I'll show how this works with environments, stacks, and functions. Consider the following pseudocode example:

a = 1
b = 2
c = 3
snapshot(x)
d = 4
b = 5
snapshot(y)
delete(d)
c = 6

Here is what reading each variable from each snapshot looks like:

| snapshot | a | b | c | d |

|x | 1 | 2 | 3 | - |
|y | 1 | 5 | 3 | 4 |
|current | 1 | 5 | 6 | - |

The current state of program memory can always be read directly without considering snapshots. Modifying or creating a new variable writes the new value into the current program memory and must also touch the most recent snapshot (if one exists) to make a copy of the old value (this is the copy-on-write element of the scheme). Since reads are more common that writes, this is a good trade-off for most programs.

So, in the example, the initial assignments into a, b, and c simply write into the current env of program memory. The assignment into d writes 4 into the current env and a tombstone into snapshot x. The second assignment to b copies the old value (2) into x and writes 5 into the current env. Deleting d copies 4 into snapshot y and removes it from the current env. The last assignment to c writes 6 to the current env and 3 to y.

Snapshots are read-only, mutations only ever affect the current state of program memory (we'll see how exactly later). Reading from a snapshot means looking in that snapshot for a variable, then every newer snapshot until the current state is reached or the variable is found.

In the example, to read from y, c d are found in y, and a and b are found in the current memory. To read from x, b is found in x, c and d are found in y, and a is found in the current memory.

OK, so that's reading and writing with snapshots in a hypothetical simple memory system. In real KCL, we also have to deal with scoping in the form of environments. Environments are in a tree and since data anywhere in that tree can be modified, when we have a snapshot, we also have to treat parent environments as snapshots. So, snapshots are organised in a tree where the parent of a snapshot is another snapshot, which is the snapshot of the parent environment. When accessing a variable (read or write) we walk up the tree of environments in the same way as without snapshots, but this time we're using a snapshot for each environment. Note that this does not mean we have to combine traversing a list of snapshots for each environment and a list of snapshots representing enclosing environments. For writes, we always write into the current environment (and update the most recent snapshot of it), for reads, we only look at the most recent snapshots to find a variable.

You can imagine that this would lead to a lot of snapshots, since every function declaration leads to a snapshot created which triggers a snapshot creation in all ancestor environments. However, in practice the tree of environments is pretty shallow (c.f., the call stack). Furthermore, we can make an important optimisation that if we need a snapshot for an environment (either the current one or an ancestor) and one exists already, then it can be reused if it is empty. You can think of this as merging two snapshots if there is no difference in the data they capture. This is the common case for parent snapshots.

That's reading and writing, now how are snapshots actually used? Every time a function is declared, a snapshot is created. When a function is called a new environment is created and pushed on to the callstack (nothing changes to the callee environment). This new environment has the function's snapshot as it's parent which means when reading variables during the function execution, the state of the variables at the time of the declaration is observed. So a few final observations on this setup:

The current environment is always 'just an environment', not a snapshot; snapshots can only be ancestors of an environment.
The parent of a regular environment can be either another regular environment or a snapshot, the parent of a snapshot is always another snapshot.
Since we only write into the current environment (never a parent environment), we never need to write into a snapshot of memory and snapshots are observably immutable (though note that the implementation of snapshots means that the snapshot Rust object is not immutable).

Second iteration - epochs

The above works nicely, but it is kinda complex. My next iteration tried to simplify the above, be more efficient when writing, and be a step towards implementing references in KCL program memory (referencing a value currently makes a copy which is inefficient).

A key observation is that even though we need to do all this stuff because KCL values are mutable, mutation is rare and restricted to one class of data within one class of values. Therefore, handling mutation just where needed, rather than as a global property is more efficient and means less code has to take mutability into account.

This approach removed snapshots (leaving environments and call stacks). Program memory maintains a global counter which is incremented whenever a function is declared, and that counter is saved with the function in memory (rather than a reference to a snapshot). This simple scheme wouldn't be practical in a general purpose language due to concurrency and counter overflow, but in KCL it's fine. The counter is called an epoch counter, where an epoch is the most fine-grained period of time we need to distinguish between memory states.

Whenever a value is created in memory, we save the current value of the epoch counter with it. Then, looking up a variable means finding it in memory, but ignoring it if it was created more recently than the epoch we are 'looking in'. Values which can be modified must handle multiple versions internally (rather than this being handled universally by program memory). Again, since the common case is immutability, this is a good trade-off!

Mutation in KCL is limited to tagging of objects. Essentially rather than just keeping a list of tags, an object stores the epoch of writing for each tag and may have multiple different tags for different epochs. Finding the tags for an object at an epoch means searching the epochs and tags for the most recent tag which existed at the specified epoch.

I will have availability for Rust coaching or adoption from March 2026; from a single call to ongoing 2 days/week. I can help your team get things done, adopt Rust and use it more effectively, or to accurately evaluate Rust as a new technology.

If you're adopting Rust, I can help make that a success with advice, 1:1 or group mentoring, design and code review, or online support. Coaching.