Concurrent Compaction in Automerge Repo
November 17, 2024
Introduction
automerge-repo
is a library for integrating automerge
with pluggable storage and networking providers. One interesting
challenge for storage is that Automerge documents have to be regularly
compacted (losslessly) otherwise they take up a lot of storage space,
but we wanted to have a very minimal storage interface so that it is
simple to write a new storage plugin. In particular, we wanted it to be
possible to have multiple concurrent writers using the same storage
backend without requiring that the user implement transactions or other
concurrency controls. Implementing compaction without requiring
concurrency control is a bit fiddly and I’ve described our approach a
few times in the Automerge discord so I’m writing it up here to have
something to point people at.
The Problem
We’ll start with an abstract description of this problem and later we’ll talk about some Automerge specifics. Imagine that our application is appending events to a log. Every event is immediately written to disk. The storage interface is a simple key/value store where the keys are strings and the values are byte arrays. To write an event to disk we’ll encode the event into a byte array somehow (maybe we just serialize it to JSON) and then hash that encoded byte array to get a key to use for the event.

There can be multiple instances of the application writing to the same storage at the same time, so things can actually look a bit like this:

Here the visual offset of the events and the event names suggest that these events are ordered somehow. The storage interface is just a key/value store though so we have to encode the ordering into the events themselves. There are lots of ways to achieve this, for this example let’s assume all the applications have access to the same clock and they write the timestamp of the clock into the bytes of the event.
Every so often the application compacts the log. The simplest way to
describe how to do this is that we read all the events off of the disk,
perform the compression, write the compressed log back to disk and
delete all the original events. What key should we write the compressed
log to? Let’s say "compressed"
for now.

To read the log then an application first checks if there is a
compressed
key and reads that, and then reads any remaining
keys. Unfortunately this scheme will have problems in the case of
concurrent writers. Consider this situation:

Here there are two instances of the application which have loaded a shared log - events 1 through 3. Each instance then generates a new event and decides to compact the log. To compact the log each instance reads everything which is currently on disk and then starts compacting it. This means that the events each instance will see are the following:
Instance 1 | Instance 2 |
---|---|
1,2,3,4 | 1,2,3,4,5 |
So far so good. The problem is that we have no guarantees about which
order writes to storage will complete in. If instance 1 writes first and
then instance 2 then everything is fine, but if instance 2 finished
first and then instance 1 then the compressed
log on disk
will be missing event 5.
The Solution
The basic problem here is that writing to the
"compressed"
key deletes whatever was there before and
because we don’t coordinate around writes to a key it’s always possible
that a key has changed since we last loaded it. There’s one easy way to
avoid this problem: give every write a unique name. We actually already
do this with the individual events which we write using their hash, we
can extend this same principle to writing the compressed log.
Instead of writing the compressed log to the
"compressed"
key we’ll write it to
"compressed/<hash>
where the hash is the hash of the
compressed log. It’s possible that two nodes will independently compress
the same events and write to the same key, but that’s fine as it will
contain the same data. This means our read and compress operations now
become:
- When we are reading from the log we’ll make sure to read every key
which begins with
compressed/
. We will need to have some way of discarding duplicate entries from the log - we could give each event a unique ID for example. - When we are compressing the log we read all the individual events
off of disk as well as any
compressed/<hash>
keys, then write a newcompressed/<hash>
key with the new compressed log in it. Finally we delete all the<hash>
keys we loaded and also anycompressed/<hash>
keys which we loaded.
Altogether what this means is that occasionally we will read some redundant data, but we never lose data and we don’t have to coordinate between processes using the same storage.
Optimising Loads
In cases where the application already has the log in memory when it produces new events (as is the case in Automerge) then the scheme described above can lead to a lot of redundant reads. Every time we compress we read everything off of disk, even though we probably already have most data in memory. We can avoid this problem by having the application keep track of what keys it loaded when it first loaded the log - let’s call this the “live keyset”. Then, every time the application writes a new event to disk it adds that events key to it’s live keyset and every time the application compresses the log it adds the new compressed log key to it’s live keyset and removes all of the previous keys. Some care must be taken to make sure the live keyset is not updated until after storage operations are complete (e.g. don’t add a key to the live keyset until the write operation is complete).
Automerge Specifics
For automerge-repo
we implement this scheme as
follows:
- Every time we load a document we keep track of which storage keys we loaded it from
- Every time the user makes a change to a document (using
DocHandle.change
) we obtain any new changes usingAutomerge.getChangesSince
and we write them to<document ID>/incremental/<change hash>
and add the key to the live keyset - Whenever the total size of the incremental changes on disk is larger
than the size of the compressed document we compress the document, then
we save the compressed document to
<document ID>/snapshots/<hash of document heads>
and remove all the previous keys from the live keyset (after deleting them)
Note that we use the hash of the document heads instead of the hash of the compressed file as the key for the compressed log. This is because the Automerge compression format uses DEFLATE internally, which is non deterministic and so it’s possible that writes with the same data would produce different encodings. Happily the heads of an automerge document unquely identify the changes in the document, so we can use this as a stable identifier.