Skip to content

Open questions

These aren’t missing features — they’re design problems where we don’t yet trust any answer, including our own. If you’ve faced one of these in your engine, we’d like to compare notes.

Compaction without losing history’s value

Section titled “Compaction without losing history’s value”

The event log is an asset — audit, provenance, future undo/diff features — until it’s a liability. What’s the right compaction contract? Snapshot + truncate loses blame; rolling windows lose old provenance; never compacting loses the disk. Yjs has the same question with different math.

A durable client queue replayed on reconnect is the obvious shape — but replaying guarded patches days later mostly means rejections, which pushes everything into session-style three-way resolution. Is offline-then-review (a reconnect producing a changeset to review rather than silent writes) the honest model for server-authoritative engines? We suspect yes, and haven’t built it.

Per-document/per-field rules could live in schema documents — declarative, syncable, agent-readable, consistent with everything else. Or authorization could stay the host application’s job, keeping the engine small. The first is elegant and a big correctness surface; the second is honest and pushes complexity onto every app.

The three-way preview provides the data, but what do users actually want — field-level pick-and-choose? Restage-from-head and re-apply? Let the agent propose the merge and review that? The apps are the laboratory; no pattern has won yet.

A concrete sub-question: partial accept. Per-document accept already falls out of the model (each stage commits independently), but accepting some staged operations within one document while keeping the rest staged needs a selective drain plus a rebase of the remainder. Per-item triage of AI suggestions is proven UX elsewhere — is it worth the machinery, or is whole-stage commit the right simplicity?

A document’s event stream replays anywhere, but its schema version stamps are folder-local coordinates — so import only works into a fresh or cloned folder that carries the schema document’s history. Importing into a folder whose schema evolved independently is a remap problem: translate sequence stamps by matching reconstructed schema content. Nothing extra needs to be stored to make that possible — but the remap tool doesn’t exist, and whether it’s ever needed (versus “clone whole folders” being the only real use case) is open.

The projection layer reshapes a fully synced document into a more ergonomic view — picks, renames, sorts, groupings. A subset of those operations inverts cleanly, which makes writable lenses possible: edit the view, and the edit translates back into path-disjoint physical operations that merge cleanly. The mechanics exist at a very alpha level — but we genuinely don’t know whether they’re worth the effort, what the right shape is, or whether projections should simply stay read-only and writes always go through the physical document. If you’ve built (or abandoned) a lens layer over a sync engine, this is the question we’d most like to compare notes on.

One host document, one changeset is today’s model. Several named changesets per host (parallel agent proposals, draft vs. review lanes) is repeatedly tempting and repeatedly deferred — is the added model complexity worth it, or is “one changeset, host documents are cheap” the right discipline?

How far does “schemas as documents” stretch?

Section titled “How far does “schemas as documents” stretch?”

Migration happens per document on its next read, with the result written back — so cold documents accumulate pending migrations until someone loads them, and “is the migration done?” has no clean answer without a bulk sweep, which doesn’t exist. Relatedly: should a schema write warn when it will strand existing documents as flagged-invalid, and what do good repair workflows for those documents look like? Nobody has hit these walls yet, which is not the same as the walls not existing.