Behind Backend Branching: Fingerprint + 3-Way Diff

17 May 20267 minutes
Junwen Feng

Junwen Feng

Engineer

Backend branching in InsForge

Giving developers a safe "staging environment" — but not just a database. The whole backend, with auth, storage, edge functions, scheduled jobs, email templates, and RLS policies all included.

Every InsForge project runs on its own EC2 instance, hosting a full Postgres, PostgREST, and the insforge server. When a developer wants to validate an RLS change, an auth provider switch, or a brand-new schema migration, today there's only one option: try it directly on prod. Backend branching is about replacing "try it on prod" with "try it on a branch, validate, then merge back to parent."

This sounds like the familiar "dev DB" concept in the database world. But what makes InsForge special is this: what it offers isn't just a database — it's the entire backend's "code": auth configs, storage buckets, edge function source, realtime channels, AI gateway, and so on. So "opening a branch" isn't just cloning a database; it's cloning "the whole backend configuration + data." And "merge" isn't just schema — it's a fine-grained, field-level merge across every system config table.

This post breaks down the design we eventually shipped.


1. The Status Quo: The Backend "Code" Lives in Postgres

Each project's Postgres holds two kinds of data: platform metadata and developer business data. Starting from 2.0, we refactored the schema to isolate them:

CategorySchemaOwnerWho can modify
Platform-reservedauth, storage, functions, email, ai, realtime, schedules, system, deployments, etc.InsForge OSSVia SDK / dashboard / CLI; raw SQL changes are unsupported
Developer businesspublic + user-definedApplication codeFree DDL + DML

The key tables in the platform schemas are essentially InsForge backend's "source code":

text
auth.config / auth.oauth_configs / auth.custom_oauth_configs     ← auth strategy and providers
storage.buckets / storage.config                                  ← buckets and global storage config
functions.definitions                                             ← edge function source itself
email.config / email.templates                                    ← SMTP and email templates
ai.configs                                                        ← model gateway
realtime.config / realtime.channels                               ← realtime config
schedules.jobs                                                    ← cron jobs
system.migrations / system.custom_migrations / system.secrets     ← platform migrations and secrets

In other words: "backend code = DB rows". Switching an OAuth provider from Google to GitHub isn't editing a YAML file — it's changing a few rows in auth.oauth_configs.


2. Creating a Branch: The T0 "Fingerprint"

Branch creation itself isn't complex: spin up a new EC2 in the same region, pg_dump the parent, pg_restore into it. The tricky bits are some environment-specific fields:

  1. JWT_SECRET is reused from parent — so existing tokens remain valid on the branch; ANON_KEY / API_KEY are regenerated — so SDKs don't accidentally fire requests at the wrong environment.
  2. Storage is one-way visible: existing parent files are visible to the branch, but new files on the branch are invisible to the parent.

But none of that is the hard part. The hard part is what happens when you merge back to parent and run into conflicts.

For example: at 8:00 we create a branch, at 8:30 we change the schema on the branch, at 9:00 someone changes the schema on parent, and at 10:00 we try to merge back. Three things can happen:

  1. The branch and parent changes have no conflict at all;
  2. Both sides modified the same table and the results disagree;
  3. The table the branch modified has already been dropped on parent.

Our approach is three-way diff: compare three states — parent at T0, parent now, and branch now — and merge safely.

2.1 The T0 Fingerprint

The question three-way diff has to answer is: "Since T0, did parent change this object? Did branch change it? Or did both?" The only reliable way to answer is to keep a cryptographic fingerprint of the T0 state. The fingerprint has to be:

  • Stable: capturing the same state multiple times must produce identical hashes — unaffected by physical row order, JSON key order, audit timestamps, or other noise;
  • Fine-grained: precise down to "a single table," "a single policy," "a single function," "a single config row" — so conflicts can be pinpointed to the object level;
  • Lightweight: we can't take a full pg_dump every time — merges are a high-frequency operation.

Every branch captures a snapshot of parent's T0 fingerprint at creation time and stores it in the branch metadata. It's immutable after that — even a branch reset (which rolls the database back to T0) does not reset T0 itself. It's the baseline every future merge references.

2.2 The Five Buckets of a BranchFingerprint

A fingerprint is made of five buckets:

ts
interface BranchFingerprint {
  tables:         Record<string, string /* sha256 */>;
  policies:       Record<string, string>;
  functions:      Record<string, string>;
  configs:        Record<string, string>;  // row-set hash for each config table
  edge_functions: Record<string, string>;
}

Each bucket corresponds exactly to one kind of diff unit at merge time. tables/policies/functions live at the DDL layer (sourced from pg_class/pg_policies/pg_proc); configs lives at the DML layer (per-table row-set hash).

2.3 Three Layers of Canonical Hashing

Saying "compute a hash" isn't enough — the input has to be a stable byte stream. We use three layers of normalization:

  1. Key ordering: sort keys at every level of the object before serializing. Postgres's row order and JSON.stringify's field order are both unstable.
  2. Row cleanup: strip two kinds of columns before serializing:
    • Audit columns like created_at / updated_at / last_used_at — frequently bumped by triggers. Without stripping, the same row hashes differently across dry-runs, producing "ghost modifies."
    • The excludeColumns / excludeKeys declared per table (from the mergeable matrix in the next section) — environment-specific fields like OAuth client_secret are masked at the fingerprint stage, so they never even enter the diff.
  3. Row ordering: sort the whole table's rows by id before hashing. Physical insert order is unstable; PK ordering is the only reliable anchor.

2.4 Capture Path: Introspection, Not pg_dump

For the DDL parts (tables/policies/functions) we chose PG catalog introspection over pg_dump --schema-only:

  1. The newlines, comments, and SET preambles in pg_dump output drift across PG versions and dump configurations — the hash isn't stable.
  2. Pulling structured fields out of pg_class/pg_policies/pg_proc and canonicalizing them ourselves is far more deterministic than parsing SQL text.
  3. It's fast. Merge dry-runs land in the sub-second range.

Introspection metadata can't be replayed directly, so we pre-render a ready-to-run CREATE TABLE and a standalone index list into the result — but deliberately exclude these two fields from the hash. That keeps older T0 snapshots (taken before we added them) diffable against today's parent_now / branch_now. Backward-compatible by construction.


3. Merging: The Mergeable Matrix — How Each Table Participates

At merge time we hold three fingerprints: parent_T0, parent_now, branch_now. But not every diff should be merged. An OAuth client_secret is a prod credential on prod and a dev credential on a branch; the fact that they're "different" is by design, not a conflict. To let the diff engine tell apart "should be merged" from "shouldn't be merged," we need an explicit rulebook — that's the mergeable matrix.

3.1 Four Actions

Every platform table is registered in the Mergeable Matrix with one of four strategies:

  • always_mergeable: the whole row participates in three-way diff and is UPSERTed wholesale.
  • conditionally_mergeable: same as above, but first strip certain columns (excludeColumns) or certain keys (excludeKeys). The two work differently: the former is column-level pruning (other columns of the same row still merge), the latter is row-level skipping (the entire row is excluded from the diff). OAuth redirect_url, ANON_KEY / API_KEY inside secrets — these fall into the "strip" category.
  • append_only: branch is only allowed to append to the tail of T0; modification at any position is disallowed. Used primarily for migrations.
  • never_mergeable: the default. Anything in SYSTEM_SCHEMAS = {auth, storage, functions, email, ai, realtime, schedules, system, deployments} that isn't explicitly registered in the matrix falls into this bucket — e.g., auth.users, storage.objects, system.audit_logs. Business data rows in user schemas also default here.

3.2 The "Why" Behind a Few Key Decisions

Why the OAuth trio + allowed_redirect_urls are always excluded. client_id / client_secret / redirect_uri and the redirect allow-list are all per-environment: the branch runs against a dev app registration (often with temporary entries like localhost:3000 or preview-xxxx.vercel.app), while prod runs against the prod registration. Push these back to parent and you either get an OAuth callback hijack or an open-redirect. Environment-specific ≠ configuration drift.

Why JWT_SECRET / ANON_KEY / API_KEY are excluded too. At branch creation, JWT_SECRET is copied from parent (so tokens remain valid on the branch), but the API keys are regenerated on the branch. Merging them back: JWT_SECRET is already identical on both sides (meaningless), and pushing API keys onto prod would change prod's keys and instantly disconnect every client.

Why user-schema data is "never merged." Business data on the branch is test data (it might even be empty tables in --schema-only mode). Merging it back to prod would necessarily pollute. Schema migration (DDL) is meaningful; business data isn't. So user-schema tables only run the DDL diff — data rows never enter the configs bucket at all.


4. Three-Way Diff: parent_T0 vs parent_now vs branch_now

We now have three fingerprints: T0 (the moment the branch was created), parent_now (parent's state at merge time), and branch_now (branch's state at merge time). This section is about how to triangulate them into changes + conflicts.

4.1 Per-Object Decision Matrix

For every object (a user table, a policy, a function, a config row, an edge function slug), consult this 5-row table:

parent_T0 vs parent_nowbranch_now vs parent_T0MeaningOutput
samesameNeither side touched itskip
samechangedOnly branch touched itapply branch_now → parent
changedsameOnly parent touched itskip (parent keeps its own evolution)
changedchanged, and parent_now == branch_nowBoth sides made the same change in parallelskip (already converged)
changedchanged, and parent_now != branch_nowBoth modified the same object, into different thingsCONFLICT, merge is blocked

Row 4 is the edge case many diff systems ignore: if both sides independently made the exact same change after T0 (e.g., both rewrote the same RLS policy from auth.uid() = user_id into the identical new expression), their hashes end up equal. That isn't a conflict — it's duplicated work that's already converged. No-op is the right answer.

4.2 The Append-Only Diff for Migrations

system.custom_migrations runs on a separate merge path — semantically it's a "history log." The core precondition: t0 must be a prefix of both parentNow and branchNow. If either side violates that, the corresponding migration history was rebased/reset, and we conflict immediately.

Assuming the precondition holds, look at the tails:

  • branchTail empty → no-op;
  • parentTail empty → branch's tail appends to parent directly;
  • both have tails → conflict, prompting the user to rebase the branch manually.

The third case is intentionally conservative. We don't try to inspect what each migration actually does — if both sides have a tail, we declare conflict. Migrations are arbitrary SQL: they might CREATE TABLE, ALTER, or DELETE FROM .... Even if the two sides didn't touch the same table, appending them in timestamp-interleaved order could produce hidden problems like "parent's migration #101 assumed a state, but branch's interloping migration #101 invalidated it." Asking the user to rebase manually is more reliable than letting the algorithm guess.

4.3 The "Three Views" of a Conflict Report

DiffResult isn't one thing — it's three parallel views, each aimed at a different consumer:

jsonc
{
  "summary":      { "added": 5, "modified": 2, "conflicts": 1 },
  "rendered_sql": "BEGIN;\n-- [DDL] ...\n-- [DATA] ...\nCOMMIT;",
  "changes":      [{ "schema", "object", "type", "action", "sql" }, ...],
  "conflicts":    [{ "schema", "object", "type",
                     "parent_t0_hash", "parent_now_hash", "branch_now_hash",
                     "hint" }, ...]
}
  • rendered_sql is for humans on the dashboard and CLI: wrapped in BEGIN/COMMIT, split into DDL / DATA / MIGRATION sections. On conflict it leads with a -- ⚠️ MERGE BLOCKED banner plus the three hashes per conflicting object — and the SQL body is still rendered (annotated "do NOT run as-is"), so a developer can see "what would have run" and resolve manually.
  • changes is the structured list for agents — each entry carries a SQL fragment and a note explaining why some entries won't actually be applied (e.g., v1 always skips deletes against parent to avoid data loss).
  • conflicts is the programmatic conflict list. The three hashes let an agent reason precisely — "for this object, T0 was X, parent is now Y, branch is now Z" — and decide whether to rebase.

4.4 Why We Chose to "Stop" Instead of Auto-Resolving

Many version-control systems attempt automatic three-way merges (git's textual merge is the classic example). We deliberately don't:

  1. State is harder than text. An OAuth config row can have strict-ordering requirements internally (e.g., the order of entries in the redirect_uri allow-list affects prefix matching) — textual merge can't see that.
  2. The cost of being wrong is high. A bad backend-config merge isn't visibly broken like a code conflict marker; the downstream is live traffic.
  3. The dev agent is in the loop. InsForge's target users are agent-driven development flows — handing a structured conflict to an agent that can carry context back to the branch is far more accurate than asking a context-free algorithm to guess.

So today's merge state machine has only two paths: ready → merging → merged, or ready → merging → conflict → roll back entirely to ready. There's no "partial apply." On the parent side, the whole merge sits inside a BEGIN ... COMMIT transaction; failure means ROLLBACK, and parent never sits in a half-applied state.


5. Things We Deliberately Left Out

The hard part of engineering is always in what you decide not to do. The current version intentionally skips:

  • Column-level conflict detection. Two sides editing different columns of the same row also count as a conflict (conservative). The cost is that developers may need to split and retry migrations — but it avoids a class of "column-level merge looks fine, semantics are wrong" incidents.
  • Storage object copy. S3 objects aren't copied at branch creation. Reads fall back transparently to parent via the InsForge OSS layer; writes are isolated. A product trade-off — copying hundreds of GB for a "test" is too slow and too expensive.
  • Nested branches. Branch-of-a-branch is rejected outright, to keep things simple.
  • Continuous rebase. Branches don't auto-pull new changes from parent — once the two diverge far enough, the only paths are merge (or delete and start over). This collapses a dimension of the system's state space.
  • Revert history. There's no "roll back to a particular branch version" after a branch is deleted. Use project backup when needed.

6. Closing

Backend branching looks like "database branching," but the real problem it solves is branching the entire backend configuration — auth, storage, functions, templates, policies, migrations — together. The process touches several platform modules, and the hidden premise of the whole design is this: the most dangerous failure isn't a diff computed wrong; it's a diff computed right but applying things that shouldn't have been merged. So the fingerprint has to be stable, the mergeable matrix has to be explicit, and conflicts have to surface early.

The first version has shipped, but plenty remains — column-level diff, cross-branch rebase, storage object sync. The skeleton runs, though, and that's reason enough to write up the technical details and share them with everyone.