Algorithm Selection for Merge in Distributed CouchDB Pipelines

In distributed CouchDB deployments spanning edge/IoT endpoints and mobile backends, the merge algorithm you bind to each document class dictates data consistency, replication latency, and operational overhead. When bidirectional replication surfaces divergent document states, the pipeline must route conflicting revisions through a deterministic resolution path rather than trusting CouchDB’s default winning-revision selection, which only picks a revision to return on reads (highest generation, then lexicographically highest hash) and never deletes the losing branches. This page shows how to configure the replication job, detect conflicts from the _changes feed, implement a routing resolver in Python, and choose between competing merge strategies. It is the decision layer of Conflict Detection & Automated Resolution Strategies: the algorithm you select here sits between the raw _conflicts array — produced by the concurrency patterns catalogued in conflict generation models — and a reconciled, tombstoned document tree. Getting it wrong silently discards writes; getting it right requires evaluating network partition frequency, data semantics, and convergence guarantees before you commit a strategy to a topology.

Configuration Schema & Required Parameters

Deploying any merge algorithm starts from an explicit _replicator document that keeps every divergent leaf flowing to the resolver. There is no flag needed to “retain” conflicts: replication always copies every divergent leaf revision and CouchDB never discards losing branches on its own, so the merge pipeline always receives complete revision trees. Deploy the job with the standard _replicator document schema; a continuous edge-to-central job looks like this:

{
  "_id": "rep_edge_to_central",
  "source": "https://edge-node-01.local:5984/iot_telemetry",
  "target": "https://central-db.cluster:5984/iot_telemetry",
  "continuous": true,
  "create_target": false,
  "filter": "sync_filters/replication_filter",
  "user_ctx": {
    "name": "replicator_svc",
    "roles": ["_admin"]
  }
}

The filter value uses the ddocname/filtername form (here, the replication_filter function inside the _design/sync_filters document) — not a _design/.../_filter path. Whether you run this job continuously or as a scheduled sweep is a separate decision covered in continuous vs one-way sync; the merge router below is agnostic to that choice.

Parameter	Type	Default	Effect on the merge pipeline
`_id`	string	— (required)	Names the replication job in the `_replicator` database; use a stable, descriptive id per topology edge.
`source` / `target`	string or object	— (required)	Database URLs (or objects with embedded auth/headers). Both must be reachable from the replicator node.
`continuous`	boolean	`false`	When `true`, holds an open `_changes` listener so conflicts surface within seconds; when `false`, they surface only per sweep.
`create_target`	boolean	`false`	Creates the target DB if missing. Leave `false` in production so a typo can’t spawn a stray database.
`filter`	string	none	`ddocname/filtername` server-side filter. Mutually exclusive with `doc_ids` and `selector` — set exactly one.
`selector`	object	none	Mango selector that narrows the replicated stream declaratively.
`doc_ids`	array	none	Explicit `_id` allow-list; useful for replaying a known conflicted set through the resolver.
`user_ctx`	object	none	Roles the job runs under. Writing to `_replicator` and tombstoning losers needs `_admin` or an equivalent database role.

CouchDB treats doc_ids, filter, and selector as mutually exclusive, so pick one and validate active state with GET /_active_tasks (or GET /_scheduler/jobs) after deploying. For authoritative field-level guidance, consult the Apache CouchDB Replicator documentation.

Streaming Detection / Monitoring Setup

Routing decisions must be evaluated at the point the pipeline consumes the _changes feed, before losing leaves are pruned downstream. Request the feed with style=all_docs and conflicts=true so each row carries the full leaf set and each document’s computed _conflicts array. The minimal listener below streams changes and yields only the documents that actually have competing branches:

import json
import httpx


def stream_conflicts(db_url: str, since: str = "now"):
    """Yield (doc_id, winning_rev, conflicts[]) for each conflicted document.

    Uses the continuous _changes feed with style=all_docs so every leaf
    revision is reported, and include_docs + conflicts=true so the computed
    _conflicts array rides along without a second round trip.
    """
    params = {
        "feed": "continuous",
        "since": since,
        "style": "all_docs",
        "include_docs": "true",
        "conflicts": "true",
        "heartbeat": "10000",
    }
    with httpx.stream("GET", f"{db_url}/_changes", params=params, timeout=None) as r:
        for line in r.iter_lines():
            if not line:  # heartbeat keep-alive
                continue
            row = json.loads(line)
            doc = row.get("doc") or {}
            conflicts = doc.get("_conflicts")
            if conflicts:  # only conflicted docs carry this key
                yield doc["_id"], doc["_rev"], conflicts


if __name__ == "__main__":
    for doc_id, rev, conflicts in stream_conflicts("http://localhost:5984/iot_telemetry"):
        print(f"conflict: {doc_id} winner={rev} losers={conflicts}")

Classification must stay idempotent: during network flapping the same conflict can appear on the feed more than once, so the resolver has to tolerate re-processing a document whose losers were already tombstoned. Emit a per-document metric here (conflict detected, timestamp) so you can later measure resolution latency end to end.

Core Implementation

The core of a merge pipeline is a router that inspects each conflicted document, selects an algorithm by namespace, applies it across every branch, and commits the merged winner and the loser tombstones in a single _bulk_docs batch. Splitting those two writes is the classic mistake — the conflict only clears when the losing leaves are deleted. The class below adds structured logging, bounded retries with exponential backoff on 409 Conflict, and concurrent branch fetches via asyncio. Deterministic dispatch keeps it compatible with the externalised rules described in auto-merge rule engines.

import asyncio
import logging
import os
from typing import Any, Callable

import httpx

logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
log = logging.getLogger("merge-router")


def select_algorithm(doc_id: str) -> str:
    """Map a document namespace (the _id prefix) to a merge strategy name."""
    if doc_id.startswith("telemetry."):
        return "lww"
    if doc_id.startswith("config."):
        return "semantic"
    if doc_id.startswith("collab."):
        return "crdt"
    return "escalate"  # unknown namespace -> manual review


def merge_lww(branches: list[dict]) -> dict:
    """Last-write-wins: keep the branch with the newest application timestamp."""
    return max(branches, key=lambda d: d.get("updated_at", ""))


def merge_semantic(branches: list[dict]) -> dict:
    """Field-union: fold non-conflicting field updates across every branch."""
    merged: dict[str, Any] = {}
    for branch in sorted(branches, key=lambda d: d.get("updated_at", "")):
        for key, value in branch.items():
            if not key.startswith("_"):
                merged[key] = value  # later timestamp wins per field
    return merged


STRATEGIES: dict[str, Callable[[list[dict]], dict]] = {
    "lww": merge_lww,
    "semantic": merge_semantic,
}


class MergeRouter:
    """Resolve conflicted CouchDB documents by namespace-selected strategy."""

    def __init__(self, db_url: str, max_retries: int = 5):
        self.db_url = db_url.rstrip("/")
        self.max_retries = max_retries

    async def resolve(self, client: httpx.AsyncClient, doc_id: str,
                      conflicts: list[str]) -> dict | None:
        strategy = select_algorithm(doc_id)
        if strategy not in STRATEGIES:
            log.warning("escalating %s (strategy=%s)", doc_id, strategy)
            return None  # caller routes to the manual-review queue

        # Fetch the current winner plus every losing leaf in parallel.
        current = (await client.get(f"{self.db_url}/{doc_id}")).json()
        loser_resps = await asyncio.gather(
            *[client.get(f"{self.db_url}/{doc_id}?rev={rev}") for rev in conflicts]
        )
        branches = [current] + [r.json() for r in loser_resps]

        merged = STRATEGIES[strategy](branches)
        merged["_id"] = doc_id
        merged["_rev"] = current["_rev"]  # write against the current winner

        # Commit the winner and tombstone the losers in one atomic batch.
        batch = {"docs": [merged, *(
            {"_id": doc_id, "_rev": rev, "_deleted": True} for rev in conflicts
        )]}
        return await self._commit_with_retry(client, doc_id, batch, strategy)

    async def _commit_with_retry(self, client, doc_id, batch, strategy) -> dict:
        for attempt in range(1, self.max_retries + 1):
            resp = await client.post(f"{self.db_url}/_bulk_docs", json=batch)
            if resp.status_code != 409:
                log.info("resolved %s via %s (attempt %d)", doc_id, strategy, attempt)
                return resp.json()
            backoff = min(2 ** attempt, 30)
            log.warning("409 on %s, retrying in %ss", doc_id, backoff)
            await asyncio.sleep(backoff)
        raise RuntimeError(f"exhausted retries resolving {doc_id}")


async def _main() -> None:
    db_url = os.environ.get("COUCH_URL", "http://localhost:5984/iot_telemetry")
    router = MergeRouter(db_url)
    async with httpx.AsyncClient(timeout=30) as client:
        # Example: resolve one known conflicted document.
        await router.resolve(client, "telemetry.sensor-42", conflicts=["2-b91d..."])


if __name__ == "__main__":
    asyncio.run(_main())

Retry logic must respect CouchDB’s _rev generation sequence: on a 409, re-read the current winner before rebuilding the batch so you never write against a stale generation. Backpressure the loop with a bounded task group when draining a large _conflicts backlog so I/O-bound revision fetches don’t overwhelm the CouchDB cluster.

Strategy Variants & Trade-offs

Production sync pipelines typically implement one of three paradigms, each optimal for a different workload. Map document namespaces to strategies with an explicit matrix rather than a single global default:

Document namespace	Partition tolerance	Recommended algorithm
`telemetry.*`	High (frequent drops)	LWW (timestamp-based)
`config.*`	Medium	Semantic field-union
`collab.*`	Low (offline sync)	CRDT (G-Counter / LWW-Register)

Last-Write-Wins (LWW) runs at O(1) by comparing revision sequences or application-level timestamps. It suits high-frequency telemetry and stateless sensor metrics where eventual consistency tolerates a discarded overwrite. Its failure mode is clock skew, so the timestamp-normalisation and tombstone details live in the dedicated guide on implementing Last-Write-Wins in CouchDB.

Semantic (field-union) merging evaluates field-level diffs against domain rules, preserving non-overlapping updates — e.g. accepting a new device.firmware_version from one branch while keeping the device.network_config edited on another. It needs a schema-aware diffing engine and adds moderate CPU cost, but it prevents data loss in structured configuration payloads.

Conflict-Free Replicated Data Types (CRDTs) enforce mathematical convergence through commutative, associative, and idempotent operations, ideal for offline-first collaboration and distributed counters. The price is payload overhead and custom document schemas, since the merge state must be embedded in every write.

Strategy	Consistency guarantee	Latency / cost	Implementation complexity	Best fit
LWW	Convergent, lossy (one branch discarded)	Lowest	Low	Telemetry, ephemeral metrics
Semantic field-union	Convergent, lossless for disjoint fields	Medium (diff per field)	Medium	Config & profile documents
CRDT	Strong convergence, lossless	Higher payload, low compute	High (schema + ops)	Counters, offline collaboration

Deployment & Orchestration

Run the resolver as a small stateless service, one replica per replication partition. Running two replicas against the same partition doubles the 409 retry pressure because both race to tombstone the same losers, so scale horizontally by sharding namespaces, not by cloning workers on one stream. Configure everything through the environment so the same image serves every edge:

# Container environment (one replica per partition)
COUCH_URL=https://central-db.cluster:5984/iot_telemetry
MERGE_MAX_RETRIES=5
MERGE_NAMESPACES=telemetry.,config.,collab.
HEALTHCHECK_PORT=8080

Expose a /healthz endpoint that confirms the worker can still reach CouchDB and that its _changes cursor is advancing — a cursor stuck for longer than one heartbeat interval signals a stalled listener, not a healthy idle. Package the listener from the detection section and the MergeRouter into one entrypoint, pin the since checkpoint in durable storage so a restart resumes rather than replays, and let your orchestrator restart the pod on a failed health check.

Troubleshooting & Common Errors

Symptom / error	Likely cause	Remediation
`409 Conflict` on `_bulk_docs`	Winner `_rev` went stale between read and write	Re-read the document, rebuild the batch against the fresh `_rev`, retry with backoff
`doc_update_conflict` in logs	Two resolvers racing the same partition	Enforce single-replica-per-partition; shard by namespace
Conflicts never clear	Winner written but losers not tombstoned	Include `_deleted: true` entries for every `_conflicts` rev in the same batch
`filter` job replicates nothing	`filter` used a `_design/.../_filter` path	Switch to the `ddocname/filtername` form; validate with `GET /_active_tasks`
Revision tree grows unbounded	Compaction never runs after resolution	Schedule `POST /db/_compact` in low-traffic windows; watch `revs_limit`
Checkpoint drift / replaying old changes	`since` cursor not persisted across restarts	Store the last `seq` durably and resume from it
Rising escalation rate	Unknown namespace or failed semantic validation	Deepen error handling per error handling & retry logic and inspect the queue

When merge confidence falls below threshold or semantic validation fails repeatedly, do not force a lossy write. Route the document through fallback resolution chains and, if still unresolved, into manual review sync queues so an operator inspects it with a full audit trail. Track three operational signals throughout: resolution latency (detection to committed _rev), fallback rate (share of conflicts escalated), and revision-tree depth (max generations before pruning).

FAQ

Does CouchDB ever merge conflicting revisions for me?

No. CouchDB deterministically selects a winning revision to return on reads, but it keeps every divergent leaf in the revision tree. Merging — and deleting the losing leaves — is entirely the application’s responsibility, which is why algorithm selection is a first-class pipeline concern rather than a database setting.

Why must I tombstone the losing revisions instead of just writing a new winner?

Writing a new winner adds another leaf; it does not remove the competing branches. The document stays conflicted until each rev in its _conflicts array is deleted with {"_id": id, "_rev": rev, "_deleted": true}. Sending the merged winner and those tombstones in one _bulk_docs batch clears the conflict atomically.

Can I mix algorithms in a single database?

Yes, and you usually should. Bind strategies by document namespace (the _id prefix or a type field): timestamp-based LWW for telemetry.*, field-union for config.*, CRDT operations for collab.*. A single global strategy either loses config edits or over-engineers throwaway telemetry.

How do I stop clock skew from corrupting LWW decisions?

Synchronise edge devices with NTP (or PTP for tighter bounds) and prefer hybrid logical clocks over raw wall-clock timestamps. Always parse timestamps as timezone-aware UTC. The full mitigation is covered in the Last-Write-Wins implementation guide linked above.

Where do documents go when no algorithm can resolve them?

They escalate. The router returns without committing, the pipeline flags resolution_status: "manual_review", and the document flows through the fallback chain into the manual review queue rather than being silently overwritten.

Part of: Conflict Detection & Automated Resolution Strategies

Algorithm Selection for Merge in Distributed CouchDB Pipelines #

Configuration Schema & Required Parameters #

Streaming Detection / Monitoring Setup #

Core Implementation #

Strategy Variants & Trade-offs #

Deployment & Orchestration #

Troubleshooting & Common Errors #

FAQ #

Related #