Conflict Generation Models in CouchDB Replication

In distributed edge and mobile deployments, document conflicts are not anomalies; they are deterministic outcomes of concurrent state mutations under network uncertainty. A conflict generation model names the precise conditions under which divergent document revisions are produced, propagated, and materialized in the database — and if you cannot reproduce those conditions on demand, you cannot prove your sync pipeline handles them. This page gives edge/IoT developers, mobile backend engineers, and Python sync builders a way to configure, stream-observe, and deterministically generate each model in a controlled environment before it hits production. It sits under CouchDB Replication Architecture & Revision Fundamentals, which establishes how MVCC and replication checkpoints interact to surface divergence; here we focus on the upstream question of how that divergence is created in the first place.

The canonical concurrent-write case: two disconnected nodes edit the same parent revision, each producing a generation-3 leaf locally. Replication copies both leaves into one database, where they coexist as a conflict until the application reconciles them:

CouchDB selects a winning revision deterministically — highest generation count first, then the lexicographically highest revision hash — while retaining the losing branches as additional leaves surfaced through a computed _conflicts array on reads with ?conflicts=true. The winner is chosen at read time, in memory, and is purely the default revision returned; it is never a deletion of the losers. Every generation model below is a different route to that same divergent-leaf state, and each one demands its own simulation profile.

Configuration Schema & Required Parameters

Reliable conflict generation starts from a replication job that keeps every divergent leaf flowing into one database — nothing is discarded, so the conflict is guaranteed to materialize. Deploy the simulation target with the standard _replicator document schema; a bidirectional edge-to-central job used for reproduction looks like this:

{
  "_id": "rep_conflict_sim_bidi",
  "source": "https://edge-node-01.local:5984/conflict_sim",
  "target": "https://central-db.cluster:5984/conflict_sim",
  "continuous": true,
  "create_target": false,
  "user_ctx": {
    "name": "conflict_sim_svc",
    "roles": ["_admin"]
  }
}

The generator itself is driven by a small set of parameters that map directly onto the four models. Each parameter shapes which divergence pattern you reproduce and how aggressively:

Parameter	Type	Default	Effect on the generated conflict
`db_url`	string	— (required)	Target database URL the generator writes against; must retain conflicts (no auto-merge upstream).
`auth`	tuple	— (required)	Basic-auth credentials for the write user; needs write access plus `_bulk_docs` for tombstone cleanup.
`concurrent_writes`	int	`3`	Number of threads racing the same `base_rev`; higher values widen the conflict fan-out per document.
`delay_ms`	int	`50`	Jitter injected before results are collected, emulating network latency between racing writers.
`base_rev`	string	— (required)	The shared parent revision every writer targets; forcing a common parent is what guarantees divergence.
`partition_window_s`	int	`0`	For split-brain runs, how long to withhold replication so independent histories accumulate before merge.

Because CouchDB never discards a losing branch on its own, you do not need any special “retain conflicts” flag — the job simply must not run an auto-merge worker on the target during a simulation. Validate the job is live with GET /_active_tasks or GET /_scheduler/jobs before generating load. For authoritative field-level guidance on the replication document, consult the official CouchDB Replicator documentation.

Streaming Detection / Monitoring Setup

To confirm a model actually produced conflicts — rather than assuming it did — subscribe to the target’s _changes feed with style=all_docs and conflicts=true so every leaf revision is reported and each document carries its computed _conflicts array. The minimal listener below streams changes and yields only the documents that diverged, giving you a live conflict-generation rate:

import json

import httpx


def stream_generated_conflicts(db_url: str, since: str = "now"):
    """Yield (doc_id, winning_rev, losers[]) for each conflicted document.

    Uses the continuous _changes feed with style=all_docs so every leaf is
    reported, plus include_docs + conflicts=true so the computed _conflicts
    array arrives without a second round trip.
    """
    params = {
        "feed": "continuous",
        "since": since,
        "style": "all_docs",
        "include_docs": "true",
        "conflicts": "true",
        "heartbeat": "10000",
    }
    with httpx.stream("GET", f"{db_url}/_changes", params=params, timeout=None) as r:
        for line in r.iter_lines():
            if not line:  # heartbeat keep-alive
                continue
            row = json.loads(line)
            doc = row.get("doc") or {}
            losers = doc.get("_conflicts")
            if losers:  # only diverged docs carry this key
                yield doc["_id"], doc["_rev"], losers


if __name__ == "__main__":
    for doc_id, rev, losers in stream_generated_conflicts("http://localhost:5984/conflict_sim"):
        print(f"generated: {doc_id} winner={rev} losers={losers}")

Emit a per-document metric here (conflict detected, timestamp, model label) so you can attribute each spike back to the generation profile that caused it and later measure how quickly your resolver drains the backlog. The same feed is the input to every downstream resolver described in conflict detection & automated resolution strategies, so wiring detection first means the same observability path serves both simulation and production.

Core Implementation

The following framework generates deterministic conflicts for pipeline validation. It creates a base document, races several threads against the same revision to force _rev collisions, and then verifies that the divergence materialized. Structured logging, a bounded HTTP session with connection reuse, and explicit 409 handling make it safe to run in CI/CD. Annotations mark the non-obvious lines.

import logging
import time
import uuid
from concurrent.futures import ThreadPoolExecutor, as_completed
from typing import Any, Dict, List, Tuple

import requests

logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
log = logging.getLogger("conflict-generator")


class ConflictGenerator:
    """Simulates deterministic CouchDB conflict generation for pipeline validation."""

    def __init__(self, db_url: str, auth: Tuple[str, str]):
        self.db_url = db_url.rstrip("/")
        self.session = requests.Session()
        self.session.auth = auth
        self.session.headers.update({"Content-Type": "application/json"})

    def _request(self, method: str, endpoint: str, **kwargs) -> requests.Response:
        resp = self.session.request(method, f"{self.db_url}/{endpoint}", **kwargs)
        resp.raise_for_status()
        return resp

    def create_base_document(self, doc_id: str, initial_data: Dict[str, Any]) -> str:
        """Create the shared parent every writer will diverge from; return its _rev."""
        payload = {"_id": doc_id, **initial_data}
        return self._request("PUT", doc_id, json=payload).json()["rev"]

    def _concurrent_write(self, doc_id: str, rev: str, payload: Dict[str, Any]) -> Dict[str, Any]:
        """Write against a fixed parent rev to force divergence.

        Only one racer wins the local write; the rest see 409. Divergence
        proper appears once these independent leaves are replicated together.
        """
        payload["_rev"] = rev
        try:
            resp = self._request("PUT", doc_id, json=payload)
            return {"status": "success", "new_rev": resp.json()["rev"]}
        except requests.exceptions.HTTPError as exc:
            if exc.response is not None and exc.response.status_code == 409:
                return {"status": "conflict", "message": "revision mismatch"}
            raise

    def generate_conflicts(
        self,
        doc_id: str,
        base_rev: str,
        concurrent_writes: int = 3,
        delay_ms: int = 50,
    ) -> List[Dict[str, Any]]:
        """Spawn concurrent threads that all mutate the same document revision."""
        payloads = [
            {"_id": doc_id, "sensor_id": f"node_{i}", "value": i * 10, "ts": time.time()}
            for i in range(concurrent_writes)
        ]
        results: List[Dict[str, Any]] = []
        with ThreadPoolExecutor(max_workers=concurrent_writes) as executor:
            futures = [
                executor.submit(self._concurrent_write, doc_id, base_rev, p)
                for p in payloads
            ]
            time.sleep(delay_ms / 1000.0)  # jitter emulates real-world write latency
            for future in as_completed(futures):
                results.append(future.result())
        log.info("generated %d write outcomes for %s", len(results), doc_id)
        return results

    def verify_conflict_state(self, doc_id: str) -> Dict[str, Any]:
        """Fetch the document with conflicts=true to confirm divergence materialized."""
        return self._request("GET", f"{doc_id}?conflicts=true").json()


if __name__ == "__main__":
    COUCH_URL = "http://localhost:5984/conflict_sim"
    AUTH = ("admin", "password")
    generator = ConflictGenerator(COUCH_URL, AUTH)

    test_doc_id = f"conflict_test_{uuid.uuid4().hex[:8]}"
    base_rev = generator.create_base_document(test_doc_id, {"type": "telemetry"})
    log.info("base revision: %s", base_rev)

    outcomes = generator.generate_conflicts(test_doc_id, base_rev, concurrent_writes=3)
    for outcome in outcomes:
        log.info("write outcome: %s", outcome)

    state = generator.verify_conflict_state(test_doc_id)
    log.info("final document state: %s", state)

The single-database race above reproduces the local half of divergence; to reproduce true replication conflicts, run two ConflictGenerator instances against separate databases (or the same database while a bidirectional job is paused with partition_window_s), then let replication merge the trees. The resulting _conflicts array is what you feed into a resolver such as algorithm selection for merge.

Conflict Generation Models & Trade-offs

For production sync automation, generation models fall into four operational categories. Each has a distinct trigger, a distinct detection signal, and a distinct mitigation, so parameterize your simulation to cover all four rather than only the easy concurrent-write case:

Concurrent Write Divergence — Two or more nodes independently mutate the same document revision before replication synchronizes state. CouchDB retains every leaf and surfaces the losers in _conflicts. This is the direct consequence of revision tree mechanics: the winner is chosen at read time, never during compaction, and the losing branches persist until the application deletes them.
Partition-Induced Split-Brain — Network segmentation isolates edge nodes, letting independent mutation histories accumulate. On reconnection, the replication engine merges the revision trees and materializes conflicts wherever branch heads diverge. Topology-aware routing mitigates it, but simulation remains critical for validating fallback behavior.
Mobile Offline Queue Replay — Devices buffer mutations while offline and flush them in bulk on reconnection. High-volume replay against a recently updated server revision triggers systematic conflict generation, especially when local write timestamps drift from the CouchDB cluster clock or optimistic concurrency checks are bypassed.
IoT Telemetry Collision — High-frequency sensor streams from many devices target the same logical document, producing overlapping revision windows. When write intervals fall below replication latency, the CouchDB cluster experiences rapid revision churn, which the right sync topology models distribute to lower collision probability.

Generation model	Primary trigger	Detection signal	Simulation knob	Mitigation
Concurrent write divergence	Same-rev writes on two nodes	`_conflicts` on default read	`concurrent_writes`	Deterministic resolver, tombstone losers
Partition-induced split-brain	Reconnect after isolation	Conflict burst on merge seq	`partition_window_s`	Topology-aware routing, per-partition workers
Mobile offline queue replay	Bulk flush vs. moved server rev	409 storm then conflict spike	Large batch + stale `base_rev`	Hybrid logical clocks, optimistic retry
IoT telemetry collision	Sub-latency write interval	Sustained high revision churn	Low `delay_ms`, many writers	Shard by device, aggregate at edge

The trade-off is not consistency versus latency here — it is coverage versus test time. Concurrent-write and telemetry-collision runs complete in seconds; split-brain and offline-replay runs require holding replication open across a partition_window_s, so budget for the longer sweep in CI when validating those two.

Deployment & Orchestration

Run the generator as a small stateless job, one instance per replication partition you are validating. Running two generators against the same partition inflates the 409 retry pressure without producing a cleaner conflict, so shard by document namespace or device group instead of cloning writers on one stream. Drive everything through the environment so the same image serves every simulated edge:

# Container environment (one generator per simulated partition)
COUCH_URL=https://central-db.cluster:5984/conflict_sim
GEN_CONCURRENT_WRITES=3
GEN_DELAY_MS=50
GEN_PARTITION_WINDOW_S=0
HEALTHCHECK_PORT=8080

Expose a /healthz endpoint that confirms the generator can still reach CouchDB and that its detection cursor is advancing — a cursor stuck longer than one heartbeat interval means a stalled listener, not a quiet database. Pin the since checkpoint in durable storage so a restart resumes the observation window rather than replaying it, and let your orchestrator restart the pod on a failed health check. When a simulation run finishes, clean up by tombstoning generated leaves in a single _bulk_docs batch so the next run starts from a known-clean tree.

Troubleshooting & Common Errors

Symptom / error	Likely cause	Remediation
`409 Conflict` on every writer but no `_conflicts`	All writers targeted the same node; only one leaf was created	Run writers against separate databases, then replicate the trees together
No conflict materializes after replication	An auto-merge worker drained the losers before you observed them	Pause resolvers during the run; verify with `?conflicts=true` immediately after
`doc_update_conflict` in logs	Two generators racing the same partition	Enforce one generator per partition; shard by namespace
Conflict count keeps climbing after cleanup	Winner written but losers never tombstoned	Delete every rev in `_conflicts` with `_deleted: true` in the same batch
Revision tree grows unbounded	Compaction never runs between simulation sweeps	Schedule `POST /db/_compact` in low-traffic windows; watch `revs_limit`
Checkpoint drift / replaying old changes	`since` cursor not persisted across restarts	Store the last `seq` durably and resume from it
Split-brain run produces no conflicts	`partition_window_s` too short for both histories to advance	Widen the window so each side commits at least one independent leaf

Persistent 409 storms during offline-replay simulation are expected — they are the generation signal, not a failure — but production replay of the same pattern belongs in error handling & retry logic and the focused guide on handling 409 conflicts in replication jobs. Whether you reproduce conflicts under a continuous listener or a scheduled sweep is a separate decision covered in continuous vs one-way sync.

FAQ

Why do concurrent writes to a single node not produce a conflict?

A single CouchDB node serializes writes against the current _rev: the first writer wins and the rest receive 409 Conflict, so only one leaf is created. True divergence needs two independent leaves that later meet through replication. Reproduce it by writing to two databases (or two paused replicas) and then replicating their trees together.

Does CouchDB ever auto-resolve the conflicts I generate?

No. CouchDB deterministically picks a winning revision to return on reads, but it retains every divergent leaf in the revision tree. Nothing is auto-merged or auto-deleted; clearing a conflict is entirely the application’s job, which is exactly why generating and observing conflicts safely is a prerequisite to validating a resolver.

How do I make conflict generation deterministic in CI?

Fix the shared base_rev, pin concurrent_writes, and force ordering with the delay_ms jitter so every run races the same parent. Assert on the _conflicts array length after the run rather than on which specific hash wins, since the winner is a lexicographic tiebreak you should treat as opaque.

What is the difference between split-brain and offline-replay generation?

Split-brain accumulates independent histories on both sides during a network partition and conflicts on reconnect. Offline-replay buffers writes on one side and flushes them against a server revision that has already moved, so the conflict burst is one-directional and usually preceded by a 409 storm. They need different simulation windows and different mitigations.

Where do generated conflicts go once I stop simulating?

They flow into the same resolution path as production conflicts: a resolver reads the _conflicts array, merges the branches, and tombstones the losers. Documents no strategy can resolve should escalate to manual review sync queues rather than being silently overwritten.

Part of: CouchDB Replication Architecture & Revision Fundamentals

Conflict Generation Models in CouchDB Replication #

Configuration Schema & Required Parameters #

Streaming Detection / Monitoring Setup #

Core Implementation #

Conflict Generation Models & Trade-offs #

Deployment & Orchestration #

Troubleshooting & Common Errors #

FAQ #

Related #