Self-hosting OpenPartner: Postgres, click ingestion, and what scales when
A practical guide to running OpenPartner in production on your own infrastructure — from the single-box Compose deploy to the parts of the stack that need attention as click volume climbs into the millions per month.
OpenPartner ships with a Docker Compose template that runs the
full stack — Postgres, click router, API, admin portal — on a single box. docker compose up -d, fill in the install wizard, you’re tracking clicks. For most programs that’s the
whole story.
This post is for the small fraction of self-hosters who outgrow the single-box deploy or want to know what’s going to bend before it does. The honest version: most things just keep working, and the parts that need attention are predictable.
What’s in the stack
The four services in the default compose:
- Postgres — source of truth for clicks, identities, events, attributions, commissions
- Router — the redirect endpoint at
/r/:slug. Records the click, sets cref, 302s. - API — everything else. SDK ingestion, admin actions, attribution engine, payouts.
- Portal — the admin and partner web UIs. Static-ish, talks to the API.
Plus optional dependencies you bring (or don’t):
- A reverse proxy (Caddy, Traefik, nginx) for TLS
- An object store for uploads (S3-compatible) once you need shared file storage
- An SMTP relay (or transactional email API) for partner notifications
Day 1: single-box deploy
Two CPU cores, 4GB RAM, a 50GB SSD. Caddy in front for TLS. That handles roughly:
- A few hundred thousand clicks per day comfortably
- Tens of thousands of conversions
- The full admin and partner experience for hundreds of partners
The bottleneck on a box this size is almost always Postgres write throughput on the
Click table — every click is one insert. If you’re below 10/sec average click rate, you
won’t notice anything. Above that, watch the next section.
What scales linearly with click volume
The click router is the only hot path that scales with click volume (rather than conversion volume or partner count). It does three things:
- Look up the share-link by slug
- Insert a
Clickrow - 302 the user with
?cref=<clickId>appended
Each of those is fast. The contention point at scale is the Click insert.
Two patterns for going past the single-box limit:
Pattern 1: separate router + Postgres replica
Run multiple router instances behind a load balancer, all writing to a single primary Postgres. Clicks are insert-only on the hot path; replication and read replicas pick up analytics queries so the primary doesn’t get blocked by big aggregations.
This works comfortably to ~10M clicks/month on a sensibly-sized primary (8 vCPU / 16GB RAM, NVMe storage). Postgres handles inserts faster than people expect when you’re not also running 30 dashboards on the same instance.
Pattern 2: partition the Click table by month
Once monthly click volume crosses ~50M, declarative partitioning on Click.ts keeps
inserts cheap and lets you drop old partitions cheaply when you archive. The schema
supports this without changes — the only step is CREATE TABLE click_2026_05 PARTITION OF "Click" FOR VALUES FROM ... on a monthly cron and detaching old ones.
Most self-hosters never need this. It’s documented because the few who do, need it without a refactor.
What scales with conversion + commission volume
Different hot path — the inbound SDK calls. identify() writes an Identity row;
event() runs the attribution query, writes a Commission row, and fires webhooks.
The attribution query is the interesting part. It joins Event → Identity → Click within the lookback window. With proper indexes (the schema ships them), it stays fast at millions of events. The rare slow case is a user with hundreds of clicks in the lookback window — the engine has to load all of them to compute multi-touch. That’s normal; bigger boxes solve it before architecture does.
Webhooks are async — they go through a retry queue (WebhookDelivery table) with
exponential backoff. If your downstream is slow or down, the queue absorbs it. No
back-pressure on the conversion path.
Storage: file uploads need a real home
Default file uploads (brand logos, creator avatars) write to local disk at
/var/lib/openpartner/uploads. On a single-box deploy, mount a persistent volume there or
the files vanish on container restart. On any multi-host or ephemeral runtime, switch to
S3-backed storage:
OPENPARTNER_STORAGE_KIND=s3OPENPARTNER_STORAGE_S3_BUCKET=...OPENPARTNER_STORAGE_S3_REGION=...See the env reference for the full set. This is the most common “I deployed and now my logos are gone” surprise.
Backups: pg_dump daily, that’s it
Postgres is the only stateful service. Daily pg_dump (or your managed-Postgres provider’s
snapshot mechanism) is enough. The repo’s Makefile includes a make backup target that
dumps to a date-stamped file.
For added safety: WAL archiving + point-in-time recovery on a managed Postgres. The attribution data is never re-derivable from logs; if you lose Postgres, you lose history.
Observability: it’s just a Node app
The API and router are Node services with structured JSON logs to stdout. Pipe them into
whatever log stack you already have — Loki, Datadog, CloudWatch, plain files. Health
endpoints at /healthz (liveness) and /readyz (readiness, checks Postgres reachability).
Postgres metrics are the single most important thing to watch. Specifically:
pg_stat_statementsto see if any query is dominating- Connection count vs
max_connections(default 100 — you’ll want 200+ on a busy box) - Replication lag if you’ve added replicas
- Disk usage on the data volume
If those are green, the rest of the stack is almost always green.
The portability dividend
The whole point of self-hosting is that you own the data. Settings → Export data
produces CSV + JSON + a pg_dump-compatible SQL file you can hand to anyone. The
portability doc covers the export and re-import path in
full — including how attribution gets re-derived from raw events under the configured
model on the destination instance.
That export works the same way on the hosted plan. Migration in either direction is the same operation.
When to not self-host
Self-hosting makes sense if any of these are true:
- You have an ops team that runs Postgres in production already
- You have compliance constraints that require data residency you control
- You want to fork the code and run modifications
It does not make sense if:
- You’re a solo founder and ops time is your scarcest resource
- You don’t have on-call rotation for the Postgres
- The program is small enough that the hosted Revshare plan ($0/month + 3%) is cheaper than your time
The right choice changes as the program grows. Both sides of the door are open and the data moves between them losslessly. Start where the cost-of-ops vs cost-of-hosted math points today; switch later if it changes.
The full self-host index is at /docs/self-host/compose, and the GitHub repo is at github.com/getcoherence/openpartner.