Deployment topology & sizing¶
How the Bedrock components combine into a deployment, how coverage cells map routers to geography, and what actually drives scaling — including a clear account of what needs HA (the Directory, and the web position recorder) versus what is relay infrastructure (the routers).
Components¶
A deployment is built from four kinds of process. Only the router scales horizontally; the Directory is a singleton.
| Component | Count | Scales? | Notes |
|---|---|---|---|
| Directory | 1 | No — singleton | The identity authority: mints tokens, signs the revocation list, holds the trust root. Single point of failure — there is no documented replication, standby, or HA. If it is down, new logins and token issuance stop; already-cached tokens on routers and clients keep working. |
| Server / router | 1+ | Yes — horizontal | Relays cell-scoped traffic. Redundancy comes from running federated peers, not from making one box highly available (see What needs HA). Carries a set of coverage cells and keeps short-lived local buffers (chat 24 h, drawings, channels) in RocksDB, not replicated across routers. |
| Gateway | 0+ | Optional | Stateless interop bridge. Connects to one or more routers as a Directory-authenticated client (service IdentityToken) and subscribes. Not on the critical path — the mesh keeps working if a gateway is down. |
| Web / Android / Node | many | — | Clients, each pointed at a static router endpoint (no load balancer or service discovery). The web tier also runs a position recorder that durably archives position history to its database — the one durable record of positions (positions are non-durable in the mesh; clients show only a ~10-minute live trail). |
What needs HA¶
Durability is concentrated in two places; everything else is relay or client-owned. That is the whole reason "make the routers highly available" is not the goal.
- Directory — the one true SPOF. It is the identity authority (token signing, revocation list, trust root) with no replication or standby in source. Down ⇒ no new logins or token issuance — but already-issued tokens on routers and clients keep working, so the live mesh keeps running. This is the component whose availability matters most. Source ships no HA mechanism; design redundancy around it (see Status & Roadmap).
- Web position recorder — the position archive. The web tier durably records position history (
web:app/domains/tracks/position_zenoh_recorder.ts); positions are otherwise non-durable — they relay through the mesh and clients show only a ~10-minute live trail. If position history matters to you, that recorder is the thing worth protecting. Voice is real-time and never archived. - Routers — relays, redundant by federation, not by HA. A router forwards live traffic and keeps short-lived local buffers. For live continuity you don't make one router highly available — you run federated peers on the same cells and let clients reconnect. What a client is sending is covered by the client outbox (web/Android), durable on the client and replayed on reconnect, so a router restart never loses outbound messages.
The one caveat: per-router durable history is not replicated
A router does hold authoritative durable state — chat (24 h TTL), drawings (indefinite), channel definitions (indefinite) — in its local RocksDB, and there is no backfill between routers. The outbox is sender-side: it covers what you send, not what you receive. So a client that reconnects to a different peer after a failure sees that peer's history, not the dead router's — received chat/drawing history can be lost on failover. For short-lived tactical traffic this is an accepted tradeoff; if long-lived shared drawings matter to you, treat it as a real limitation (see Known limits & gaps).
Coverage cells¶
Routing in Bedrock is cell-first: every position, heartbeat, and drawing is tagged with the geographic cell it happened in, and the cell is the first segment of the message address (waypoint/<cell>/...).
- Cells are geohash-5. A cell is exactly five characters from the geohash base32 alphabet (digits plus lowercase letters, minus
a,i,l,o) — enforced byGEOHASH5_PATTERNin the Directory (directory:app/domains/shared/server_token_service.ts). A geohash-5 cell is roughly a few km on a side at mid-latitudes (the create-server runbook uses ~5 km × ~5 km as a rule of thumb). - Deny-by-default ACL. A router's transport access control denies everything, then adds exactly two allow rules:
waypoint/global/**(auth, chat, voice, channels, key-rotation, liveliness, revocation relay) and onewaypoint/<cell>/**per cell in the router'sServerToken.coverage_cells(server:src/router.rs). A router with no coverage cells drops all cell-routed traffic — it is misconfigured. - 256-cell cap. A single ServerToken may carry at most 256 coverage cells (
MAX_COVERAGE_CELLS = 256,directory:app/domains/shared/server_token_service.ts). The source comment notes this covers ~6,400 km² at the widest geohash-5 cell size (~24 km × ~24 km near the equator). - Manual assignment at enrollment. Cells are chosen by the operator and baked into the ServerToken when the server is registered — there is no automatic geography-to-cell allocation tool. Two routers covering neighbouring areas simply own disjoint cell lists.
Cells are set when you register a server. See Add a server (Step 2 "Decide what to register" and "How geofencing works").
Topology patterns¶
Single router¶
Smallest possible deployment: one Directory, one router holding all the coverage cells. Every client connects to the same endpoint. No federation, no redundancy.
flowchart TD
D["Directory<br/>(identity authority)"]
R["Router<br/>(all cells)"]
C["Clients<br/>(web / android / node)"]
D -. tokens .-> R
D -. tokens .-> C
C --> R
Use it for: a single site, a demo, or a pilot where one box comfortably carries the whole operating area.
Single-site HA¶
Multiple routers covering the same geography — i.e. identical coverage cells — with clients spread across them (round-robin DNS or a static endpoint list). Each router still routes the same cells.
flowchart TD
D["Directory"]
R1["Router A<br/>(cells X, Y, Z)"]
R2["Router B<br/>(cells X, Y, Z)"]
C1["Clients (group 1)"]
C2["Clients (group 2)"]
D -. tokens .-> R1
D -. tokens .-> R2
C1 --> R1
C2 --> R2
R1 <--> R2
Redundancy here is federation, not HA of one box
Two routers on the same cells give live continuity: if one fails, clients reconnect to the other and traffic keeps flowing — no standby promotion, no automatic failover, client redirection is operator-managed (DNS / static endpoint list). It is not state replication: received chat/drawing history is per-router and not backfilled, so a client reconnecting to the peer sees the peer's history, not the dead router's (see What needs HA). Positions are non-durable in the mesh — routers relay them and clients keep a ~10-minute live trail.
Use it for: a single site that wants more than one box live so one failure doesn't take everyone offline — accepting that received durable history is per-router.
Multi-site / federated¶
One router per region, each carrying different coverage cells, linked into a mesh by an explicit static peer list. Cross-region traffic relays over the federation hop.
flowchart LR
D["Directory"]
RE["Router EU<br/>(EU cells)"]
RU["Router US<br/>(US cells)"]
CE["EU clients"]
CU["US clients"]
D -. tokens .-> RE
D -. tokens .-> RU
CE --> RE
CU --> RU
RE <==>|"transport.connect<br/>mTLS gossip"| RU
Use it for: a deployment spanning regions where each region has its own router and operators in one region need to see relevant traffic from another.
Federation¶
Routers federate router↔router. The link is not auto-discovered — the operator declares it:
- Static peer list. Each router lists the peers it dials in
transport.connect(server:src/config.rs), a list of Zenoh locators. Add a locator to connect, prune it to disconnect. - mTLS gossip with CN pinning. Federation links are mutually authenticated; an inbound peer's certificate CN is its router principal_id (
server:src/active_peers.rs). Federation does not re-sign or re-verify per-hop application envelopes — payloads relay byte-identical across the hop — so the revocation audit on the peer's CN is the only enforcement, and an effectively-removed peer router is fully cut off only once its cert expires and the operator prunes it fromtransport.connect. - Operator chooses the shape. Because peering is explicit, the operator decides whether routers form a chain, a star, or a full mesh. There is no topology controller.
Sizing¶
What drives scaling¶
There is no single "users per box" number in the source. Scale is driven by three independent pressures:
- Geography → more routers / more cells. Wider or more-fragmented coverage means more geohash-5 cells, and (past the 256-cell cap per ServerToken, or past one box's reach) more routers.
- Redundancy → more routers. Wanting more than one router live for a given area means duplicating its cells onto additional boxes (see Single-site HA) — bounded by the state caveat, not by a config limit.
- Operators / devices → bigger revocation list + Directory load. More principals and devices grow the revocation list every router polls and the token-issuance load on the (single) Directory.
The hard, source-defined limits a deployment runs into:
| Limit | Value | Where |
|---|---|---|
| Coverage cells per ServerToken | 256 (MAX_COVERAGE_CELLS) |
directory:app/domains/shared/server_token_service.ts |
| ServerToken TTL | default 30 days, max 365 days | directory:app/domains/shared/server_token_service.ts (DEFAULT_SERVER_TOKEN_TTL_DAYS, MAX_SERVER_TOKEN_TTL_DAYS) |
| Chat retention (durable, per-router) | 24 h TTL, swept hourly | server:src/store_ttl.rs (CHAT_TTL_MS) |
| Key-rotation grace period | 1–60 min, default 10 | server:src/config.rs (default_grace_minutes, MAX_GRACE_PERIOD_MINUTES) |
| Revocation-list poll cadence | default 300 s | server:src/config.rs (default_revocation_sync_interval_secs) |
| Max invite size | 65,536 bytes | server:src/invite_handler.rs (MAX_INVITE_BYTES) |
e2-small is the demo/dev baseline, not a validated production size
The Terraform example environments provision a GCE e2-small (2 vCPU / 2 GB) with a 20 GB pd-balanced disk for both the server and directory modules (infrastructure:terraform/modules/server-environment/, .../directory-environment/). These are demo/dev examples — there is no autoscaling and no load balancer in the modules, and no production sizing has been validated in source. Treat e2-small as a starting point, not a recommendation.
Worked example¶
A 2-site, ~50-operator deployment: operators split across Europe and the US, with one router per region and a gateway feeding an external system.
Shape:
- 1 Directory — the single identity authority for both sites. Mints all operator/device tokens and signs the one revocation list both routers poll.
- 2 routers, peer-linked:
- Router EU — coverage cells over the EU operating area.
- Router US — coverage cells over the US operating area (a disjoint cell list).
- Each lists the other in
transport.connect, forming a 2-node mesh over mTLS gossip. - 1 gateway — connects to one (or both) routers as a Directory-authenticated service client and bridges to the external system. If it dies, the mesh is unaffected.
- Clients — web and Android operators, each pointed at the router for their region.
How it connects and what federates:
- Before anything, the PKI and the Directory exist, and each box meets its prerequisites — see Before you begin.
- Each router is registered against the Directory with its region's cells — see Add a server (Step 2 sets the coverage cells; Step 3 mints and installs the ServerToken). The EU and US routers get different cell lists.
- An EU operator authenticates against the Directory, receives a token, and connects to Router EU. Their position/heartbeat/drawing traffic is tagged with EU cells and routes through Router EU.
- A US operator does the same against Router US with US cells.
- What federates: global-namespace traffic (
waypoint/global/**— chat, voice, channels, liveliness, key-rotation, revocation relay) and any cell-scoped traffic for cells the receiving router covers relays across thetransport.connecthop. Because the cell lists are disjoint, an EU client's cell-scoped (position/drawing) traffic is only delivered on routers that cover EU cells; the cross-region visibility you get is what the global namespace and overlapping coverage carry. Durable chat is held on the router that received it (24 h TTL) and is not replicated to the peer.
Known limits & gaps¶
These are the current reality, not future plans. Plan around them; do not assume an HA mechanism the source doesn't implement. The framing is in What needs HA — these are the sharp edges of it.
- Directory is a single point of failure. No replication, standby, or HA in source. Directory down ⇒ no new logins or token issuance (cached tokens keep working, so the live mesh continues). This is the availability priority.
- Per-router durable history is not replicated (by design). Routers are relays; chat (24 h), drawings (indefinite), and channels (indefinite) live in each router's local RocksDB with no replication or backfill, so failover loses the dead router's received history. Acceptable for short-lived tactical traffic; a real limitation if long-lived shared drawings matter.
- No automated failover. Nothing promotes a standby or reroutes clients automatically; client redirection is operator-managed (DNS / static endpoint lists). Live continuity comes from running federated peers, not from HA of one box.
- No autoscaling and no load balancer. The Terraform modules provision fixed single instances; clients use static router endpoints.
- No per-router capacity SLA. Source defines no "clients per router" or throughput model. Capacity must be measured, not read off a number.
- No multi-region cell-assignment tooling. Coverage cells are chosen and assigned manually at enrollment; there is no algorithm or tool to allocate geohash cells across regions.
For where these sit relative to planned work, see Status & Roadmap.
Verified against server@ab688f0, directory@9c5e565, web@80e3ec2, infrastructure@b3849c0.