Skip to content

Wire Protocol

How clients and servers talk over Zenoh. Routing is by key expression. Ground truth is the waypoint_common protos (common/proto/server/*) plus the server router (server/); clients (android, web), gateway, and node all speak this.

For per-message verification (gates, signatures, revocation) see security/model.md; for trust roots and tokens see security/pki.md. This page is the wire shape and routing.

Key namespace

All traffic is rooted at waypoint/. The namespace is cell-first: geocentric topics carry a geohash-5 cell segment right after the root; non-geocentric topics use the literal cell global.

Key expression Purpose
waypoint/global/auth/<node_id> Auth handshake (queryable)
waypoint/<cell>/pos/<principal_id> Position updates
waypoint/<cell>/heartbeat/<principal_id> Presence heartbeats
waypoint/<cell>/draw/<drawing_id> Drawing shapes (durable)
waypoint/<cell>/draw/del/<drawing_id> Drawing-delete tombstones
waypoint/<cell>/sensor/<sensor_id> Sensor data (namespace reserved; no server handler yet)
waypoint/global/chat/<chat_id>/<msg_id> Chat messages (durable)
waypoint/global/channel/<channel_id> Channel definition (durable)
waypoint/global/channel/del/<channel_id> Channel-delete tombstone
waypoint/global/channel/members/<channel_id> Channel membership updates
waypoint/global/channel/transfer/<channel_id> Channel ownership-transfer record (durable; in-flight)
waypoint/global/invite/<invitee_principal_id>/<channel_id> Channel invite tokens (durable)
waypoint/global/voice/<voice_id>/<principal_id>/<session_id> Voice datagrams (best-effort)
waypoint/global/cmd/<target_node>/<cmd_id> Point-to-point device commands
waypoint/global/ack/<source_node>/<cmd_id> Command acks (back to issuer)
waypoint/global/router/<principal_id>/live Router liveliness token
waypoint/global/router/<principal_id>/token Router ServerToken gossip
waypoint/global/sec/revocations Directory-signed RevocationList feed
waypoint/global/state/<class>/<key> Generic router-hosted state

<cell> is a 5-char geohash (base32, 0123456789bcdefghjkmnpqrstuvwxyz). A router serves only the cells in its ServerToken.coverage_cells; traffic on an uncovered cell is denied by the ACL. Wildcards follow key-expression semantics (** = any suffix).

Storage / durability (observable behaviour)

Prefix Durability
waypoint/<cell>/pos/** In-memory ring, ~10 min
waypoint/global/chat/** Durable; 24 h sweep
waypoint/<cell>/draw/** Durable; pruned on tombstone or coverage exit (no time TTL)
waypoint/global/channel/**, waypoint/global/invite/** Durable until tombstoned
waypoint/<cell>/heartbeat/**, waypoint/global/voice/**, .../router/*/live Transient (none)

The storage engine + sweep cadence are server-internal (see server/ repo).

AuthEnvelope (the wire wrapper)

Every payload on a transport sample is wrapped in one AuthEnvelope proto (common/proto, canonical verify in common/src/auth_envelope.rs):

AuthEnvelope {
  identity_token      // raw Directory-signed IdentityToken bytes
  payload             // SealedContent blob (see Payload confidentiality below)
  classification      // signed-cleartext classification level (relay-readable, unforgeable)
  owner_principal_id  // signed-cleartext channel owner (set on channel control messages)
  nonce               // 12-byte random nonce
  issued_at_ms        // sender wall-clock at pack time
  device_signature    // 64-byte Ed25519 over the canonical signing input,
                      //   by the token's principal_sign_key
}

There is no nested SignedEnvelope/EncryptedEnvelope. Authenticity and integrity are the Ed25519 device_signature. verify runs, in order: nonce length 12 → issued_at_ms within ±replay window → IdentityToken decodes + Directory Ed25519 sig valid + not expired → device_signature verifies under principal_sign_key(principal_id, nonce) not replayed. After verify, the server's gate_verified drops revoked principals, revoked device sign-keys, and tokens with empty device_id, and surfaces max_classification. Full detail in security/model.md.

Payload confidentiality — server-blind E2E content encryption

Current model: member content — chat, drawings, channel definitions, positions, and voice — is encrypted with AES-256-GCM under the deployment group key before it enters the envelope. The payload field carries a SealedContent blob: version(1)=0x01 | epoch(u32 BE) | nonce(12) | GCM(ciphertext‖tag) (common/src/crypto/). Heartbeats remain plaintext so liveness/presence survives a missing or stale key.

The router is payload-blind. The classification gate and channel-ownership check read the signed-cleartext classification and owner_principal_id envelope fields — the router never decodes payload. It stores and relays opaque ciphertext. It is group-key-free: the Directory issues the group key to clients only via GET /api/group-key (Bearer auth) and denies it to relay/server principals. The server never receives or holds the group key.

Scope of protection. This blinds the router/host — the central aggregator that relays and stores everyone's content and history. It does not protect against edge-device capture: a captured member leaks its own content and its copy of the deployment-wide group key until revocation + rotation. Edge capture is bounded (one viewpoint) and revocable; tightening it (per-channel keys, rotation-on-loss, forward secrecy) is tracked as follow-up.

Accepted residuals. Metadata visible to the router: principal identities, timing/tempo, key-expression cell / chat_id / msg_id, the cleartext classification level, message sizes. The deployment group key gives every member read access to every channel (no per-channel need-to-know). Not forward-secret beyond rotation.

Implementation status. Implemented in common, server, and Android. Web, gateway, and node clients, and the flag-day wire-breaking fleet cutover, remain outstanding.

Connection / auth handshake

Client                                  Router
  ├── session open (TLS) ──────────────►│  server-cert TLS (clients); mTLS for
  │                                      │  router↔router federation
  ├── get(waypoint/global/auth/**) ─────►│  auth handshake — queryable reply
  │◄── AuthResponse { server_token } ────┤  router's Directory-signed ServerToken
  │                                      │  (max_classification + coverage_cells)
  ├── declare_subscriber(waypoint/...) ─►│
  ├── get(waypoint/global/chat/**) ─────►│  late-join state replay
  │◄── stored AuthEnvelope samples ──────┤
  └── put / declare_publisher(...) ─────►│  normal pub/sub

The handshake is a queryable, not pub/sub: the router declares a queryable on waypoint/global/auth/**; the client gets it. Group keys are not in this reply — clients fetch them from the Directory (/api/group-key).

Group-key rotation

The Directory owns rotation and issues group keys to clients only — the server is group-key-free and broadcasts nothing about keys over the mesh. Clients fetch the (current, previous) bundle from GET /api/group-key at login/extend. A client that receives content sealed under an epoch it does not hold refreshes its bundle from the Directory (refresh-on-miss) and drops the triggering frame — the wire frame is never processed into real content. Clients may surface a non-destructive "encrypted · can't display" placeholder for the dropped chat/drawing content (built only from the cleartext key expression and verified sender, replaced if a later refresh + catch-up reopen succeeds) rather than show nothing; see #21 item 3. The two-epoch retention (GroupKeyManager current + previous) lets a client open messages sent across a rotation boundary. IdentityToken.key_epoch records the epoch current at token mint.

Offline edge. A client that can reach the router but not the Directory tolerates a stale key up to a configurable bound (~7 days); past the bound it fails closed for secure content until it next reaches the Directory. Heartbeats stay plaintext throughout, so presence survives a missing or stale key.

State catch-up

No separate StateSync protocol — a late-joining client issues a get against the durable prefix and the router replays stored samples:

get("waypoint/global/chat/<chat_id>/**")     // chat history
get("waypoint/<cell>/draw/**")                // drawings for a cell
get("waypoint/global/channel/**")             // channel definitions
get("waypoint/global/invite/<own_pid>/**")    // invites addressed to me

The stored AuthEnvelope is re-served verbatim, so the original signer's identity verifies end-to-end on catch-up. Receivers verify these with SkewPolicy::AllowStale (relaxes only the freshness/skew check; device-sig, token-sig, and revocation gates still apply — see security/model.md).

Federation

Router-to-router gossip. Peers list each other in transport.connect; the link is mTLS with the peer cert CN pinned to node_id (server/src/active_peers.rs). Forwarded IdentityTokens are byte-identical across hops, so the Directory signature stays verifiable end-to-end — routers never re-sign identity claims.

The authoritative ceiling rides the Directory-signed ServerToken.max_classification; enforcement is the per-message publish-side PEP server/src/classification_gate.rs (min(sender, server)), not the TLS chain.

Channel ownership & transfer

A channel (chat or voice) has exactly one owner, a Directory principal. Ownership is the authority to mutate channel state: edit the def, manage members, delete, and transfer.

How ownership is established and enforced.

  • On create, the owner is the creating principal. It rides as the signed-cleartext AuthEnvelope.owner_principal_id on the def at waypoint/global/channel/<id> (covered by device_signature, so unforgeable; the router never decodes the sealed Chat/Voice payload).
  • The router holds an in-memory owner map, authoritative, rebuilt from durable storage on boot. A channel mutation is accepted only when the verified sender is the current owner or holds an operator identity (IdentityKind::Operator — the operator-authority machine principal, not a human session role).
  • classification is immutable after create; a def re-publish that changes it is dropped. owner_principal_id is likewise not mutable by a plain def re-publish — the owner map ignores a re-claimed owner on an existing channel. Ownership changes only through the transfer record below.

Transfer mechanism (key waypoint/global/channel/transfer/<id>):

  • The record is an AuthEnvelope signed by the current owner, with signed-cleartext owner_principal_id set to the new owner, and a sealed-empty payload (mirrors the delete tombstone). No new proto message and no envelope-field change — the new owner reuses the existing owner_principal_id cleartext field, distinguished from a def by the key shape.
  • The router reconstructs the change from the record alone: new owner = cleartext owner_principal_id; old owner = the verified signer (token principal); channel = key id.
  • Server gate, in order: verify (sig / replay / revocation) → sender == current authoritative owner (else drop) → new owner non-empty and current owner → set_owner in the owner map + store.put the transfer record durably. On boot the owner map reads defs for the base owner, then overlays the latest transfer record per channel. The def stays byte-intact (still signed by the original creator, end-to-end verifiable on catch-up); the transfer record is the auditable ownership-change trail.
  • Clients learn of a transfer by receiving the record on the **-subscribed channel prefix and update both the channel def's owner and per-member owner flags locally. Initiation is gated on being the current owner (operator identities are machine principals, not exposed in client channel UIs).

Implementation status. In-flight — agreed protocol, not yet built. Outstanding in every repo: common (transfer key segment), server (classify_key arm + handle_transfer + boot overlay), web and android (publish the transfer record; consume it to update local ownership state — Android already carries the transfer UI + command types as a stub). Tracked in #48.

Replay defence

AuthEnvelope carries a 12-byte nonce + issued_at_ms checked against a sliding window (default 60 s). Outside the window → rejected as stale/future; a repeated (principal_id, nonce) inside the window → rejected. Independent of transport timestamps.

Where to read the code

Concern Location
Envelope + verify (canonical) common/src/auth_envelope.rs, common/proto/server/*
Key namespace, ACL, storage, handshake server/ (router)
Group-key rotation proto common/proto/server/keys.proto
Classification PEP server/src/classification_gate.rs
Android pack/verify android/.../core/crypto/AuthEnvelope.kt, core/engine/TacNetUtils.kt
Web transport web/inertia/features/transport/