The Architecture of SAMIZDAT
The 30-second mental model: a publisher’s CLI hands content to their
local samizdat-node, which announces it through one or more
samizdat-hub matchmakers; a consumer’s local samizdat-node hears
about the new content from its hubs and pulls the bytes directly from
whichever node still holds them. So:
publisher -> node -> hub -> ... -> hub -> node -> consumer
The hubs do routing, discovery, and NAT-traversal. They never see the actual content; every payload they forward is opaque-encrypted with a key derived from a hash the hub does not have.
The pages in this section walk through the layers of the data model, bottom-up:
- Objects – the atomic content-addressed unit. A file, split into 256kB chunks, organised as a Merkle tree whose root hash is the object’s identity.
- Collections – an immutable Patricia
tree of
path -> objectmappings. The root hash is the collection’s identity. - Series – an Ed25519 keypair that lets a publisher issue signed pointers (editions) to successive collections over time.
- Subscriptions – the per-node registration that turns “I care about this series” into eager fetches of new editions.
- Autovacuum – how the node keeps its local cache bounded.
The matchmaking primitive that lets the hub broker queries without learning the content is the riddle; it is introduced in the objects page.
The network of hubs is itself a graph rather than a tree: any hub can federate with any other hub by acting as if it were a node on the other hub’s network. A single query naturally terminates at a default hop depth of 6 because each hop consumes one of the riddles in the query. See the network page for the transport details.