Chapter 9: Build Your Own
HLD, LLD, Class Diagrams & Implementation Roadmap
High-Level Design (HLD)
π‘ System Overview
PCCL consists of 3 main components communicating over TCP/IP:
- Master Server - Coordination & consensus
- Peer Client - Training & collective operations
- P2P Mesh - Direct data transfer between peers
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β HIGH-LEVEL ARCHITECTURE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββ
β MASTER SERVER β
β β
β β’ Membership β
β β’ Consensus β
β β’ Topology β
ββββββββββ¬βββββββββ
β
Control Plane β (lightweight messages)
βββββββββββββββββΌββββββββββββββββ
β
ββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β PEER CLIENT β β PEER CLIENT β β PEER CLIENT β
β β β β β β
β β’ Model/Grads βββββββΊβ β’ Model/Grads βββββββΊβ β’ Model/Grads β
β β’ Optimizer β β β’ Optimizer β β β’ Optimizer β
β β’ Collectives β β β’ Collectives β β β’ Collectives β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β² β² β²
β β β
ββββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββ
Data Plane (P2P Mesh)
(heavy tensor data)
Component Responsibilities
| Component | Responsibilities | Communicates With |
|---|---|---|
| Master Server | Track peers, coordinate state transitions, optimize topology | All Peers (control) |
| Peer Client | Hold model weights, run training, execute collectives | Master + Other Peers |
| P2P Connection | Transfer tensor data directly between peers | Peer β Peer only |
Entity Relationship Diagram
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ENTITY RELATIONSHIP DIAGRAM β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββ 1:N ββββββββββββββββ
β Master βββββββββββββββββββΊβ Peer β
ββββββββββββββββ ββββββββββββββββ
β id β β id (rank) β
β address β β address β
β port β β port β
β topology β β phase β
ββββββββββββββββ β state β
β hash β
ββββββββ¬ββββββββ
β
β N:N
βΌ
ββββββββββββββββ
β P2PConnectionβ
ββββββββββββββββ
β peer_a_id β
β peer_b_id β
β bandwidth β
β latency β
β socket β
ββββββββββββββββ
β
β 1:N
βΌ
ββββββββββββββββ N:1 ββββββββββββββββ
β Operation ββββββββββββββββββββ Tensor β
ββββββββββββββββ ββββββββββββββββ
β id (tag) β β id β
β type β β data_ptr β
β state β β size β
β participants β β dtype β
ββββββββββββββββ β device β
ββββββββββββββββ
Low-Level Design (LLD)
Phase 1: Core Infrastructure
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PHASE 1: CLASS DIAGRAM - CORE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββ
β TcpServer β
βββββββββββββββββββββββββββββββββββ€
β - address: String β
β - port: u16 β
β - listener: TcpListener β
β - connections: Vec β
βββββββββββββββββββββββββββββββββββ€
β + bind(addr, port) β
β + accept() -> Connection β
β + broadcast(msg) β
β + shutdown() β
βββββββββββββββββββββββββββββββββββ
β²
β uses
β
βββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ
β Master β β Peer β
βββββββββββββββββββββββββββββββββββ€ βββββββββββββββββββββββββββββββββββ€
β - peers: HashMap β β - id: PeerId β
β - topology: RingTopology β β - phase: ConnectionPhase β
β - state: MasterState β β - state: ConnectionState β
βββββββββββββββββββββββββββββββββββ€ β - master_conn: TcpStream β
β + register_peer(peer) β β - p2p_conns: Vec β
β + accept_peer(peer_id) β βββββββββββββββββββββββββββββββββββ€
β + remove_peer(peer_id) β β + connect_master(addr) β
β + broadcast_abort(op_id) β β + connect_peer(peer_id) β
β + optimize_topology() β β + send_to_master(msg) β
β + get_ring_order() -> Vec β β + recv_from_master() -> Msg β
βββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ
β
β has
βΌ
βββββββββββββββββββββββββββββββββββ
β P2PConnection β
βββββββββββββββββββββββββββββββββββ€
β - remote_peer_id: PeerId β
β - socket: TcpStream β
β - send_buffer: Vec β
β - recv_buffer: Vec β
β - bandwidth: f64 β
βββββββββββββββββββββββββββββββββββ€
β + send(data: &[u8]) β
β + recv(buf: &mut [u8]) β
β + measure_bandwidth() -> f64 β
β + close() β
βββββββββββββββββββββββββββββββββββ
State Machine Classes
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β STATE ENUMS & TRANSITIONS β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββ βββββββββββββββββββββββ βββββββββββββββββββββββ
β ConnectionPhase β β ConnectionState β β CollectiveState β
βββββββββββββββββββββββ€ βββββββββββββββββββββββ€ βββββββββββββββββββββββ€
β β’ Registered β β β’ Idle β β β’ VoteInitiate β
β β’ Accepted β β β’ VotingAcceptPeers β β β’ Performing β
βββββββββββββββββββββββ β β’ SyncingState β β β’ VoteComplete β
β β’ CollectiveRunning β βββββββββββββββββββββββ
βββββββββββββββββββββββ
State Transition Rules:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Registered ββ[vote passes]βββΊ Accepted
Idle ββ[peers pending]βββΊ VotingAcceptPeers ββ[vote done]βββΊ Idle
Idle ββ[hash mismatch]βββΊ SyncingState ββ[sync done]βββΊ Idle
Idle ββ[start collective]βββΊ CollectiveRunning ββ[complete]βββΊ Idle
VoteInitiate ββ[all agree]βββΊ Performing ββ[done]βββΊ VoteComplete ββ[all ack]βββΊ (end)
Phase 2: Collective Operations
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PHASE 2: CLASS DIAGRAM - COLLECTIVES β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββ
β CollectiveOperation β (abstract)
βββββββββββββββββββββββββββββββββββ€
β - tag: u32 β
β - state: CollectiveState β
β - participants: Vec β
β - buffer_backup: Option β
βββββββββββββββββββββββββββββββββββ€
β + execute() -> Result β
β + abort() β
β + restore_backup() β
βββββββββββββββββ¬ββββββββββββββββββ
β
ββββββββββββββββββββββΌβββββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββββββββ βββββββββββββββββββββββββ βββββββββββββββββββββββββ
β AllReduce β β Broadcast β β AllGather β
βββββββββββββββββββββββββ€ βββββββββββββββββββββββββ€ βββββββββββββββββββββββββ€
β - reduce_op: ReduceOp β β - root: PeerId β β - recv_counts: Vec β
βββββββββββββββββββββββββ€ βββββββββββββββββββββββββ€ βββββββββββββββββββββββββ€
β + reduce_scatter() β β + execute() β β + execute() β
β + all_gather() β βββββββββββββββββββββββββ βββββββββββββββββββββββββ
β + execute() β
βββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ
β ReduceOp β β RingTopology β
βββββββββββββββββββββββββββββββββββ€ βββββββββββββββββββββββββββββββββββ€
β β’ Sum β β - order: Vec β
β β’ Avg β β - bandwidth_matrix: Matrix β
β β’ Min β βββββββββββββββββββββββββββββββββββ€
β β’ Max β β + get_next(peer) -> PeerId β
βββββββββββββββββββββββββββββββββββ β + get_prev(peer) -> PeerId β
β + optimize() -> Vec β
β + compute_chunk_bounds(n) -> Vecβ
βββββββββββββββββββββββββββββββββββ
Phase 3: Fault Tolerance
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PHASE 3: CLASS DIAGRAM - FAULT TOLERANCE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββ
β FailureDetector β
βββββββββββββββββββββββββββββββββββ€
β - heartbeat_interval: Duration β
β - timeout: Duration β
β - last_seen: HashMap β
βββββββββββββββββββββββββββββββββββ€
β + start_monitoring() β
β + record_heartbeat(peer_id) β
β + get_failed_peers() -> Vec β
β + is_alive(peer_id) -> bool β
βββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ
β AbortController β β StateRecovery β
βββββββββββββββββββββββββββββββββββ€ βββββββββββββββββββββββββββββββββββ€
β - abort_flag: AtomicBool β β - backup_store: HashMap β
β - abort_reason: Option β βββββββββββββββββββββββββββββββββββ€
βββββββββββββββββββββββββββββββββββ€ β + save_backup(op_id, data) β
β + signal_abort(reason) β β + restore_backup(op_id) -> Data β
β + check_abort() -> bool β β + clear_backup(op_id) β
β + reset() β β + has_backup(op_id) -> bool β
βββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββ
β SharedStateSync β
βββββββββββββββββββββββββββββββββββ€
β - hasher: SimpleHash β
β - local_hash: u64 β
βββββββββββββββββββββββββββββββββββ€
β + compute_hash(tensors) -> u64 β
β + compare_hashes(peers) -> Vec β (returns outliers)
β + sync_from_peer(peer_id) β
β + sync_to_peer(peer_id) β
βββββββββββββββββββββββββββββββββββ
Sequence Diagrams
Peer Join Flow
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SEQUENCE: PEER JOIN β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
NewPeer Master ExistingPeers
β β β
βββββ connect() βββββΊβ β
β β β
βββββ REGISTERED βββββ β
β β β
β ββββ vote_accept? ββββββ
β β β
β βββββ vote_accept βββββΊβ
β β β
β βββββ all_agree ββββββββ
β β β
βββββ ACCEPTED βββββββ β
β β β
βββββββββββββββββ establish P2P ββββββββββββΊβ
β β β
β ββββ sync_state ββββββββ
ββββββββββββββ receive state βββββββββββββββ
β β β
β READY TO PARTICIPATE β
βΌ βΌ βΌ
All-Reduce Flow
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SEQUENCE: ALL-REDUCE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Peer0 Peer1 Peer2 Master
β β β β
ββββββββββββββ vote_initiate βββββββββββββββββββΊβ
β β β β
ββββββββββββββ all_agreed ββββββββββββββββββββββ
β β β β
β backup_buffer β backup_buffer β backup_buffer β
β β β β
β ββββββββββ REDUCE-SCATTER (N-1 steps) ββββββ β
β β β β
βββββ chunk ββββΊβ β β
β βββββ chunk ββββΊβ β
βββββββββββββββ chunk ββββββββββ β
β β β β
β (accumulate received chunks locally) β
β β β β
β ββββββββββ ALL-GATHER (N-1 steps) ββββββββββββ
β β β β
βββββ chunk ββββΊβ β β
β βββββ chunk ββββΊβ β
βββββββββββββββ chunk ββββββββββ β
β β β β
ββββββββββββββ vote_complete βββββββββββββββββββΊβ
β β β β
ββββββββββββββ all_acked βββββββββββββββββββββββ
β β β β
β clear_backup β clear_backup β clear_backup β
βΌ βΌ βΌ βΌ
Failure Recovery Flow
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SEQUENCE: FAILURE DURING COLLECTIVE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Peer0 Peer1(dies) Peer2 Master
β β β β
β ββββββββββ ALL-REDUCE IN PROGRESS ββββββββββ β
β β β β
βββββ chunk ββββΊβ CRASH! π β β
β X β β
β X β β
β (timeout waiting for Peer1) β
β β β β
βββββββββββββββ report_failure βββββββββββββββββΊβ
β β β β
βββββββββββββββ ABORT signal βββββββββββββββββββ
β β βββββββ ABORT βββ
β β β β
β restore_backupβ β restore_backupβ
β β β β
β β βββ remove_peerββ
ββββββββββββββ topology_update βββββββββββββββββ
β β β β
β ββββββββββ RETRY ALL-REDUCE (2 peers) ββββββ β
β β β β
ββββββββββββββββββββββββββββββββΊβ β
β β β β
β SUCCESS! β β
βΌ βΌ βΌ
Language Showdown
Recommended Stack
| Component | Language | Why |
|---|---|---|
| Core Library | Rust or C++ | Performance, memory control |
| Master Node | Go or Rust | Concurrency, simplicity |
| Python Bindings | PyO3 / pybind11 | PyTorch integration |
Implementation Checklist
Phase 1: Core Infrastructure
- β TcpServer / TcpClient classes
- β Master with peer registry
- β Peer with master connection
- β P2PConnection class
- β ConnectionPhase enum
- β ConnectionState enum
- β Message serialization (protobuf/msgpack)
Phase 2: Collective Operations
- β RingTopology class
- β CollectiveOperation base class
- β AllReduce (reduce-scatter + all-gather)
- β ReduceOp enum (Sum, Avg, Min, Max)
- β CollectiveState enum
- β Async operation support
- β Multiple concurrent operations
Phase 3: Fault Tolerance
- β FailureDetector (heartbeat-based)
- β AbortController (atomic flag)
- β StateRecovery (backup/restore)
- β SharedStateSync (hash comparison)
- β SimpleHash implementation
- β Graceful peer removal
- β Operation retry logic
Phase 4: Optimization
- β Bandwidth measurement
- β ATSP solver for ring order
- β Connection pooling
- β Quantization support
Phase 5: Integration
- β Python bindings (PyO3/pybind11)
- β PyTorch tensor support
- β CUDA memory handling
- β Stress testing