@Alberto I have some additional pull of information for you, let me know if it makes sense
@ugur check please that I donβt mess up with rfc understanding
In short, aside from the network, we need to measure how the protocol itself affects the synchronisation time and how the parameters will depend on this
When a message travels through de-MLS, it doesnβt just hop between Waku nodes. It goes through MLS encryption (ratchet tree operations that scale with O(logβ n)), consensus voting where every member sends an individually encrypted vote, freeze rounds where commit candidates need to reach 2n/3 before deterministic selection can work, and so on. All of this adds time on top of the pure network propagation that the baseline experiment captures.
So even if Waku delivers a message to 2n/3 nodes in 3 seconds, the protocol might need 8 seconds end-to-end for everyone to actually be βsynchronizedβ in the de-MLS sense β meaning theyβve decrypted, voted, and converged on the same state.
Simplified flow:
1. Member proposes "add Alice"
2. Proposal MLS-encrypted, broadcast via Waku
3. Every member decrypts and votes (n vote messages)
4. Consensus reached (2n/3 votes tallied) β 2n/3 must sync here
5. Steward batches approved proposals β CommitCandidate
6. CommitCandidate broadcast via Waku
7. βββ FREEZE WINDOW (Ξ) βββ β 2n/3 must sync here
8. Deterministic selection β MLS tree update β new epoch
The baseline experiment covers step 6β7. The full protocol also needs time for 2β4 (consensus) and 8 (crypto). The connection:
protocol Ξ >= network Ξ + crypto overhead + consensus time + margin
β β β
Ξ baseline measurement A measurement B
Proposed additional measurements
|
What |
Why |
How it connects to 2n/3 |
| A |
MLS crypto timing per operation |
CPU floor at each group size |
Added on top of every network Ξ |
| B |
Consensus convergence time |
n vote messages β 2n/3 agreement |
Must finish before freeze even starts |
| C |
Full epoch round-trip |
Proposal β commit β freeze β sync |
The real end-to-end Ξ |
| D |
Join burst throughput |
Many joins at once, peak load |
Stress test of the assumption |
| E |
Emergency proposal propagation |
High-priority message to 2n/3 |
Time-critical protocol events |
| F |
Network partition recovery |
What happens when assumption breaks |
Failure mode analysis |
The batch size problem β how many proposals can one epoch handle?
This is something we need to figure out from the benchmarks. The steward computes MLS tree updates sequentially β each add or remove is a tree operation. If we allow too many proposals in one epoch, the steward spends most of the epoch just computing, and the group is effectively blocked.
Hereβs the problem visualized (with some kinda random numbers):
epoch = 30 seconds
Scenario A: 5 proposals per epoch (comfortable)
βββ consensus voting ββββ 5s βββ€
βββ steward computes ββ 0.5s βββ€
βββ freeze window Ξ βββ 15s ββββ€
βββ idle/chat βββββββ 9.5s βββββ€ β group works normally
β OK
Scenario B: 200 proposals per epoch (problem!)
βββ consensus voting ββ 10s ββββ€
βββ steward computes ββ 20s ββββ€ β tree ops for 200 members
βββ freeze window Ξ βββ 15s ββββ€
β β
βββ total: 45s > 30s epoch βββββ β DOESNT FIT
β Group is stuck
So we probably need a max batch size per epoch β the steward takes the first k approved proposals, commits those, and defers the rest to the next epoch. That limit depends on:
k_max = (epoch_budget - consensus_time - freeze_duration - margin) / T_per_tree_op(n)
Where T_per_tree_op(n) comes directly from measurement A. For example, if at n=500 each tree operation takes ~5ms, and we have 10 seconds of epoch budget after consensus and freeze:
k_max = 10s / 5ms = 2000 β probably fine for adds at this size
But if the tree operation takes 50ms at n=1000 with larger payloads:
k_max = 10s / 50ms = 200 β still okay, but getting tighter
The real concern is that tree operations might not scale linearly β if thereβs overhead that compounds with batch size, the limit could be much lower. Thatβs exactly what measurement A should reveal.
This matters for the join burst test (measurement D): if 200 people try to join at once but k_max is 50, then the first 50 join in epoch E, next 50 in epoch E+1, and so on β spreading the load across 4 epochs.
The question becomes: is the user experience acceptable? And does the deferred queue work correctly? @ugur
What makes it worse at scale
Three factors compound as the group grows:
-
Tree depth increases β MLS tree operations are O(logβ n). At n=1000, each add/remove touches ~10 tree levels instead of ~7 at n=100. Individual operations get slower.
-
CommitCandidate size grows β k proposals Γ MLS proposal bytes + 1 commit. A 100-proposal candidate could be tens of KB. Larger messages take longer to propagate through Waku, eating into the freeze window.
-
Welcome messages grow linearly β each new joiner gets a Welcome containing the full ratchet tree (~350KB at n=1000). If 50 people join in one batch, thatβs ~17MB of Welcome traffic from the steward alone.
Timer tuning β finding the minimum epoch
All protocol timers derive from epoch_duration:
epoch_duration (default 30s)
βββ freeze_duration = epoch / 2 β this IS the protocol Ξ
βββ consensus_timeout = 15s β must finish before freeze
βββ join_timeout β 2 Γ epoch
We want to find the minimum epoch that works for each group size. So weβd test:
- Group sizes: n β {200, 500, 750, 1000} (same as original experiment)
- Epoch durations: {10, 15, 30, 60}s (which gives freeze Ξ of {5, 7.5, 15, 30}s)
Questions for discussion
- Does extending the experiment this way make sense alongside the network-level tests?
- For join burst β whatβs a realistic βspikeβ size? 10 joins? 100? And should we cap the batch size per epoch based on measurement A results?
- The Welcome message grows linearly with group size (~350KB at n=1000). Should we measure whether Waku handles messages that large reliably?
- The consensus timeout is hardcoded at 15s. If benchmarks show it needs more at large n, should it become configurable?
- For the batch size limit β should excess proposals be deferred automatically to the next epoch, or should there be backpressure on proposal creation (e.g., reject new proposals when the queue is full)?