Experiment Scope for Testing the Synchronization de-MLS Assumption

TL;DR: Since the assumption that at least (2n/3) members see a message within \Delta seconds is critical for the de-MLS RFC, we propose measuring it experimentally for selected network sizes and \Delta values.

Assumption

We assume that at least (2n/3) of the members become synchronized within \Delta time.

Definition of Synchronization

For this experiment, we define synchronization as follows:

A message generated by one node is considered synchronized if it is seen by at least (2n/3) members within \Delta seconds.

In other words, after a node produces a message at time (t_0), we check whether at least (2n/3) members have seen that message by time (t_0 + \Delta).

Goal

The goal is to evaluate whether the network satisfies the above synchronization assumption under different network sizes and time bounds.

Proposed Parameters

  • n \in {200, 500, 750, 1000}
  • \Delta \in {3, 5, 7, 10}) in seconds

Experiment Outline

For each (n, \Delta ) configuration:

  1. Initialize a network with (n) Waku/Logos message nodes.
  2. Select one node to generate a message.
  3. Broadcast the message at time (t_0).
  4. Measure how many nodes have seen the message by time (t_0 + \Delta).
  5. Check whether the message reached at least (2n/3) members within the time bound.

Core Output

For each configuration, the experiment should report:

  • the fraction of nodes that saw the message within \Delta ,
  • whether the (2n/3) threshold was reached,
  • the observed dissemination behavior across the tested parameter sets.

Discussion

Does this kind of experiment make sense? Any suggestions on what we could add or remove to make it more useful?

2 Likes

I think this setup is fairly easy to replicate and measure. Also very well defined, thanks.

We were just about to start with the new version v0.38.0.rc-1. Do you want us to measure this using just logos-messaging (waku)?

Sure, but just wonder what would be else?

@Alberto I have some additional pull of information for you, let me know if it makes sense

@ugur check please that I don’t mess up with rfc understanding

In short, aside from the network, we need to measure how the protocol itself affects the synchronisation time and how the parameters will depend on this

When a message travels through de-MLS, it doesn’t just hop between Waku nodes. It goes through MLS encryption (ratchet tree operations that scale with O(logβ‚‚ n)), consensus voting where every member sends an individually encrypted vote, freeze rounds where commit candidates need to reach 2n/3 before deterministic selection can work, and so on. All of this adds time on top of the pure network propagation that the baseline experiment captures.

So even if Waku delivers a message to 2n/3 nodes in 3 seconds, the protocol might need 8 seconds end-to-end for everyone to actually be β€œsynchronized” in the de-MLS sense β€” meaning they’ve decrypted, voted, and converged on the same state.

Simplified flow:


1. Member proposes "add Alice"

2. Proposal MLS-encrypted, broadcast via Waku

3. Every member decrypts and votes (n vote messages)

4. Consensus reached (2n/3 votes tallied)  ← 2n/3 must sync here

5. Steward batches approved proposals β†’ CommitCandidate

6. CommitCandidate broadcast via Waku

7. ═══ FREEZE WINDOW (Ξ”) ═══  ← 2n/3 must sync here

8. Deterministic selection β†’ MLS tree update β†’ new epoch

The baseline experiment covers step 6β†’7. The full protocol also needs time for 2β†’4 (consensus) and 8 (crypto). The connection:

protocol Ξ”  >=  network Ξ”  +  crypto overhead  +  consensus time  +  margin

                    ↑               ↑                    ↑

               Ξ” baseline    measurement A        measurement B

Proposed additional measurements

What Why How it connects to 2n/3
A MLS crypto timing per operation CPU floor at each group size Added on top of every network Ξ”
B Consensus convergence time n vote messages β†’ 2n/3 agreement Must finish before freeze even starts
C Full epoch round-trip Proposal β†’ commit β†’ freeze β†’ sync The real end-to-end Ξ”
D Join burst throughput Many joins at once, peak load Stress test of the assumption
E Emergency proposal propagation High-priority message to 2n/3 Time-critical protocol events
F Network partition recovery What happens when assumption breaks Failure mode analysis

The batch size problem β€” how many proposals can one epoch handle?

This is something we need to figure out from the benchmarks. The steward computes MLS tree updates sequentially β€” each add or remove is a tree operation. If we allow too many proposals in one epoch, the steward spends most of the epoch just computing, and the group is effectively blocked.

Here’s the problem visualized (with some kinda random numbers):

epoch = 30 seconds

Scenario A: 5 proposals per epoch (comfortable)
β”œβ”€β”€ consensus voting ──── 5s ───
β”œβ”€β”€ steward computes ── 0.5s ───
β”œβ”€β”€ freeze window Ξ” ─── 15s ────
β”œβ”€β”€ idle/chat ─────── 9.5s ─────  ← group works normally
                                   βœ“ OK

Scenario B: 200 proposals per epoch (problem!)
β”œβ”€β”€ consensus voting ── 10s ────
β”œβ”€β”€ steward computes ── 20s ────  ← tree ops for 200 members
β”œβ”€β”€ freeze window Ξ” ─── 15s ────
β”‚                              β”‚
└── total: 45s > 30s epoch β”€β”€β”€β”€β”˜  ← DOESNT FIT
                                   βœ— Group is stuck

So we probably need a max batch size per epoch β€” the steward takes the first k approved proposals, commits those, and defers the rest to the next epoch. That limit depends on:

k_max = (epoch_budget - consensus_time - freeze_duration - margin) / T_per_tree_op(n)

Where T_per_tree_op(n) comes directly from measurement A. For example, if at n=500 each tree operation takes ~5ms, and we have 10 seconds of epoch budget after consensus and freeze:

k_max = 10s / 5ms = 2000  ← probably fine for adds at this size

But if the tree operation takes 50ms at n=1000 with larger payloads:

k_max = 10s / 50ms = 200  ← still okay, but getting tighter

The real concern is that tree operations might not scale linearly β€” if there’s overhead that compounds with batch size, the limit could be much lower. That’s exactly what measurement A should reveal.

This matters for the join burst test (measurement D): if 200 people try to join at once but k_max is 50, then the first 50 join in epoch E, next 50 in epoch E+1, and so on β€” spreading the load across 4 epochs.
The question becomes: is the user experience acceptable? And does the deferred queue work correctly? @ugur

What makes it worse at scale

Three factors compound as the group grows:

  1. Tree depth increases β€” MLS tree operations are O(logβ‚‚ n). At n=1000, each add/remove touches ~10 tree levels instead of ~7 at n=100. Individual operations get slower.

  2. CommitCandidate size grows β€” k proposals Γ— MLS proposal bytes + 1 commit. A 100-proposal candidate could be tens of KB. Larger messages take longer to propagate through Waku, eating into the freeze window.

  3. Welcome messages grow linearly β€” each new joiner gets a Welcome containing the full ratchet tree (~350KB at n=1000). If 50 people join in one batch, that’s ~17MB of Welcome traffic from the steward alone.


Timer tuning β€” finding the minimum epoch

All protocol timers derive from epoch_duration:

epoch_duration (default 30s)

  β”œβ”€β”€ freeze_duration = epoch / 2   ← this IS the protocol Ξ”
  β”œβ”€β”€ consensus_timeout = 15s       ← must finish before freeze
  └── join_timeout β‰ˆ 2 Γ— epoch

We want to find the minimum epoch that works for each group size. So we’d test:

  • Group sizes: n ∈ {200, 500, 750, 1000} (same as original experiment)
  • Epoch durations: {10, 15, 30, 60}s (which gives freeze Ξ” of {5, 7.5, 15, 30}s)

Questions for discussion

  • Does extending the experiment this way make sense alongside the network-level tests?
  • For join burst β€” what’s a realistic β€œspike” size? 10 joins? 100? And should we cap the batch size per epoch based on measurement A results?
  • The Welcome message grows linearly with group size (~350KB at n=1000). Should we measure whether Waku handles messages that large reliably?
  • The consensus timeout is hardcoded at 15s. If benchmarks show it needs more at large n, should it become configurable?
  • For the batch size limit β€” should excess proposals be deferred automatically to the next epoch, or should there be backpressure on proposal creation (e.g., reject new proposals when the queue is full)?