Discussion: Timing Correlation Risks in Mixnet Delay Strategies
This summary captures a technical discussion regarding the implementation of mixing delays and their interaction with auxiliary mechanisms (such as RLN based DoS protection) for libp2p based mixnets. While the discussion originated around the Exponential Delay Strategy and RLN (Rate Limiting Nullifier) used as DoS protection, the insights are relevant for any pluggable delay or security mechanism within the mixnet.
1. The “Timing Floor” (Minimum Delay Interference)
In the current specification and implementation, DoS protection proof generation is executed in parallel with the mixing delay timer by the intermediate nodes (this is done as an optimization). In such cases, the effective time a packet is held/delayed by an intermediate mix node is dictated by:
$$\text{Effective Delay} = \max(\text{Sampled Delay}, \text{RLN Proof Generation Time})$$
The Security Risk:
If the sampled delay is smaller than the time taken for the RLN proof generation to complete, the delay of each packet processing by the intermediate node would always become fixed leading to predictability. An external observer can then correlate incoming and outgoing traffic which completely defeats the purpose of having random delays.
2. The “Timing Ceiling” (Vulnerabilities in Truncation)
For practical reasons the delays are truncated (nim-libp2p PR #2120) to avoid “long tail” of an exponential distribution that could lead to extreme latencies (e.g., > 60s). For the portion of packets that fall into the statistical tail (e.g., the top 5%), the delay is no longer random. These packets are all delayed by the exact same value. This again increases the chance of possible timing correlation by an external observer.
3. Proposed Mitigation
To eliminate predictable spikes at both the floor (processing) and the ceiling (practical limit), the consensus is to follow the approach below.
-
Handling the “Long Tail”: If a sampled delay exceeds the practical threshold (e.g., 5 seconds), the node should resample until it obtains a value within the valid range. This redistributes the probability weight across the rest of the curve, maintaining a smooth, unpredictable distribution.
-
Handling varying RLN proof generation times across hardware: Set a minimum value of delay based on
average proof generation time+buffer(e.g~100msecsbased on logos-messaging RLN data +buffer).Few open points:
- Determine how much buffer time is appropriate.
- fine-tune
proof generation timeto be used
Key Requirements for Delay Implementations
-
Stochastic Independence: The exit time should remain as close to the intended probability distribution as possible.
-
Avoid Truncation: Any logic that forces diverse samples into a single fixed value (max) introduces a timing correlation risk. Rely on resampling rather.
-
Distribution Integrity: Implementers must ensure that auxiliary mechanisms do not create artificial “spikes” that facilitate traffic analysis.