Scenerios:
Scenario 1: Rare Service Discovery in a Large Noisy Network
Goal: Verify that service discovery finds rare services faster than random walking and that discoverers converge toward the right registrars rather than wandering.
Network setup:
- 250 registrars → server mode, no services, no xprPublishing
- 150 kad-only peers → plain KadDHT (not ServiceDiscovery) — act as routing noise
- 80 advertisers → advertise a common service e.g. “/logos/popular/1.0.0”
- 3 advertisers → advertise the rare service “/logos/rare/1.0.0”
- 40 discoverers → call lookup() for “/logos/rare/1.0.0” only
How to run:
-
Start registrars first. Wait for routing tables to stabilise (~30s).
-
Start 80 popular-service advertisers. Let them get confirmed (~60s).
-
Start 3 rare-service advertisers. Record the exact timestamp.
-
Start 40 discoverers. Each calls lookup("/logos/rare/1.0.0") in a loop every 10s.
Existing logs to watch:
| Log line |
What it tells you |
debug "getting adverts" |
Each GET_ADS sent — count per discoverer |
debug "adverts found" count=N |
N=0 means empty response from that registrar |
debug "advert accepted" |
Rare service ad admitted at registrar |
cd_lookup_peers_found metric |
Total peers found per lookup call |
cd_lookup_requests metric |
Total lookups initiated |
cd_registrar_cache_ads gauge |
Cache size per registrar over time |
cd_service_table_peers gauge |
Routing table growth toward service hash |
Missing logs — need to add:
-
debug "lookup complete" with fields: serviceId, peersFound, registrarsContacted, bucketsTraversed, durationMs — currently no single log captures when a lookup finishes and what it took
-
debug "empty response from registrar" — currently adverts found count=0 exists but has no explicit distinguishing label; add a dedicated log so it can be grepped separately
-
debug "routing table state" at lookup start — log how many peers are in each bucket of DiscT(service_id_hash) before the first GET_ADS goes out; currently no snapshot log exists
What to check:
-
Time from rare ad admitted (advert accepted) to first discoverer finding it (adverts found count>0)
-
Ratio of adverts found count=0 to total getting adverts — should decrease over time as routing tables converge
-
cd_service_table_peers should grow steadily for discoverers; if it stays flat, routing table is not being updated from closerPeers
-
Whether the same registrar peer ID appears repeatedly in getting adverts — indicates discoverers are not advancing through buckets
REGISTER Storm / Retry Explosion
Things to check in logs:
-
Number of retries to same registrar
-
REGISTER requests per second
-
Whether advertisers retry without waiting
-
registrar cache size change with time
-
how waiting times change with registrar cache size
Client mode
The main goal of this test is to verify that client-mode nodes can use service discovery only as discoverers, but cannot act as advertisers or registrars. Server-mode nodes may act as discoverers, advertisers, or registrars, but client-mode nodes must only act as discoverers. So a client-mode node should be able to search for peers providing a service, but it should not advertise its own service or accept/store advertisements from others.
-
80 nodes → server mode registrars
-
40 nodes → server mode advertisers
-
30 nodes → server mode discoverers
-
100 nodes → client mode
-
keep just 1 service for simplicity and to just test specifically client mode functionality
First, start the server-mode registrars and advertisers. Let the advertisers register their advertisements.
Then start the client-mode nodes and make them discover the same service. The expected result is that client-mode nodes should be able to send GET_ADS requests and discover valid advertisers.
After that, intentionally try to misuse client mode. Pick 20 client-mode nodes and try to make them advertise. Pick another 20 client-mode nodes and try to make them behave like registrars by accepting REGISTER requests from advertisers. These operations should fail cleanly. They should not silently succeed, should not add anything to an advertisement cache, and should not start registrar/advertiser loops in the background.
Things to check in logs:
-
Client-mode nodes successfully send GET_ADS requests
-
Client-mode nodes receive valid advertisements for the requested service
-
Client-mode nodes do not send REGISTER requests for their own advertisements
-
Clear error or warning appears when advertise/register is attempted from client mode
Malicious Registrars Poisoning Routing Tables
Service discovery success heavily depends heavily on registrars helping advertisers and discoverers move closer to the correct service-specific region of the keyspace.
This test checks whether malicious routing information slowly corrupts search tables.
-
120 registrar nodes
-
50 malicious registrars
-
30 advertisers
-
40 discoverers
The malicious registrars should intentionally return bad closerPeers lists. They should:
-
return duplicate peers
-
return only malicious peers
-
return unreachable peers
-
return themselves repeatedly
-
return peers far from service hash
-
return random garbage peer IDs.
Things to check in logs:
Ticket Grinding Attack
The malicious advertisers repeatedly try to game the waiting-time system. We want to verify that malicious advertisers cannot get artificially lower wait times.
Malicious advertiser behaviour:
-
retry too early
-
retry too late
-
modify t_wait_for
-
reuse old tickets
-
use ticket from another registrar
-
slightly modify advertisement while reusing ticket
-
intentionally drop tickets and restart.
what to check in logs:
-
Early retry rejections
-
Late retry rejections
-
Ticket validation failures
-
Lower-bound state changes
-
Are malicious advertiser eventually penalized with higher waiting times
Same-IP Sybil Flooding
Service discovery includes IP similarity scoring to prevent many nodes from the same subnet dominating the advertisement cache. This test checks whether the IP tree logic actually works.
-
100 registrars
-
500 advertisers
Out of these:
All advertisers continuously advertise services.
This test checks whether the IP tree logic actually works.
Things to check in logs:
Popular Service Hotspot Attack
Service discovery specifically tries to avoid hotspotting near service hashes. This test checks whether rare services still remain discoverable when one service becomes extremely popular.
The popular service should aggressively try to dominate registrar caches.
Things to check in logs
-
Cache entries per service
-
WAIT times for popular vs rare services
-
Number of rare ads stored
-
Lookup latency for rare services vs popular one
Advertisement Expiry and Churn Chaos
This checks whether stale advertisements continue being returned after expiry time E.
Start:
-
100 registrars
-
100 advertisers
-
100 discoverers
Allow advertisements to stabilize first.
Then suddenly:
Things to check in logs:
Oversized / Corrupted Advertisement Attack
Registrars and discoverers should reject invalid advertisements safely
Start:
-
50 registrars
-
50 malicious advertisers
The malicious advertisers send:
Things to check in logs:
Eclipse Attack Near Service Hash
Service discovery assumes that querying random registrars across buckets prevents eclipse attacks. This test checks whether that assumption holds.
Start:
The malicious registrars should:
-
suppress honest advertisements,
-
return empty responses,
-
return fake advertisements,
-
return only malicious closer peers.
Things to check in logs:
-
Honest peer discovery success rate
-
Number of malicious registrars contacted
-
Percentage of invalid ads returned
-
Whether discoverers terminate early
Concurrent Advertise + Lookup + Churn Race Test
Test concurrency bugs, async races, routing-table corruption, and cleanup issues.
Start:
-
150 registrars
-
200 advertisers
-
200 discoverers
Continuously:
-
start advertisers,
-
stop advertisers,
-
rotate services,
-
perform lookups,
-
restart registrars.
Everything should happen simultaneously for a long duration.
Things to check in logs: