From 13362940e15e8784a8e1f76f475cbb793741f737 Mon Sep 17 00:00:00 2001 From: Arceliar Date: Fri, 25 Jun 2021 18:03:22 -0500 Subject: [PATCH 1/8] draft of blog post on v0.4rc3 benchmarks --- .../2021-06-26-v0-4-prerelease-benchmarks.md | 56 +++ .../mobility1-10-10_arrival_progress.svg | 337 +++++++++++++ .../mobility1-10-30_arrival_progress.svg | 337 +++++++++++++ .../mobility1-10-60_arrival_progress.svg | 337 +++++++++++++ .../mobility1-30-10_arrival_progress.svg | 337 +++++++++++++ .../mobility1-30-30_arrival_progress.svg | 337 +++++++++++++ .../mobility1-30-60_arrival_progress.svg | 337 +++++++++++++ .../2021-06-26/mobility2_arrival_progress.svg | 430 +++++++++++++++++ .../2021-06-26/mobility2_traffic_progress.svg | 443 ++++++++++++++++++ .../images/2021-06-26/scalability1-grid4.svg | 408 ++++++++++++++++ .../images/2021-06-26/scalability1-line.svg | 414 ++++++++++++++++ .../images/2021-06-26/scalability1-rtree.svg | 375 +++++++++++++++ 12 files changed, 4148 insertions(+) create mode 100644 _posts/2021-06-26-v0-4-prerelease-benchmarks.md create mode 100644 assets/images/2021-06-26/mobility1-10-10_arrival_progress.svg create mode 100644 assets/images/2021-06-26/mobility1-10-30_arrival_progress.svg create mode 100644 assets/images/2021-06-26/mobility1-10-60_arrival_progress.svg create mode 100644 assets/images/2021-06-26/mobility1-30-10_arrival_progress.svg create mode 100644 assets/images/2021-06-26/mobility1-30-30_arrival_progress.svg create mode 100644 assets/images/2021-06-26/mobility1-30-60_arrival_progress.svg create mode 100644 assets/images/2021-06-26/mobility2_arrival_progress.svg create mode 100644 assets/images/2021-06-26/mobility2_traffic_progress.svg create mode 100644 assets/images/2021-06-26/scalability1-grid4.svg create mode 100644 assets/images/2021-06-26/scalability1-line.svg create mode 100644 assets/images/2021-06-26/scalability1-rtree.svg diff --git a/_posts/2021-06-26-v0-4-prerelease-benchmarks.md b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md new file mode 100644 index 0000000..a95a216 --- /dev/null +++ b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md @@ -0,0 +1,56 @@ +--- +layout: post +title: "v0.4 Pre-release Benchmarks" +date: 2021-06-26 21:00:00 +0000 +author: Arceliar +--- + +### Improvements in v0.4 + +As noted in a [recent post](2021-06-19-preparing-for-v0-4.md), the upcoming v0.4 release will include a number of major changes to how Yggdrasil routes traffic. +Most of these changes aim to improve performance in dynamic networks and reduce bandwidth consumption from protocol traffic. +It will take some time to get a sense for how these affect performance in a live network, but until then, I thought it could be interesting to look at some benchmark results. + +### Mesh Network Lab + +All of the results shown here are from [meshnet-lab](https://github.com/mwarning/meshnet-lab). Meshnet-lab simulates mesh networks using linux's network namespace functionality. Each node is give a network namespace, which can be linked to other namespaces to simulate an arbitrary topology. + +Although meshnet-lab supports many other mesh networking protocols, I thought it would be best to focus on comparing Yggdrasil `v0.3.16` (the latest stable release) with `v0.4rc3` (the most recent release candidate). The point of this post is to highlight what kind of performance changes we expect to see in the new Yggdrasil release, not to compare Yggdrasil to other mesh routers. + +#### Mobility1 + +If I understand correctly, the `mobility1` benchmark simulates a dynamic unit disc graph. A two-dimensinal plane of nodes is simulated, with nodes having connections to other nodes that fall within a certain radius. The network periodically moves all nodes a random distance between 0 and X (X=10,30,60m) in a 1km x 1km virtual space, then waits some amount of time (10s or 30s) before pinging 200 random paths. The paths are limited to source/destination pairs that are in the same connected component, so it only tests paths that plausibly could work. + +![mobility1-10-10_arrival_progress](/assets/images/2021-06-26/mobility1-10-10_arrival_progress.svg) +![mobility1-10-30_arrival_progress](/assets/images/2021-06-26/mobility1-10-30_arrival_progress.svg) +![mobility1-10-60_arrival_progress](/assets/images/2021-06-26/mobility1-10-60_arrival_progress.svg) + +![mobility1-30-10_arrival_progress](/assets/images/2021-06-26/mobility1-30-10_arrival_progress.svg) +![mobility1-30-30_arrival_progress](/assets/images/2021-06-26/mobility1-30-30_arrival_progress.svg) +![mobility1-30-60_arrival_progress](/assets/images/2021-06-26/mobility1-30-60_arrival_progress.svg) + +These mobility tests are an area where Yggdrasil has struggled up to now, as seen in the `v0.3.16` results. Basically, when a node moves, this can affect the coords of other nodes in the network. With the changes in `v0.4rc3`, the 30s tests are generally not a problem. The 10s tests see some loss, due to the time it takes to detect failed links before we can route around them. + +#### Mobility2 + +The `mobility2` test is essentially a variation of the above. Nodes periodically move a random (increasing) step size with a 15s delay before testing 200 random paths. This test also monitors bandwidth usage. + +![mobility2_arrival_progress](/assets/images/2021-06-26/mobility2_arrival_progress.svg) +![mobility2_traffic_progress](/assets/images/2021-06-26/mobility2_traffic_progress.svg) + +The main feature to note is that, asside from having terrible reliability in this test, `v0.3.16` uses a ridiculous amount of bandwidth when mobility is involved. With `v0.4rc3`, the bandwith use drops to at or below around 10KBps, depending on how mobile things are. I'm fairly certain that most of this bandwith is still a reaction to mobility events in the network, because (as we're about to see) the bandwith use a pretty low in static networks. + +#### Scalability1 + +The `scalability1` test set involves running the network over line, tree, or square grid networks. The line and tree networks start at 50 nodes and increase to 300. The grid network starts at 49 nodes (7x7) and increases the side length by 1 at each step, up to 298 nodes (17x17). This test waits for about 5 minutes before pinging 200 paths (slowly, over an additional 5 minutes), and measures both packet delivery rate and network utilization. + +![scalability1-line](/assets/images/2021-06-26/scalability1-line.svg) +![scalability1-rtree](/assets/images/2021-06-26/scalability1-rtree.svg) +![scalability1-grid](/assets/images/2021-06-26/scalability1-grid.svg) + +There's not a whole lot to say here, `v0.4rc3` is just an improvement across the board. Note that it's a little surprising how the bandwidth use *decreases* as the network grows. I think this is an artifact of how the test works. Each network measures reliability by pinging a fixed number of paths (200). The bandwidth used by these pings counts towards the test results. In the line network, increasing the network size also increases the path length at an equal rate, so the bandwidth use per node stays about the same. In the grid and rtree networks, path length doesn't increases as rapidly as the number of nodes, so the bandwith from the 200 test pings is increasing slower than the network size, which results in decreased average bandwidth use per node. In the future, it may be interesting to run some variation of this test without the pings, to get a better measurement of how the idle protocol traffic scales. + +### Conclusion + +The upcoming v0.4 release changes how packets are routed through the network. It's hard to predict exactly how this will affect network performance, but benchmarks in simulated networks may give us some insight into what we can expect. If the above benchmarks are at least qualitatively accurate, then we have reason to be optimistic about performance in the next release. + diff --git a/assets/images/2021-06-26/mobility1-10-10_arrival_progress.svg b/assets/images/2021-06-26/mobility1-10-10_arrival_progress.svg new file mode 100644 index 0000000..f395742 --- /dev/null +++ b/assets/images/2021-06-26/mobility1-10-10_arrival_progress.svg @@ -0,0 +1,337 @@ + + + +Gnuplot +Produced by GNUPLOT 5.2 patchlevel 8 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 20 + + + + + + + + + + + + + 40 + + + + + + + + + + + + + 60 + + + + + + + + + + + + + 80 + + + + + + + + + + + + + 100 + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 5 + + + + + + + + + + + + + 10 + + + + + + + + + + + + + 15 + + + + + + + + + + + + + 20 + + + + + + + + + + + + + 25 + + + + + + + + + + + + + 30 + + + + + + + + + packet arrival [%] + + + + + 10s steps [-] + + + + + Mobility1 test of 50 nodes. Start inside 1x1km square. + + + Step duration is 10 seconds. Step width is 0-10m. 100MBit/s - 1ms latency links. + + + + + v0.3.16 [%] + + + v0.3.16 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + v0.4rc3 [%] + + + v0.4rc3 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/assets/images/2021-06-26/mobility1-10-30_arrival_progress.svg b/assets/images/2021-06-26/mobility1-10-30_arrival_progress.svg new file mode 100644 index 0000000..31a90d9 --- /dev/null +++ b/assets/images/2021-06-26/mobility1-10-30_arrival_progress.svg @@ -0,0 +1,337 @@ + + + +Gnuplot +Produced by GNUPLOT 5.2 patchlevel 8 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 20 + + + + + + + + + + + + + 40 + + + + + + + + + + + + + 60 + + + + + + + + + + + + + 80 + + + + + + + + + + + + + 100 + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 5 + + + + + + + + + + + + + 10 + + + + + + + + + + + + + 15 + + + + + + + + + + + + + 20 + + + + + + + + + + + + + 25 + + + + + + + + + + + + + 30 + + + + + + + + + packet arrival [%] + + + + + 10s steps [-] + + + + + Mobility1 test of 50 nodes. Start inside 1x1km square. + + + Step duration is 10 seconds. Step width is 0-30m. 100MBit/s - 1ms latency links. + + + + + v0.3.16 [%] + + + v0.3.16 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + v0.4rc3 [%] + + + v0.4rc3 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/assets/images/2021-06-26/mobility1-10-60_arrival_progress.svg b/assets/images/2021-06-26/mobility1-10-60_arrival_progress.svg new file mode 100644 index 0000000..ad17ee8 --- /dev/null +++ b/assets/images/2021-06-26/mobility1-10-60_arrival_progress.svg @@ -0,0 +1,337 @@ + + + +Gnuplot +Produced by GNUPLOT 5.2 patchlevel 8 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 20 + + + + + + + + + + + + + 40 + + + + + + + + + + + + + 60 + + + + + + + + + + + + + 80 + + + + + + + + + + + + + 100 + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 5 + + + + + + + + + + + + + 10 + + + + + + + + + + + + + 15 + + + + + + + + + + + + + 20 + + + + + + + + + + + + + 25 + + + + + + + + + + + + + 30 + + + + + + + + + packet arrival [%] + + + + + 10s steps [-] + + + + + Mobility1 test of 50 nodes. Start inside 1x1km square. + + + Step duration is 10 seconds. Step width is 0-60m. 100MBit/s - 1ms latency links. + + + + + v0.3.16 [%] + + + v0.3.16 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + v0.4rc3 [%] + + + v0.4rc3 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/assets/images/2021-06-26/mobility1-30-10_arrival_progress.svg b/assets/images/2021-06-26/mobility1-30-10_arrival_progress.svg new file mode 100644 index 0000000..e273259 --- /dev/null +++ b/assets/images/2021-06-26/mobility1-30-10_arrival_progress.svg @@ -0,0 +1,337 @@ + + + +Gnuplot +Produced by GNUPLOT 5.2 patchlevel 8 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 20 + + + + + + + + + + + + + 40 + + + + + + + + + + + + + 60 + + + + + + + + + + + + + 80 + + + + + + + + + + + + + 100 + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 5 + + + + + + + + + + + + + 10 + + + + + + + + + + + + + 15 + + + + + + + + + + + + + 20 + + + + + + + + + + + + + 25 + + + + + + + + + + + + + 30 + + + + + + + + + packet arrival [%] + + + + + 30s steps [-] + + + + + Mobility1 test of 50 nodes. Start inside 1x1km square. + + + Step duration is 30 seconds. Step width is 0-10m. 100MBit/s - 1ms latency links. + + + + + v0.3.16 [%] + + + v0.3.16 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + v0.4rc3 [%] + + + v0.4rc3 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/assets/images/2021-06-26/mobility1-30-30_arrival_progress.svg b/assets/images/2021-06-26/mobility1-30-30_arrival_progress.svg new file mode 100644 index 0000000..9e6e841 --- /dev/null +++ b/assets/images/2021-06-26/mobility1-30-30_arrival_progress.svg @@ -0,0 +1,337 @@ + + + +Gnuplot +Produced by GNUPLOT 5.2 patchlevel 8 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 20 + + + + + + + + + + + + + 40 + + + + + + + + + + + + + 60 + + + + + + + + + + + + + 80 + + + + + + + + + + + + + 100 + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 5 + + + + + + + + + + + + + 10 + + + + + + + + + + + + + 15 + + + + + + + + + + + + + 20 + + + + + + + + + + + + + 25 + + + + + + + + + + + + + 30 + + + + + + + + + packet arrival [%] + + + + + 30s steps [-] + + + + + Mobility1 test of 50 nodes. Start inside 1x1km square. + + + Step duration is 30 seconds. Step width is 0-30m. 100MBit/s - 1ms latency links. + + + + + v0.3.16 [%] + + + v0.3.16 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + v0.4rc3 [%] + + + v0.4rc3 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/assets/images/2021-06-26/mobility1-30-60_arrival_progress.svg b/assets/images/2021-06-26/mobility1-30-60_arrival_progress.svg new file mode 100644 index 0000000..880f3cc --- /dev/null +++ b/assets/images/2021-06-26/mobility1-30-60_arrival_progress.svg @@ -0,0 +1,337 @@ + + + +Gnuplot +Produced by GNUPLOT 5.2 patchlevel 8 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 20 + + + + + + + + + + + + + 40 + + + + + + + + + + + + + 60 + + + + + + + + + + + + + 80 + + + + + + + + + + + + + 100 + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 5 + + + + + + + + + + + + + 10 + + + + + + + + + + + + + 15 + + + + + + + + + + + + + 20 + + + + + + + + + + + + + 25 + + + + + + + + + + + + + 30 + + + + + + + + + packet arrival [%] + + + + + 30s steps [-] + + + + + Mobility1 test of 50 nodes. Start inside 1x1km square. + + + Step duration is 30 seconds. Step width is 0-60m. 100MBit/s - 1ms latency links. + + + + + v0.3.16 [%] + + + v0.3.16 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + v0.4rc3 [%] + + + v0.4rc3 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/assets/images/2021-06-26/mobility2_arrival_progress.svg b/assets/images/2021-06-26/mobility2_arrival_progress.svg new file mode 100644 index 0000000..25f68f8 --- /dev/null +++ b/assets/images/2021-06-26/mobility2_arrival_progress.svg @@ -0,0 +1,430 @@ + + + +Gnuplot +Produced by GNUPLOT 5.2 patchlevel 8 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 20 + + + + + + + + + + + + + 40 + + + + + + + + + + + + + 60 + + + + + + + + + + + + + 80 + + + + + + + + + + + + + 100 + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 5 + + + + + + + + + + + + + 10 + + + + + + + + + + + + + 15 + + + + + + + + + + + + + 20 + + + + + + + + + + + + + 25 + + + + + + + + + + + + + 30 + + + + + + + + + + + + + 35 + + + + + + + + + + + + + 40 + + + + + + + + + + + + + 45 + + + + + + + + + + + + + 50 + + + + + + + + + packet arrival [%] + + + + + Mobility2 Test for 50 randomly placed nodes in a 1x1km square. + + + Move in random directions of 50-400m in 50m increments. + + + Wait and measure ping arrival over 60s in 10s intervals each time. + + + 100MBit/s - 1ms latency links. + + + + + v0.3.16 [%] + + + v0.3.16 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + v0.4rc3 [%] + + + v0.4rc3 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/assets/images/2021-06-26/mobility2_traffic_progress.svg b/assets/images/2021-06-26/mobility2_traffic_progress.svg new file mode 100644 index 0000000..381cf04 --- /dev/null +++ b/assets/images/2021-06-26/mobility2_traffic_progress.svg @@ -0,0 +1,443 @@ + + + +Gnuplot +Produced by GNUPLOT 5.2 patchlevel 8 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 100 + + + + + + + + + + + + + 200 + + + + + + + + + + + + + 300 + + + + + + + + + + + + + 400 + + + + + + + + + + + + + 500 + + + + + + + + + + + + + 600 + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 5 + + + + + + + + + + + + + 10 + + + + + + + + + + + + + 15 + + + + + + + + + + + + + 20 + + + + + + + + + + + + + 25 + + + + + + + + + + + + + 30 + + + + + + + + + + + + + 35 + + + + + + + + + + + + + 40 + + + + + + + + + + + + + 45 + + + + + + + + + + + + + 50 + + + + + + + + + tx traffic per node [KB/s] + + + + + Mobility2 Test for 50 randomly placed nodes in a 1x1km square. + + + Move in random directions of 50-400m in 50m increments. + + + Wait and measure ping arrival over 60s in 10s intervals each time. + + + 100MBit/s - 1ms latency links. + + + + + v0.3.16 [%] + + + v0.3.16 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + v0.4rc3 [%] + + + v0.4rc3 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/assets/images/2021-06-26/scalability1-grid4.svg b/assets/images/2021-06-26/scalability1-grid4.svg new file mode 100644 index 0000000..e2d8689 --- /dev/null +++ b/assets/images/2021-06-26/scalability1-grid4.svg @@ -0,0 +1,408 @@ + + + +Gnuplot +Produced by GNUPLOT 5.2 patchlevel 8 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 0.5 + + + + + + + + + + + + + 1 + + + + + + + + + + + + + 1.5 + + + + + + + + + + + + + 2 + + + + + + + + + + + + + 2.5 + + + + + + + + + + + + + 3 + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 50 + + + + + + + + + + + + + 100 + + + + + + + + + + + + + 150 + + + + + + + + + + + + + 200 + + + + + + + + + + + + + 250 + + + + + + + + + + + + + 300 + + + + + 0 + + + + + 10 + + + + + 20 + + + + + 30 + + + + + 40 + + + + + 50 + + + + + 60 + + + + + 70 + + + + + 80 + + + + + 90 + + + + + 100 + + + + + + + + + tx per node [KB/s] + + + + + packet arrival [%] + + + + + # number of nodes + + + + + Traffic by routing protocol on grid4 dataset with 100MBit/s - 1ms latency links. + + + 1. Start daemons, 2. Wait 300s, 3. Measure for 300s with <node_count> random pings + + + + + v0.3.16 [KB/s/node] + + + v0.3.16 [KB/s/node] + + + + + + + + + + + + + + + + + + v0.3.16 [%] + + + v0.3.16 [%] + + + + + + + + + + + + + + + + + + v0.4rc3 [KB/s/node] + + + v0.4rc3 [KB/s/node] + + + + + + + + + + + + + + + + + + v0.4rc3 [%] + + + v0.4rc3 [%] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/assets/images/2021-06-26/scalability1-line.svg b/assets/images/2021-06-26/scalability1-line.svg new file mode 100644 index 0000000..a7dcc46 --- /dev/null +++ b/assets/images/2021-06-26/scalability1-line.svg @@ -0,0 +1,414 @@ + + + +Gnuplot +Produced by GNUPLOT 5.2 patchlevel 8 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 2 + + + + + + + + + + + + + 4 + + + + + + + + + + + + + 6 + + + + + + + + + + + + + 8 + + + + + + + + + + + + + 10 + + + + + + + + + + + + + 12 + + + + + + + + + + + + + 14 + + + + + + + + + + + + + 16 + + + + + + + + + + + + + 18 + + + + + + + + + + + + + 50 + + + + + + + + + + + + + 100 + + + + + + + + + + + + + 150 + + + + + + + + + + + + + 200 + + + + + + + + + + + + + 250 + + + + + + + + + + + + + 300 + + + + + 0 + + + + + 10 + + + + + 20 + + + + + 30 + + + + + 40 + + + + + 50 + + + + + 60 + + + + + 70 + + + + + 80 + + + + + 90 + + + + + 100 + + + + + + + + + tx per node [KB/s] + + + + + packet arrival [%] + + + + + # number of nodes + + + + + Traffic by routing protocol on line dataset with 100MBit/s - 1ms latency links. + + + 1. Start daemons, 2. Wait 300s, 3. Measure for 300s with <node_count> random pings + + + + + v0.3.16 [KB/s/node] + + + v0.3.16 [KB/s/node] + + + + + + + + + + + + + v0.3.16 [%] + + + v0.3.16 [%] + + + + + + + + + + + + + v0.4rc3 [KB/s/node] + + + v0.4rc3 [KB/s/node] + + + + + + + + + + + + + v0.4rc3 [%] + + + v0.4rc3 [%] + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/assets/images/2021-06-26/scalability1-rtree.svg b/assets/images/2021-06-26/scalability1-rtree.svg new file mode 100644 index 0000000..8831639 --- /dev/null +++ b/assets/images/2021-06-26/scalability1-rtree.svg @@ -0,0 +1,375 @@ + + + +Gnuplot +Produced by GNUPLOT 5.2 patchlevel 8 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 0 + + + + + + + + + + + + + 0.2 + + + + + + + + + + + + + 0.4 + + + + + + + + + + + + + 0.6 + + + + + + + + + + + + + 0.8 + + + + + + + + + + + + + 1 + + + + + + + + + + + + + 1.2 + + + + + + + + + + + + + 50 + + + + + + + + + + + + + 100 + + + + + + + + + + + + + 150 + + + + + + + + + + + + + 200 + + + + + + + + + + + + + 250 + + + + + + + + + + + + + 300 + + + + + 0 + + + + + 10 + + + + + 20 + + + + + 30 + + + + + 40 + + + + + 50 + + + + + 60 + + + + + 70 + + + + + 80 + + + + + 90 + + + + + 100 + + + + + + + + + tx per node [KB/s] + + + + + packet arrival [%] + + + + + # number of nodes + + + + + Traffic by routing protocol on rtree dataset with 100MBit/s - 1ms latency links. + + + 1. Start daemons, 2. Wait 300s, 3. Measure for 300s with <node_count> random pings + + + + + v0.3.16 [KB/s/node] + + + v0.3.16 [KB/s/node] + + + + + + + + + + + + + v0.3.16 [%] + + + v0.3.16 [%] + + + + + + + + + + + + + v0.4rc3 [KB/s/node] + + + v0.4rc3 [KB/s/node] + + + + + + + + + + + + + v0.4rc3 [%] + + + v0.4rc3 [%] + + + + + + + + + + + + + + + + + + + + + + + + + From 20b8e25b0bbd6e80deb64624bcdb1935eb0025b0 Mon Sep 17 00:00:00 2001 From: Arceliar Date: Fri, 25 Jun 2021 18:05:32 -0500 Subject: [PATCH 2/8] grid4 --- _posts/2021-06-26-v0-4-prerelease-benchmarks.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_posts/2021-06-26-v0-4-prerelease-benchmarks.md b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md index a95a216..66dc215 100644 --- a/_posts/2021-06-26-v0-4-prerelease-benchmarks.md +++ b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md @@ -46,7 +46,7 @@ The `scalability1` test set involves running the network over line, tree, or squ ![scalability1-line](/assets/images/2021-06-26/scalability1-line.svg) ![scalability1-rtree](/assets/images/2021-06-26/scalability1-rtree.svg) -![scalability1-grid](/assets/images/2021-06-26/scalability1-grid.svg) +![scalability1-grid](/assets/images/2021-06-26/scalability1-grid4.svg) There's not a whole lot to say here, `v0.4rc3` is just an improvement across the board. Note that it's a little surprising how the bandwidth use *decreases* as the network grows. I think this is an artifact of how the test works. Each network measures reliability by pinging a fixed number of paths (200). The bandwidth used by these pings counts towards the test results. In the line network, increasing the network size also increases the path length at an equal rate, so the bandwidth use per node stays about the same. In the grid and rtree networks, path length doesn't increases as rapidly as the number of nodes, so the bandwith from the 200 test pings is increasing slower than the network size, which results in decreased average bandwidth use per node. In the future, it may be interesting to run some variation of this test without the pings, to get a better measurement of how the idle protocol traffic scales. From 6544fd0761abf214d05afd9a686a96c7508a5090 Mon Sep 17 00:00:00 2001 From: Arceliar Date: Fri, 25 Jun 2021 21:02:52 -0500 Subject: [PATCH 3/8] explain v0.3 and v0.4 differences a bit --- .../2021-06-26-v0-4-prerelease-benchmarks.md | 23 ++++++++++++++----- 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/_posts/2021-06-26-v0-4-prerelease-benchmarks.md b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md index 66dc215..41e9a9e 100644 --- a/_posts/2021-06-26-v0-4-prerelease-benchmarks.md +++ b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md @@ -5,21 +5,32 @@ date: 2021-06-26 21:00:00 +0000 author: Arceliar --- +### The Problem with v0.3 + +In the current stable release of Yggdrasil, `v0.3.16`, routing works basically the same way that it has always worked since release. Traffic is forwarded by greedy routing in a metric space. In essence, each node has a distance label (`coords` in the code), and given the distance label of any two nodes, you can calculate the distance of some path between them. Traffic is forwarded to whichever peer minimizes that distance to the destination. This has been discussed in an [earlier blog post](2018-07-17-world-tree.md), so lets not worry about the details of how it works right now. We'll just focus on what happens when it *doesn't* work. + +To be able to send traffic to a destination `D`, the sender `S` must look up the node's distance label and key in the DHT. This happens just before session setup, where ephemeral keys are exchanged. You can think of it a bit like a DNS lookup: it maps some static information (the node's Yggdrasil IPv6 address) onto some dynamic information (the node's distance label). If anything happens to the network that causes the destination node `D`'s distance label to change, then all traffic to `D` will drop until the `S` can look up `D`'s new distance label. However, that lookup depends on the DHT, and the DHT *also* uses distance labels for communication, so DHT lookups for `D` will fail for some amount of time until that completes. While that's happening, `S` cannot communicate with `D`, even if the path between `S` and `D` is unaffected. Further exacerbating the problem, the DHT search is an iterative process that requires round trip communication with multiple nodes. These nodes essentially random, meaning most of them are likely to be near the edge of the network, where connections are comparatively unreliable and costly to use. If any part of the lookup fails, then this delays search progress (if it doesn't cause the search to fail entirely). + +The network tries to combat these problems by having `D` refresh itself in the DHT and send a notification to `S` when `D`'s distance label changes. However, there is no guarantee that `D` knows every node which is tracking it in the DHT, and these notifications will hit a dead and and be dropped if the distance labels of the recipients have also changed. This often happens if `S` and `D` share a common ancestor in the tree. Basically, if `S` and `D` are in a LAN with gateway `G`, and `G`'s connection to the outside world dies, this disrupts the connection between `S` and `D` (and leaves the DHT in a broken state, where they can't look each other up, until things time out). + ### Improvements in v0.4 As noted in a [recent post](2021-06-19-preparing-for-v0-4.md), the upcoming v0.4 release will include a number of major changes to how Yggdrasil routes traffic. Most of these changes aim to improve performance in dynamic networks and reduce bandwidth consumption from protocol traffic. -It will take some time to get a sense for how these affect performance in a live network, but until then, I thought it could be interesting to look at some benchmark results. +Without repeating too much from that earlier blog post, the basic goal here is to insulate the routing from changes to distance labels. +This happens through a mix of reactive opportunistic source routing and falling back to to proactive DHT-based routing, both of which use distance labels for path setup, but neither of which is broken when the distance labels change (provided that the links in the path still work). + +Since it may take a while to see how this affects performance in a live network, and becuause it's a bit difficult to actually measure these things in a real network, it seems like it would be useful to look at some results from benchmarks on simulated networks. ### Mesh Network Lab -All of the results shown here are from [meshnet-lab](https://github.com/mwarning/meshnet-lab). Meshnet-lab simulates mesh networks using linux's network namespace functionality. Each node is give a network namespace, which can be linked to other namespaces to simulate an arbitrary topology. +All of the results shown here are from [meshnet-lab](https://github.com/mwarning/meshnet-lab). You should probably just read the documentation if you want to know more, but to summarize: meshnet-lab simulates mesh networks using network namespace on linux. Each node is give a network namespace, which can be linked to other namespaces to simulate an arbitrary topology. Links are added and removed as needed to e.g. simulate movement in a mobile adhoc network. -Although meshnet-lab supports many other mesh networking protocols, I thought it would be best to focus on comparing Yggdrasil `v0.3.16` (the latest stable release) with `v0.4rc3` (the most recent release candidate). The point of this post is to highlight what kind of performance changes we expect to see in the new Yggdrasil release, not to compare Yggdrasil to other mesh routers. +Although meshnet-lab supports many other mesh networking protocols, this post will focus on comparing Yggdrasil `v0.3.16` (the latest stable release) with `v0.4rc3` (the most recent release candidate). The point of this post is to highlight what kind of performance changes we expect to see in the new Yggdrasil release, not to compare Yggdrasil to other mesh routers. #### Mobility1 -If I understand correctly, the `mobility1` benchmark simulates a dynamic unit disc graph. A two-dimensinal plane of nodes is simulated, with nodes having connections to other nodes that fall within a certain radius. The network periodically moves all nodes a random distance between 0 and X (X=10,30,60m) in a 1km x 1km virtual space, then waits some amount of time (10s or 30s) before pinging 200 random paths. The paths are limited to source/destination pairs that are in the same connected component, so it only tests paths that plausibly could work. +The `mobility1` benchmark simulates a dynamic [unit disc graph](https://en.wikipedia.org/wiki/Unit_disk_graph). Nodes are simulated within a two-dimensional Euclidean plane, with each node having connections to other nodes that fall within a certain radius. The network periodically moves all nodes a random distance between 0 and X (X=10,30,60m) in a 1km x 1km virtual space, then waits some amount of time (10s or 30s) before pinging 200 random paths. The paths are limited to source/destination pairs that are in the same connected component, so it only tests paths that plausibly could work. ![mobility1-10-10_arrival_progress](/assets/images/2021-06-26/mobility1-10-10_arrival_progress.svg) ![mobility1-10-30_arrival_progress](/assets/images/2021-06-26/mobility1-10-30_arrival_progress.svg) @@ -29,7 +40,7 @@ If I understand correctly, the `mobility1` benchmark simulates a dynamic unit di ![mobility1-30-30_arrival_progress](/assets/images/2021-06-26/mobility1-30-30_arrival_progress.svg) ![mobility1-30-60_arrival_progress](/assets/images/2021-06-26/mobility1-30-60_arrival_progress.svg) -These mobility tests are an area where Yggdrasil has struggled up to now, as seen in the `v0.3.16` results. Basically, when a node moves, this can affect the coords of other nodes in the network. With the changes in `v0.4rc3`, the 30s tests are generally not a problem. The 10s tests see some loss, due to the time it takes to detect failed links before we can route around them. +These mobility tests are an area where Yggdrasil has struggled up to now, as seen in the `v0.3.16` results. Basically, when a node moves, this can affect the coords of other nodes in the network. With the changes in `v0.4rc3`, the 30s tests are generally in good shape. The 10s tests see some loss, due to the time it takes to detect failed links before we can route around them. #### Mobility2 @@ -48,7 +59,7 @@ The `scalability1` test set involves running the network over line, tree, or squ ![scalability1-rtree](/assets/images/2021-06-26/scalability1-rtree.svg) ![scalability1-grid](/assets/images/2021-06-26/scalability1-grid4.svg) -There's not a whole lot to say here, `v0.4rc3` is just an improvement across the board. Note that it's a little surprising how the bandwidth use *decreases* as the network grows. I think this is an artifact of how the test works. Each network measures reliability by pinging a fixed number of paths (200). The bandwidth used by these pings counts towards the test results. In the line network, increasing the network size also increases the path length at an equal rate, so the bandwidth use per node stays about the same. In the grid and rtree networks, path length doesn't increases as rapidly as the number of nodes, so the bandwith from the 200 test pings is increasing slower than the network size, which results in decreased average bandwidth use per node. In the future, it may be interesting to run some variation of this test without the pings, to get a better measurement of how the idle protocol traffic scales. +There's not a whole lot to say here, `v0.4rc3` is just an improvement across the board. Note that it's a little surprising how the bandwidth use *decreases* as the network grows. This may be an artifact of how the test works. Each network measures reliability by pinging a fixed number of paths (200). The bandwidth used by these pings counts towards the test results. In the line network, increasing the network size also increases the path length at an equal rate, so the bandwidth use per node stays about the same. In the grid and rtree networks, path length doesn't increases as rapidly as the number of nodes, so the bandwith from the 200 test pings is increasing slower than the network size, which results in decreased average bandwidth use per node. In the future, it may be interesting to run some variation of this test without the pings, to get a better measurement of how the idle protocol traffic scales. ### Conclusion From f531938bdc5c74259407381eaf219d2f609345fe Mon Sep 17 00:00:00 2001 From: Arceliar Date: Sat, 26 Jun 2021 15:06:40 -0500 Subject: [PATCH 4/8] benchmark editing --- _posts/2021-06-26-v0-4-prerelease-benchmarks.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/_posts/2021-06-26-v0-4-prerelease-benchmarks.md b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md index 41e9a9e..53998ab 100644 --- a/_posts/2021-06-26-v0-4-prerelease-benchmarks.md +++ b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md @@ -7,11 +7,13 @@ author: Arceliar ### The Problem with v0.3 -In the current stable release of Yggdrasil, `v0.3.16`, routing works basically the same way that it has always worked since release. Traffic is forwarded by greedy routing in a metric space. In essence, each node has a distance label (`coords` in the code), and given the distance label of any two nodes, you can calculate the distance of some path between them. Traffic is forwarded to whichever peer minimizes that distance to the destination. This has been discussed in an [earlier blog post](2018-07-17-world-tree.md), so lets not worry about the details of how it works right now. We'll just focus on what happens when it *doesn't* work. +In the current stable release of Yggdrasil, `v0.3.16`, routing works basically the same way that it has always worked since release. Traffic is forwarded by greedy routing in a metric space. In essence, each node has a distance label (`coords` in the code), and given the distance label of any two nodes, you can calculate the distance of some path between them. Traffic is forwarded to whichever peer minimizes that distance to the destination. This has been discussed in an [earlier blog post](2018-07-17-world-tree.md), so lets not worry about the details of how it works for now. Instead, we'll focus on what happens when it *doesn't* work. -To be able to send traffic to a destination `D`, the sender `S` must look up the node's distance label and key in the DHT. This happens just before session setup, where ephemeral keys are exchanged. You can think of it a bit like a DNS lookup: it maps some static information (the node's Yggdrasil IPv6 address) onto some dynamic information (the node's distance label). If anything happens to the network that causes the destination node `D`'s distance label to change, then all traffic to `D` will drop until the `S` can look up `D`'s new distance label. However, that lookup depends on the DHT, and the DHT *also* uses distance labels for communication, so DHT lookups for `D` will fail for some amount of time until that completes. While that's happening, `S` cannot communicate with `D`, even if the path between `S` and `D` is unaffected. Further exacerbating the problem, the DHT search is an iterative process that requires round trip communication with multiple nodes. These nodes essentially random, meaning most of them are likely to be near the edge of the network, where connections are comparatively unreliable and costly to use. If any part of the lookup fails, then this delays search progress (if it doesn't cause the search to fail entirely). +To be able to send traffic to a destination `D`, the sender `S` must look up the node's distance label and key in the DHT. This happens just before session setup, where ephemeral keys are exchanged. You can think of it a bit like a DNS lookup: it maps some known static information (the node's Yggdrasil IPv6 address) onto some unknown or dynamic information (the node's static key and dynamic distance label). If anything happens to the network that causes the destination node `D`'s distance label to change, then all traffic to `D` will drop until the `S` can look up `D`'s new distance label. However, that lookup depends on the DHT, and the DHT *also* uses distance labels for communication, so DHT lookups for `D` will fail for some amount of time, until the out-of-date information about `D` times out or is replaced. While that's happening, `S` cannot communicate with `D`, even if the path between `S` and `D` is unaffected. Further exacerbating the problem, the DHT search is an iterative process, which requires round trip communication with multiple nodes. These nodes are, for the most part, randomly distributed across the physical network, meaning most of them are likely to be near the edge of the network, where connections are comparatively unreliable and costly to use. If any part of the lookup fails, then this delays search progress (if it doesn't cause the search to fail entirely). -The network tries to combat these problems by having `D` refresh itself in the DHT and send a notification to `S` when `D`'s distance label changes. However, there is no guarantee that `D` knows every node which is tracking it in the DHT, and these notifications will hit a dead and and be dropped if the distance labels of the recipients have also changed. This often happens if `S` and `D` share a common ancestor in the tree. Basically, if `S` and `D` are in a LAN with gateway `G`, and `G`'s connection to the outside world dies, this disrupts the connection between `S` and `D` (and leaves the DHT in a broken state, where they can't look each other up, until things time out). +The network tries to combat these problems by having `D` refresh itself in the DHT and send a notification to `S` when `D`'s distance label changes. However, there is no guarantee that `D` knows every node which is tracking it in the DHT, and these notifications will hit a dead and and be dropped if the distance labels of the recipients have also changed. This often happens if `S` and `D` share a common ancestor in the tree. + +To give a concrete example, if `S` and `D` are in a LAN with gateway `G`, and `G`'s connection to the outside world dies, then this disrupts the traffic flow between `S` and `D`. That happens even when the path between them in their own network is unaffected. It also causes various issues in the DHT, which hurt performance for the network in general, and prevents `S` and `D` in particular from being able to resume communication. ### Improvements in v0.4 @@ -26,7 +28,7 @@ Since it may take a while to see how this affects performance in a live network, All of the results shown here are from [meshnet-lab](https://github.com/mwarning/meshnet-lab). You should probably just read the documentation if you want to know more, but to summarize: meshnet-lab simulates mesh networks using network namespace on linux. Each node is give a network namespace, which can be linked to other namespaces to simulate an arbitrary topology. Links are added and removed as needed to e.g. simulate movement in a mobile adhoc network. -Although meshnet-lab supports many other mesh networking protocols, this post will focus on comparing Yggdrasil `v0.3.16` (the latest stable release) with `v0.4rc3` (the most recent release candidate). The point of this post is to highlight what kind of performance changes we expect to see in the new Yggdrasil release, not to compare Yggdrasil to other mesh routers. +Although meshnet-lab supports many other mesh networking protocols, this post will focus on comparing Yggdrasil `v0.3.16` (the latest stable release) with `v0.4rc3` (the most recent release candidate). Comparisons with other mesh routers would be interesting, but it would be best if those were done by an unbiased 3rd party (and using a stable `v0.4.X` release instead of a release candidate). Instead, this post will try to highlight (qualitatively) what sort of performance changes we expect to see in the new release. #### Mobility1 @@ -59,7 +61,7 @@ The `scalability1` test set involves running the network over line, tree, or squ ![scalability1-rtree](/assets/images/2021-06-26/scalability1-rtree.svg) ![scalability1-grid](/assets/images/2021-06-26/scalability1-grid4.svg) -There's not a whole lot to say here, `v0.4rc3` is just an improvement across the board. Note that it's a little surprising how the bandwidth use *decreases* as the network grows. This may be an artifact of how the test works. Each network measures reliability by pinging a fixed number of paths (200). The bandwidth used by these pings counts towards the test results. In the line network, increasing the network size also increases the path length at an equal rate, so the bandwidth use per node stays about the same. In the grid and rtree networks, path length doesn't increases as rapidly as the number of nodes, so the bandwith from the 200 test pings is increasing slower than the network size, which results in decreased average bandwidth use per node. In the future, it may be interesting to run some variation of this test without the pings, to get a better measurement of how the idle protocol traffic scales. +There's not a whole lot to say here, `v0.4rc3` is just an improvement across the board. Note that it's a little surprising how the bandwidth use *decreases* as the network grows. This may be an artifact of how the test works, since a fixed number of pings may represent proportionally more traffic in small network, but that's speculation. ### Conclusion From eafb927d57b043048eaf81589e68b630028d246b Mon Sep 17 00:00:00 2001 From: Arceliar Date: Sat, 26 Jun 2021 15:14:15 -0500 Subject: [PATCH 5/8] fix typos --- _posts/2021-06-26-v0-4-prerelease-benchmarks.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_posts/2021-06-26-v0-4-prerelease-benchmarks.md b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md index 53998ab..2be2312 100644 --- a/_posts/2021-06-26-v0-4-prerelease-benchmarks.md +++ b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md @@ -22,11 +22,11 @@ Most of these changes aim to improve performance in dynamic networks and reduce Without repeating too much from that earlier blog post, the basic goal here is to insulate the routing from changes to distance labels. This happens through a mix of reactive opportunistic source routing and falling back to to proactive DHT-based routing, both of which use distance labels for path setup, but neither of which is broken when the distance labels change (provided that the links in the path still work). -Since it may take a while to see how this affects performance in a live network, and becuause it's a bit difficult to actually measure these things in a real network, it seems like it would be useful to look at some results from benchmarks on simulated networks. +Since it may take a while to see how this affects performance in a live network, and because it's a bit difficult to actually measure these things in a real network, it seems like it would be useful to look at some results from benchmarks on simulated networks. ### Mesh Network Lab -All of the results shown here are from [meshnet-lab](https://github.com/mwarning/meshnet-lab). You should probably just read the documentation if you want to know more, but to summarize: meshnet-lab simulates mesh networks using network namespace on linux. Each node is give a network namespace, which can be linked to other namespaces to simulate an arbitrary topology. Links are added and removed as needed to e.g. simulate movement in a mobile adhoc network. +All of the results shown here are from [meshnet-lab](https://github.com/mwarning/meshnet-lab). You should probably just read the documentation if you want to know more, but to summarize: meshnet-lab simulates mesh networks using network namespace on linux. Each node is given a network namespace, which can be linked to other namespaces to simulate an arbitrary topology. Links are added and removed as needed to e.g. simulate movement in a mobile adhoc network. Although meshnet-lab supports many other mesh networking protocols, this post will focus on comparing Yggdrasil `v0.3.16` (the latest stable release) with `v0.4rc3` (the most recent release candidate). Comparisons with other mesh routers would be interesting, but it would be best if those were done by an unbiased 3rd party (and using a stable `v0.4.X` release instead of a release candidate). Instead, this post will try to highlight (qualitatively) what sort of performance changes we expect to see in the new release. From 022bf5b38c534be38c4252a6af054061fdba2485 Mon Sep 17 00:00:00 2001 From: Arceliar Date: Sat, 26 Jun 2021 15:43:19 -0500 Subject: [PATCH 6/8] more editing for the benchmark post --- _posts/2021-06-26-v0-4-prerelease-benchmarks.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/_posts/2021-06-26-v0-4-prerelease-benchmarks.md b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md index 2be2312..bcbcf17 100644 --- a/_posts/2021-06-26-v0-4-prerelease-benchmarks.md +++ b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md @@ -5,9 +5,9 @@ date: 2021-06-26 21:00:00 +0000 author: Arceliar --- -### The Problem with v0.3 +### Revisiting v0.3 -In the current stable release of Yggdrasil, `v0.3.16`, routing works basically the same way that it has always worked since release. Traffic is forwarded by greedy routing in a metric space. In essence, each node has a distance label (`coords` in the code), and given the distance label of any two nodes, you can calculate the distance of some path between them. Traffic is forwarded to whichever peer minimizes that distance to the destination. This has been discussed in an [earlier blog post](2018-07-17-world-tree.md), so lets not worry about the details of how it works for now. Instead, we'll focus on what happens when it *doesn't* work. +In the current stable release of Yggdrasil, `v0.3.16`, routing works basically the same way that it has always worked since release. Traffic is forwarded by greedy routing in a metric space. In essence, each node has a "distance label", and given the distance label of any two nodes, you can calculate the distance of some path between them. In the code, this label is usually called `coords`, as it represents a position in the tree, but technically we don't care about the position itself, we only care that it works as a distance label. Traffic is forwarded to whichever peer minimizes that distance to the destination. This has been discussed in an [earlier blog post](2018-07-17-world-tree.md), so lets not worry about the details of how it works for now. Instead, we'll focus on what happens when it *doesn't* work. To be able to send traffic to a destination `D`, the sender `S` must look up the node's distance label and key in the DHT. This happens just before session setup, where ephemeral keys are exchanged. You can think of it a bit like a DNS lookup: it maps some known static information (the node's Yggdrasil IPv6 address) onto some unknown or dynamic information (the node's static key and dynamic distance label). If anything happens to the network that causes the destination node `D`'s distance label to change, then all traffic to `D` will drop until the `S` can look up `D`'s new distance label. However, that lookup depends on the DHT, and the DHT *also* uses distance labels for communication, so DHT lookups for `D` will fail for some amount of time, until the out-of-date information about `D` times out or is replaced. While that's happening, `S` cannot communicate with `D`, even if the path between `S` and `D` is unaffected. Further exacerbating the problem, the DHT search is an iterative process, which requires round trip communication with multiple nodes. These nodes are, for the most part, randomly distributed across the physical network, meaning most of them are likely to be near the edge of the network, where connections are comparatively unreliable and costly to use. If any part of the lookup fails, then this delays search progress (if it doesn't cause the search to fail entirely). @@ -46,12 +46,12 @@ These mobility tests are an area where Yggdrasil has struggled up to now, as see #### Mobility2 -The `mobility2` test is essentially a variation of the above. Nodes periodically move a random (increasing) step size with a 15s delay before testing 200 random paths. This test also monitors bandwidth usage. +The `mobility2` test is essentially a much more aggressive variation of the above. Nodes periodically move a random (increasing) step size with a 15s delay before testing 200 random paths. This test also monitors bandwidth usage. ![mobility2_arrival_progress](/assets/images/2021-06-26/mobility2_arrival_progress.svg) ![mobility2_traffic_progress](/assets/images/2021-06-26/mobility2_traffic_progress.svg) -The main feature to note is that, asside from having terrible reliability in this test, `v0.3.16` uses a ridiculous amount of bandwidth when mobility is involved. With `v0.4rc3`, the bandwith use drops to at or below around 10KBps, depending on how mobile things are. I'm fairly certain that most of this bandwith is still a reaction to mobility events in the network, because (as we're about to see) the bandwith use a pretty low in static networks. +The main feature to note is that, aside from having terrible reliability in this test, `v0.3.16` uses a ridiculous amount of bandwidth when mobility is involved. With `v0.4rc3`, the bandwith use drops to at or below around 10KBps, depending on how mobile things are. I'm fairly certain that most of this bandwith is still a reaction to mobility events in the network, because (as we're about to see) the bandwith use a pretty low in static networks. #### Scalability1 @@ -67,3 +67,5 @@ There's not a whole lot to say here, `v0.4rc3` is just an improvement across the The upcoming v0.4 release changes how packets are routed through the network. It's hard to predict exactly how this will affect network performance, but benchmarks in simulated networks may give us some insight into what we can expect. If the above benchmarks are at least qualitatively accurate, then we have reason to be optimistic about performance in the next release. +If things go according to plan, then these changes should improve the user experience and overall usefulness of the network. Changes to the network state should no longer affect existing traffic flows, as long as the path the flow is using is unaffected. In cases where the path *is* affected, it should take much less time for the network to detect this and route around the damage (when it's possible to do so). With or without disruptive changes in the network, there should be reduced bandwidth from protocol traffic, leading to lower data use and longer battery life in energy constrained environments (e.g. mobile phones). + From 7dfbd98c6229fd0c1df8c0e8bf1c9dd273b0f171 Mon Sep 17 00:00:00 2001 From: Arceliar Date: Sat, 26 Jun 2021 15:56:20 -0500 Subject: [PATCH 7/8] editing the conclusion --- _posts/2021-06-26-v0-4-prerelease-benchmarks.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_posts/2021-06-26-v0-4-prerelease-benchmarks.md b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md index bcbcf17..b02a337 100644 --- a/_posts/2021-06-26-v0-4-prerelease-benchmarks.md +++ b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md @@ -65,7 +65,7 @@ There's not a whole lot to say here, `v0.4rc3` is just an improvement across the ### Conclusion -The upcoming v0.4 release changes how packets are routed through the network. It's hard to predict exactly how this will affect network performance, but benchmarks in simulated networks may give us some insight into what we can expect. If the above benchmarks are at least qualitatively accurate, then we have reason to be optimistic about performance in the next release. +The upcoming v0.4 release changes how packets are routed through the network. While it's hard to say exactly how things will perform in the real world, the performance gains in the simulated networks give us reason to be optimistic. If things go according to plan, then these changes should improve the user experience and overall usefulness of the network. Changes to the network state should no longer affect existing traffic flows, as long as the path the flow is using is unaffected. In cases where the path *is* affected, it should take much less time for the network to detect this and route around the damage (when it's possible to do so). With or without disruptive changes in the network, there should be reduced bandwidth from protocol traffic, leading to lower data use and longer battery life in energy constrained environments (e.g. mobile phones). From a0f5fbbc7412af10cced176f6cda4735e3776874 Mon Sep 17 00:00:00 2001 From: Arceliar Date: Sat, 26 Jun 2021 15:57:28 -0500 Subject: [PATCH 8/8] conclusion --- _posts/2021-06-26-v0-4-prerelease-benchmarks.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_posts/2021-06-26-v0-4-prerelease-benchmarks.md b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md index b02a337..ad43148 100644 --- a/_posts/2021-06-26-v0-4-prerelease-benchmarks.md +++ b/_posts/2021-06-26-v0-4-prerelease-benchmarks.md @@ -65,7 +65,7 @@ There's not a whole lot to say here, `v0.4rc3` is just an improvement across the ### Conclusion -The upcoming v0.4 release changes how packets are routed through the network. While it's hard to say exactly how things will perform in the real world, the performance gains in the simulated networks give us reason to be optimistic. +The upcoming v0.4 release changes how packets are routed through the network. While it's hard to say exactly how things will behave in the real world, the performance gains in the simulated networks give us reason to be optimistic. If things go according to plan, then these changes should improve the user experience and overall usefulness of the network. Changes to the network state should no longer affect existing traffic flows, as long as the path the flow is using is unaffected. In cases where the path *is* affected, it should take much less time for the network to detect this and route around the damage (when it's possible to do so). With or without disruptive changes in the network, there should be reduced bandwidth from protocol traffic, leading to lower data use and longer battery life in energy constrained environments (e.g. mobile phones).