Commits · 47ba0de8a241d662cf71c6c00fa48071a3b9bdf1 · HEME Clement / Komodo

May 29, 2024

improve performances by not shuffling vectors (dragoon/komodo!122 ) · 47ba0de8

STEVAN Antoine authored 10 months ago

in `./bins/inbreeding/`, this MR does
- refactor the "list item drawing" from `environment.rs` and `strategy.rs` into the `draw_unique_elements` function of new `random.rs` module
- use a `HashSet` to draw unique indices in the slice of "things" to draw from and then extracts the items corresponding to these indices

## results
```bash
use ./bins/inbreeding
use std bench

const PRNG_SEED = 0
const OPTS = {
    nb_bytes: (10 * 1_024),
    k: 10,
    n: 20,
    nb_measurements: 100,
    nb_scenarii: 10,
    measurement_schedule: 1,
    measurement_schedule_start: 2_000,
    max_t: 2_000,
    strategies: [ "single:5" ],
    environment: null,
}

def run [rev: string] {
    git co $rev

    inbreeding build

    let a = bench --rounds 5 {
        inbreeding run --options ($OPTS | update environment "fixed:0") --prng-seed $PRNG_SEED
    }
    let b = bench --rounds 5 {
        inbreeding run --options ($OPTS | update environment "fixed:1") --prng-seed $PRNG_SEED
    }

    {
        0: $a,
        1: $b,
    }
}

let main = run a29b511d
let mr = run fix-shuffle
```
```bash
let table = [
    [env, main, mr, improvement];

    ["fixed:0", $main."0".mean, $mr."0".mean, (($main."0".mean - $mr."0".mean) / $main."0".mean * 100)],
    ["fixed:1", $main."1".mean, $mr."1".mean, (($main."1".mean - $mr."1".mean) / $main."1".mean * 100)],
]

$table | to md --pretty
```

| env     | main                    | mr                      | improvement        |
| ------- | ----------------------- | ----------------------- | ------------------ |
| fixed:0 | 8sec 504ms 794µs 784ns | 6sec 353ms 206µs 645ns | 25.298530930431912 |
| fixed:1 | 727ms 648µs 292ns      | 639ms 443µs 782ns      | 12.12186037811795  |

the improvement is quite nice, even though not huge, but the code is cleaner anyways 🙏

47ba0de8

May 28, 2024

split `examples/` into `benchmarks/` and `bins/` (dragoon/komodo!117 ) · bb626120

STEVAN Antoine authored 10 months ago

## new structure for the repository
- benchmarks are in `./benchmarks/` and can be run with either `cargo run --package benchmarks --bin <bench>` or the commands in `./benchmarks/README.md`
```
├── Cargo.toml
├── README.md
└── src
    └── bin
        ├── commit.rs
        ├── fec.rs
        ├── linalg.rs
        ├── operations
        │   ├── curve_group.rs
        │   └── field.rs
        ├── recoding.rs
        ├── setup.rs
        └── setup_size.rs
```

- examples are now in `./bins/` as standalone binaries and can be run either with `cargo run --package <pkg>` or with the help of the `cargo bin` command from `.nushell/cargo.nu`
```
├── curves
│   ├── Cargo.toml
│   ├── README.md
│   └── src
│       └── main.rs
├── inbreeding
│   ├── build.nu
│   ├── Cargo.toml
│   ├── consts.nu
│   ├── mod.nu
│   ├── plot.nu
│   ├── README.md
│   ├── run.nu
│   └── src
│       ├── environment.rs
│       ├── main.rs
│       └── strategy.rs
├── rank
│   ├── Cargo.toml
│   └── src
│       └── main.rs
└── rng
    ├── Cargo.toml
    └── src
        └── main.rs
```

- Nushell modules are now located in `./.nushell/`

## changelog
apart from the changes to the general structure of the repo:
- `binary.nu` -> `.nushell/binary.nu`
- new `cargo bin` command from `.nushell/cargo.nu`
- `error throw` is now defined in `.nushell/error.nu`
- main TOML has been greatly simplified because the dependencies of "examples" have been moved to the associated crates
- the rest is basically the same but in the new structure

bb626120

May 27, 2024

Rand (dragoon/komodo!113 ) · 2b2783f3

STEVAN Antoine authored 10 months ago

- add `--prng-seed: u8` to fix the random number generator seed

## example
by running the following snippet, we get
- `first.123.png` and `second.123.png` with `--prng-seed 123` which are the same
- `first.111.png` and `second.111.png` with `--prng-seed 111` which are the same
- `first.111.png` and `first.123.png` are different

```bash
use ./scripts/inbreeding

const OPTS = {
    nb_bytes: (10 * 1_024),
    k: 10,
    n: 20,
    nb_scenarii: 10,
    nb_measurements: 10,
    measurement_schedule: 1,
    measurement_schedule_start: 0,
    max_t: 50,
    strategies: [
        "single:1",
        "double:0.5:1:2",
        "single:2"
        "double:0.5:2:3",
        "single:3"
        "single:5"
        "single:10",
    ],
    environment: "random-fixed:0.5:1",
}

inbreeding build

inbreeding run --options $OPTS --prng-seed 123 --output /tmp/first.123.nuon
inbreeding plot /tmp/first.123.nuon --options { k: $OPTS.k } --save /tmp/first.123.png

inbreeding run --options $OPTS --prng-seed 123 --output /tmp/second.123.nuon
inbreeding plot /tmp/second.123.nuon --options { k: $OPTS.k } --save /tmp/second.123.png

inbreeding run --options $OPTS --prng-seed 111 --output /tmp/first.111.nuon
inbreeding plot /tmp/first.111.nuon --options { k: $OPTS.k } --save /tmp/first.111.png

inbreeding run --options $OPTS --prng-seed 111 --output /tmp/second.111.nuon
inbreeding plot /tmp/second.111.nuon --options { k: $OPTS.k } --save /tmp/second.111.png
```

| seed | first | second |
| ---- | ----- | ------ |
| 123  | ![first.123](/uploads/6b09bf94ca7019e200a47e1a53adc533/first.123.png) | ![second.123](/uploads/fa77c1be84d6279f71cbfe3064c83242/second.123.png) |
| 111  | ![first.111](/uploads/bd31a0832825ecbef48178b2e8689a6f/first.111.png) | ![second.111](/uploads/e1c00e0a6765e82388a3c7142847bbab/second.111.png) |

2b2783f3

add timestamp to measurements and delay start (dragoon/komodo!111 ) · 1f16e7e2

STEVAN Antoine authored 10 months ago

- add a timestamp to all the measurements of the _diversity_ from `inbreeding/mod.rs`
- allow to delay the measurement starts with `--measurement-schedule-start`, to help completing already existing measurements

> ❗ **Important**  
> existing measurement files will have to change shape from
> ```
> table<strategy: string, diversity: list<float>>
> ```
> to
> ```
> table<strategy: string, diversity: table<t: int, diversity: float>>
> ```

1f16e7e2

fix decoding on empty shards and when less than $k$ (dragoon/komodo!110 ) · d039128d

STEVAN Antoine authored 10 months ago

makes sure
- "inbreeding" experiment quits when there are less than $k$ shards
- `fec::decode` returns `KomodoError::TooFewShards` when no shards are provided

d039128d

May 24, 2024

make "inbreeding" scripts a module (dragoon/komodo!109) · d594d917
STEVAN Antoine authored 11 months ago
```
just a small QoL improvement
```
d594d917

measure "inbreeding" for multiple recoding scenarii (dragoon/komodo!108 ) · 11cd2cdc

STEVAN Antoine authored 11 months ago

this MR is two-fold
- refactor `run.nu` and `plot.nu` from `scripts/inbreeding/` into Nushell modules with `--options` as argument instead of `options.nu` (a7cebb95, 6b72191f and 5f1c4963)
- introduce another level of depth to the measurements (a0e52e95)

> 💡 **Note**  
> in the table below
> - $s$ is the number of recoding scenarii averages together
> - $m$ is the number of measurements per point
> - two iterations of the same experiment are shown side by side for comparison

s   |    m | . | .
:--:|:----:|:-------------------------:|:-------------------------:
1   | 10   | ![inbreeding_1_10.1](/uploads/c593393edb3513c9d77b0fe134c27fd7/inbreeding_1_10.1.png) | ![inbreeding_1_10.2](/uploads/97c85b36833112de51a2b756ade53479/inbreeding_1_10.2.png)
1   | 100  | ![inbreeding_1_100.1](/uploads/af4da1d7cf76ef43fb39c2a3a529b7cd/inbreeding_1_100.1.png) | ![inbreeding_1_100.2](/uploads/e187298709d524437dea503be6ac555f/inbreeding_1_100.2.png)
1   | 1000 | ![inbreeding_1_1000.1](/uploads/394821777baeff9fec589440ba4c554c/inbreeding_1_1000.1.png) | ![inbreeding_1_1000.2](/uploads/1d592b791075f204f8a0ebdd739403dd/inbreeding_1_1000.2.png)
10  | 100  | ![inbreeding_10_100.1](/uploads/3c822e7669e9f0b4d97919e5a3bd4bca/inbreeding_10_100.1.png) | ![inbreeding_10_100.2](/uploads/aa6b54ec64f82ca386dae4e262dcd0b6/inbreeding_10_100.2.png)
100 | 10   | ![inbreeding_100_10.1](/uploads/7ef383143d981717b0ad01dee0359eb0/inbreeding_100_10.1.png) | ![inbreeding_100_10.2](/uploads/ce71a32a7f6e9ba3c4dae563aeefd856/inbreeding_100_10.2.png)
100 | 100  | ![inbreeding_100_100.1](/uploads/e2038273051f959d8be69fef9ba7a493/inbreeding_100_100.1.png) | ![inbreeding_100_100.2](/uploads/0ef30735597e1a6812484d1cac4d34ca/inbreeding_100_100.2.png)

we can see that
- the smaller the $s$, the more different the two figures are on each line -> this is likely due to the fact that, if only one recoding scenario is used, then repeating the same experiment will result in very different results and measurements. Running the same experiment $s$ times and averaging helps reducing the variance along this axis
- the smaller the $m$, the more noisy the measures of each points -> this is simply because, when $m$ is small, the variance of the empirical means measured for each point is higher

## final results
![inbreeding](/uploads/e561f4e4acad8eedbb3ccf1a4666c302/inbreeding.png)
![inbreeding_100_100_1](/uploads/37ab0eacb5159137595579dfbb20250c/inbreeding_100_100_1.png)

11cd2cdc

May 23, 2024

Refactor plot commands (dragoon/komodo!104 ) · 0f43be24

STEVAN Antoine authored 11 months ago

this MR moves run and plot commands from `examples/benches/README.md` to
- `scripts/setup/`: `run.nu` and `plot.nu`
- `scripts/commit/`: `run.nu` and `plot.nu`
- `scripts/recoding/`: `run.nu` and `plot.nu`
- `scripts/fec/`: `run.nu` and `plot.nu`
- `scripts/inbreeding/`: `build.nu`, `run.nu` and `plot.nu`

to generate all the figures at once
```bash
use scripts/setup/run.nu; seq 0 13 | each { 2 ** $in } | run --output data/setup.ndjson
use ./scripts/setup/plot.nu; plot data/setup.ndjson --save ~/setup.pdf

use scripts/commit/run.nu; seq 0 13 | each { 2 ** $in } | run --output data/commit.ndjson
use ./scripts/commit/plot.nu; plot data/commit.ndjson --save ~/commit.pdf

use scripts/recoding/run.nu; seq 0 18 | each { 512 * 2 ** $in } | run --ks [2, 4, 8, 16] --output data/recoding.ndjson
use ./scripts/recoding/plot.nu; plot data/recoding.ndjson --save ~/recoding.pdf

use scripts/fec/run.nu; seq 0 18 | each { 512 * 2 ** $in } | run --ks [2, 4, 8, 16] --output data/fec.ndjson
use ./scripts/fec/plot.nu; plot encoding data/fec.ndjson --save ~/encoding.pdf
use ./scripts/fec/plot.nu; plot decoding data/fec.ndjson --save ~/decoding.pdf
use ./scripts/fec/plot.nu; plot e2e data/fec.ndjson --save ~/e2e.pdf

use ./scripts/fec/plot.nu; plot combined data/fec.ndjson --recoding data/recoding.ndjson --save ~/comparison.pdf
use ./scripts/fec/plot.nu; plot ratio data/fec.ndjson --recoding data/recoding.ndjson --save ~/ratio.pdf

./scripts/inbreeding/build.nu
./scripts/inbreeding/run.nu --output data/inbreeding.nuon
./scripts/inbreeding/plot.nu data/inbreeding.nuon --save ~/inbreeding.pdf
```

> 💡 **Note**  
> this took around 27min 18sec in total on my machine with 14min 45sec for the inbreeding section only and 12min 33sec for the rest

0f43be24

define more complex inbreeding strategies (dragoon/komodo!103 ) · 61a2320e

STEVAN Antoine authored 11 months ago

this MR:
- refactors the "inbreeding" example into `examples/inbreeding/`
- adds `--strategy` and `--environment`
  - `Strategy::draw` will draw the number of shards to keep for recoding
  - `Environment::update` will update the pool of shards by losing some of them

61a2320e

May 21, 2024

run examples in release mode (dragoon/komodo!101 ) · be3b6b93

STEVAN Antoine authored 11 months ago

- update `benches/README.md` to use `cargo run --release --example ...`
- add `build-examples` to `Makefile` to build all examples in release

### minor change
add two `eprintln!` in `inbreeding.rs` to show the experiment parameters

be3b6b93

better plots (dragoon/komodo!102 ) · 9c3cd0a7

STEVAN Antoine authored 11 months ago

- new `scripts/plot.nu` with common tools and options
- better sets of parameters
- better commands in `benches/README.md`

9c3cd0a7

fix and improve inbreeding example (dragoon/komodo!100 ) · c9f4b174

STEVAN Antoine authored 11 months ago

* 240be2a6 add `--measurement-schedule` to inbreeding
* ace8b364 include max_t in the range
* a3345a88 shuffle the shards before recoding
* 6995c103 use `fec::recode_random`
* 2e729ab2 add second level of progress bar
* 84bef553 update the README
* 42ec5760 make Clippy happy

c9f4b174

May 13, 2024

make figures better (dragoon/komodo!98 ) · b4e53ac6

STEVAN Antoine authored 11 months ago

this MR makes the plot a bit nicer.

## new figures
![setup](/uploads/e6a7ac4e7460d8ff7015906216f9d30b/setup.png)
![commit](/uploads/a40700913594771cefeba45a44b2370b/commit.png)
![recoding](/uploads/1f3e86763c897dcf034e4b01ba858ada/recoding.png)
![decoding](/uploads/dd703ba4af59b4043ae5bb966f8b55ae/decoding.png)
![encoding](/uploads/6dbc077bbfe8086357074a0e18c8b530/encoding.png)
![e2e](/uploads/9eb92dbb8dc013ef803bde70f3e04f02/e2e.png)
![inbreeding](/uploads/9f7dac6a48c24e97448e92a69a506d2f/inbreeding.png)

b4e53ac6

May 02, 2024

add an example to study the _recoding inbreeding_ phenomenon (dragoon/komodo!97 ) · 7d5fca82

STEVAN Antoine authored 11 months ago

this MR adds `examples/inbreeding.rs` which allows to do two things
- _naive recoding_: in order to generate a new random shard, we first $k$-decode the whole data and then $1$-encode a single shard
- _true recoding_: to achieve the same goal, we directly $k$-recode shards into a new one

## the scenario
regardless of the _recoding strategy_, the scenario is the same
1. data is split into $k$ shards and $n$ original shards are generated
2. for a given number of steps $s$, $k$ shards are drawn randomly with replacement and we count the number of successful decoding, given a measure of the _diversity_, $$\delta = \frac{\#success}{\#attempts}$$
3. create a new _recoded shard_ and add it to the $n$ previous ones, i.e. $n$ increases by one
4. repeat steps 2. and 3. as long as you want
 
## results
![inbreeding](/uploads/b81614abcae01b7c915435aa87ccaec0/inbreeding.png)

7d5fca82