Commits · 6b9af3df375836cd709f427c05722c46b8fe2c8e · HEME Clement / Komodo

Jul 05, 2024

add `.env.nu` to load Nushell modules (dragoon/komodo!154 ) · 6b9af3df

STEVAN Antoine authored 9 months ago

adds a `.env.nu` file to load Nushell modules automatically thanks to the `nuenv` hook from the `nu-hooks` package.

6b9af3df

improve setup and commit plots (dragoon/komodo!152 ) · 4bd3943c

STEVAN Antoine authored 9 months ago

- show the log of the degree for high values
- don't show the "time" Y label because the units are in the times
  values
- don't rotate the X tick labels

## results
![setup](/uploads/75dfd2033801a88e446e0ce8cae24167/setup.png)
![commit](/uploads/cca5bdfabb8f4242d21bf68079a5566a/commit.png)

4bd3943c

Jun 07, 2024

bump to 0.3.0 (dragoon/komodo!151 ) · 414c94fa
STEVAN Antoine authored 9 months ago

0.3.0

414c94fa
move inbreeding to `dragoon/nc-diversity` (dragoon/komodo!150) · 84d14a15
STEVAN Antoine authored 9 months ago
```
see [`dragoon/nc-diversity`](https://gitlab.isae-supaero.fr/dragoon/nc-diversity)
```
84d14a15
move "inbreeding" snippets from README to real scripts (dragoon/komodo!148 ) · ff5752a1
STEVAN Antoine authored 10 months ago

ff5752a1

add "long full recoding" test (dragoon/komodo!147 ) · 447e4473

STEVAN Antoine authored 10 months ago

this will
- recode for $\#steps \in [10, 20, 100]$
- at $t = 0$, $k$ random shards among the $n$ encoded will be selected at random
- at $t \geq 1$, all $k$ shards will be used to recode $k$ brand new shards
- make sure the last set of $k$ shards recoded $\#steps$ together can decode the data

## example with $(k, n) = (3, 5)$ and $\#steps = 3$
- $(s_i)_{1 \leq i \leq k}$ are the $k$ source shards
- $(e_j)_{1 \leq j \leq n}$ are the $n$ encoded shards
- $(m_i)_{1 \leq i \leq k}$ are the $k$ randomly selected shards
- $(n_i)_{1 \leq i \leq k}$ are the shards after step $1$
- $(o_i)_{1 \leq i \leq k}$ are the shards after step $2$
- $(p_i)_{1 \leq i \leq k}$ are the shards after step $3$
- the $(p_i)_{1 \leq i \leq k}$ will be used for decoding

```mermaid
graph TD;

    s1 --> e1; s1 --> e2; s1 --> e3; s1 --> e4; s1 --> e5;
    s2 --> e1; s2 --> e2; s2 --> e3; s2 --> e4; s2 --> e5;
    s3 --> e1; s3 --> e2; s3 --> e3; s3 --> e4; s3 --> e5;

    e1 --> m1;
    e3 --> m2;
    e4 --> m3;

    m1 --> n1; m1 --> n2; m1 --> n3;
    m2 --> n1; m2 --> n2; m2 --> n3;
    m3 --> n1; m3 --> n2; m3 --> n3;

    n1 --> o1; n1 --> o2; n1 --> o3;
    n2 --> o1; n2 --> o2; n2 --> o3;
    n3 --> o1; n3 --> o2; n3 --> o3;

    o1 --> p1; o1 --> p2; o1 --> p3;
    o2 --> p1; o2 --> p2; o2 --> p3;
    o3 --> p1; o3 --> p2; o3 --> p3;
```

447e4473

working on FEC tests again (dragoon/komodo!146 ) · 0e229c28

STEVAN Antoine authored 10 months ago

- pass `n` to `try_all_decoding_combinations` and don't try to decode when shards have been recoded ($\#shards > n$) and there are no recoded shards in the $k$ combination under review ($\max(is) < n$)
- pass `recoding_steps` and `should_not_be_decodable` as arguments to `end_to_end_with_recoding_template`
- fix $n = 5$ => this leads to tests that run in less than 10sec again
- add $(k, n) = (8, 10)$ => tests still run in less than 13sec
- split recoding scenarii into "_simple_" and "_chain_"
- show indices in a "_pretty_" format, i.e. showing indices greater than $n$ as `(n)`, `(n + 1)`, ...

0e229c28

Jun 06, 2024

complete the FEC and "linear algebra" tests (dragoon/komodo!145 ) · 7b0eae24

STEVAN Antoine authored 10 months ago

- `komodo::linalg::Matrix::random` is tested
- `komodo::linalg::Matrix::inverse` is tested on more matrix sizes, from $1$ to $20$ random matrices
- `komodo::field` tests have been double-checked
- pure "recoding" tests from `komodo::fec` have been double-checked
- `end_to_end` and `end_to_end_with_recoding` now runs for $k \in [3, 5]$ and $\rho \in [\frac{1}{2}, \frac{1}{3}]$ with $n = \lfloor \frac{k}{\rho} \rfloor$
- all "_$k$ among $n + t$_" combinations are tested with `try_all_decoding_combinations`, possibly with some removals in case recoding is involved with `is_inside`

> ❗ **Important**  
> on my machine, `make test` goes from less than 8sec on latest `main` to around 40sec with this MR

7b0eae24

remove `rng` and `curves` from `bins/` (dragoon/komodo!144) · e55fc269
STEVAN Antoine authored 10 months ago
```
they have been moved to [dragoon/binaries](https://gitlab.isae-supaero.fr/dragoon/binaries).
```
e55fc269

Jun 05, 2024
- remove "end to end" test case from "inbreeding" (dragoon/komodo!142) · 136df509
  STEVAN Antoine authored 10 months ago
```
this is the trivial case where it is always possible to decode the original data, so there is no need to have it here.
```
  136df509
Jun 03, 2024
- refactor "inbreeding" Nushell modules (dragoon/komodo!141) · 92520b53
  STEVAN Antoine authored 10 months ago
```
this MR
- moves the Nushell modules from `bins/inbreeding/` to `bins/inbreeding/src/.nushell/`
- creates a `NUSHELL` constant in `consts.nu` to allow the following more robust construct
```bash
  use consts.nu
  use $consts.NUSHELL ...
```
- updates the README
```
  92520b53
- add snippet to plot all "inbreeding" experiments (dragoon/komodo!139 ) · 902e5b4c
  STEVAN Antoine authored 10 months ago
  
  902e5b4c
- refactor parsing in "inbreeding" (dragoon/komodo!140) · 0d615007
  STEVAN Antoine authored 10 months ago
```
this simply makes "parsing" in the inbreeding modules more robust and centralized.
```
  0d615007
- return integer columns from `inbreeding inspect` (dragoon/komodo!138 ) · dae90f4c
  STEVAN Antoine authored 10 months ago
  
  dae90f4c
May 31, 2024

make `inbreeding watch` better and more robust (dragoon/komodo!136) · e841c66b
STEVAN Antoine authored 10 months ago
```
- early return when format is bad twice
- refactor
```
e841c66b

add `inbreeding list` (dragoon/komodo!135 ) · 3f4c8ddb

STEVAN Antoine authored 10 months ago

wait for
- !134 

## description
assuming !134 has been merged, this MR allows to run something like
```bash
inbreeding list | input list --fuzzy | inbreeding load $in | inbreeding plot
```
which will fuzzy find the available experiments and then plot it after load 👌

3f4c8ddb

add information about experiment to `inbreeding load` (dragoon/komodo!134 ) · 142e673e

STEVAN Antoine authored 10 months ago

this can be used by `plot` without passing extra arguments, i.e. the pipeline becomes
```bash
use bins/inbreeding

let experiment = "..."
```
```bash
inbreeding load $experiment | inbreeding plot
```
instead of having to know the value of `k` and do
```bash
inbreeding load $experiment | inbreeding plot --options { k: $k }
```

142e673e

branch out RNG when measuring (dragoon/komodo!130 ) · a3a5f2e2

STEVAN Antoine authored 10 months ago

- addresses #9 
- needed !133 to work

this MR simply adds `+ Clone` to `rng` and removes the `mut` from the `rng` of the `measure_inbreeding` function.

running the same snippet of code from #9 yields the following two images with a schedule of $1$ and $5$ respectively

![000_1.1](/uploads/ace74e299c6d4374df713ab5004f3c1d/000_1.1.png)
![000_1.2](/uploads/c784f5b62c41004971fd3048f7b5babc/000_1.2.png)

we see that all measurements on $t$ where $t = 0 \mod 5$ are the same in both images 🎉

a3a5f2e2

May 30, 2024

fix experiment hash by not hashing _measurement schedule_ (dragoon/komodo!133 ) · 3b26bad1

STEVAN Antoine authored 10 months ago

related to
- #9

in order for the measurements not to influence the experiment, the seeds passed to the runs need to not include the _measurement schedule_ parameters!

EDIT: in the end, it's more than that, we want to only include things related to the environment in its hash, nothing related to the measurements, i.e. we want to either
- exclude `strategies`, `nb_scenarii`, `measurement_schedule`, `measurement_schedule_start`, `nb_measurements` and `max_t`
- include `nb_bytes`, `k`, `n` and `environment`

3b26bad1

and `inbreeding watch` command (dragoon/komodo!131 ) · a58c650a

STEVAN Antoine authored 10 months ago

## output sample
let's say the current seed is `"b239e48345ac457b492cc164f58c010d07292c88e4791e607d91796baec7f334"` and the experiment has ID `fixed:0-single:3-5-50-10240`, and that we are watching both the creation of the first experiment and a few runs, e.g.
- `002d8de28913efbf7dbd111b817ae901fee2d47882ba7aa76d293c2d95d9652c`
- `015672b37c9cf1a6b475937987294f9a503a922ffbcfdfc5d18ef839fac91b8c`
- `080a51a17ac43fcbdf08f77f3bc993983fef589bd9672f04382af4b16dd09b13`

then the output would look like
```
b239e48  fixed:0-single:3-5-50-10240            at 2024-05-30 15:24:12
b239e48  fixed:0-single:3-5-50-10240  002d8de   at 2024-05-30 15:24:14
b239e48  fixed:0-single:3-5-50-10240  015672b   at 2024-05-30 15:24:16
b239e48  fixed:0-single:3-5-50-10240  080a51a   at 2024-05-30 15:24:18
```

a58c650a

remove all mentions to "naive" and "true" recoding (dragoon/komodo!132) · 71410d6f
STEVAN Antoine authored 10 months ago
```
we are switching
- _naive recoding_ to _$(k, 1)$-re-encoding_
- _true recoding_ to _$k$-recoding_
```
71410d6f

hash the whole experiment and remove timestamp (dragoon/komodo!129 ) · e38dcb17

STEVAN Antoine authored 10 months ago

this is to have a complete identifier for each run, which does not depend at all on the time at which the experiment runs.

e38dcb17

fix random item draw (dragoon/komodo!127 ) · e5311d6c

STEVAN Antoine authored 10 months ago

related to
- !122

## description
!122 introduced a `draw_unique_indices` function which uses a `HashSet` to accumulate unique indices in the range `0..<len`.
however, a `HashSet` does not preserve the order of insertion when iterating over the elements of the set... which results in apparent randomness, even though the RNG seed is the same 😮 

this MR switches back to using `shuffle` which used to work, even though a bit less performant 👌   
it's basically a revert of !122, while keeping the refactoring into `random.rs`.

## measuring the performance
i did run the same timing experiment from !122 but with `main` on `bb55005f` and the MR on `fix-shuffle`

| env     | main                   | mr                      | improvement         |
| ------- | ---------------------- | ----------------------- | ------------------- |
| fixed:0 | 6sec 244ms 238µs 45ns | 8sec 734ms 929µs 328ns | -39.88783363238997  |
| fixed:1 | 639ms 720µs 39ns      | 731ms 360µs 261ns      | -14.325051024390373 |

we loose a bit

e5311d6c

support proper 32-byte RNG seeds (dragoon/komodo!126 ) · bb55005f

STEVAN Antoine authored 10 months ago

- add optional `$.help` to argument `err` of `error throw`
- parse `prng_seed: [u8; 32]` in `rng` and `inbreeding`
- compute the "_local_" seed by hashing the "_global_" seed, the strategy and the iteration index
- pass `--prng-seed: string`, a 64-char long seed to `inbreeding run`

bb55005f

May 29, 2024

parse experiment names with regex (dragoon/komodo!128 ) · 3b4d14d1

STEVAN Antoine authored 10 months ago

because the "_environment_" can contain an experiment separator, e.g. `random-fixed`, more powerful `parse` patterns needs to be used.

3b4d14d1

add `inbreeding inspect` to look at the cache (dragoon/komodo!125 ) · 0b51ce48

STEVAN Antoine authored 10 months ago

* 7b44a886 refactor constants from `inbreeding load`
* 64e446f7 move `remove-cache-prefix` to new `path.nu`
* 822b03fd add `inbreeding inspect`

0b51ce48

complete experiment dump paths with k, n, ... (dragoon/komodo!124 ) · 85f60a29

STEVAN Antoine authored 10 months ago

in order to add more information in the experiment names, e.g. $k$, $n$ and $\#\text{bytes}$, this MR changes the cache files format from
```
{seed}/{timestamp}/{env}/{strategy}/...
```
to
```
{seed}/{timestamp}-{env}-{strategy}-{k}-{n}-{nb_bytes}/...
```

85f60a29

make `.nushell/` a proper module (dragoon/komodo!123 ) · 45540a7c

STEVAN Antoine authored 10 months ago

this MR turns `./.nushell/` into a directory module by
- adding `mod.nu`
- exporting all the modules

all uses of `.nushell/` have been fixed to not mention `.nu` internal modules anymore.

> 💡 **Note**  
> the `.nushell venv` module has been removed because, when the `$venv.VENV` activation script is not there, Nushell can't parse the whole `.nushell` module, which is very annoying to have to rely of the state of the external filesystem to be able to simply parse a module...

45540a7c

use all x values to compute x limits in `inbreeding plot` (dragoon/komodo!120 ) · f270ac3e
STEVAN Antoine authored 10 months ago

f270ac3e

add tests for the `color.nu` module (dragoon/komodo!121 ) · 4212bb72

STEVAN Antoine authored 10 months ago

this also bumps Nushell to 0.93.0 to include the "extra" command `fmt`, see [Nushell 0.92.0](https://www.nushell.sh/blog/2024-04-02-nushell_0_92_0.html#incorporating-the-extra-feature-by-default-toc)

4212bb72

improve performances by not shuffling vectors (dragoon/komodo!122 ) · 47ba0de8

STEVAN Antoine authored 10 months ago

in `./bins/inbreeding/`, this MR does
- refactor the "list item drawing" from `environment.rs` and `strategy.rs` into the `draw_unique_elements` function of new `random.rs` module
- use a `HashSet` to draw unique indices in the slice of "things" to draw from and then extracts the items corresponding to these indices

## results
```bash
use ./bins/inbreeding
use std bench

const PRNG_SEED = 0
const OPTS = {
    nb_bytes: (10 * 1_024),
    k: 10,
    n: 20,
    nb_measurements: 100,
    nb_scenarii: 10,
    measurement_schedule: 1,
    measurement_schedule_start: 2_000,
    max_t: 2_000,
    strategies: [ "single:5" ],
    environment: null,
}

def run [rev: string] {
    git co $rev

    inbreeding build

    let a = bench --rounds 5 {
        inbreeding run --options ($OPTS | update environment "fixed:0") --prng-seed $PRNG_SEED
    }
    let b = bench --rounds 5 {
        inbreeding run --options ($OPTS | update environment "fixed:1") --prng-seed $PRNG_SEED
    }

    {
        0: $a,
        1: $b,
    }
}

let main = run a29b511d
let mr = run fix-shuffle
```
```bash
let table = [
    [env, main, mr, improvement];

    ["fixed:0", $main."0".mean, $mr."0".mean, (($main."0".mean - $mr."0".mean) / $main."0".mean * 100)],
    ["fixed:1", $main."1".mean, $mr."1".mean, (($main."1".mean - $mr."1".mean) / $main."1".mean * 100)],
]

$table | to md --pretty
```

| env     | main                    | mr                      | improvement        |
| ------- | ----------------------- | ----------------------- | ------------------ |
| fixed:0 | 8sec 504ms 794µs 784ns | 6sec 353ms 206µs 645ns | 25.298530930431912 |
| fixed:1 | 727ms 648µs 292ns      | 639ms 443µs 782ns      | 12.12186037811795  |

the improvement is quite nice, even though not huge, but the code is cleaner anyways 🙏

47ba0de8

May 28, 2024

include the standard deviation of "inbreeding" measurements (dragoon/komodo!119 ) · a29b511d

STEVAN Antoine authored 10 months ago

this adds `$.diversity.e` to the output of `inbreeding load` and input of `inbreeding plot` to show errors bars in the final plot.

these can be discarded by running something similar to
```bash
inbreeding load $experiment
    | reject diversity.e
    | inbreeding plot --options { k: $OPTS.k }
```

a29b511d

split "inbreeding" load and plot (dragoon/komodo!118 ) · 31d58f4c

STEVAN Antoine authored 10 months ago

plotting "inbreeding" results is now done in two steps
- `load` the data from raw experiment files with `inbreeding load`
- `plot` the results by piping this data into `inbreeding plot`

31d58f4c

split `examples/` into `benchmarks/` and `bins/` (dragoon/komodo!117 ) · bb626120

STEVAN Antoine authored 10 months ago

## new structure for the repository
- benchmarks are in `./benchmarks/` and can be run with either `cargo run --package benchmarks --bin <bench>` or the commands in `./benchmarks/README.md`
```
├── Cargo.toml
├── README.md
└── src
    └── bin
        ├── commit.rs
        ├── fec.rs
        ├── linalg.rs
        ├── operations
        │   ├── curve_group.rs
        │   └── field.rs
        ├── recoding.rs
        ├── setup.rs
        └── setup_size.rs
```

- examples are now in `./bins/` as standalone binaries and can be run either with `cargo run --package <pkg>` or with the help of the `cargo bin` command from `.nushell/cargo.nu`
```
├── curves
│   ├── Cargo.toml
│   ├── README.md
│   └── src
│       └── main.rs
├── inbreeding
│   ├── build.nu
│   ├── Cargo.toml
│   ├── consts.nu
│   ├── mod.nu
│   ├── plot.nu
│   ├── README.md
│   ├── run.nu
│   └── src
│       ├── environment.rs
│       ├── main.rs
│       └── strategy.rs
├── rank
│   ├── Cargo.toml
│   └── src
│       └── main.rs
└── rng
    ├── Cargo.toml
    └── src
        └── main.rs
```

- Nushell modules are now located in `./.nushell/`

## changelog
apart from the changes to the general structure of the repo:
- `binary.nu` -> `.nushell/binary.nu`
- new `cargo bin` command from `.nushell/cargo.nu`
- `error throw` is now defined in `.nushell/error.nu`
- main TOML has been greatly simplified because the dependencies of "examples" have been moved to the associated crates
- the rest is basically the same but in the new structure

bb626120

fix the random number generator seeds (dragoon/komodo!116 ) · 173a1088

STEVAN Antoine authored 10 months ago

related to
- dragoon/komodo!113

this MR makes sure that the seeds given to each "strategy + scenario" loop are different by generating a bunch of unique seeds per strategy.

173a1088

dump inbreeding results raw (dragoon/komodo!115 ) · 8448ef57

STEVAN Antoine authored 10 months ago

this will make `inbreeding run` dump all the data as-is to files on disk.
data will be arranged in the following way
- the `$CACHE` is `~/.cache/komodo/inbreeding/`
- each seed will have its own directory: `$CACHE/<seed>/`
- each strategy and environment will have their own directory: `$CACHE/<seed>/<date>/<env>/<strat>/`
- finally each recoding scenario will go to its own subdirectory: `$CACHE/<seed>/<date>/<env>/<strat>/<i>`

> ❗ **Important**  
> the plotting pipeline is broken for now but will be fixed in later MRs soon...

8448ef57

May 27, 2024

don't run inbreeding if "start > max_t" (dragoon/komodo!114 ) · ac90901b
STEVAN Antoine authored 10 months ago

ac90901b

Rand (dragoon/komodo!113 ) · 2b2783f3

STEVAN Antoine authored 10 months ago

- add `--prng-seed: u8` to fix the random number generator seed

## example
by running the following snippet, we get
- `first.123.png` and `second.123.png` with `--prng-seed 123` which are the same
- `first.111.png` and `second.111.png` with `--prng-seed 111` which are the same
- `first.111.png` and `first.123.png` are different

```bash
use ./scripts/inbreeding

const OPTS = {
    nb_bytes: (10 * 1_024),
    k: 10,
    n: 20,
    nb_scenarii: 10,
    nb_measurements: 10,
    measurement_schedule: 1,
    measurement_schedule_start: 0,
    max_t: 50,
    strategies: [
        "single:1",
        "double:0.5:1:2",
        "single:2"
        "double:0.5:2:3",
        "single:3"
        "single:5"
        "single:10",
    ],
    environment: "random-fixed:0.5:1",
}

inbreeding build

inbreeding run --options $OPTS --prng-seed 123 --output /tmp/first.123.nuon
inbreeding plot /tmp/first.123.nuon --options { k: $OPTS.k } --save /tmp/first.123.png

inbreeding run --options $OPTS --prng-seed 123 --output /tmp/second.123.nuon
inbreeding plot /tmp/second.123.nuon --options { k: $OPTS.k } --save /tmp/second.123.png

inbreeding run --options $OPTS --prng-seed 111 --output /tmp/first.111.nuon
inbreeding plot /tmp/first.111.nuon --options { k: $OPTS.k } --save /tmp/first.111.png

inbreeding run --options $OPTS --prng-seed 111 --output /tmp/second.111.nuon
inbreeding plot /tmp/second.111.nuon --options { k: $OPTS.k } --save /tmp/second.111.png
```

| seed | first | second |
| ---- | ----- | ------ |
| 123  | ![first.123](/uploads/6b09bf94ca7019e200a47e1a53adc533/first.123.png) | ![second.123](/uploads/fa77c1be84d6279f71cbfe3064c83242/second.123.png) |
| 111  | ![first.111](/uploads/bd31a0832825ecbef48178b2e8689a6f/first.111.png) | ![second.111](/uploads/e1c00e0a6765e82388a3c7142847bbab/second.111.png) |

2b2783f3

mix colors for hybrid recoding strategies (dragoon/komodo!112 ) · 9d6e6e7a

STEVAN Antoine authored 10 months ago

- define `scripts/color.nu` to manipulate RGB colors, especially mix two colors together
- compute the color of _hybrid recoding strategies_ as a weighted sum of the two _simple recoding strategies_ involved, e.g. if the strategy is "10% of the time recode 2 shards and 90% of the time recode 3", then the color of that curve will be 10% the color of the simple strategy recoding 2 shards and 90% the color of the other simple strategy recoding 3 shards
- make the _hybrid_ curves transparent and dashed

## example
![foo](/uploads/71afca07a87fc38e3fa005cebfab4e50/foo.png)

9d6e6e7a

add timestamp to measurements and delay start (dragoon/komodo!111 ) · 1f16e7e2

STEVAN Antoine authored 10 months ago

- add a timestamp to all the measurements of the _diversity_ from `inbreeding/mod.rs`
- allow to delay the measurement starts with `--measurement-schedule-start`, to help completing already existing measurements

> ❗ **Important**  
> existing measurement files will have to change shape from
> ```
> table<strategy: string, diversity: list<float>>
> ```
> to
> ```
> table<strategy: string, diversity: table<t: int, diversity: float>>
> ```

1f16e7e2