Sunday, September 10, 2023

Benchmarker

Hi,

This is one of my first Rust apps.  

I use it to benchmark an application - run it multiple times and get readings + graphs.

Benchmark

Benchmarking data collector - runs an application as a child process, collecting stats (time, CPU usage, memory usage) and generating benchmarking reports.



Why

Created it for 2 reasons:
- to check/learn how to create and manage full rust applications using the whole ecosystem - crates/builds/publishing
- personal needs to get benchmarks for different other projects

Code




High-level idea

  • run the application multiple times

  • collect all interested readings:

    • time
    • CPU
    • memory
  • process outputs and provide results as:

    • CSV/excel
    • graphs

Save outputs to local DB/file to check downgrade/speedup in the next release of an application.


Methodology

For each benchmark run:

  • run multiple times (default 10)
  • remove outliers
  • average output results

methodology

Build

cargo build -r --bin benchmark 

Usage

benchmark 0.1.0
Benchmarking data collector.

USAGE:
    benchmark [OPTIONS] <APPLICATION>

ARGS:
    <APPLICATION>    Application path (just name if it is in the same directory)

OPTIONS:
    -h, --help           Print help information
    -l, --log <LOG>      Set custom log level: info, debug, trace [default: info]
    -r, --runs <RUNS>    Number of runs to be executed [default: 10]
    -V, --version        Print version information

Example output

09:33:24.899 (t: main) INFO - benchmark - Application to be benchmark is: /opt/workspace/ballista/target/release/examples/example_processing
09:33:24.899 (t: main) INFO - benchmark - Number of runs: 10
09:33:24.902 (t: main) INFO - benchmark - Collecting data::example_processing
09:33:24.902 (t: main) INFO - benchmark::bench::analysis - Run 0 of 10
09:33:24.947 (t: main) INFO - benchmark::bench::analysis - Run 1 of 10
09:33:24.983 (t: main) INFO - benchmark::bench::analysis - Run 2 of 10
09:33:25.016 (t: main) INFO - benchmark::bench::analysis - Run 3 of 10
09:33:25.049 (t: main) INFO - benchmark::bench::analysis - Run 4 of 10
09:33:25.087 (t: main) INFO - benchmark::bench::analysis - Run 5 of 10
09:33:25.132 (t: main) INFO - benchmark::bench::analysis - Run 6 of 10
09:33:25.188 (t: main) INFO - benchmark::bench::analysis - Run 7 of 10
09:33:25.238 (t: main) INFO - benchmark::bench::analysis - Run 8 of 10
09:33:25.288 (t: main) INFO - benchmark::bench::analysis - Run 9 of 10
09:33:25.338 (t: main) INFO - benchmark - Processing outputs
0.04,130,18752,
0.03,140,18664,
0.03,156,18856,
0.03,153,18868,
0.04,152,18884,
0.04,140,18904,
0.05,136,19404,
0.05,145,19220,
0.05,137,18780,
0.05,138,18788,
09:33:25.339 (t: main) INFO - benchmark::bench::collector - SUMMARY:
09:33:25.339 (t: main) INFO - benchmark::bench::collector - Time [ms]:: min: 30, max: 50, avg: 41 ms
09:33:25.339 (t: main) INFO - benchmark::bench::collector - CPU [%]:: min: 130, max: 156, avg: 142.7 %
09:33:25.339 (t: main) INFO - benchmark::bench::collector - Memory [kB]:: min: 18664, max: 19404, avg: 18912 kB

Process finished with exit code 0


Also in the current directory of the benchmark app, there is an output directory created named "bench_<your_app_name>", ie: bench_example_processing, which contains:

Output CSV file:

Time,Cpu,Mem
0.04,130,18752
0.03,140,18664
0.03,156,18856
0.03,153,18868
0.04,152,18884
0.04,140,18904
0.05,136,19404
0.05,145,19220
0.05,137,18780
0.05,138,18788

and output graphs:

summary report: summary_report.txt

TEST

cargo build --example test_app -r   

cargo run --bin benchmark -- /opt/workspace/app_banchmark/target/release/examples/test_app   

cargo run --bin benchmark -- "/opt/workspace/app_banchmark/target/release/examples/test_app -additionl -app -params"  


TODO:

  • incremental runs - use date/time in output dir
  • local db / or file struct to see changes with time/application trends
  • move out from GNU time dependency to sysinfo



Note: For monitoring long-running processes like servers / streaming apps - see https://github.com/yarenty/app_tracer.

Datafusion Comet

Hi! Recently I moved to Rust and working on several projects - more insights to come ... one of them was Datafusion - an extremely fast S...