Benchmarking Best Practices
This page describes best practices for writing benchmarks with Gungraun that produce accurate, meaningful, and reproducible results.
Use a Separate Workspace Member for Benchmarks
For library benchmarks, place your benchmarks in a separate workspace member
directory (for example, benchmarks/) rather than in the same crate as the code
being benchmarked.
Rust inlines functions freely within a crate but does not inline across crate
boundaries without explicit #[inline] attributes or link-time optimization
(LTO). This fundamental difference means in-crate benchmarks can present
significantly better performance than what users of your library will actually
experience.
This problem is not specific to Gungraun and it is generally a good idea to use a separate workspace member for benchmarks no matter the benchmarking framework.
Setting Up a Separate Benchmarks Crate
Create a workspace member for your benchmarks:
# `Cargo.toml` (workspace root of your crate `my_lib`)
[workspace]
members = ["benchmarks"]
# ...
# `benchmarks/Cargo.toml`
[package]
name = "benchmarks"
version = "0.1.0"
edition = "2024"
publish = false
[dependencies]
my_lib = { path = ".." }
[dev-dependencies]
gungraun = "0.18.1"
# Assuming there is a gungraun benchmark in `benchmarks/benches/gungraun.rs`
[[bench]]
harness = false
name = "gungraun"
path = "benches/gungraun.rs"
You can now run the benchmarks with cargo bench -p benchmarks or
cargo bench --workspace.
This approach avoids the inlining problem entirely and gives you measurements that reflect what users will observe. It also lets you use a different Rust version or profile settings for benchmarking than your main project requires.
Keep Benchmark Functions Clean
The body of your benchmark function should contain only the code you want to measure. Setup and teardown logic should be handled elsewhere to avoid attributing their costs to the function under test.
Gungraun provides built-in mechanisms for this:
extern crate gungraun; fn process_data(data: Vec<u64>) -> u64 { data.len() as u64 } use gungraun::prelude::*; use std::hint::black_box; fn expensive_setup(n: u64) -> Vec<u64> { (0..n).collect() } #[library_benchmark] #[bench::with_setup(args = [100], setup = expensive_setup)] fn bench_processing(data: Vec<u64>) -> u64 { black_box(process_data(black_box(data))) } library_benchmark_group!(name = my_group, benchmarks = bench_processing); fn main() { main!(library_benchmark_groups = my_group); }
See Setup and Teardown for the full range of options including teardown functions.
Use black_box Appropriately
Wrap values in std::hint::black_box to prevent the compiler
from optimizing away computations. As a general rule of thumb, you wrap all
input and output values:
extern crate gungraun; fn expensive_computation(n: u64) -> u64 { n } use gungraun::prelude::*; use std::hint::black_box; #[library_benchmark] #[bench::low(5)] fn bench_example(n: u64) -> u64 { // Ensure `n` and `result` are used and not optimized away black_box(expensive_computation(black_box(n))) } library_benchmark_group!(name = example, benchmarks = bench_example); fn main() { main!(library_benchmark_groups = example); }
Design for CI
Gungraun excels in CI environments because it produces consistent measurements even in noisy virtualized systems. To get the most from CI benchmarking:
- Use baselines to detect regressions across branches and runs
- Avoid benchmarks that depend on external state (network, filesystem timing, random seeds without fixed seeds)
See Regressions for configuring regression detection.
Understanding Your Metrics
Gungraun measures instruction counts, cache behavior, and estimated cycles. These correlate with but do not equal wall-clock time:
- Instruction counts are precise and portable across systems
- Cache simulation approximates real cache behavior
- Estimated cycles provide a rough wall-clock approximation
For user-perceived latency validation, combine Gungraun with wall-clock benchmarks like Criterion.rs. Use Gungraun for detecting regressions and microoptimizations; use wall-clock benchmarks for validating end-to-end performance claims.
Where to Go Next
- Library Benchmark Quickstart to start writing benchmarks
- Binary Benchmarks for benchmarking executables
- Regressions for regression checks for example in the CI