Skip to content

Search is only available in production builds. Try building and previewing the site to test it out locally.

02 - Measuring Performance

Measuring the runtime of benchmarks is often sufficient to identify performance regressions. However, runtime alone is an unbounded metric—aside from the trivial lower bound of zero—providing little insight into actual bottlenecks. To address this, the benchmarking jobs collect a comprehensive set of performance properties, offering deep insights into your application’s performance.

Performance Metrics: Understanding Hardware Effects

Modern processors include dedicated hardware features called performance counters, which collect detailed statistical data about application execution. Performance counters capture low-level metrics such as:

  • The volume of memory transfers
  • The number of executed floating-point operations
  • The number of predicted and mispredicted branching instructions

From these raw statistics, Daisytuner computes higher-level performance metrics that reflect hardware utilization. Many of these metrics are inherently bounded, making them effective indicators of the hardware utilization. Examples include:

  • Cache hit-to-miss ratios
  • Branch misprediction rates
  • Vectorization rates of floating-point operations

Collected metrics are available in our web dashboard and are also provided automatically in pull request comments, visually summarized in a convenient hardware diagram.

Hardware Diagram

How to Use

Include desired metrics using the metrics parameter in your benchmark definition:

run:
benchmark_1:
command: ./a.out
measurements: 5
profiler: perf
metrics:
- flop_dp
- memory_volume
- br_misp_ratio

You can find a comprehensive list of available metrics for each computing partition in the Partitions Reference.