dev.dokimos.core.comparison.RunComparison

public final class RunComparison extends Object

Regression-comparison engine that compares a baseline set of runs against a candidate set.

Each side may contain one or more runs (repetitions). Items are grouped by an item-identity key, aggregated across repetitions into a per-item pass-probability and per-evaluator mean, then paired across sides by key. The engine emits per-evaluator and overall deltas classified as IMPROVED, REGRESSED, or UNCHANGED, each backed by a significance test.

For single-run binary outcomes the pass-rate test uses McNemar's test with continuity correction; otherwise a paired sign-flip permutation test with a bootstrap percentile confidence interval. A change is flagged only when |delta| > epsilon and the test is significant at alpha.

Randomized procedures are deterministic for a fixed seed and evaluator set; the shared Random is consumed in evaluator-name order, so adding or removing evaluators shifts p-values of the others.

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static final class

RunComparison.Builder

Builder for RunComparison.
Method Summary

Modifier and Type

Method

Description

static RunComparison.Builder

builder()

New builder with default configuration.

RunComparisonResult

compare(List<RunResult> baseline, List<RunResult> candidate)

Compares baseline runs against candidate runs.

static RunComparison

create()

Engine with default settings.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Method Details
- builder
  
  public static RunComparison.Builder builder()
  
  New builder with default configuration.
- create
  
  public static RunComparison create()
  
  Engine with default settings.
- compare
  
  public RunComparisonResult compare(List<RunResult> baseline, List<RunResult> candidate)
  
  Compares baseline runs against candidate runs. Either list may be empty.
  
  Throws:
  
  NullPointerException - if either argument is null

Class RunComparison

Nested Class Summary

Method Summary

Methods inherited from class java.lang.Object

Method Details

builder

create

compare