Package dev.dokimos.core.gate
Class GateConfig
java.lang.Object
dev.dokimos.core.gate.GateConfig
Configuration for the server-free regression gate.
The defaults suit an LLM-judge gate: it flags only statistically significant aggregate and
per-evaluator drops, so a noisy judge does not flake the build, while severityMargin still
fails the gate on a single item that breaks hard. Pair the gate with a temperature-0 judge for
stable scores.
Accessors follow the framework convention (no get prefix). Construct via defaults() or builder().
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic final classBuilder forGateConfig.static enumHow baseline and candidate items are paired.static enumWhat to do when an evaluator present in the baseline is missing from the candidate. -
Method Summary
Modifier and TypeMethodDescriptiondoublealpha()intbooleanWhether a missing-baseline bootstrap writes the file and passes (the default), versus the strict approval-test stance that writes it but fails once so the run is red until the new baseline is reviewed and committed.static GateConfig.Builderbuilder()Creates a new builder.static GateConfigdefaults()Returns a configuration with all defaults.booleanbooleanpairing()intlongseed()doublebooleanstatic booleanWhether an explicit baseline update was requested viaDOKIMOS_UPDATE_BASELINE(env, the primary control because-Ddoes not reach the test JVM under default Gradle or the IntelliJ runner) or thedokimos.updateBaselinesystem property.
-
Method Details
-
defaults
Returns a configuration with all defaults.- Returns:
- the default gate configuration
-
builder
Creates a new builder.- Returns:
- a new builder pre-populated with defaults
-
updateBaselineRequested
public static boolean updateBaselineRequested()Whether an explicit baseline update was requested viaDOKIMOS_UPDATE_BASELINE(env, the primary control because-Ddoes not reach the test JVM under default Gradle or the IntelliJ runner) or thedokimos.updateBaselinesystem property.- Returns:
- true if a baseline overwrite was requested out of band
-
alpha
public double alpha()- Returns:
- the significance level for the statistical tests
-
seed
public long seed()- Returns:
- the RNG seed for the permutation and bootstrap tests (pinned for reproducibility)
-
permutationIterations
public int permutationIterations()- Returns:
- the permutation-test iteration count
-
bootstrapIterations
public int bootstrapIterations()- Returns:
- the bootstrap iteration count
-
severityMargin
public double severityMargin()- Returns:
- the per-item worst-evaluator score-drop threshold that fails the gate (guard 2)
-
pairing
- Returns:
- the item pairing strategy
-
failOnRegression
public boolean failOnRegression()- Returns:
- whether a detected regression fails the gate
-
failOnRemovedItems
public boolean failOnRemovedItems()- Returns:
- whether a removed item (present in baseline, absent in candidate) fails the gate
-
onRemovedEvaluator
- Returns:
- the policy for an evaluator present in the baseline but missing from the candidate
-
bootstrapPasses
public boolean bootstrapPasses()Whether a missing-baseline bootstrap writes the file and passes (the default), versus the strict approval-test stance that writes it but fails once so the run is red until the new baseline is reviewed and committed. The CI no-baseline branch is unaffected either way: it never writes (the checkout is ephemeral) and always passes with a warning.- Returns:
- true if a local missing-baseline bootstrap passes after writing
-
updateBaseline
public boolean updateBaseline()- Returns:
- whether the baseline should be overwritten from the candidate and the gate pass
-