Class GateService
RunComparison engine and returning a pass/fail verdict. The
entity-to-core conversion, pairing decision, and engine invocation are shared with the per-case
diff view through ComparisonSupport.
Automatic baseline selection is scoped by experiment, the candidate's dataset version, and an optional git branch. Other dimensions (evaluator set, judge model/prompt, thresholds, tenant) are not yet part of baseline selection. A mismatched evaluator set across the two runs is handled by the engine itself: an evaluator present on only one side co-occurs on no shared item and is reported UNCHANGED, never a regression.
Known limitation (zero-eval items): the core ItemResult.success()
treats an item with no eval results as passing (an allMatch over an empty stream is
true), so the comparison engine counts a zero-eval item as passing. The server's SQL
countItemsWithAllEvalsPassed treats a zero-eval item as not passed. These two notions of
"passing" can therefore diverge for items that carry no eval results, a pre-existing core
semantic that is not addressed here.
-
Constructor Summary
ConstructorsConstructorDescriptionGateService(ExperimentRepository experimentRepository, ExperimentRunRepository runRepository, ComparisonSupport comparisonSupport) -
Method Summary
Modifier and TypeMethodDescriptionevaluateGate(UUID experimentId, GateRequest request) Evaluates the regression gate for a candidate run within an experiment.
-
Constructor Details
-
GateService
public GateService(ExperimentRepository experimentRepository, ExperimentRunRepository runRepository, ComparisonSupport comparisonSupport)
-
-
Method Details
-
evaluateGate
@Transactional(readOnly=true) public GateResult evaluateGate(UUID experimentId, GateRequest request) Evaluates the regression gate for a candidate run within an experiment.- Parameters:
experimentId- the experiment the candidate belongs torequest- the gate request;candidateRunIdis required- Returns:
- the gate verdict, which is PASS, FAIL, or NO_BASELINE
- Throws:
IllegalArgumentException- if the experiment or a referenced run does not exist or a run does not belong to the experiment (surfaces as 404)IllegalStateException- if the candidate or an explicit baseline run is not terminal (surfaces as 409)
-