All Classes and Interfaces (Dokimos :: Parent POM 0.25.0-SNAPSHOT API)

Class

Description

AbstractScopedRepository<T>

Base implementation of ScopedRepository.

AddItemsRequest

AddItemsRequest.EvalData

AddItemsRequest.ItemData

AgentEvalCase

Typed entry point for building an EvalTestCase for the agent evaluators.

AgentEvalCase.Builder

Builder for constructing an EvalTestCase that targets the agent evaluators.

AgentTrace

Wraps a complete agent execution trace for evaluation.

AgentTrace.Builder

Builder for constructing agent traces.

AggregationStrategy

Strategies for aggregating scores from multiple evaluation criteria.

AlertWebhook

A configured HTTP endpoint that receives a JSON alert when a completed run of one of its project's experiments shows a significant pass-rate regression against its baseline.

AlertWebhookController

CRUD over a project's regression-alert webhooks.

AlertWebhookDispatcher

Delivers regression alerts to a project's enabled webhooks.

AlertWebhookRepository

Tenant-scoped repository for AlertWebhook.

AlertWebhookRepositoryFragment

Entity-specific scoped finders for AlertWebhook.

AlertWebhookRepositoryFragmentImpl

Tenant-scoped implementation of the AlertWebhook finders.

AlertWebhookService

Manages a project's regression-alert webhooks.

AlertWebhookView

Public view of an AlertWebhook.

AlignmentController

Exposes the per-evaluator judge-human alignment for a run: the agreement rate between automated evaluator verdicts and human annotations.

AlignmentService

Computes per-run, per-evaluator agreement between automated evaluator verdicts and human annotations.

AlignmentView

Per-run, per-evaluator agreement between automated evaluator verdicts and human annotations.

AlignmentView.EvaluatorAlignment

Agreement breakdown for a single evaluator across a run's annotated items.

Annotation

A human reviewer's verdict on a single ItemResult.

AnnotationController

Endpoints for the single human review annotation on a run item result.

AnnotationRepository

AnnotationRequest

Payload for creating or updating the annotation on a run item result.

AnnotationService

Service for human review annotations on run item results.

AnnotationVerdict

A human reviewer's verdict on a single run item result.

AnnotationView

Read model for an annotation on a run item result.

ApiKey

A scoped API key used to authenticate write requests against /api/v1/**.

ApiKeyAuthenticator

Authenticator backed by two credential sources that coexist for backward compatibility: the single legacy key configured via DOKIMOS_API_KEY (which maps to Role.ADMIN), and the scoped keys stored in api_keys (each hashed, each carrying its own role and tenant).

ApiKeyAuthFilter

Servlet filter that authenticates /api/v1/** by delegating credential resolution to an Authenticator, then enforces role-based authorization on the resolved Principal.

ApiKeyController

Manages scoped API keys.

ApiKeyHasher

Hashes raw API keys to the SHA-256 hex form stored in the database.

ApiKeyProperties

Configuration properties for API key authentication.

ApiKeyRepository

ApiKeyService

Mints and manages scoped API keys.

ApiKeyView

Metadata view of an ApiKey.

ArgMatchMode

Controls how a tool call's arguments are compared against the expected arguments.

ArgumentMatcher

Strategy for comparing a tool call's actual arguments against the expected arguments.

Assertions

Assertion utilities for evaluation-based testing.

AsyncTask

A non-blocking task that produces a TaskResult asynchronously.

Authenticator

Strategy that resolves the Principal behind an incoming request from its credential.

BaseEvaluator

Base class for implementing concrete evaluators.

BaselineFile

The committed regression-gate baseline (format v1).

BaselineFile.BaselineItem

One item's projection.

BaselineFile.DatasetInfo

Dataset summary metadata (advisory; never used for pairing).

BaselineFile.EvaluatorEntry

One evaluator's recorded outcome for an item.

BaselineFile.JudgeInfo

Judge provenance (advisory).

BaselineFile.Provenance

Build/judge context, kept separate from the measured signal and excluded from the byte-stability guarantee.

BaselineStore

Reads and writes the regression-gate baseline file, and reconstructs comparison-ready RunResults from it.

CallMetrics

Optional metrics describing the LLM call that produced an item result.

ClasspathDatasetResolver

ComparisonStatus

Classification of how a metric or item changed between a baseline and a candidate.

ComparisonSupport

Shared entity-to-core conversion and run-comparison invocation for the regression gate and the per-case diff view.

ComparisonSupport.ComparisonOutcome

Result of comparing two runs: the engine output, the pairing strategy, and the runs' loaded item entities so callers (the diff view) can derive per-case input text without re-querying.

ContextualRelevanceEvaluator

Evaluator that measures how relevant retrieved context chunks are to a user's input query.

ContextualRelevanceEvaluator.Builder

Builder for constructing ContextualRelevanceEvaluator instances.

ContextualRelevanceEvaluator.ContextScore

Represents the relevance score for a single context chunk.

ConversationalApplication

Functional interface representing an application that can engage in multi-turn conversations.

ConversationSimulator

Orchestrates multi-turn conversations between a simulated user and an application.

ConversationSimulator.Builder

Builder for constructing conversation simulators.

ConversationTrajectory

Represents a complete conversation trajectory between a simulated user and an application.

ConversationTrajectory.Builder

Builder for constructing conversation trajectories.

CreateAlertWebhookRequest

Request to register an alert webhook for a project.

CreateApiKeyRequest

Request to mint a scoped API key.

CreatedApiKeyView

Response returned exactly once when a key is minted.

CreateDatasetRequest

CreateLlmConnectionRequest

Request to register an OpenAI-compatible LLM connection.

CreateRunRequest

CreateRunResponse

CreateTraceEvalRuleRequest

Request to create or replace a trace eval rule.

CreateVersionRequest

CreateVersionRequest.ItemPayload

Dataset

A collection of examples for evaluation.

Dataset

A named container that owns one or more immutable DatasetVersions.

Dataset.Builder

Builder for constructing datasets.

DatasetArgumentsProvider

JUnit ArgumentsProvider that loads Examples from a Dataset.

DatasetController

DatasetDetails

DatasetDetails.VersionSummary

DatasetItem

A single example within a DatasetVersion.

DatasetItemRepository

DatasetItemView

DatasetParser

DatasetReporter

Marks a static field that supplies the Reporter for DatasetSource tests.

DatasetRepository

Tenant-scoped repository for Dataset.

DatasetRepositoryFragment

Entity-specific scoped finders for Dataset.

DatasetRepositoryFragmentImpl

Tenant-scoped implementation of the Dataset finders.

DatasetResolutionException

Thrown when a dataset cannot be correctly resolved or loading fails.

DatasetResolver

Resolves a dataset URI to a Dataset.

DatasetResolverRegistry

Singleton registry for dataset resolvers.

DatasetRunExtension

Opens and reports a Reporter run for each DatasetSource test method.

DatasetRunExtension.DatasetItemRecorder

Collects the actual outputs and evaluation results a DatasetSource test body produced for one parameterized invocation.

DatasetService

Service for server-owned versioned datasets.

DatasetSource

Provides Examples from a Dataset as arguments to a parameterized test.

DatasetSummary

DatasetVersion

An immutable snapshot of a Dataset.

DatasetVersionDetails

DatasetVersionRepository

DiffCase

One row of the per-case run-diff table: a single item compared across baseline and candidate, with its per-evaluator old-to-new deltas.

DiffCase.EvaluatorDiff

A single evaluator's change on one item between baseline and candidate.

DiffController

Exposes the per-case run-diff view: the same comparison the CI gate runs, presented as a full, paginated table of every case with per-evaluator deltas.

DiffService

Builds the per-case run-diff view for the UI.

DiffSummary

Whole-run summary of a per-case run diff.

DiffView

Combined per-case run-diff payload: the whole-run summary plus the first (or requested) page of cases.

DokimosMcpServer

MCP server that exposes dokimos evaluation tools over stdio transport.

DokimosServerApplication

DokimosServerReporter

Async HTTP Reporter that sends experiment results to a Dokimos server.

DokimosServerReporter.Builder

Builder for DokimosServerReporter.

DokimosServerReporter.ItemDeliveryFailure

Describes a batch of items permanently dropped after all delivery retries were exhausted.

DokimosTypeConversionException

Thrown when a stored output value cannot be converted to the requested type by one of the typed read accessors (for example EvalTestCase.actualOutputAs(Class) or Example.expectedOutputAs(dev.dokimos.core.OutputType)).

EmbabelSupport

Entry point for evaluating Embabel agent runs with Dokimos.

EmbabelTraceCollector

Accumulates an Embabel agent's tool calls into a Dokimos AgentTrace.

EnqueueJudgeRequest

Request to enqueue a server-side judge job for a run.

EvalJob

A unit of server-side scoring work: score every not-yet-evaluated item of a run with a single evaluator configuration, using a registered LlmConnection.

EvalJobController

EvalJobRepository

EvalJobService

Enqueues server-side judge jobs and reads the jobs registered for a run.

EvalJobStatus

Lifecycle state of an EvalJob.

EvalJobView

Public view of an EvalJob, returned by the enqueue endpoint and the per-run job listing.

EvalResult

The result of an evaluation.

EvalResult

EvalResult.Builder

Builder for constructing evaluation results.

EvalResultRepository

EvalTestCase

A test case for evaluation.

EvalTestCase.Builder

Builder for constructing test cases with multiple inputs and outputs.

EvalTestCaseParam

EvaluationCriterion

Defines a single evaluation dimension for trajectory evaluation.

EvaluationException

Thrown when an evaluation cannot be executed successfully.

Evaluator

Evaluates test cases and produces scored results.

EvaluatorDelta

Change in a single evaluator's mean score between baseline and candidate.

ExactMatchEvaluator

Evaluator that checks for exact string match between actual and expected outputs.

ExactMatchEvaluator.Builder

Example

A dataset example with inputs, expected outputs, and metadata.

Example.Builder

Builder for constructing examples with multiple inputs and outputs.

Experiment

An evaluation experiment that runs a task against a dataset and evaluates the results.

Tenant-scoped repository for Experiment.

ExperimentRepositoryFragment

Entity-specific scoped finders for Experiment.

ExperimentRepositoryFragmentImpl

Tenant-scoped implementation of the Experiment finders.

ExperimentResult

Aggregated results from an experiment.

ExperimentResultExporter

Utility class for exporting ExperimentResult to various formats.

ExperimentRun

ExperimentRunRepository

Tenant-scoped repository for ExperimentRun.

ExperimentRunRepositoryFragment

Entity-specific scoped finders for ExperimentRun.

ExperimentRunRepositoryFragmentImpl

Tenant-scoped implementation of the ExperimentRun finders.

ExperimentService

ExperimentSummary

ExperimentSummary.LatestRunInfo

FaithfulnessEvaluator

Evaluator that uses an LLM to check how much of the actual output is backed by the given context.

FaithfulnessEvaluator.Builder

FileDatasetResolver

Resolves datasets from the filesystem.

GateConfig

Configuration for the server-free regression gate.

GateConfig.Builder

Builder for GateConfig.

GateConfig.Pairing

How baseline and candidate items are paired.

GateConfig.RemovedEvaluatorPolicy

What to do when an evaluator present in the baseline is missing from the candidate.

GateController

Exposes the CI regression gate.

GateRequest

Request to evaluate a CI regression gate for an already-ingested candidate run.

GateResult

Verdict of a CI regression gate.

GateResult.EvaluatorDrop

A per-item evaluator score drop.

GateResult.RegressedCase

A single regressed item, identified by its dataset item id when paired by id or by its positional index otherwise.

GateResult.RegressedEvaluator

A single evaluator's regression between baseline and candidate.

GateService

Evaluates a CI regression gate by comparing an already-ingested candidate run against a resolved baseline run with the core RunComparison engine and returning a pass/fail verdict.

GateVerdict

The result of a regression-gate comparison: the overall status, the pass-rate move, the regressed evaluators and cases, and any coverage-loss or threshold-drift warnings.

GateVerdict.EvaluatorDrop

A per-item evaluator score drop.

GateVerdict.RegressedCase

A single regressed item, identified by its dataset item id when paired by id or by its positional index otherwise.

GateVerdict.RegressedEvaluator

A single evaluator's significant regression between baseline and candidate.

GlobalExceptionHandler

HallucinationEvaluator

Evaluator that uses an LLM to detect hallucinations in the actual output.

HallucinationEvaluator.Builder

IngestedBatch

Marks an item batch (identified by idempotency key) as committed for a run.

IngestedBatch.IngestedBatchId

Composite primary key.

IngestedBatchRepository

ItemComparison

Comparison of a single paired item across baseline and candidate.

ItemResult

The outcome of executing a single example: the example itself, the actual outputs the task produced, the evaluator results, and optional metrics describing the underlying LLM call.

ItemResult

ItemResultRepository

Json

Shared, framework-internal JSON utilities backed by a single, immutable Jackson ObjectMapper.

JsonResultStore

Stores run records as a JSON array in a local file.

JudgeCallException

Unchecked failure raised when a judge HTTP call cannot complete.

JudgeJobTransactions

The transactional steps of the judge worker, each in its own committed transaction so the worker can make the LLM HTTP call between them without holding a database transaction open.

JudgeJobTransactions.ItemSnapshot

A detached view of an item result carrying only the fields the judge prompt needs.

JudgeJobTransactions.ScoredResult

Pairs an eval result with the id of the item result it belongs to, for batch persistence.

JudgeLM

A language model used for evaluation.

JudgeProperties

Tuning knobs for the background judge worker, bound from the dokimos.judge prefix.

JudgeScorer

Drives a single judge scoring: builds the prompt from a criteria and the selected parameters, calls the underlying JudgeLM, and parses the response.

JudgeScorer.ScoreOutcome

The result of scoring one item: a numeric score, the judge's reasoning, and the pass decision.

JudgeWorker

Background worker that drains the judge job queue.

LangChain4jSupport

Utilities for integrating with LangChain4j.

LlmConnection

A named, reusable pointer to an OpenAI-compatible endpoint used by the server-side judge.

LlmConnectionController

LlmConnectionProtocol

The API surface an LlmConnection's endpoint speaks.

LlmConnectionRepository

Tenant-scoped repository for LlmConnection.

LlmConnectionRepositoryFragment

Entity-specific scoped finders for LlmConnection.

LlmConnectionRepositoryFragmentImpl

Tenant-scoped implementation of the LlmConnection finders.

LlmConnectionService

Registers and reads LLM connections.

LlmConnectionView

Public view of an LlmConnection.

LlmCredentialService

Resolves and protects the API key for an LlmConnection.

LLMJudgeEvaluator

Evaluator that uses an LLM to evaluate outputs based on the specified criteria.

LLMJudgeEvaluator.Builder

LlmResponseUtils

Utility methods for processing LLM responses.

LLMSimulatedUser

An LLM-based simulated user for multi-turn conversation testing.

LLMSimulatedUser.Builder

Builder for constructing LLM simulated users.

MatchingStrategy

Strategy for determining if a retrieved item matches an expected item.

MeasuredTask

A task that produces outputs together with optional CallMetrics describing the underlying LLM call.

Message

Represents a single message in a conversation.

Message.Role

The role of a message sender in a conversation.

MetadataEntry

A single typed run-metadata key-value pair for DatasetSource.entries().

NoOpReporter

A no-op implementation of Reporter that does nothing.

OpenAiCompatibleJudge

A JudgeLM that calls an OpenAI-compatible chat completions endpoint over the JDK HTTP client, with no vendor SDK.

OpenAiSupport

Utilities for integrating with the OpenAI Java SDK.

OpenResponsesJudge

A JudgeLM that calls an Open Responses endpoint (POST {baseUrl}/responses) over the JDK HTTP client, with no vendor SDK.

OtlpAnyValue

The OTLP AnyValue union in its JSON encoding.

OtlpExportTraceServiceRequest

The top level of the OTLP/HTTP JSON encoding of ExportTraceServiceRequest.

OtlpKeyValue

A single OTLP attribute: a key and an OtlpAnyValue holding the typed value.

OtlpProtobufConverter

Translates the protobuf ExportTraceServiceRequest into the same DTO tree Jackson produces for the OTLP/HTTP JSON encoding, so both encodings share one parser and persistence path.

OtlpResource

A resource and its attributes.

OtlpResourceSpans

The spans emitted by one resource, with that resource's attributes (for example service.name).

OtlpScopeSpans

The spans emitted by one instrumentation scope.

OtlpSpan

One OTLP span in the JSON encoding.

OtlpStatus

A span status.

OtlpTraceParser

Pure translation from the OTLP/HTTP JSON request DTOs into the server's own span model, with no persistence.

OtlpTraceParser.ParsedSpan

A span translated from OTLP, carrying the merged attributes and derived input/output text.

OtlpTraceParser.Result

The outcome of parsing: the valid spans and the count of malformed spans that were skipped.

OutputType<T>

A super-type token that captures a full generic type T (including its type arguments) at compile time so it survives erasure at runtime.

PageResponse<T>

Stable page envelope used by paginated endpoints.

PageResponse.PageablePage

Mirrors Spring's PageableObject JSON shape.

PageResponse.PageSort

Mirrors Spring's SortObject JSON shape.

PrecisionEvaluator

Evaluator that measures retrieval precision.

PrecisionEvaluator.Builder

Builder for constructing PrecisionEvaluator instances.

PriceTable

Turns a model id and token counts into a USD cost, returning null when the model is unknown or a count is missing (never throws, never fabricates a number).

Principal

Authenticated caller.

Project

A project groups experiments.

ProjectController

ProjectRepository

Tenant-scoped repository for Project.

ProjectRepositoryFragment

Entity-specific scoped finders for Project, implemented over the tenant-scoped query helper.

ProjectRepositoryFragmentImpl

Tenant-scoped implementation of the Project finders.

ProjectService

ProjectSummary

PromoteRequest

Payload for promoting run item results into a new version of an existing dataset.

PromoteRequest.PromoteItem

A single run item result to promote.

RecallEvaluator

Evaluator that measures retrieval recall.

RecallEvaluator.Builder

Builder for constructing RecallEvaluator instances.

RegexEvaluator

Evaluator that checks if the actual output matches a regular expression pattern.

RegexEvaluator.Builder

RegressionAlertEvent

Published inside the run-completion transaction when a run regresses, and consumed only after that transaction commits.

RegressionAlertPayload

JSON body POSTed to a project's alert webhooks when a completed run regresses against its baseline.

RegressionAlertService

Computes, inside the run-completion transaction, whether a just-completed run regressed against its baseline, and publishes a RegressionAlertEvent when it did.

RegressionGate

Compares a candidate experiment result against a committed baseline and produces a GateVerdict.

RegressionGateRunner

The run lifecycle around RegressionGate: resolves whether to update, bootstrap, or compare a baseline, performs the file I/O, and turns a FAIL verdict into an AssertionError.

RegressionGateRunner.Environment

Reads the outside world the lifecycle depends on.

Reporter

Interface for reporting experiment results to an external system.

ResultStore

Persistence layer for evaluation run records.

ReviewQueueController

ReviewQueueItem

A run item surfaced in the review queue because it still needs a human verdict, carrying enough run, experiment, and project context for a reviewer to act on it without opening the run first.

ReviewQueueService

Surfaces run items that still need a human verdict so a reviewer can work through a single queue instead of opening runs one at a time.

Role

Role granted to an authenticated Principal.

RunComparison

Regression-comparison engine that compares a baseline set of runs against a candidate set.

RunComparison.Builder

Builder for RunComparison.

RunComparisonResult

Result of comparing a baseline set of runs against a candidate set.

RunController

RunDetails

Detail view of a single run.

RunDetails.EvalSummary

RunDetails.ItemSummary

RunHandle

A handle representing an active experiment run.

RunRecord

Persistent record of a single evaluation run.

RunRecord.EvalDetail

Single evaluator result for one example.

RunRecord.ItemDetail

Detail for a single evaluated example.

RunResult

Results from a single run of an experiment.

RunService

RunStatus

Status of an experiment run.

RunStatus

RunSummary

Summary view of a single run in a list.

ScopedRepository<T>

Common tenant-scoped operations shared by every scoped repository.

ServerDatasetResolver

Resolves dataset URIs of the form dataset://<name>@<version> against a Dokimos server.

SignificanceResult

Outcome of a statistical significance test on a paired comparison.

SimulatedUser

Functional interface for simulating user behavior in multi-turn conversations.

SpanView

Public view of a TraceSpan, including its derived input/output text and flattened attributes.

SpaResourceConfig

Serves the single-page React app and forwards client side routes to index.html so a deep link or a page refresh on a route such as /traces or /api-keys is handled by the SPA router instead of returning 404.

SpringAiAlibabaSupport

Utilities for evaluating Spring AI Alibaba (graph/agent) runs with the Dokimos agent evaluators.

SpringAiAlibabaSupport.AlibabaAgentResponse

A Spring AI Alibaba agent run's output text paired with optional token usage, consumed by SpringAiAlibabaSupport.measuredAsyncTask(Function, String, PriceTable).

SpringAiSupport

Utilities for integrating with Spring AI.

StructuralMatchEvaluator

Compares the actual output against the expected output as JSON structures rather than as opaque strings.

StructuralMatchEvaluator.Builder

Builder for constructing structural match evaluators.

StructuralMatchMode

Controls how StructuralMatchEvaluator compares an expected JSON structure against an actual one.

Task

TaskCompletionEvaluator

Evaluates whether an AI agent completed the user's requested tasks.

TaskCompletionEvaluator.Builder

Builder for constructing the evaluator.

TaskResult

The outcome of executing a MeasuredTask: the actual outputs and optional metrics describing the underlying LLM call.

TenantPredicate

Builds the SQL/JPQL tenant predicate for a TenantScope against a tenant_id path.

TenantScope

Immutable tenant visibility used by every scoped repository read and by service-side write stamping.

TenantScopedFinder<T>

Reusable tenant-scoped query helper shared by every scoped repository implementation.

TenantScopeResolver

Derives the TenantScope and principal id for the current request from the Principal the auth filter placed on the request attribute.

TolerantArgumentMatcher

Default ArgumentMatcher that compares arguments structurally with a few deliberate tolerances.

TolerantArgumentMatcher.Builder

Builder for constructing the matcher.

ToolArgumentHallucinationEvaluator

Uses a judge LLM to assess whether tool call argument values are factually grounded in the user's input and preceding tool call results.

ToolArgumentHallucinationEvaluator.Builder

Builder for constructing the evaluator.

ToolCall

Represents a single tool invocation made by an AI agent.

ToolCall.Builder

Builder for constructing tool calls.

ToolCalls

Coercion helpers for turning a raw value into a typed List<ToolCall>.

ToolCallValidityEvaluator

Validates that tool calls are syntactically correct per their JSON schema definitions.

ToolCallValidityEvaluator.Builder

Builder for constructing the evaluator.

ToolCorrectnessEvaluator

Checks whether the agent used the expected set of tools.

ToolCorrectnessEvaluator.Builder

Builder for constructing the evaluator.

ToolCorrectnessEvaluator.MatchMode

The comparison mode for evaluating tool correctness.

ToolDefinition

Describes an available tool's contract including its name, description, and JSON schema.

ToolDefinition.Builder

Builder for constructing tool definitions.

ToolDescriptionReliabilityEvaluator

Evaluates tool description quality using a mix of rule-based checks and optional LLM checks.

ToolDescriptionReliabilityEvaluator.Builder

Builder for constructing the evaluator.

ToolEfficiencyEvaluator

Measures how efficiently an agent used its tools by detecting redundant calls.

ToolEfficiencyEvaluator.Builder

Builder for constructing the evaluator.

ToolErrorEvaluator

Detects tool execution failures by inspecting ToolCall.result().

ToolErrorEvaluator.Builder

Builder for constructing the evaluator.

ToolHandlers

Implements the four MCP tool handlers for the dokimos evaluation framework.

ToolNameReliabilityEvaluator

Evaluates tool naming quality using a mix of rule-based checks and optional LLM checks.

ToolNameReliabilityEvaluator.Builder

Builder for constructing the evaluator.

ToolTrajectoryEvaluator

Evaluates an agent's tool-call trajectory against an expected trajectory using a selectable match mode.

ToolTrajectoryEvaluator.Builder

Builder for constructing the evaluator.

ToolTrajectoryEvaluator.MatchMode

The comparison mode for evaluating a trajectory.

Trace

A single distributed execution trace ingested over OTLP, holding one or more TraceSpan rows.

TraceController

OTLP trace ingestion and read endpoints.

TraceDetail

Full view of one trace: its metadata, all of its spans, and the online eval jobs scoped to those spans.

TraceEvalEnqueuer

Decides which online eval jobs to create for a freshly ingested trace and persists them.

TraceEvalJob

A unit of online scoring work: score one ingested TraceSpan's derived output against one TraceEvalRule's judge configuration.

TraceEvalJobRepository

TraceEvalJobStatus

Lifecycle state of a TraceEvalJob.

TraceEvalJobTransactions

The transactional steps of the trace eval worker, each in its own committed transaction so the worker can make the LLM HTTP call between them without holding a database transaction open.

TraceEvalJobTransactions.ClaimedJob

A claimed job paired with the span text snapshots the worker scores after the transaction closes.

TraceEvalJobView

Public view of a TraceEvalJob, returned in the per-trace detail.

TraceEvalRule

A per-project rule that enqueues an online judge evaluation when an ingested span matches.

TraceEvalRuleController

Manages per-project trace eval rules.

TraceEvalRuleRepository

Tenant-scoped repository for TraceEvalRule.

TraceEvalRuleRepositoryFragment

Entity-specific scoped finders for TraceEvalRule.