Class ToolCorrectnessEvaluator
java.lang.Object
dev.dokimos.core.BaseEvaluator
dev.dokimos.core.evaluators.agents.ToolCorrectnessEvaluator
- All Implemented Interfaces:
Evaluator
Checks whether the agent used the expected set of tools.
This is a glass-box evaluator for tool proficiency that compares actual tool calls against expected tool calls. The score is the F1-score of tool name sets, balancing precision and recall. No LLM is required.
Supports multiple match modes:
NAMES_ONLY— compares tool name sets (default)NAMES_AND_ORDER— also checks invocation orderNAMES_AND_ARGS— full structural comparison including arguments
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classBuilder for constructing the evaluator.static enumThe comparison mode for evaluating tool correctness. -
Method Summary
Modifier and TypeMethodDescriptionbuilder()Creates a new builder for constructing the evaluator.Methods inherited from class dev.dokimos.core.BaseEvaluator
evaluate, evaluateAsync, evaluateAsync, name, threshold
-
Method Details
-
builder
Creates a new builder for constructing the evaluator.- Returns:
- a new builder
-