Class ToolArgumentHallucinationEvaluator
java.lang.Object
dev.dokimos.core.BaseEvaluator
dev.dokimos.core.evaluators.agents.ToolArgumentHallucinationEvaluator
- All Implemented Interfaces:
Evaluator
Uses a judge LLM to assess whether tool call argument values are factually
grounded in the user's input and preceding tool call results.
This is a glass-box evaluator for tool proficiency. For each tool call, the judge evaluates whether argument values can be derived from the user's request or from the results of earlier tool calls in the same execution. This supports multi-step agent workflows where later tool arguments are derived from earlier tool results (e.g., a search returns URLs, then a fetch tool uses one of those URLs).
When ToolCall.result() is populated, the result
is included as grounding context for subsequent tool calls. When result is null,
only the user input is considered as grounding context.
The score is the fraction of non-hallucinated tool calls (0.0 to 1.0).
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classBuilder for constructing the evaluator. -
Method Summary
Modifier and TypeMethodDescriptionbuilder()Creates a new builder for constructing the evaluator.Methods inherited from class dev.dokimos.core.BaseEvaluator
evaluate, evaluateAsync, evaluateAsync, name, threshold
-
Method Details
-
builder
Creates a new builder for constructing the evaluator.- Returns:
- a new builder
-