dev.dokimos.core.evaluators.agents.ToolCorrectnessEvaluator

All Implemented Interfaces:: Evaluator

public class ToolCorrectnessEvaluator extends BaseEvaluator

Checks whether the agent used the expected set of tools.

This is a glass-box evaluator for tool proficiency that compares actual tool calls against expected tool calls. The score is the F1-score of tool name sets, balancing precision and recall. No LLM is required.

Supports multiple match modes:

NAMES_ONLY — compares tool name sets (default)
NAMES_AND_ORDER — also checks invocation order
NAMES_AND_ARGS — full structural comparison including arguments

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static class

ToolCorrectnessEvaluator.Builder

Builder for constructing the evaluator.

static enum

ToolCorrectnessEvaluator.MatchMode

The comparison mode for evaluating tool correctness.
Method Summary

Modifier and Type

Method

Description

static ToolCorrectnessEvaluator.Builder

builder()

Creates a new builder for constructing the evaluator.

Methods inherited from class dev.dokimos.core.BaseEvaluator
evaluate, evaluateAsync, evaluateAsync, name, threshold

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Method Details
- builder
  
  public static ToolCorrectnessEvaluator.Builder builder()
  
  Creates a new builder for constructing the evaluator.
  
  Returns:
  
  a new builder

Class ToolCorrectnessEvaluator

Nested Class Summary

Method Summary

Methods inherited from class dev.dokimos.core.BaseEvaluator

Methods inherited from class java.lang.Object

Method Details

builder