Class ToolDescriptionReliabilityEvaluator

java.lang.Object
dev.dokimos.core.BaseEvaluator
dev.dokimos.core.evaluators.agents.ToolDescriptionReliabilityEvaluator
All Implemented Interfaces:
Evaluator

public class ToolDescriptionReliabilityEvaluator extends BaseEvaluator
Evaluates tool description quality using a mix of rule-based checks and optional LLM checks.

Performs 13 checks across two categories:

Rule-based (always run):

  • input_arguments_clarity: Each parameter has a "description" key
  • input_arguments_types: Each parameter has a "type" key
  • max_num_input_arguments: Total params ≤ maxInputArgs (default 5)
  • max_optional_input_arguments: Optional params ≤ maxOptionalArgs (default 3)

LLM-based (require judge):

  • general_structure: Description includes purpose, inputs, and output
  • has_examples: Description includes usage examples
  • has_usage_notes: Description includes notes/limitations/caveats
  • intent_over_implementation: Communicates what, not how
  • clarity: Avoids obscure/ambiguous terms
  • redundancy: Avoids redundant information
  • input_arguments_enum: Applicable args include enumeration values
  • input_arguments_format: Applicable args include format specs
  • return_statement_quality: Output information is clearly described

Without a judge LLM, only the 4 rule-based checks run. Score is based on checks that actually ran.