Class TrajectoryEvaluationCriteria
This class provides ready-to-use criteria for common conversation evaluation dimensions. Each criterion includes a detailed description that guides the LLM judge in scoring the conversation.
Example usage:
TrajectoryEvaluator evaluator = TrajectoryEvaluator.builder()
.judge(judgeLM)
.criteria(List.of(
TrajectoryEvaluationCriteria.userSatisfaction(),
TrajectoryEvaluationCriteria.problemResolution(),
TrajectoryEvaluationCriteria.professionalTone()))
.build();
-
Method Summary
Modifier and TypeMethodDescriptionstatic EvaluationCriterionclarity()Creates a criterion for evaluating clarity of communication.static EvaluationCriterionCreates a criterion for evaluating consistency.static EvaluationCriterionCreates a criterion for evaluating conversation quality.static EvaluationCriterionCreates a custom evaluation criterion.static EvaluationCriterionCreates a custom evaluation criterion with specified weight.static EvaluationCriterionCreates a criterion for evaluating goal completion.static EvaluationCriterionCreates a criterion for evaluating helpfulness.static EvaluationCriterionCreates a criterion for evaluating information accuracy.static EvaluationCriterionCreates a criterion for evaluating problem resolution.static EvaluationCriterionCreates a criterion for evaluating professional tone.static EvaluationCriterionCreates a criterion for evaluating response relevance.static EvaluationCriterionsafety()Creates a criterion for evaluating safety and appropriateness.static EvaluationCriterionCreates a criterion for evaluating user satisfaction.
-
Method Details
-
userSatisfaction
Creates a criterion for evaluating user satisfaction.Assesses whether the user's concerns were adequately addressed and whether they seem satisfied by the end of the conversation.
- Returns:
- a user satisfaction criterion
-
goalCompletion
Creates a criterion for evaluating goal completion.Assesses whether the assistant successfully achieved the stated or implied goal of the conversation.
- Returns:
- a goal completion criterion
-
conversationQuality
Creates a criterion for evaluating conversation quality.Assesses the overall flow, coherence, and naturalness of the dialogue.
- Returns:
- a conversation quality criterion
-
responseRelevance
Creates a criterion for evaluating response relevance.Assesses whether the assistant's responses were on-topic and directly addressed the user's questions.
- Returns:
- a response relevance criterion
-
professionalTone
Creates a criterion for evaluating professional tone.Assesses whether the assistant maintained appropriate language, demeanor, and professionalism throughout.
- Returns:
- a professional tone criterion
-
problemResolution
Creates a criterion for evaluating problem resolution.Assesses whether issues or problems raised were effectively resolved.
- Returns:
- a problem resolution criterion
-
informationAccuracy
Creates a criterion for evaluating information accuracy.Assesses whether the information provided was correct and reliable.
- Returns:
- an information accuracy criterion
-
clarity
Creates a criterion for evaluating clarity of communication.Assesses how clearly and understandably the assistant communicated.
- Returns:
- a clarity criterion
-
helpfulness
Creates a criterion for evaluating helpfulness.Assesses how genuinely helpful and service-oriented the assistant was.
- Returns:
- a helpfulness criterion
-
consistency
Creates a criterion for evaluating consistency.Assesses whether the assistant maintained consistent information and behavior throughout the conversation.
- Returns:
- a consistency criterion
-
safety
Creates a criterion for evaluating safety and appropriateness.Assesses whether the assistant avoided harmful, inappropriate, or unsafe responses.
- Returns:
- a safety criterion
-
custom
Creates a custom evaluation criterion.Use this method when the pre-built criteria don't fit your evaluation needs.
- Parameters:
name- the criterion namedescription- detailed instructions for evaluation- Returns:
- a custom criterion with weight 1.0
-
custom
Creates a custom evaluation criterion with specified weight.- Parameters:
name- the criterion namedescription- detailed instructions for evaluationweight- the weight for score aggregation- Returns:
- a custom criterion with the specified weight
-