Skip to main content

Dokimos

An Evaluation Framework for LLM applications in Java

Dataset-Driven Evaluation

Load test cases from JSON or CSV files, or create them programmatically. Run the same dataset across experiments or JUnit tests.

Built-in Evaluators

Use built-in and LLM-based evaluators out of the box.

Framework Integration

Works with JUnit 5 for parameterized testing and LangChain4j for evaluating AI Services. Integrate into existing CI/CD pipelines.