Record Class ExperimentResult

java.lang.Object
java.lang.Record
dev.dokimos.core.ExperimentResult
Record Components:
name - the experiment name
description - the experiment description
metadata - experiment metadata
runResults - results for each run of the experiment

public record ExperimentResult(String name, String description, Map<String,Object> metadata, List<RunResult> runResults) extends Record
Aggregated results from an experiment.

When an experiment is run multiple times, this class aggregates the results across all runs. Methods like averageScore(String) and passRate() return values averaged across all runs.

  • Constructor Details

    • ExperimentResult

      public ExperimentResult(String name, String description, Map<String,Object> metadata, List<RunResult> runResults)
      Creates an instance of a ExperimentResult record class.
      Parameters:
      name - the value for the name record component
      description - the value for the description record component
      metadata - the value for the metadata record component
      runResults - the value for the runResults record component
  • Method Details

    • itemResults

      public List<ItemResult> itemResults()
      Returns all item results across all runs.

      For single-run experiments, this returns the same results as accessing the first run's item results directly.

      Returns:
      all item results flattened across runs
    • runs

      public List<RunResult> runs()
      Returns the individual run results for detailed analysis.
      Returns:
      list of run results
    • runCount

      public int runCount()
      Returns the number of runs performed.
      Returns:
      the run count
    • totalCount

      public int totalCount()
      Returns the total number of items evaluated per run.

      This returns the count from the first run. All runs evaluate the same dataset, so this value is consistent across runs.

      Returns:
      the total count per run
    • passCount

      public double passCount()
      Returns the average number of items that passed across all runs.
      Returns:
      the average pass count
    • failCount

      public double failCount()
      Returns the average number of items that failed across all runs.
      Returns:
      the average fail count
    • passRate

      public double passRate()
      Returns the average pass rate across all runs.
      Returns:
      the pass rate between 0.0 and 1.0
    • averageScore

      public double averageScore(String evaluatorName)
      Returns the average score for the specified evaluator across all runs.

      This first computes the average score within each run, then averages those values across all runs.

      Parameters:
      evaluatorName - the evaluator's name
      Returns:
      the computed average score
    • scoreStdDev

      public double scoreStdDev(String evaluatorName)
      Returns the sample standard deviation of scores for the specified evaluator across runs.

      This measures how much the average score varies between runs. A high standard deviation suggests instability in your task or evaluator outputs.

      Uses sample standard deviation (N-1 denominator) since runs represent a sample of potential outcomes, not the complete population.

      For single-run experiments, this returns 0.0.

      Parameters:
      evaluatorName - the evaluator's name
      Returns:
      the standard deviation, or 0.0 for single-run experiments
    • evaluatorNames

      public Set<String> evaluatorNames()
      Returns the names of all evaluators used in this experiment.
      Returns:
      set of evaluator names
    • exportJson

      public void exportJson(Path path)
      Exports the experiment result to a JSON file.
      Parameters:
      path - the file path to write to
    • exportHtml

      public void exportHtml(Path path)
      Exports the experiment result to an HTML file.
      Parameters:
      path - the file path to write to
    • exportMarkdown

      public void exportMarkdown(Path path)
      Exports the experiment result to a Markdown file.
      Parameters:
      path - the file path to write to
    • exportCsv

      public void exportCsv(Path path)
      Exports the experiment result to a CSV file.
      Parameters:
      path - the file path to write to
    • toJson

      public String toJson()
      Returns the experiment result as a JSON string.
      Returns:
      JSON representation
    • toHtml

      public String toHtml()
      Returns the experiment result as an HTML string.
      Returns:
      HTML representation
    • toMarkdown

      public String toMarkdown()
      Returns the experiment result as a Markdown string.
      Returns:
      Markdown representation
    • toCsv

      public String toCsv()
      Returns the experiment result as a CSV string.
      Returns:
      CSV representation
    • toString

      public final String toString()
      Returns a string representation of this record class. The representation contains the name of the class, followed by the name and value of each of the record components.
      Specified by:
      toString in class Record
      Returns:
      a string representation of this object
    • hashCode

      public final int hashCode()
      Returns a hash code value for this object. The value is derived from the hash code of each of the record components.
      Specified by:
      hashCode in class Record
      Returns:
      a hash code value for this object
    • equals

      public final boolean equals(Object o)
      Indicates whether some other object is "equal to" this one. The objects are equal if the other object is of the same class and if all the record components are equal. All components in this record class are compared with Objects::equals(Object,Object).
      Specified by:
      equals in class Record
      Parameters:
      o - the object with which to compare
      Returns:
      true if this object is the same as the o argument; false otherwise.
    • name

      public String name()
      Returns the value of the name record component.
      Returns:
      the value of the name record component
    • description

      public String description()
      Returns the value of the description record component.
      Returns:
      the value of the description record component
    • metadata

      public Map<String,Object> metadata()
      Returns the value of the metadata record component.
      Returns:
      the value of the metadata record component
    • runResults

      public List<RunResult> runResults()
      Returns the value of the runResults record component.
      Returns:
      the value of the runResults record component