Skip to main content

Spring AI Alibaba Integration

This page shows you how to evaluate a Spring AI Alibaba graph or agent run with Dokimos. Spring AI Alibaba's graph runtime carries its whole conversation as standard Spring AI message types, so Dokimos folds a run's OverAllState straight into an AgentTrace and reuses the same message extraction as the Spring AI integration.

What you get

  • Graph state to trace: fold a graph run's OverAllState "messages" list into a single AgentTrace with SpringAiAlibabaSupport.toAgentTrace(...).
  • Reuses Spring AI: tool-call and tool-definition conversion delegate to SpringAiSupport, so the same AssistantMessage/ToolResponseMessage handling applies.
  • Per-turn correlation: tool calls are matched to their results turn by turn, so a tool-call id reused across turns never binds to the wrong result.

Step 1: Add the dependency

This module pulls dokimos-core and dokimos-spring-ai. You bring the Spring AI Alibaba SDK (spring-ai-alibaba-agent-framework, the 1.1.x line) yourself.

Version compatibility

This adapter targets the current Spring AI Alibaba 1.1.x line (spring-ai-alibaba-agent-framework, which carries ReactAgent and the graph runtime). Spring AI Alibaba is not source-compatible across releases: the 1.0.x line kept the agent types in spring-ai-alibaba-graph-core, 1.0.0.4 added a checked exception to CompiledGraph.invoke, and 1.1.x relocated the agent types and changed the ReactAgent builder. Use a 1.1.x version.

Maven

<dependency>
<groupId>dev.dokimos</groupId>
<artifactId>dokimos-spring-ai-alibaba</artifactId>
<version>${dokimos.version}</version>
</dependency>

Gradle (Groovy DSL)

implementation 'dev.dokimos:dokimos-spring-ai-alibaba:${dokimosVersion}'

Step 2: Fold a graph run into a trace

A Spring AI Alibaba ReactAgent runs on a compiled graph. The graph keeps every intermediate tool call in its OverAllState, under the "messages" key. SpringAiAlibabaSupport.toAgentTrace(state) reads that list and builds one AgentTrace: the tool calls come from the assistant messages, and the final response is the text of the last assistant message.

If you already have the state, pass it directly:

import com.alibaba.cloud.ai.graph.OverAllState;
import dev.dokimos.core.agents.AgentTrace;
import dev.dokimos.springai.alibaba.SpringAiAlibabaSupport;

// The OverAllState from a graph run
OverAllState state = /* ... */;

AgentTrace trace = SpringAiAlibabaSupport.toAgentTrace(state);

Step 3: Run the agent and read the state

The compiled graph is the full-fidelity entry point. Call getAndCompileGraph().invoke(...), which returns an Optional<OverAllState> carrying the whole run. The one-liner toAgentTrace(agent, inputs, config) does this for you: it invokes the agent's compiled graph and folds the returned state.

import com.alibaba.cloud.ai.graph.OverAllState;
import com.alibaba.cloud.ai.graph.RunnableConfig;
import com.alibaba.cloud.ai.graph.agent.ReactAgent;
import dev.dokimos.core.agents.AgentTrace;
import dev.dokimos.springai.alibaba.SpringAiAlibabaSupport;
import org.springframework.ai.chat.messages.UserMessage;

// Build a ReactAgent on your Spring AI ChatClient
ReactAgent agent = ReactAgent.builder()
.name("assistant")
.chatClient(chatClient)
.tools(toolCallbacks)
.build();

// Inputs go in under the "messages" key
Map<String, Object> inputs = Map.of(
"messages", List.of(new UserMessage("What's the weather in Paris?"))
);

// One-liner: invoke the compiled graph and fold the state
AgentTrace trace = SpringAiAlibabaSupport.toAgentTrace(agent, inputs, RunnableConfig.builder().build());

If you manage the graph yourself, invoke it and fold the Optional it returns:

import com.alibaba.cloud.ai.graph.OverAllState;
import dev.dokimos.springai.alibaba.SpringAiAlibabaSupport;

Optional<OverAllState> state = agent.getAndCompileGraph().invoke(inputs);

AgentTrace trace = SpringAiAlibabaSupport.toAgentTrace(state);
note

Use getAndCompileGraph().invoke(...) rather than a single-shot call. The compiled graph preserves every intermediate tool call across turns; a single-shot call would lose them.

Per-turn windowing

A graph run can span several turns, and a sub-agent or loop may reuse a tool-call id across them. To keep results correlated, toToolCalls(state) windows the messages: each AssistantMessage that issues tool calls is matched only against the ToolResponseMessages that follow it, up to the next AssistantMessage. A call with no matching response in its window has a null result. This is what toAgentTrace uses, so multi-turn runs score correctly without any extra wiring.

If you want the raw calls without building a trace, read them directly:

import dev.dokimos.core.agents.ToolCall;
import dev.dokimos.springai.alibaba.SpringAiAlibabaSupport;

List<ToolCall> toolCalls = SpringAiAlibabaSupport.toToolCalls(state);

Step 4: Score with the agent evaluators

Convert the tool callbacks the agent was built with into ToolDefinitions, build an EvalTestCase with trace.toTestCase(input, tools), and run any of the agent evaluators. Use the builder() form for every agent evaluator.

import dev.dokimos.core.EvalResult;
import dev.dokimos.core.EvalTestCase;
import dev.dokimos.core.agents.AgentTrace;
import dev.dokimos.core.agents.ToolDefinition;
import dev.dokimos.core.evaluators.agents.ToolCorrectnessEvaluator;
import dev.dokimos.core.evaluators.agents.ToolCallValidityEvaluator;
import dev.dokimos.springai.alibaba.SpringAiAlibabaSupport;

// Run the agent and fold its state
AgentTrace trace = SpringAiAlibabaSupport.toAgentTrace(agent, inputs, RunnableConfig.builder().build());

// Convert the tools the agent was given
List<ToolDefinition> tools = SpringAiAlibabaSupport.toToolDefinitions(toolCallbacks);

// Build the test case the agent evaluators expect
EvalTestCase testCase = trace.toTestCase("What's the weather in Paris?", tools);

// Evaluate
EvalResult validity = ToolCallValidityEvaluator.builder().build().evaluate(testCase);
EvalResult correctness = ToolCorrectnessEvaluator.builder().build().evaluate(testCase);
tip

See Agent Evaluation for the full set of agent evaluators and the EvalTestCase keys they read.

Judges and async tasks

For judging and plain async execution, this module does not add its own asJudge or asyncTask. Spring AI Alibaba agents run on a standard Spring AI ChatModel or ChatClient, so use SpringAiSupport.asJudge(...) and SpringAiSupport.asyncTask(...) from the Spring AI integration directly.

Cost, tokens, and latency

For metrics capture, the module does add SpringAiAlibabaSupport.measuredAsyncTask(...). The ReactAgent graph path returns a bare AssistantMessage with no typed Usage, so you supply the token counts via an AlibabaAgentResponse carrier (AlibabaAgentResponse.of(text) for text-only, or with tokensIn/tokensOut when you have them). Latency is timed automatically and cost is composed from an optional PriceTable:

PriceTable prices = (model, in, out) -> /* your price map */ null;

AsyncTask task = SpringAiAlibabaSupport.measuredAsyncTask(
example -> {
String answer = runYourAgent(example.input()); // your ReactAgent call -> text
// supply token counts from your usage source, or AlibabaAgentResponse.of(answer) for latency-only
return new AlibabaAgentResponse(answer, promptTokens, completionTokens);
},
"your-model",
prices);

See Cost and Pricing for the PriceTable seam and the run-detail metric cards.

Coopetition note

Spring AI Alibaba ships its own admin console that shows runs after the fact. That is useful for inspecting what happened. Dokimos is the gate that runs before: it scores a run's tool calls against the tools the agent was given and fails the build when the agent picks the wrong tool, hallucinates arguments, or misses the task. Use the admin console to look; use Dokimos in CI to block.

For AI agentsView as Markdown