Skip to main content
This guide walks you through running your first evaluation using a pre-built insurance customer service agent example.

Prerequisites

Before you begin, make sure you have:

Run Your First Evaluation

1

Set up the example agent

Navigate to the insurance customer service example and configure the agent:
cd examples/bank-insurance
	mv agent_config_openai.json agent_config.json
What’s included:
  • Pre-configured agent settings
  • Sample knowledge base for scenario building and evaluation
  • Ready-to-use runtime configuration
  • A prompt-based agent using OpenAI’s Chat Completions API
See our Input Setup guide for detailed instructions on creating customized agent configurations.
2

Set your API key

Export your OpenAI API key as an environment variable:
export OPENAI_API_KEY=your_api_key_here
3

Run the simulator

./run_arksim.sh run ./examples/bank-insurance/config.yaml
The simulation will take a few minutes to complete. You’ll see progress updates in your terminal.
4

View your results

Once complete, find your evaluation results in the results/ folder:
FileDescription
results/evaluation/final_report.htmlVisual summary with aggregated metrics and insights
results/conversation/Individual conversation logs with message-by-message details
results/evaluation/Granular per-turn and per-conversation scores
Open final_report.html in your browser to see your agent’s performance!

Troubleshooting

The simulator looks for agent_config.json in the agent setup directory. Make sure you’ve renamed the file:
mv agent_config_openai.json agent_config.json
Use an absolute path for the bind mount to ensure the container sees your local files:
docker run --platform linux/amd64 \
	-v "$PWD/examples:/examples" \
	--env-file ./examples/.env \
	-it public.ecr.aws/f1v9v1i4/arklex/arksim:1.0.0-alpha \
	run ./examples/bank-insurance/config.yaml