Documentation Index
Fetch the complete documentation index at: https://docs.arklex.ai/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
Get started
Set your API key and config
OpenAI (default)
Other providers
Set your API key:export OPENAI_API_KEY="your-api-key"
Anthropic (Claude): Set your API key.export ANTHROPIC_API_KEY="your-api-key"
In your config.yaml:agent_config:
agent_type: chat_completions
agent_name: bank-insurance
api_config:
endpoint: https://api.anthropic.com/v1/messages
headers:
Content-Type: application/json
x-api-key: "${ANTHROPIC_API_KEY}"
anthropic-version: "2023-06-01"
body:
model: claude-opus-4-6
max_tokens: 1024
system: |
You are a customer service chatbot for XYZ Bank insurance.
XYZ Insurance—a core business within XYZ Bank Group—is one of
Canada's leading providers of life, health, home, auto, and
travel insurance.
Rules:
1. Do not flip roles.
2. Avoid using bullet points or lists.
3. Never exceed 80 words.
# ...
# LLM model
model: claude-opus-4-6
# LLM provider
provider: anthropic
Google Gemini: Set your API key. You can use Google Gemini as the evaluation LLM and optionally as the agent (Google Gemini exposes an OpenAI-compatible endpoint).export GEMINI_API_KEY="your-api-key"
In your config.yaml:agent_config:
agent_type: chat_completions
agent_name: e-commerce
api_config:
endpoint: https://generativelanguage.googleapis.com/v1beta/openai/chat/completions
headers:
Content-Type: application/json
Authorization: "Bearer ${GEMINI_API_KEY}"
body:
model: gemini-2.5-flash
messages:
- role: system
content: >-
You are a shopping assistant. Please provide a complete answer
to the user's question based on your knowledge. Response within
50 words.
# ...
# LLM model
model: gemini-2.5-flash
# LLM provider
provider: google
Run Simulation & Evaluation
Download the examples, then run simulation and evaluation via CLI, Python, or the web UI.Download examples: From your project directory, run:This creates an examples/ folder with three projects you can experiment.From the bank-insurance example directory run:cd examples/bank-insurance
arksim simulate-evaluate config.yaml
This uses the example’s config.yaml (OpenAI) and scenarios.json. Ensure OPENAI_API_KEY is set from Step 2. Run simulation and evaluation from Python using the bank-insurance example:import asyncio
from arksim.simulation_engine import run_simulation, SimulationInput
from arksim.evaluator import run_evaluation, EvaluationInput
from arksim.config import AgentConfig
from arksim.scenario import Scenarios
BANK_INSURANCE_DIR = "./examples/bank-insurance"
# Example agent config (OpenAI; match bank-insurance config.yaml)
agent_config = AgentConfig(
agent_type="chat_completions",
agent_name="bank-insurance",
api_config={
"endpoint": "https://api.openai.com/v1/chat/completions",
"headers": {
"Content-Type": "application/json",
"Authorization": "Bearer ${OPENAI_API_KEY}",
},
"body": {
"model": "gpt-5.1",
"messages": [
{"role": "system", "content": "You are a customer service chatbot for XYZ Bank insurance."}
],
},
},
)
scenarios = Scenarios.load(f"{BANK_INSURANCE_DIR}/scenarios.json")
simulation = asyncio.run(run_simulation(SimulationInput(
agent_config=agent_config,
num_conversations_per_scenario=1,
max_turns=5,
num_workers="50",
output_file_path=f"{BANK_INSURANCE_DIR}/results/simulation.json",
)))
evaluation = run_evaluation(EvaluationInput(
scenario_file_path=f"{BANK_INSURANCE_DIR}/scenarios.json",
output_dir=f"{BANK_INSURANCE_DIR}/results",
custom_metrics_file_paths=[f"{BANK_INSURANCE_DIR}/custom_metrics.py"],
metrics_to_run=[
"faithfulness",
"helpfulness",
"coherence",
"relevance",
"goal_completion",
"agent_behavior_failure",
],
model="gpt-5.1",
provider="openai",
num_workers=50,
generate_html_report=True,
), simulation=simulation, scenarios=scenarios)
print("Simulation and evaluation complete.")
print(f"Simulated {len(simulation.conversations)} conversations.")
print(f"Evaluation results saved to {BANK_INSURANCE_DIR}/results")
Run simulation and evaluation from your browser with the arksim web UI.Your browser opens at http://localhost:8080. Set your API key in the environment before starting (e.g. OPENAI_API_KEY). Then open examples/bank-insurance/config.yaml or create a config that points at your chosen example. You can run e-commerce and openclaw the same way from examples/e-commerce and examples/openclaw. See the example guides in Next Steps below.
View Results
Open final_report.html or evaluation.json in your results/evaluation directory for metrics and failure analysis.Use these to see how your agent performed and where it failed or could be improved.
Next Steps
Now that you’ve run your first simulation and evaluation, here’s where to go next.