Documentation Index
Fetch the complete documentation index at: https://docs.arklex.ai/llms.txt
Use this file to discover all available pages before exploring further.
What is ArkSim?
ArkSim is an open-source agent testing tool designed to help developers validate, simulate, and harden their agent systems. It focuses on reproducible interaction simulations to identify reliability and idempotency gaps before production deployment.Who is it for?
Developers looking to ensure their agents are robust, reliable, and ready for real-world interactions, without relying on manual testing.What makes it different?
Open-source
Fully transparent and customizable, letting you extend or adapt the tool as needed.
End-to-end testing
Test your agent in realistic scenarios to uncover gaps in behavior or reliability before production.
Pre-built scenarios
Use predefined scenarios to quickly validate your agent and jump-start testing effectively.
Easily extensible
Integrate with existing systems and tailor the testing environment to your specific needs.
Core Capabilities
Scenarios
A scenario is a test case that defines a simulated user’s attributes, goals, and prior knowledge. It allows you to test different versions of your agent against the same set of scenarios to compare behavior and catch regressions.Simulation
Simulation turns scenarios into live, multi-turn conversations with your agent. You get full transcripts (and optional metadata) that you can critically inspect or pass to evaluation.Evaluation
Evaluation scores your agent on those conversations. You get quantitative and qualitative metrics (e.g. helpfulness, faithfulness) and agent behavior failures that you can prioritize and fix.Frequently Asked Questions
What does ArkSim do?
What does ArkSim do?
ArkSim generates realistic multi-turn conversations between LLM-powered synthetic users and your agent, then evaluates every turn. Each synthetic user has a distinct profile, goal, and knowledge level. This reveals failures that only emerge across multiple conversation turns, like losing context, calling the wrong tool, or contradicting earlier responses.
How do I get started?
How do I get started?
Install with
pip install arksim, run arksim init to scaffold a starter config and agent file, then run arksim simulate-evaluate config.yaml. See the quickstart guide for a full walkthrough.What agents and frameworks are supported?
What agents and frameworks are supported?
ArkSim works with any AI agent, whether it is built with LangChain, CrewAI, OpenAI Agents SDK, or your own custom code. Connect through a Chat Completions HTTP endpoint, the A2A protocol, or load a Python agent class directly with no server needed. See the full list of supported integrations.
Can I run this in CI/CD?
Can I run this in CI/CD?
Yes. ArkSim runs as a CLI command that exits non-zero when quality thresholds are not met. Add it to any CI pipeline as a quality gate on every pull request. See the CI integration guide.
What metrics are available?
What metrics are available?
Built-in metrics include helpfulness, coherence, relevance, faithfulness, verbosity, goal completion, and agent behavior failure detection. You can also define custom quantitative and qualitative metrics with full access to the conversation context. See the evaluation guide.
Ready to start?
Test your AI agent with ArkSim. Follow the quick start or explore the source on GitHub.Quick Start
Set up your first simulation in minutes.
Source Code
Explore full implementation on GitHub.