Overview
Customize your simulation by controlling conversation count, model selection, parallelism, and evaluation settings using YAML configuration files or CLI flags.For step-by-step execution with individual stage configurations, see the Advanced Usage section.
YAML Configuration
Use YAML configuration files to run the full pipeline (scenario building, conversation simulation, and conversation evaluation).Understanding Key Settings
- num_conversations — How many test conversations to generate. Start with 5-10, increase for thorough testing.
- max_turns — Maximum back-and-forth exchanges per conversation. 5 is typical for most use cases.
- num_workers — Parallel processing threads. Higher values = faster execution but more parallel API calls. Use
autoto default tonum_conversations. - model — The LLM used for scenario generation and evaluation (not your agent).
Configuration File
CLI Flags
Command-line flags override config file values and can be applied to both Binary and Docker runs.Usage
Use--flag value (or --flag=value). Flags are written in kebab-case (e.g., --num-conversations) and are mapped to YAML keys automatically.
Example (Binary):
Available Flags
| Flag | Applies to | Description |
|---|---|---|
--agent-setup-dir | build, simulate, evaluate, run | Path to the agent setup directory (contains agent_config.json and knowledge.json). |
--output-dir | build, simulate, evaluate, run | Output directory. If omitted, a stage-specific default is used. |
--model | build, simulate, evaluate, run | LLM model used by Arksim (scenario generation and/or evaluation; not your agent). |
--provider | build, simulate, evaluate, run | LLM provider (e.g., openai, azure). |
--temperature | run | Temperature for LLM generation (full pipeline only). |
--embedding-model | build, run | Embedding model name for vector operations. |
--enable-topic-modeling | build, run | Whether to enable topic modeling for knowledge. |
--knowledge-method | build, run | Knowledge clustering method: knee, density_cp, ppr, hdbscan, meanshift, affprop, ratio. |
--user-attribute-file | build, run | Path to the user attributes file. |
--oversample-ratio | build, run | Oversample ratio for user attributes generation. |
--seed | build, run | Random seed for reproducibility. |
--num-conversations | build, simulate, run | Number of conversations to generate/simulate. |
--num-conversations-per-persona | build, run | Number of conversations per persona. |
--max-turns | simulate, run | Maximum turns per conversation. |
--num-workers | simulate, evaluate, run | Number of parallel workers. |
--display-time-delay | simulate, run | Add delays between turns when displaying conversations. |
--input-dir | simulate, evaluate | Input directory (simulate: pre-built scenarios; evaluate: simulation outputs). |
--generate-html-report | evaluate, run | Whether to generate an HTML report. |
--score-threshold | evaluate, run | If any per-conversation final score is below this threshold, exit with non-zero code. |
--code-file-path | evaluate, run | Path to code file for fix generation (requires --entry-function). |
--entry-function | evaluate, run | Entry function for code fix generation (requires --code-file-path). |