
The Test Center lets you test your voice agents before they interact with real customers. Run simulated conversations, evaluate performance against measurable criteria, and identify issues like incorrect answers, script deviations, or unnatural phrasing.
By running the same test suites after each change, you can track improvements over time and catch regressions early.
Manual testing requires hours of phone calls to cover all your scenarios, and it’s difficult to consistently reproduce edge cases or validate that your agent handles every situation correctly. Simulations solve this by automating your testing workflow.
Test suites provide a repeatable way to validate your agent’s behavior. Instead of manually calling your agent and checking responses, test suites run multiple scenarios in parallel and evaluate results against defined success criteria. You can test dozens of scenarios in minutes, validate prompt changes systematically before deploying to clients, and catch consistency issues (greeting delivery, conversation flow problems) automatically.
This approach is especially effective for Flow Designer agents. The node-based structure makes it easier for the AI to generate targeted test cases that cover specific conversation paths, decision points, and edge cases. Custom prompt agents are also supported, though the generated tests may be broader since there’s no explicit conversation graph to analyze.
A test suite is a collection of test cases designed to evaluate a specific agent. Each suite is bound to one agent and generates multiple test cases based on scenarios you select.
To create a test suite, go to the Test Center and click + Test Suite.

Start by selecting the agent you want to test and the language for the simulated conversations. The test suite will be permanently linked to this agent.
From here, you can either write your own test cases manually or let the AI generate them for you based on scenarios.
Scenarios define what situations your test cases will cover. When you create a test suite, you can select from common scenarios or add your own custom scenarios.

You can also add custom scenarios to test specific situations unique to your use case. There’s no limit to how many scenarios you can select.
Every test suite includes 4 base test cases generated automatically. These are not based on a template; the AI analyzes your specific agent to create meaningful tests. The analysis includes:
For each scenario you select, additional test cases are generated. The total number of test cases equals 4 base + number of scenarios. You can adjust this using the slider before generating.
Each test case consists of a scenario prompt that describes the situation to simulate, and a list of success criteria that determine whether the test passed or failed.
You can configure how criteria are evaluated: require all criteria to pass, or pass when any criterion is met. This flexibility lets you create both strict compliance tests and exploratory edge-case tests.
Test suite generation works with both custom prompt agents and flow designer agents.
To run a test suite, go to Run History and select the suite you want to execute. The suite runs against the agent it was created for, so you cannot change the target agent after creation. Configure any additional settings like maximum turns, then click Run Suite.
At this point, Synthflow creates a session containing all the simulations. Each test case runs as a simulated call where your agent is paired with a persona agent, a simulated customer created automatically in the same language. The system records audio, generates transcripts, and evaluates results against your success criteria.
You’ll see live progress as runs complete.
After simulations finish, open the Run History tab to see all sessions. Each row shows the date, agent tested, test suite name, and overall status. Click any session to see the individual test case results.

Select a test case to view its full evaluation. The detail view shows:

This level of detail helps you understand exactly where your agent succeeded or failed, making it easier to iterate on your prompts or flow design.
Simulations end with one of three statuses:
Temporary limitation: during simulations, call transfers and Real-Time Booking (RTB) actions are not actually executed. They are simulated from the conversation transcript so success criteria can still be evaluated. Full action execution in the test environment is on the roadmap.
No. Each test suite is permanently linked to the agent it was created for. If you want to test a different agent, create a new test suite for that agent.
You can have up to 25 test cases per suite. The default is 4 base test cases plus one additional test case per scenario you select.
No. Simulations are free and do not consume minutes from your account balance.
Review the success criteria explanations in the test case details. The AI evaluates each criterion independently and provides reasoning for pass or fail. A test case may fail if the agent didn’t explicitly meet a specific criterion, even if the overall conversation was acceptable.
Yes. You can add, remove, or modify test cases within a test suite at any time. However, you cannot change the target agent.