Quick Guide
Connect Claude Code to Cekura
Install the Cekura MCP server (see the Claude Code tab in the MCP overview) and install the Cekura Skills repository. Skills give Claude Code the playbook for creating evaluators, running tests, and analyzing results through Cekura.
Set up mock data and cleanup hooks
Complex agents depend on database state—ships, credits, sectors, user metadata. Stand up a lightweight webhook server that seeds mock data before a run and listens for Cekura’s post-run webhook to reset the database. This keeps concurrent tests isolated and repeatable.
Let Claude Code learn the schema
Ask Claude Code to read your database schema and agent tools through the Cekura MCP. Once it has the lay of the land, prompt it to generate scenario coverage—for example: “Using the Cekura skill, create 10 test cases covering ship movement, corp joins, and credit transfers.”
Run the full suite
Kick off all evaluators at once through the MCP. Cekura executes them concurrently and surfaces pass/fail with full transcripts and traces.
Triage with Claude
Ask Claude Code to pull failing runs and classify them: “Analyze these 4 failures and tell me which are real bugs vs. flakes.” Because runs are non-deterministic, rerun each failing test 3–4 times and ask Claude to compute a pass rate. One pass out of four usually points to a prompt or tool-routing bug, not infra.