What the replay URL is
- Reproducibility page — In the web app, go to Reproducibility (or open it from Behavior Search via Test Reproducibility for a session). The page URL can include
sessionIdandagentId:/reproducibility?sessionId=...&agentId=... - Target URL (replay_url) — A single field on that page where you enter the base URL of your agent’s replay endpoint (e.g.
https://your-agent.example.com/replayor an ngrok URL). Kindred will send replay payloads to this endpoint when you click Run.
POST requests with a JSON body that includes the original session’s messages or user input and an is_replay flag. Kindred runs that request multiple times (N runs) and compares results to check determinism.
How to get to the Reproducibility page
- From Behavior Search — Search for sessions, select a session, then use Test Reproducibility. This opens the Reproducibility page with
sessionId(and usuallyagentId) pre-filled. - Direct URL — Go to
/reproducibilityand optionally add query params:?sessionId=...&agentId=... - App nav — Use the main navigation link to Reproducibility and then paste or choose a session ID (and agent ID if needed).
Set and save the Target URL (replay_url)
- On the Reproducibility page, ensure a session is selected (and agent ID is set or derived).
- In the Target URL (replay_url) field, enter the full base URL of your replay endpoint, e.g.:
https://your-agent.example.com/replayhttps://abc123.ngrok.io(if your server listens on/or/replay)
- Click Save to store this URL for the current agent so it’s pre-filled next time.
Run reproducibility
- Set Target URL (replay_url) as above (and save if desired).
- Choose N runs (e.g. 1–20). Kindred will call your replay endpoint N times with the same payload.
- Click Run. Kindred sends the replay payload to your endpoint N times and compares responses.
- Review the results table for pass/fail and any divergence between runs.
Replay endpoint contract
Your server should accept POST requests (e.g. at/ or /replay) with a JSON body. Two common shapes:
- LLM path —
{ "messages": [...], "model": "gpt-4o-mini", "is_replay": true } - User path —
{ "input": "user message", "is_replay": true }
Example: replay test server
The Kindred repo includes a small replay-server that accepts replay payloads and forwards them to OpenAI. You can use it to try reproducibility locally:- Start the replay server (e.g. on port 5001) and set
OPENAI_API_KEY. - Expose it with a tunnel (e.g.
ngrok http 5001). - In the Reproducibility page, set Target URL (replay_url) to the ngrok URL (e.g.
https://abc123.ngrok.io). - Select a session that has replay context and click Run.
POST / or POST /replay with the replay payload and returns the model response so Kindred can compare N runs.
When a session is not replayable
If the session has no replay context (e.g. no user or LLM request to replay, or logs not found for your account), the UI will show that the session is not replayable and Run will be disabled. Use a session that was recorded with the tracer and that has the required context.Next steps
- Set up the Kindred tracer so sessions are logged.
- Get your API key so you can use the dashboard and save replay URLs.