commandsSource-backedReview first Safety ✓ Privacy ✓

/prompt-eval-runbook - Prompt Eval Runbook Slash Command

Slash command runbook for designing and running prompt evaluations: define tasks, success criteria, golden outputs, regression checks, and privacy-safe reporting using Anthropic test-and-evaluate guidance.

by kiannidev·added 2026-06-16·

Claude Code

HarnessClaude Code

Invocation:/prompt-eval-runbook <feature-or-prompt-name>

Install

Source

/prompt-eval-runbook <feature-or-prompt-name>

Readiness

TrustReview first
Sourcesource-backed
Safety notesPresent
ReviewedYes

Documentation Source repository Registry JSON · LLM text

Review first — review before installing

Open the source and read safety notes before installing.

Safety notes

Eval fixtures must not contain live credentials or customer PII.
Treat model outputs as draft until human reviewers sign off on regressions.

Privacy notes

Redact proprietary prompts before exporting eval results outside the team.

Prerequisites

Representative user tasks and expected outcomes for the prompt under test.
Staging environment without production secrets in eval fixtures.

Schema details

Install type: cli
Troubleshooting: No

Runtime and command metadata

Command syntax: /prompt-eval-runbook <feature-or-prompt-name>

Full copyable content

/prompt-eval-runbook <feature-or-prompt-name>

About this resource

The /prompt-eval-runbook command turns Anthropic test-and-evaluate guidance into a repeatable Claude Code session checklist for prompt or skill changes.

Usage

/prompt-eval-runbook <feature-or-prompt-name>

What it does

Define tasks. List 5–10 representative user tasks with clear success criteria.
Draft fixtures. Create minimal inputs without production secrets.
Establish baselines. Capture current outputs for comparison after prompt edits.
Run regression pass. Re-run tasks after changes; flag quality, safety, or format regressions.
Score results. Use pass/partial/fail per task with one-line evidence.
Document rollback. Record prior prompt version or git ref to revert if eval fails.
Publish summary. Produce a privacy-safe eval report for reviewers.

Output format

Task list and success criteria
Baseline vs candidate comparison table
Regression findings with severity
Ship/no-ship recommendation
Rollback reference

Requirements

Access to staging Claude Code or Agent SDK environment for running tasks.
Reviewer approval before promoting prompt changes to production users.

#evals #prompts #slash-command #quality

Source citations

Add this badge to your README

Show that /prompt-eval-runbook - Prompt Eval Runbook Slash Command is listed on HeyClaude. Paste this Markdown into your README — it renders the badge and links back to this page.

[![Listed on HeyClaude](https://heyclau.de/badge/commands/prompt-eval-runbook.svg)](https://heyclau.de/entry/commands/prompt-eval-runbook)

How it compares

/prompt-eval-runbook - Prompt Eval Runbook Slash Command side by side with 3 alternatives on trust, install, platform support, and disclosed safety notes — all from reviewed registry metadata.

Field	/prompt-eval-runbook - Prompt Eval Runbook Slash Command Slash command runbook for designing and running prompt evaluations: define tasks, success criteria, golden outputs, regression checks, and privacy-safe reporting using Anthropic test-and-evaluate guidance. Open dossier	/targeted-test-generation - Targeted Test Generation Runbook Community slash command runbook for adding minimal automated tests around a changed module: inspect the git diff, mirror repository test conventions, and draft focused unit or integration tests using Anthropic develop-tests guidance. Open dossier	/documentation-refresh - Documentation Refresh Runbook Community slash command runbook to refresh stale project documentation after code changes: use git history to find affected docs, compare README commands to package scripts, and flag broken internal links before opening a docs PR. Open dossier	/frontend-visual-qa - Chrome Design Verification Runbook Community slash command runbook for frontend visual QA using documented Claude Code Chrome integration workflows: enable /chrome, open a local page, read console messages, and follow the design verification checklist from the Chrome integration guide. Open dossier
Trust
Install risk	Review first	Review first	Review first	Review first
Notes	Safety ✓ Privacy ✓	Safety ✓ Privacy ✓	Safety ✓ Privacy ✓	Safety ✓ Privacy ✓
Category	commands	commands	commands	commands
Source	source-backed	source-backed	source-backed	source-backed
Author	kiannidev	kiannidev	kiannidev	kiannidev
Added	2026-06-16	2026-06-16	2026-06-16	2026-06-16
Platforms	Claude Code	Claude Code	Claude Code	Claude Code
Source repo	—	—	—	—
Safety notes	✓Eval fixtures must not contain live credentials or customer PII. Treat model outputs as draft until human reviewers sign off on regressions.	✓Generated tests are drafts; run the project's test command locally before committing. Do not add network calls, production credentials, or destructive setup to proposed tests.	✓Read-only git history inspection unless the operator approves doc edits. Validate git refs before interpolating them into shell commands.	✓Chrome integration runs in a visible browser with your logged-in session; avoid production admin flows. Handle login pages and CAPTCHAs manually when the integration pauses.
Privacy notes	✓Redact proprietary prompts before exporting eval results outside the team.	✓Use synthetic fixtures only; do not copy production logs or customer records into tests.	✓Commit messages and doc drafts enter model context; scrub internal-only details first.	✓Console logs and screenshots may include staging data; redact before external sharing.
Prerequisites	Representative user tasks and expected outcomes for the prompt under test. Staging environment without production secrets in eval fixtures.	Git repository with an existing test runner configured in package scripts or docs. A concrete code change or diff hunk to cover with new or updated tests.	Git repository with commits and documentation files such as README.md or docs/. Lockfiles or package manifests when comparing documented install commands.	Claude Code 2.0.73+ and Claude in Chrome extension 1.0.36+ on Chrome or Edge. Local dev server reachable from the operator browser session.
Install	`/prompt-eval-runbook <feature-or-prompt-name>`	`/targeted-test-generation <file-or-symbol>`	`/documentation-refresh [since-ref]`	`/frontend-visual-qa <route-or-host>`
Config	—	—	—	—
Citations	Source repositorygithub.com 2026-06-16T23:15:18+00:00 Documentationplatform.claude.com Submitted by kiannidev2026-06-16	Source repositorygithub.com 2026-06-16T23:15:18+00:00 Documentationplatform.claude.com Submitted by kiannidev2026-06-16	Source repositorygithub.com 2026-06-16T23:15:18+00:00 Documentationgit-scm.com Submitted by kiannidev2026-06-16	Source repositorygithub.com 2026-06-16T23:15:18+00:00 Documentationcode.claude.com Submitted by kiannidev2026-06-16
Claim	Unclaimed	Unclaimed	Unclaimed	Unclaimed

Signals

Loading live community signals…