►
Promptfoo red teaming
Promptfoo · evals, prompt testing, red teaming, security
AI directory search
Use this when you know the topic you need: Claude Code, MCP, evals, RAG, agents, product, coding, prompting, foundations, or model internals.
14 matches for "testing"
Promptfoo Docs · Intermediate
Very practical for regression testing prompts, model changes, and LLM outputs.
Topics
Prompt testing, Evals, Red teaming
Made With ML · Intermediate
Useful path for production ML fundamentals that transfer to AI engineering.
Topics
MLOps, Testing, Deployment, ML systems
Ollama docs · Beginner to intermediate
Practical route into running and testing local models on your own machine.
Topics
Local models, LLM tools, AI engineering, Privacy
Workflow skill catalog
Matt Pocock / AI Hero · Skill catalog
Use this when you want opinionated coding-agent workflows instead of generic prompt snippets.
Start
Start with /teach, /grill-me, /to-prd, /to-issues, /tdd, /triage, or /handoff depending on the job.
Guide · Matt Pocock · Intermediate
You want TypeScript checks, tests, linters, and review loops that help agents produce better code and catch regressions quickly.
ai coding, typescript, testing, feedback loops
Guide / Claude skill · Matt Pocock · Intermediate
You want an agent workflow that implements behavior with a red, green, refactor loop instead of jumping straight to broad code changes.
claude skills, tdd, ai coding, testing
Guide · OpenAI · Intermediate
You need API-level guidance for testing outputs, comparing models, and catching regressions during upgrades.
openai, evals, quality, regression testing, reliability
Guide · OpenAI · Intermediate
You need the current OpenAI path for tracing, grading, and regression-testing agent workflows instead of only single-prompt evals.
openai, agents, evals, traces, graders
Models guide · OpenRouter · Beginner to advanced
You need to compare many model families through one catalog before testing prompts across providers.
openrouter, model comparison, model routing, opus, claude
Model docs · xAI · Intermediate
You want the specific Grok Build model details, pricing, and capabilities before testing xAI for agentic coding work.
xai, grok build, coding agents, model selection, agentic coding
►
Open source docs · Promptfoo · Intermediate
You need regression tests for prompts, models, and LLM outputs.
evals, prompt testing, red teaming
►
Free course · Made With ML · Intermediate
You need production ML habits that transfer to AI systems.
mlops, testing, deployment