AI educator

Hamel Husain

Hamel's AI evals guides

Very practical material on evaluating LLM apps before they disappoint users.

Start with: Read the evals guide and build a small test set for your own app.

Resources from Hamel Husain

Guide

LLM Evals

Intermediate

Your AI app needs quality checks before users see it.

Guides

Hamel's AI evals guides

Intermediate to advanced

Use this when you want Hamel Husain's material for evals and related AI skills.

Skills

Learner questions

Who should learn from Hamel Husain?

Builders shipping LLM systems should start here when they need evals, rag, and llm product quality. The strongest fit is a learner who wants material in these formats: guides, workshops.

What should I do first?

Read the evals guide and build a small test set for your own app. After that, open one related resource below and write down the exact workflow, concept, or implementation pattern you want to apply.

What problem does this help with?

Very practical material on evaluating LLM apps before they disappoint users. Use this profile when you are comparing educators by topic, level, format, and practical usefulness rather than browsing random AI content.

How do I compare this with other educators?

Compare the skill coverage, the starting recommendation, the educator's own resources, and any videos when available. If you need evals, search the directory for that skill and shortlist three profiles before committing to a course, book, or playlist.

More related resources

Resource	Kind	Level	Use when
OpenAI Cookbook OpenAI	GitHub repo	Beginner to advanced	You need implementation examples rather than theory.
Prompt Engineering Guide DAIR.AI	Guide	Beginner to advanced	You want examples of prompting techniques and patterns.
AI SDK v6 Crash Course Matt Pocock	Workshop	Intermediate	You want a structured AI SDK v6 course that covers model choice, text and object generation, UI streams, agents, persistence, context engineering, evals, and advanced app patterns.
The AI Engineer Roadmap Matt Pocock	Free tutorial	Beginner to intermediate	You want a guided path through core AI concepts, model selection, the AI engineering mindset, evals, and techniques for improving LLM-powered apps.
Evaluating AI Agents DeepLearning.AI	Short course	Intermediate	You need to test, trace, and improve agent workflows instead of judging only single LLM responses.
Building and Evaluating Advanced RAG Applications DeepLearning.AI	Short course	Intermediate	You already know basic RAG and need better retrieval, evaluation, and production-quality patterns.
LangChain for LLM Application Development DeepLearning.AI	Short course	Beginner to intermediate	You want a fast introduction to building LLM applications with chains, retrieval, and tools.
OpenAI eval design guide OpenAI	Guide	Intermediate	You need practical guidance for designing representative eval datasets, choosing graders, and turning model testing into an engineering loop instead of ad hoc spot checks.
OpenAI evals quickstart and datasets OpenAI	Guide	Intermediate	You want OpenAI's current quickstart for turning examples into dataset-backed evals and improvement loops instead of relying on a deprecated docs path.
OpenAI Working with evals OpenAI	Guide	Intermediate	You need API-level guidance for testing outputs, comparing models, and catching regressions during upgrades.
OpenAI Evaluate agent workflows OpenAI	Guide	Intermediate	You need the current OpenAI path for tracing, grading, and regression-testing agent workflows instead of only single-prompt evals.
OpenAI model optimization OpenAI	Guide	Intermediate	You need a practical optimization loop across prompt changes, evals, and fine-tuning rather than guessing which knob to turn next.