Hamel Husain profile photo

AI educator

Hamel Husain

Hamel's AI evals guides

Very practical material on evaluating LLM apps before they disappoint users.

Start with: Read the evals guide and build a small test set for your own app.

Videos

Educator videos are listed first. Similar videos are labelled and included when they cover the same skills or adjacent topics.

LangGraph introduction

Similar video

LangChain · agents, langgraph, llm orchestration, rag

RAG and LlamaIndex

Similar video

LlamaIndex · rag, documents, agents, context augmentation

Vector search and Weaviate

Similar video

Weaviate · vector search, rag, embeddings, hybrid search

Pinecone semantic search

Similar video

Pinecone · vector databases, rag, embeddings, search

LLM evaluation with W&B

Similar video

Weights & Biases · evals, llm apps, observability, mlops

AI evals with Phoenix

Similar video

Arize AI · evals, observability, tracing, rag debugging

Promptfoo red teaming

Similar video

Promptfoo · evals, prompt testing, red teaming, security

Skills

Learner questions

Who should learn from Hamel Husain?

Builders shipping LLM systems should start here when they need evals, rag, and llm product quality. The strongest fit is a learner who wants material in these formats: guides, workshops.

What should I do first?

Read the evals guide and build a small test set for your own app. After that, open one related resource below and write down the exact workflow, concept, or implementation pattern you want to apply.

What problem does this help with?

Very practical material on evaluating LLM apps before they disappoint users. Use this profile when you are comparing educators by topic, level, format, and practical usefulness rather than browsing random AI content.

How do I compare this with other educators?

Compare the skill coverage, the starting recommendation, and the related videos. If you need evals, search the directory for that skill and shortlist three profiles before committing to a course, book, or playlist.

Related resources

Resource Kind Level Use when
OpenAI Cookbook
OpenAI
GitHub repo Beginner to advanced You need implementation examples rather than theory.
Prompt Engineering Guide
DAIR.AI
Guide Beginner to advanced You want examples of prompting techniques and patterns.
LLM Evals
Hamel Husain
Guide Intermediate Your AI app needs quality checks before users see it.
LlamaIndex Docs
LlamaIndex
Docs and examples Intermediate You need to connect LLMs to documents, data, and retrieval.
W&B LLM Evaluation Course
Weights & Biases
Free course Intermediate You need to debug and measure LLM app quality.
Pinecone Learn: Retrieval-Augmented Generation
Pinecone
Guide Beginner to intermediate You need to understand the moving parts of RAG.
Weaviate Academy
Weaviate
Free academy Beginner to intermediate You want structured vector database and retrieval lessons.
Phoenix by Arize
Arize AI
Open source tool and docs Intermediate You need to trace, inspect, and evaluate LLM app behavior.
Promptfoo Intro
Promptfoo
Open source docs Intermediate You need regression tests for prompts, models, and LLM outputs.
AI Evals for Engineers & PMs
Hamel Husain and Shreya Shankar
Cohort course Intermediate You are shipping AI features and need a serious evaluation workflow.