AI education source

Weights & Biases

W&B Courses

Good for builders who need to measure, debug, and improve LLM apps rather than just demo them.

Start with: Take Building LLM-powered apps, then the evaluation material.

Videos

Educator videos are listed first. Similar videos are labelled and included when they cover the same skills or adjacent topics.

W&B LLM Evaluation Course

Educator video

Weights & Biases · evals, llm apps, observability

AI evals with Phoenix

Similar video

Arize AI · evals, observability, tracing, rag debugging

Promptfoo red teaming

Similar video

Promptfoo · evals, prompt testing, red teaming, security

AI Engineering with Chip Huyen

Similar video

Chip Huyen · ai engineering, production, systems, mlops

Full Stack Deep Learning lecture

Similar video

Full Stack Deep Learning · mlops, deployment, product ml, production

MLOps community production AI

Similar video

MLOps Community · mlops, production ml, ai systems, deployment

Skills

Notable work

Building LLM-powered apps
LLM evaluation course
W&B Weave examples

Learner questions

Who should learn from Weights & Biases?

Developers evaluating and deploying LLM apps should start here when they need llm apps, evals, experiment tracking, and mlops. The strongest fit is a learner who wants material in these formats: free courses, guides, examples.

What should I do first?

Take Building LLM-powered apps, then the evaluation material. After that, open one related resource below and write down the exact workflow, concept, or implementation pattern you want to apply.

What problem does this help with?

Good for builders who need to measure, debug, and improve LLM apps rather than just demo them. Use this profile when you are comparing educators by topic, level, format, and practical usefulness rather than browsing random AI content.

How do I compare this with other educators?

Compare the skill coverage, the starting recommendation, and the related videos. If you need llm apps, search the directory for that skill and shortlist three profiles before committing to a course, book, or playlist.

Related resources

Resource	Kind	Level	Use when
LLM Evals Hamel Husain	Guide	Intermediate	Your AI app needs quality checks before users see it.
W&B LLM Evaluation Course Weights & Biases	Free course	Intermediate	You need to debug and measure LLM app quality.
Phoenix by Arize Arize AI	Open source tool and docs	Intermediate	You need to trace, inspect, and evaluate LLM app behavior.
Promptfoo Intro Promptfoo	Open source docs	Intermediate	You need regression tests for prompts, models, and LLM outputs.
Made With ML Made With ML	Free course	Intermediate	You need production ML habits that transfer to AI systems.
AI Evals for Engineers & PMs Hamel Husain and Shreya Shankar	Cohort course	Intermediate	You are shipping AI features and need a serious evaluation workflow.
Full Stack Deep Learning Lectures Full Stack Deep Learning	Course videos	Intermediate to advanced	You want the whole lifecycle of ML and AI product development.
Hamel's AI evals guides Hamel Husain	Guides	Intermediate to advanced	Use this when you want Hamel Husain's material for evals and related AI skills.
AI Evals for Engineers and PMs Shreya Shankar	Course	Intermediate	Use this when you want Shreya Shankar's material for evals and related AI skills.
Full Stack Deep Learning Full Stack Deep Learning	Course	Intermediate to advanced	Use this when you want Full Stack Deep Learning's material for mlops and related AI skills.