Fact
Advanced
Reinforcement Learning from Human Feedback (RLHF)
August 13, 2025
RLHF is a crucial technique for aligning LLMs. It involves training a 'reward model' on human-ranked responses and then using that model to fine-tune the LLM with reinforcement learning, teaching it to generate outputs that humans prefer.
Category: AI Training
Difficulty: Advanced