Alignment Newsletter #106: Evaluating generalization ability of learned reward models
Release Date:
Recorded by Robert Miles More information about the newsletter here
Alignment Newsletter #106: Evaluating generalization ability of learned reward models