Machine Learning Posts

An Overview of RL Environments

21 minute read

Everything that happens in an RL environment between the policy update and the next rollout - verification, reward shaping, tool calling, curriculum design, ...