1. Initial Assignment: Problems start in pools based on Qwen3-4B solve rate
4. Filtering: Once rate hits 100%, problem is removed (no learning signal)
2. Model Improves: As training progresses, solve rates increase
5. Sampling: Each batch samples from all pools to maintain curriculum
3. Problems Graduate: Hard → Normal → Easy as model masters them
6. Result: Model always trains on appropriately challenging problems