Blue Yonder Interview Question

Implement an LLM post-training using RL.