Multi-Sample Preference Optim | Pangram Labs