Differentiator is openness
To underscore its commitment to open source, Nvidia is revealing some of Nemotron 3’s inner workings, releasing a dataset with real-world telemetry for safety evaluations, and 3 trillion tokens of Nemotron 3’s pretraining, post-training, and RL datasets.
In addition, Nvidia is open-sourcing its NeMo Gym and NeMo RL libraries, which provide Nemotron 3’s training environments and post-training foundation, and NeMo Evaluator, to help builders validate model safety and performance. All are now available on GitHub and Hugging Face. Of these, Mayham noted, NeMo Gym might be the most “strategically significant” piece of this release.
Pre-training teaches models to predict tokens, not to complete domain-specific tasks, and traditional RL from human feedback (RLHF) doesn’t scale for complex agentic behaviors, Mayham explained. NeMo Gym enables RL with verifiable rewards — essentially computational verification of task completion rather than subjective human ratings. That is, did the code pass tests? Is the math correct? Were the tools called properly?
Read the full article here

