Agent: Claude, GPT-4LLM: Claude 3.5, GPT-4#agents#llm-evaluation#reinforcement-learning#benchmarking#agentic-ai
NeMo Gym is NVIDIA's framework for evaluating and improving LLM-based agents using realistic environments. It provides modular infrastructure for agent evaluation, benchmarking, and RL training with support for stateful interactions like code execution and tool calling.
