Overview
Model Context Protocol (MCP):- Purpose: Connect LLMs to tools, data sources, and workflows
- Focus: General-purpose tool access
- Use case: Extending LLM capabilities with external APIs, databases, file systems
- Purpose: Connect agents to reinforcement learning environments
- Focus: RL training and agent evaluation
- Use case: Training agents with reward signals, structured evaluation benchmarks
Key Differences
| Feature | MCP | ORS |
|---|---|---|
| Primary Purpose | Tool access, data integration | RL training environments |
| Episode Termination | No concept | finished signal |
| Rewards | No concept | Numeric feedback for RL |
| Tasks | No concept | Organised problems to solve |
| Splits | No concept | Tasks organised into splits |
| Session Management | Basic | Episode-centric (RL trajectories) |
| Tool Calling | Yes | Yes Yes |
| Protocol | JSON-RPC over stdio/SSE | HTTP/REST + SSE |
Detailed Comparison
Tool Calling
Both protocols support tool calling with similar interfaces: MCP Tool Spec:Tool Responses
MCP Response:reward- For RL training feedbackfinished- For episode termination
Episode Structure
MCP: No concept of episodes. Stateless or loosely stateful tool calls. ORS: Episodes are first-class:- Session = RL episode
- Episode continues until
finished: true - One complete trajectory through environment
- Clear start (task) and end (finished signal)
Task Organization
MCP: No built-in task organization. ORS: Tasks and splits:- Tasks: Individual problems to solve
- Splits: tasks grouped into splits, e.g. train/val/test
Next Steps
ORS Quick Start
Build your first ORS server
ORS Specification
Deep dive into ORS protocol
MCP Documentation
Learn about Model Context Protocol
Implementation Guide
Implement an ORS server
Key Takeaway: MCP and ORS solve different problems. MCP connects LLMs to tools. ORS connects agents to RL training environments. Both are valuable, and they can work together in sophisticated systems.

