Two Approaches
Option 1: Use ORS Python SDK
- Pros: Fast, handles protocol details, well-tested
- Cons: Python only
Option 2: Implement HTTP Protocol
- Pros: Any language, full control, no dependencies
- Cons: More work, must handle all protocol details
Option 1: Python SDK Implementation
Key Methods
list_splits(cls) - Required:
- Class method (no instance needed)
- Returns list of split names or Split objects
- Example:
return ["train", "validation", "test"]
list_tasks(cls, split) - Required:
- Class method
- Returns list of task objects (JSON)
- Can be async (return
Awaitable) - Task structure is environment-specific
get_prompt(self) - Required:
- Instance method (has access to
self.task_spec) - Returns Blocks (list of TextBlock/ImageBlock)
- Called once per episode
- Can be async
setup(self) - Optional:
- Called when episode starts
- Initialize environment state
- Can be async
teardown(self) - Optional:
- Called when episode ends
- Cleanup resources
- Can be async
num_tasks(cls, split) - Optional override:
- Class method, async
- Returns count of tasks for a split
- Default: calls
list_tasks()and returnslen() - Override for efficiency with large task sets
get_task(cls, split, index) - Optional override:
- Class method, async
- Returns a single task by index
- Default: calls
list_tasks()and indexes into result - Override for lazy loading
get_task_range(cls, split, start, stop) - Optional override:
- Class method, async
- Returns tasks for
range(start, stop), supports negative indices andNone - Default: calls
get_task()for each index
Tool Decorator
- Decorated with
@tool - Takes
selfand optionally one Pydantic model parameter - Returns
ToolOutput - Can be async
input_schema is null in the tool listing:
Task-Specific Tools
By default, tools are shared and visible to all tasks viaGET /{env_name}/tools. You can mark tools as task-specific so they only appear within an active session:
list_task_tools() to return tools dynamically based on the current task:
GET /{env_name}/task_tools (with an active session) to get the combined set of shared + task-specific tools.
Complete Example
See Quick Start for a working GSM8K environment.Option 2: Custom HTTP Implementation
Required Endpoints
Implement these HTTP endpoints: Discovery (no session required):GET /healthGET /list_environmentsGET /{env_name}/toolsGET /{env_name}/splitsPOST /{env_name}/tasksPOST /{env_name}/num_tasksPOST /{env_name}/taskPOST /{env_name}/task_rangePOST /create_session
POST /createPOST /deletePOST /delete_sessionPOST /ping
GET /{env_name}/promptGET /{env_name}/task_toolsPOST /{env_name}/call(returns result via SSE)
Example: Node.js/TypeScript
Session Management
Key points:- Store session ID → environment instance mapping
/create_sessiongenerates a session ID (no headers required)/createbinds a session to an environment + task (requires X-Session-ID)/createaccepts eithertask_specdirectly, orsplit+indexto resolve the task server-side- Implement 15-minute inactivity timeout (reset on any request with that session ID)
- Clean up on
/delete(call environment teardown) - Handle concurrent sessions
Error Handling
Return proper HTTP status codes:400- Bad request (invalid input, missing X-Session-ID)404- Not found (session, tool, environment)410- Gone (session was deleted but still referenced)500- Server error
error event.
Testing Your Server
Quick test:Deployment
Local Development
Production Deployment
Docker:Cloud Deployment
Deploy to any cloud platform:- AWS: ECS, Lambda, EC2
- GCP: Cloud Run, Compute Engine
- Azure: Container Instances, App Service
- Fly.io:
fly launch - Railway: Connect GitHub repo
Next Steps
HTTP API
Complete endpoint documentation
Quick Start
See a complete working example
Key Takeaway: Implementing an ORS server is straightforward. Use the Python SDK for quick development, or implement the HTTP protocol in any language for full control. Focus on proper reward design and episode termination.

