Core Principle: Actions are Tools
The only way agents interact with environments is by calling tools.This design choice:
- Leverages existing function calling support from LLM providers
- Provides a clear, structured interface
- Makes agent actions explicit and traceable
- Enables type-safe interactions with JSON Schema
What is a Tool?
A tool is a function that:- Has a name and description
- Optionally defines input parameters (via JSON Schema)
- Returns a
ToolOutputwith content, reward, and finished flag
submit-Submit an answer to a problembash-Execute a bash commandread_file-Read a file’s contentsweb_search-Search the webpython-Execute Python code
Tool Specification
Tools are advertised via two endpoints:GET /{env_name}/tools— shared tools available to all tasks (no session required)GET /{env_name}/task_tools— shared tools plus task-specific tools (requiresX-Session-ID, since task-specific tools depend on the active task)
GET /{env_name}/tools:
Tool Spec Fields
name (string, required):
- Tool identifier used in tool calls
- Should be descriptive (e.g.,
bash, notb) - Convention: lowercase with underscores
description (string, required):
- Human-readable explanation of what the tool does
- Used by LLMs to decide when to call the tool
- Should be clear and specific
input_schema (object, nullable):
- JSON Schema defining tool parameters
nullif the tool takes no parameters (always present in output, never omitted)- Enables validation and type checking
JSON Schema for Parameters
Theinput_schema follows JSON Schema specification:
string,number,booleanobject(nested parameters)array(lists of values)null
required-Mandatory fieldsdefault-Default valuesenum-Allowed valuesdescription-Field documentationexamples-Example values
Calling a Tool
Tools are called viaPOST /{env_name}/call with a JSON body:
name: Tool to call (required)input: Parameters matching the tool’sinput_schema(required)task_id: Optional identifier for SSE reconnection — clients can reconnect and retrieve results within a 60-second window
Tool Output
Every tool call returns aToolOutput:
Wire Format
Tool call responses are delivered via Server-Sent Events and wrapped in aRunToolSuccess or RunToolError envelope:
Successful completion:
Key Fields
blocks: The content returned by the tool
- Always an array (even for single text output)
- Can be text, images, or both
- This is what the agent observes
reward: Feedback for RL training
- Optional (can be null)
- Environment-defined; see Rewards for design patterns
finished: Episode termination signal
- Defaults to
false; always present in serialized output - When
true, episode is complete - Agent should stop calling tools and cleanup session
metadata: Optional structured data
- By convention, not included in the agent’s context window - but this is not enforced by the protocol
- Used for logging, debugging, analysis
- Can include execution time, resource usage, etc.
Tool Design Patterns
Pattern 1: Parameterless Tools
Tools that don’t need input:Pattern 2: Simple Parameter Tools
Tools with basic parameters:Pattern 3: Complex Parameter Tools
Tools with rich parameters:Pattern 4: Enum Parameters
Tools with constrained choices:Task-Specific Tools
Some environments need different tools depending on the task. For example, a multiple-choice task might offer aselect_option tool, while an open-ended task offers submit_answer instead.
ORS supports this through two separate tool-listing endpoints:
| Endpoint | Requires session | Returns |
|---|---|---|
GET /{env_name}/tools | No | Shared tools (constant across all tasks) |
GET /{env_name}/task_tools | Yes (X-Session-ID) | Shared tools + task-specific tools |
task_tools instead of tools after creating a session to get the complete tool set. If an environment has no task-specific tools, both endpoints return the same result.
Multi-Modal Tools
Tools can return images and text:- Screenshots from web navigation
- Visual feedback for agents
Tool Calling Flow
Best Practices
1. Clear Tool Names
2. Comprehensive Descriptions
3. Validate Tool Inputs
4. Provide Informative Outputs
5. Use finished Correctly
Tool Security
Input Validation
Always validate tool inputs:- Check types match schema
- Validate ranges and constraints
- Reject malformed inputs
Resource Limits
Prevent resource exhaustion:- Set timeouts on tool execution
- Limit output size
Next Steps
Tasks & Splits
Organise problems for training and evaluation
Rewards
Design reward signals for RL
Implementing a Server
Build an ORS server with custom tools
HTTP API
See how tools are listed and called
Key Takeaway: Tools are the agent’s interface to the environment. Design them carefully with clear names, comprehensive descriptions, proper validation, and informative outputs. The quality of your tools directly impacts agent performance.

