Tasks and splits are how ORS organises problems for training and evaluation. Tasks are the individual problems agents solve, while splits categorize these tasks into organised units - for example train/test splits, or splits for different types of problem (e.g. those which requires CPUs versus GPUs).Documentation Index
Fetch the complete documentation index at: https://openrewardstandard.io/llms.txt
Use this file to discover all available pages before exploring further.
Tasks
What is a Task?
A task is a specific problem for an agent to solve. Each task is represented as a JSON object with task-specific data.Task Examples
Math environment:Task Lifecycle
Accessing Tasks
The simplest way to retrieve tasks is to list all tasks in a split:POST /{env_name}/num_tasks:
POST /{env_name}/get_task:
POST /{env_name}/get_task_range:
[start, stop). Both start and stop are optional.
The server validates split names on all task endpoints and returns
400 for invalid splits.Task as Episode Input
Tasks are passed when creating episodes. You can provide the task inline viatask_spec, or reference it by split and index:
env_name defaults to the first registered environment, secrets defaults to {}. Exactly one of task_spec or split+index must be provided (see CreateSession).
The environment uses the task to:
- Generate the initial prompt
- Determine correct answers
- Calculate rewards
- Track episode progress
Splits
What is a split?
A split is a named category of tasks. Splits organise tasks for different purposes in ML workflows. An example split structure:- train - Tasks for training
- validation - Tasks for hyperparameter tuning
- test - Tasks for evaluation
Split Structure
Accessing Splits
List available splits:Custom Splits
Environments can define custom splits beyond train/validation/test:- Difficulty-based splits (easy/medium/hard)
- Domain-specific splits (algebra/geometry/calculus)
- Time-based splits (before_2020/after_2020)
- Resource-based splits (CPU/GPU sandboxes)
Split objects or bare strings. The server normalises bare strings: "train", "validation", and "test" map to their corresponding type, while any other name defaults to "type": "validation".
Convention: When using Split objects explicitly, map to standard types:
- Training-related →
"type": "train" - Evaluation-related →
"type": "test" - Tuning-related →
"type": "validation"
Next Steps
Tools
Design tools for solving tasks
Rewards
Create reward signals for tasks
Implementing a Server
Build an ORS server with tasks
HTTP API
See how tasks are accessed via API
Key Takeaway: Tasks are the problems agents solve. Splits organize tasks for proper ML workflows. Design task structures that are clear, validated, and organized into train/test splits to enable both learning and fair evaluation.

