Organising problems for RL training and evaluation
Tasks and splits are how ORS organises problems for training and evaluation. Tasks are the individual problems agents solve, while splits categorize these tasks into organised units - for example train/test splits, or splits for different types of problem (e.g. those which requires CPUs versus GPUs).
{ "question": "If x + 5 = 12, what is x?", "answer": "7", "difficulty": "easy"}
Coding environment:
Copy
{ "problem_id": "reverse_string", "description": "Write a function to reverse a string", "test_cases": [ {"input": "hello", "output": "olleh"}, {"input": "world", "output": "dlrow"} ], "time_limit_seconds": 5}
Web navigation:
Copy
{ "task_id": "find_price", "goal": "Find the price of iPhone 15", "start_url": "https://example.com", "success_criteria": "Price found and extracted correctly"}
1. Environment defines tasks2. Tasks organized into splits (e.g. train/test)3. Agent requests tasks from a split4. For each task: a. Create episode with task b. Get prompt (derived from task) c. Solve task via tool calls and receive rewards d. Receive finished signal e. Cleanup episode
Tasks are passed when creating episodes. You can provide the task inline via task_spec, or reference it by split and index:
Copy
POST /createX-Session-ID: abc-123{ "env_name": "math", "task_spec": { "question": "What is 2+2?", "answer": "4" }, "secrets": {}}
Or load directly from a split (the server resolves the task):
Copy
POST /createX-Session-ID: abc-123{ "split": "train", "index": 0, "secrets": {}}
All fields are optional: env_name defaults to the first registered environment, secrets defaults to {}. Exactly one of task_spec or split+index must be provided (see CreateSession).The environment uses the task to:
Type defaults: Environments can return splits as either Split objects or bare strings. The server normalises bare strings: "train", "validation", and "test" map to their corresponding type, while any other name defaults to "type": "validation".Convention: When using Split objects explicitly, map to standard types:
Key Takeaway: Tasks are the problems agents solve. Splits organize tasks for proper ML workflows. Design task structures that are clear, validated, and organized into train/test splits to enable both learning and fair evaluation.