Sessions as Episodes
In ORS an episode- Starts with a specific task
- Continues through multiple tool calls
- Ends when
finished: trueis received from aToolOutput.
RL Episode Terminology
| RL Term | ORS Term | Description |
|---|---|---|
| Episode | Session | One complete trajectory |
| State | Environment instance | Full internal state on the server |
| Observation | Blocks (prompt + tool outputs) | Partial view of state returned to the agent |
| Action | Tool call | Agent action |
| Reward | ToolOutput.reward | Feedback signal |
| Terminal state | finished: true | Episode complete |
Episode Lifecycle
Complete Flow
States in Detail
1. Session ID Generation
2. Episode Initialization
env_name defaults to the first registered environment. Either task_spec or both split+index must be provided (see CreateSession).
What happens:
- Server resolves
env_name(or defaults to first environment) andtask_spec(or loads fromsplit/index) - Instantiates the environment class with
task_specandsecrets - Calls
environment.setup()(async) - Marks session as “ready” when setup completes
3. Initial Observation
4. Action-Observation Loop
- Agent takes action (calls tool)
- Environment executes action
- Environment returns next state (blocks), reward, and termination flag
- If
finished: false, repeat from step 1 - If
finished: true, episode is complete
- Action: Tool call
- Observation: Blocks
- Reward: Reward signal
- Terminal: Finished flag
5. Episode Termination
- Calls
environment.teardown() - Removes session from active sessions
- Frees memory and resources
/delete when done, even if episode finished naturally.
Episode Termination
The finished Signal
The finished field in ToolOutput is critical:
finished: true:
- Episode is complete
- Agent should stop calling tools
- Agent should call
/deleteto cleanup - Task succeeded or failed (check reward or blocks for details)
finished: false:
- Episode continues
- Agent should take another action
- State may have changed (reflected in blocks)
Termination Patterns
Pattern 1: Immediate Termination
Task completes in one step:Pattern 2: Multi-Step Termination
Task requires multiple actions:Pattern 3: Failure Termination
Task fails (but episode still terminates):State Management
What’s Preserved in a Session?
Environment state:- Instance variables in environment class
- Files created during episode (if environment has filesystem or persistent sandbox)
- Any side effects from tool executions
What’s NOT Preserved?
Across episodes:- Each session is independent at the protocol level
- Session 1 and Session 2 have separate environment instances
- No shared instance state between sessions (though implementations may share class-level or cached state)
- 15 minutes of inactivity → session deleted
- State is lost
- Must create new session
finished: true:
- Episode data is final
- Further tool calls should not be made
- Call
/deletefor cleanup
Session Timeout
Sessions automatically expire after 15 minutes of inactivity.Inactivity Definition
“Inactivity” means no requests with that session’sX-Session-ID:
/pingresets timer/{env_name}/callresets timer/{env_name}/promptresets timer- Any request with
X-Session-IDresets the timer (except/delete, which removes the session)
Keeping Sessions Alive
For long-running episodes, periodically call/ping:
Timeout Cleanup
When a session times out:- Server calls
environment.teardown() - Session removed from active sessions
- Subsequent requests with that session ID → 404 Not Found
Session Best Practices
1. Always Delete Sessions
2. Check finished Flag
3. Handle Errors Gracefully
4. Use Context Managers
Debugging Sessions
Common Issues
Issue: “404 Session not found”- Cause: Session timed out or was deleted
- Fix: Check that episode completes within 15 minutes or use
/ping
- Cause: Trying to create episode with already-used session ID
- Fix: Generate new session ID with
/create_session
- Cause: Calling tool after
/deletewas called - Fix: Don’t reuse session IDs after deletion
Next Steps
Rewards Concept
Understand reward signals in episodes
Implementing a Client
Build a client that manages sessions
Key Takeaway: Sessions are RL episodes. They start with a task, continue until
finished: true, and should always be cleaned up with /delete.
