Sessions & Episodes
In ORS, a session IS an RL episode. This page explains the episode concept, lifecycle, state management, and best practices.Core Concept: Session = Episode
The most important concept in ORS:A session represents one complete RL episode (trajectory) through an environment.An episode:
- Starts with a specific task
- Continues through multiple tool calls
- Ends when
finished: trueis received - Represents one complete problem-solving attempt
RL Episode Terminology
| RL Term | ORS Term | Description |
|---|---|---|
| Episode | Session | One complete trajectory |
| State | Blocks (prompt + tool outputs) | Observable environment state |
| Action | Tool call | Agent action |
| Reward | ToolOutput.reward | Feedback signal |
| Terminal state | finished: true | Episode complete |
Episode Lifecycle
Complete Flow
States in Detail
1. Session ID Generation
2. Episode Initialization
- Server instantiates the environment class
- Passes
task_specandsecretsto constructor - Calls
environment.setup()(async) - Marks session as “ready” when setup completes
3. Initial Observation
4. Action-Observation Loop
- Agent takes action (calls tool)
- Environment executes action
- Environment returns next state (blocks), reward, and termination flag
- If
finished: false, repeat from step 1 - If
finished: true, episode is complete
- Action: Tool call
- Observation: Blocks
- Reward: Reward signal
- Terminal: Finished flag
5. Episode Termination
- Calls
environment.teardown() - Removes session from active sessions
- Frees memory and resources
/delete when done, even if episode finished naturally.
Episode Termination
The finished Signal
The finished field in ToolOutput is critical:
finished: true:
- Episode is complete
- Agent should stop calling tools
- Agent should call
/deleteto cleanup - Task succeeded or failed (check reward or blocks for details)
finished: false:
- Episode continues
- Agent should take another action
- State may have changed (reflected in blocks)
Termination Patterns
Pattern 1: Immediate Termination
Task completes in one step:Pattern 2: Multi-Step Termination
Task requires multiple actions:Pattern 3: Failure Termination
Task fails (but episode still terminates):State Management
What’s Preserved in a Session?
Environment state:- Instance variables in environment class
- Files created during episode (if environment has filesystem)
- Any side effects from tool executions
What’s NOT Preserved?
Across episodes:- Each session is independent
- Session 1 and Session 2 have separate state
- No shared state between sessions
- 15 minutes of inactivity → session deleted
- State is lost
- Must create new session
finished: true:
- Episode data is final
- Further tool calls should not be made
- Call
/deletefor cleanup
Session Timeout
Sessions automatically expire after 15 minutes of inactivity.Inactivity Definition
“Inactivity” means no requests with that session’sX-Session-ID:
/pingresets timer/{env_name}/callresets timer/{env_name}/promptresets timer- Any request with
X-Session-IDresets timer
Keeping Sessions Alive
For long-running episodes, periodically call/ping:
Timeout Cleanup
When a session times out:- Server calls
environment.teardown() - Session removed from active sessions
- Subsequent requests with that session ID → 404 Not Found
Concurrent Sessions
Multiple agents can run episodes concurrently:- Each session has independent state
- Sessions do not interfere with each other
- Server manages concurrency internally
- Server can handle many concurrent sessions
- Limited by server resources (memory, CPU)
- Each session incurs overhead
Session Best Practices
1. Always Delete Sessions
2. Check finished Flag
3. Handle Errors Gracefully
4. Use Context Managers
Episode Patterns
Pattern: Single-Step Episodes
Simple tasks that complete in one action:Pattern: Multi-Step Episodes
Complex tasks requiring exploration:Pattern: Timeout Protection
Long episodes with keep-alive:Debugging Sessions
Common Issues
Issue: “404 Session not found”- Cause: Session timed out or was deleted
- Fix: Check that episode completes within 15 minutes or use
/ping
- Cause: Trying to create episode with already-used session ID
- Fix: Generate new session ID with
/create_session
- Cause: Calling tool after
/deletewas called - Fix: Don’t reuse session IDs after deletion
Monitoring Sessions
Track episode progress:Next Steps
Rewards Concept
Understand reward signals in episodes
Implementing a Client
Build a client that manages sessions
Testing Locally
Test episode logic with local server
Key Takeaway: Sessions are RL episodes. They start with a task, continue until
finished: true, and should always be cleaned up with /delete. Understanding this concept is essential to working with ORS.
