Execution
How execution works
Section titled “How execution works”When you send a message in the Snippbot UI, it flows through a multi-stage execution pipeline:
User message in chat UI │ ▼Context built (project, memory, user preferences) │ ▼Context window checked (summarize older messages if over budget) │ ▼Message sent to LLM (streaming response) │ ├──→ Text response → stream to UI └──→ Tool calls → execute tools → feed results back to LLM │ └── Loop up to 10 turns until doneChat execution loop
Section titled “Chat execution loop”The core execution engine is the streaming chat loop. When you send a message:
- The conversation history is loaded and any file attachments are processed (text extraction, OCR, or vision analysis for images)
- The Context Builder assembles additional context: project info, user preferences, and memory recall results
- The Context Window Manager checks if the conversation exceeds the token budget --- if so, older messages are summarized into a compact preamble
- The message is streamed to the LLM provider
- If the agent is in agentic mode (tool use enabled), tool calls are executed locally and their results are fed back to the LLM for up to 10 turns per message
Context modes
Section titled “Context modes”The chat UI supports several context modes that change how the agent behaves:
| Mode | Purpose |
|---|---|
| Default | General-purpose conversation |
brainstorm | Requirements discussion --- preserves confirmed requirements and decisions |
output-refine | Iterative refinement --- tracks changes requested and applied |
browser | Web automation with Playwright |
game | Tabletop RPG narrative with specialized story summarization |
Each mode uses a tailored summarization prompt when the context window is compacted, so the agent retains the most relevant information for that mode.
Context window management
Section titled “Context window management”Snippbot automatically manages the conversation context window to stay within model token limits. This is handled by the Context Window Manager (part of the Working Memory tier in the cognitive memory architecture).
How it works
Section titled “How it works”- Token estimation: Messages are estimated at roughly 4 characters per token. Images count as approximately 1,000 tokens each.
- Budget calculation: The effective budget is determined by the model’s context limit minus the system prompt tokens and a safety margin of 20,000 tokens. The default budget is 150,000 tokens.
- Split point: When the conversation exceeds the budget, the manager walks backward from the most recent messages, always keeping at least the last 4 messages (2 full turns).
- Summarization: Older messages are summarized by Claude Haiku for speed and cost. If the LLM summary fails, a heuristic fallback truncates each message to 150 characters.
- Reassembly: The summary is injected as a system message, followed by the recent messages in full.
Model context limits
Section titled “Model context limits”| Model | Context limit |
|---|---|
| Claude Sonnet 4.5 | 200,000 tokens |
| Claude Sonnet 4 | 200,000 tokens |
| Claude Haiku 4.5 | 200,000 tokens |
| Claude Opus 4 | 200,000 tokens |
| Gemini 2.0 Flash | 1,000,000 tokens |
| Gemini 2.5 Pro | 1,000,000 tokens |
Context window strategy
Section titled “Context window strategy”You can control context window behavior from the Agent Settings panel in the UI:
| Strategy | Behavior |
|---|---|
preserve (default) | Keep as much conversation history as possible, summarize only when necessary |
compact | Aggressively summarize to minimize token usage |
summarize | Always summarize older messages to maintain a compact context |
Agentic mode and tool execution
Section titled “Agentic mode and tool execution”When agentic mode is enabled in the chat UI, the agent can use tools (file operations, code execution, web browsing, etc.) to complete tasks. Each message can trigger a multi-turn loop:
- The LLM responds with one or more tool calls
- Each tool is executed locally by the Tool Executor, which runs within the project’s working directory
- Tool results are collected and sent back to the LLM as the next turn
- The loop continues until the LLM produces a final text response (no more tool calls) or the 10-turn safety limit is reached
Agent settings
Section titled “Agent settings”Execution behavior is configurable from the Agent Settings page in the Snippbot UI. These settings are stored per user in the local database.
| Setting | Default | Range | Description |
|---|---|---|---|
| Personality | balanced | precise, balanced, creative, minimal | Agent response style |
| Verbosity | normal | minimal, normal, verbose, debug | How much detail in responses |
| Auto-execute | false | on/off | Whether tasks run without manual approval |
| Approval threshold | 50 | 0—100 | Risk threshold above which approval is required |
| Max tokens per task | 4,096 | 1,024—131,072 | Token limit for a single task |
| Max retries | 3 | 0—10 | Maximum retry attempts on failure |
| Timeout | 300s | 30—3,600s | Task timeout in seconds |
| Context window strategy | preserve | compact, preserve, summarize | How older messages are handled |
Retries and failure handling
Section titled “Retries and failure handling”Failed tasks are automatically retried based on how the failure is classified:
Failure classes
Section titled “Failure classes”| Class | Description | Retry behavior |
|---|---|---|
transient | Temporary error (rate limit, network timeout) | Retry with exponential backoff |
recoverable | Logic error the agent might fix on retry | Retry with error context added to prompt |
terminal | Unrecoverable (bad credentials, invalid input) | Mark failed immediately, no retries |
Exponential backoff
Section titled “Exponential backoff”When retries are enabled (default: 3 attempts), the delay doubles each attempt:
Attempt 1: immediateAttempt 2: 30s delayAttempt 3: 60s delayAttempt 4: 120s delay → mark failedSub-agent execution
Section titled “Sub-agent execution”Snippbot supports sub-agents --- specialized agents that can be spawned to handle subtasks. Sub-agents have their own lifecycle, resource budgets, and concurrency controls.
Sub-agent roles
Section titled “Sub-agent roles”| Role | Purpose |
|---|---|
researcher | Information gathering and analysis |
coder | Code writing and implementation |
reviewer | Code review and quality assessment |
tester | Test creation and execution |
analyst | Data analysis |
creative | Creative writing and ideation |
general | General-purpose tasks |
Sub-agent lifecycle
Section titled “Sub-agent lifecycle”Sub-agents move through these states:
pending → awaiting_approval → queued → running → completed │ ├── failed ├── cancelled └── timed_outConcurrency control
Section titled “Concurrency control”Sub-agent execution is governed by concurrency limits:
- Global maximum: 8 concurrent sub-agents across all parents
- Per-parent maximum: 5 concurrent sub-agents per parent agent
- Priority queuing: When limits are reached, sub-agents are queued with priority-based ordering (1 = highest, 10 = lowest)
Resource limits
Section titled “Resource limits”Each sub-agent has configurable resource constraints:
| Limit | Default | Range |
|---|---|---|
| Max turns | 20 | 1—100 |
| Max tokens | 100,000 | 1,000—1,000,000 |
| Timeout | 60 minutes | 1—240 minutes |
| Priority | 5 | 1—10 |
Team orchestration
Section titled “Team orchestration”For complex development tasks, Snippbot provides team orchestration --- an autonomous multi-agent loop that follows the Architect, Executor, and Reviewer pattern:
- Architect (read-only): Analyzes the task, creates a plan, and identifies requirements
- Executor (full access): Implements the plan with full tool access (file writes, code execution)
- Reviewer (read-only): Reviews the output and issues a verdict:
APPROVE,REQUEST_CHANGES, orBLOCK
If the reviewer requests changes, the loop iterates (up to 3 iterations by default). Each phase has its own model, turn limit, and tool access controls.
Team configuration defaults
Section titled “Team configuration defaults”| Setting | Default |
|---|---|
| Max iterations | 3 |
| Architect max turns | 15 |
| Executor max turns | 30 |
| Reviewer max turns | 15 |
| Timeout | 30 minutes |
| Total token budget | 500,000 |
Execution events
Section titled “Execution events”Snippbot emits events throughout the execution lifecycle that you can observe in the UI:
- Execution lifecycle:
execution.started,execution.paused,execution.resumed,execution.completed,execution.failed - Task lifecycle:
task.queued,task.started,task.completed,task.failed,task.retrying - Sub-agent lifecycle:
subagent.spawned,subagent.started,subagent.completed,subagent.failed - Team orchestration:
team.run.started,team.phase.started,team.phase.completed,team.review.decision,team.run.completed - Context window:
context.window.applied(emitted when older messages are summarized)
Monitoring execution
Section titled “Monitoring execution”The Snippbot UI provides several views for monitoring execution:
- Chat panel: Shows streaming responses, tool call results, and error messages in real time
- Activity panel: Displays execution events, tool outcomes, and sub-agent status
- Projects page: Shows task-level status for project workflows, including retry counts and failure details
Related
Section titled “Related”- Projects & Tasks --- project hierarchy and status model
- Agents --- agent tool access and workspace
- Memory --- how memory and context interact
- Monitor Usage --- observability and alerting