Execution

How execution works

When you send a message in the Snippbot UI, it flows through a multi-stage execution pipeline:

User message in chat UI
        │
        ▼
Context built (project, memory, user preferences)
        │
        ▼
Context window checked (summarize older messages if over budget)
        │
        ▼
Message sent to LLM (streaming response)
        │
        ├──→ Text response → stream to UI
        └──→ Tool calls → execute tools → feed results back to LLM
                │
                └── Loop up to 10 turns until done

Chat execution loop

The core execution engine is the streaming chat loop. When you send a message:

The conversation history is loaded and any file attachments are processed (text extraction, OCR, or vision analysis for images)
The Context Builder assembles additional context: project info, user preferences, and memory recall results
The Context Window Manager checks if the conversation exceeds the token budget --- if so, older messages are summarized into a compact preamble
The message is streamed to the LLM provider
If the agent is in agentic mode (tool use enabled), tool calls are executed locally and their results are fed back to the LLM for up to 10 turns per message

Context modes

The chat UI supports several context modes that change how the agent behaves:

Mode	Purpose
Default	General-purpose conversation
`brainstorm`	Requirements discussion --- preserves confirmed requirements and decisions
`output-refine`	Iterative refinement --- tracks changes requested and applied
`browser`	Web automation with Playwright
`game`	Tabletop RPG narrative with specialized story summarization

Each mode uses a tailored summarization prompt when the context window is compacted, so the agent retains the most relevant information for that mode.

Context window management

Snippbot automatically manages the conversation context window to stay within model token limits. This is handled by the Context Window Manager (part of the Working Memory tier in the cognitive memory architecture).

How it works

Token estimation: Messages are estimated at roughly 4 characters per token. Images count as approximately 1,000 tokens each.
Budget calculation: The effective budget is determined by the model’s context limit minus the system prompt tokens and a safety margin of 20,000 tokens. The default budget is 150,000 tokens.
Split point: When the conversation exceeds the budget, the manager walks backward from the most recent messages, always keeping at least the last 4 messages (2 full turns).
Summarization: Older messages are summarized by Claude Haiku for speed and cost. If the LLM summary fails, a heuristic fallback truncates each message to 150 characters.
Reassembly: The summary is injected as a system message, followed by the recent messages in full.

Model context limits

Model	Context limit
Claude Sonnet 4.5	200,000 tokens
Claude Sonnet 4	200,000 tokens
Claude Haiku 4.5	200,000 tokens
Claude Opus 4	200,000 tokens
Gemini 2.0 Flash	1,000,000 tokens
Gemini 2.5 Pro	1,000,000 tokens

Context window strategy

You can control context window behavior from the Agent Settings panel in the UI:

Strategy	Behavior
`preserve` (default)	Keep as much conversation history as possible, summarize only when necessary
`compact`	Aggressively summarize to minimize token usage
`summarize`	Always summarize older messages to maintain a compact context

Agentic mode and tool execution

When agentic mode is enabled in the chat UI, the agent can use tools (file operations, code execution, web browsing, etc.) to complete tasks. Each message can trigger a multi-turn loop:

The LLM responds with one or more tool calls
Each tool is executed locally by the Tool Executor, which runs within the project’s working directory
Tool results are collected and sent back to the LLM as the next turn
The loop continues until the LLM produces a final text response (no more tool calls) or the 10-turn safety limit is reached

Agent settings

Execution behavior is configurable from the Agent Settings page in the Snippbot UI. These settings are stored per user in the local database.

Setting	Default	Range	Description
Personality	`balanced`	`precise`, `balanced`, `creative`, `minimal`	Agent response style
Verbosity	`normal`	`minimal`, `normal`, `verbose`, `debug`	How much detail in responses
Auto-execute	`false`	on/off	Whether tasks run without manual approval
Approval threshold	`50`	0—100	Risk threshold above which approval is required
Max tokens per task	`4,096`	1,024—131,072	Token limit for a single task
Max retries	`3`	0—10	Maximum retry attempts on failure
Timeout	`300s`	30—3,600s	Task timeout in seconds
Context window strategy	`preserve`	`compact`, `preserve`, `summarize`	How older messages are handled

Retries and failure handling

Failed tasks are automatically retried based on how the failure is classified:

Failure classes

Class	Description	Retry behavior
`transient`	Temporary error (rate limit, network timeout)	Retry with exponential backoff
`recoverable`	Logic error the agent might fix on retry	Retry with error context added to prompt
`terminal`	Unrecoverable (bad credentials, invalid input)	Mark failed immediately, no retries

Exponential backoff

When retries are enabled (default: 3 attempts), the delay doubles each attempt:

Attempt 1: immediate
Attempt 2: 30s delay
Attempt 3: 60s delay
Attempt 4: 120s delay → mark failed

Sub-agent execution

Snippbot supports sub-agents --- specialized agents that can be spawned to handle subtasks. Sub-agents have their own lifecycle, resource budgets, and concurrency controls.

Sub-agent roles

Role	Purpose
`researcher`	Information gathering and analysis
`coder`	Code writing and implementation
`reviewer`	Code review and quality assessment
`tester`	Test creation and execution
`analyst`	Data analysis
`creative`	Creative writing and ideation
`general`	General-purpose tasks

Sub-agent lifecycle

Sub-agents move through these states:

pending → awaiting_approval → queued → running → completed
                                          │
                                          ├── failed
                                          ├── cancelled
                                          └── timed_out

Concurrency control

Sub-agent execution is governed by concurrency limits:

Global maximum: 8 concurrent sub-agents across all parents
Per-parent maximum: 5 concurrent sub-agents per parent agent
Priority queuing: When limits are reached, sub-agents are queued with priority-based ordering (1 = highest, 10 = lowest)

Resource limits

Each sub-agent has configurable resource constraints:

Limit	Default	Range
Max turns	20	1—100
Max tokens	100,000	1,000—1,000,000
Timeout	60 minutes	1—240 minutes
Priority	5	1—10

Team orchestration

For complex development tasks, Snippbot provides team orchestration --- an autonomous multi-agent loop that follows the Architect, Executor, and Reviewer pattern:

Architect (read-only): Analyzes the task, creates a plan, and identifies requirements
Executor (full access): Implements the plan with full tool access (file writes, code execution)
Reviewer (read-only): Reviews the output and issues a verdict: APPROVE, REQUEST_CHANGES, or BLOCK

If the reviewer requests changes, the loop iterates (up to 3 iterations by default). Each phase has its own model, turn limit, and tool access controls.

Team configuration defaults

Setting	Default
Max iterations	3
Architect max turns	15
Executor max turns	30
Reviewer max turns	15
Timeout	30 minutes
Total token budget	500,000

Execution events

Snippbot emits events throughout the execution lifecycle that you can observe in the UI:

Execution lifecycle: execution.started, execution.paused, execution.resumed, execution.completed, execution.failed
Task lifecycle: task.queued, task.started, task.completed, task.failed, task.retrying
Sub-agent lifecycle: subagent.spawned, subagent.started, subagent.completed, subagent.failed
Team orchestration: team.run.started, team.phase.started, team.phase.completed, team.review.decision, team.run.completed
Context window: context.window.applied (emitted when older messages are summarized)

Monitoring execution

The Snippbot UI provides several views for monitoring execution:

Chat panel: Shows streaming responses, tool call results, and error messages in real time
Activity panel: Displays execution events, tool outcomes, and sub-agent status
Projects page: Shows task-level status for project workflows, including retry counts and failure details

Projects & Tasks --- project hierarchy and status model
Agents --- agent tool access and workspace
Memory --- how memory and context interact
Monitor Usage --- observability and alerting

Execution

How execution works

Chat execution loop

Context modes

Context window management

How it works

Model context limits

Context window strategy

Agentic mode and tool execution

Agent settings

Retries and failure handling

Failure classes

Exponential backoff

Sub-agent execution

Sub-agent roles

Sub-agent lifecycle

Concurrency control

Resource limits

Team orchestration

Team configuration defaults

Execution events

Monitoring execution

Docs

Snippbot

Legal

Execution

How execution works

Chat execution loop

Context modes

Context window management

How it works

Model context limits

Context window strategy

Agentic mode and tool execution

Agent settings

Retries and failure handling

Failure classes

Exponential backoff

Sub-agent execution

Sub-agent roles

Sub-agent lifecycle

Concurrency control

Resource limits

Team orchestration

Team configuration defaults

Execution events

Monitoring execution

Related

Docs

Snippbot

Legal