Automate Browser

Snippbot’s browser tool gives agents full control over a Chromium instance via the Chrome DevTools Protocol (CDP). Agents can navigate, click, type, extract data, take screenshots, and record sessions for replay.

Snippbot browser automation panel showing CDP connection and live viewport

Browser backends

Choose your browser backend in Settings → Browser:

Backend	Description
Managed (default)	Headless Chromium launched and managed by Snippbot
Chrome	Use your already-installed Chrome browser
CDP	Connect to an existing Chrome via remote debugging port
Browserless	Connect to a browserless.io cloud instance

Asking the agent to automate

In the chat interface, simply describe what you want:

Example prompt: “Navigate to github.com/trending and extract the top 10 repos as JSON”

Example prompt: “Go to my company’s Jira board and create a ticket: Title: Fix login timeout, Priority: High, Assignee: me”

Example prompt: “Search for ‘python async tutorial’ on YouTube and screenshot the first 5 results”

The agent uses browser tools internally — you see the results in the chat.

Available browser actions

The full set of browser capabilities:

navigate — Go to a URL
go_back / go_forward — Browser history
reload — Reload current page

Interaction

click — Click an element
double_click — Double-click
right_click — Right-click (context menu)
type_text — Type into a field
press_key — Press keyboard keys (Enter, Tab, Escape, etc.)
hover — Hover over an element
drag_and_drop — Drag from one element to another

Forms

select_option — Select from a dropdown
check / uncheck — Toggle checkboxes

Scrolling

scroll — Scroll by pixels
scroll_to_element — Scroll to bring an element into view

Waiting

wait_for_selector — Wait for an element to appear
wait_for_navigation — Wait for page navigation to complete
wait_for_network_idle — Wait until there’s no network activity

Capture

screenshot — Screenshot of the viewport
screenshot_element — Screenshot of a specific element
screenshot_full_page — Full-page screenshot
pdf_export — Export page as PDF

Data extraction

get_text — Extract visible text
get_attribute — Get element attribute value
get_bounding_box — Get element position and size
is_visible / is_enabled — Check element state
element_count — Count matching elements

Dialogs

handle_dialog — Accept/dismiss JavaScript dialogs (auto-dismissed after 30s)

DOM snapshots

Instead of using CSS selectors, the agent can work with numbered element references from a DOM snapshot. This makes browser interaction more readable and less fragile:

The snapshot lists each interactive element with a numbered reference, its label, element type, and visibility state. For example, the agent might see a sign-in button as element 1, a username field as element 2, and a password field as element 3.

The agent types: “type ‘user@example.com’ into element 2” and the DOM snapshot engine resolves it to the correct element.

Recording and replay

Record a browser session and replay it later:

Start recording — open the Browser page (/browser) and click Record in the toolbar
Perform actions manually or let the agent control the browser
Stop recording — the session is saved as a JSON file in the agent workspace
Replay — open the saved session from the Browser page and click Replay in the toolbar

Recordings redact sensitive parameters (passwords, tokens) automatically.

Device emulation

Test responsive designs and mobile behavior:

Example prompt: “Emulate an iPhone 14 Pro Max and take a screenshot of twitter.com”

Supported presets:

iPhone 14, iPhone 14 Pro Max, Pixel 7, Galaxy S23
iPad Pro 11
Desktop 1920x1080, Desktop 1280x720

Or ask the agent for a custom viewport:

Example prompt: “Emulate a 375x812 viewport at 3x scale with a mobile user agent and take a screenshot”

Network interception

Capture and mock network requests:

Example prompt: “Capture all API calls made by the page, then mock the /api/users endpoint to return test data”

The NetworkManager supports:

Request/response capture (HAR format)
Response mocking with custom status and body
Header injection
Request blocking

Authentication persistence

The agent can save and restore login state across sessions:

Example prompt: “Log into GitHub with my credentials, then save the session for reuse”

Auth state (cookies, localStorage, sessionStorage) is encrypted at rest using Fernet. On the next session, the agent restores the state to skip the login step.

SSRF protection

The browser tool includes an SSRF guard that blocks:

Private IPv4 ranges (10.x, 192.168.x, 172.16–31.x, 127.x)
IPv6 loopback and link-local
Cloud metadata endpoints (169.254.169.254 for AWS, GCP, Alibaba)
Internal hostnames (.local, .internal)
Dangerous URL schemes (file://, ftp://, javascript:, gopher://)

Rate limits

60 browser actions per 60 seconds (sliding window)
Actions that exceed the limit are queued with exponential backoff

Viewing the live browser

Open the Browser page (/browser) in the Snippbot UI to see a live stream of what the agent is doing. The viewport updates at up to 2 frames/second by default.

Auto-snapshot during automation

When Auto-snapshot is enabled, Snippbot takes a screenshot at each step. This is invaluable for:

Debugging automation failures
Creating audit trails
Building replay recordings

Screenshots are saved to the agent’s workspace.

Tips

Use natural language goals, not step-by-step instructions. The agent handles the specifics.
Enable auto-snapshot for complex workflows — you can see exactly where things went wrong
Use cursor sync in the Browser UI to track the agent’s mouse position
Start with managed Chromium — switch to CDP only if you need an existing browser profile/extensions