Skip to content

Browser API

Base path: /api/browser

All endpoints require authentication. See API Overview for auth details.

GET /api/browser/status

Response:

{
"running": true,
"backend": "managed",
"current_url": "https://github.com",
"tabs": 2,
"uptime_seconds": 3600
}
POST /api/browser/close

Closes the browser session. Equivalent to clicking Close Browser in the UI.

POST /api/browser/action

General-purpose action endpoint. All browser interactions (navigation, clicking, typing, etc.) go through this single endpoint:

{
"action": "goto",
"url": "https://github.com/trending"
}
{
"action": "click",
"selector": "#sign-in-button"
}
{
"action": "type_text",
"selector": "input[name=email]",
"text": "user@example.com"
}
{
"action": "scroll",
"x": 0,
"y": 500
}
{
"action": "press_key",
"key": "Enter"
}

Available actions: goto, click, double_click, right_click, hover, type_text, press_key, select_option, check, uncheck, scroll, scroll_to_element, drag_and_drop, focus, blur, handle_dialog, wait_for_selector, wait_for_navigation, wait_for_network_idle, go_back, go_forward, reload

GET /api/browser/screenshot

Returns the current viewport as a binary image.

POST /api/browser/screenshot

Takes a screenshot with optional format/quality options:

{"format": "jpeg", "quality": 85}

Returns a binary image response (JPEG or PNG).

GET /api/browser/snapshot

Returns a numbered, AI-friendly representation of interactive elements:

{
"snapshot": "[1] Sign in (button)\n[2] Email (input)\n[3] Password (input)\n[4] Remember me (checkbox)\n",
"element_count": 47,
"url": "https://example.com/login"
}
GET /api/browser/tabs
POST /api/browser/tabs
DELETE /api/browser/tabs/{tab_id}
PUT /api/browser/tabs/{tab_id}/activate

Manage browser tabs — list, create, close, and switch between tabs.

GET /api/browser/requests

Returns recent network requests.

POST /api/browser/har/start
POST /api/browser/har/stop

Returns the captured HAR data.

POST /api/browser/mock
{
"url_pattern": "*/api/users",
"status_code": 200,
"body": {"users": []},
"headers": {"Content-Type": "application/json"}
}
DELETE /api/browser/mock
GET /api/browser/cookies
POST /api/browser/cookies
DELETE /api/browser/cookies

Get, set, or clear browser cookies.

GET /api/browser/auth-states
POST /api/browser/auth-states
POST /api/browser/auth-states/{name}/restore
DELETE /api/browser/auth-states/{name}

Save and restore authentication state (cookies + localStorage) for reuse across sessions.

POST /api/browser/recording/start
{"name": "login-flow-test"}
POST /api/browser/recording/stop

Returns:

{
"recording_id": "rec_abc123",
"duration_seconds": 45,
"action_count": 12,
"file_path": "workspace/recordings/login-flow-test.json"
}
GET /api/browser/recordings
POST /api/browser/recordings/{id}/replay
GET /api/browser/recordings/{id}
DELETE /api/browser/recordings/{id}
POST /api/browser/emulation
{
"preset": "iPhone 14 Pro"
}

Or custom:

{
"viewport": {"width": 375, "height": 812},
"device_scale_factor": 3,
"is_mobile": true,
"user_agent": "Mozilla/5.0 (iPhone...)"
}
POST /api/browser/profile

Set a browser profile (user agent, locale, etc.).

POST /api/browser/screen-recording
POST /api/browser/screen-recording/from-recording/{recording_id}

Record the browser viewport as video, or generate a video from an existing action recording.

WS /api/browser/stream

Receives screenshot frames as binary WebSocket messages at up to 2 fps. The UI’s Browser page uses this for the live viewport.

GET /api/browser/config
PUT /api/browser/config
{
"backend": "managed",
"headless": true,
"auto_snapshot": true,
"snapshot_interval_ms": 500,
"stream_fps": 2,
"stream_quality": 75,
"ssrf_allowlist": [],
"ssrf_blocklist": []
}