🎯

agent-browser

🎯Skill

from linuxlewis/agent-skills

What it does

Automates web browsing tasks by navigating websites, interacting with page elements, taking screenshots, and performing browser actions via CLI commands.

📦

Part of

linuxlewis/agent-skills(3 items)

agent-browser

Installation

npm installInstall npm package

npm install -g agent-browser

📖 Extracted from docs: linuxlewis/agent-skills

Need more details? View full documentation on GitHub →

1Installs

AddedFeb 4, 2026

View on GitHub Back to Skills

Skill Details

SKILL.md

Browser automation using the agent-browser CLI. Use when user asks to browse websites, open webpages, interact with page elements, take screenshots, fill forms, click buttons, scrape content, or automate browser tasks.

Overview

# Agent Browser

Browser automation using the agent-browser CLI - a fast, headless browser automation tool for AI agents.

Installation

```bash

npm install -g agent-browser

agent-browser install # Install browser binaries

```

Quick Start

```bash

# Navigate to a URL

agent-browser open https://example.com

# Get accessibility snapshot (shows refs like @e1, @e2)

agent-browser snapshot -i

# Click using ref from snapshot

agent-browser click @e2

# Type into an element

agent-browser fill @e3 "hello world"

# Take screenshot

agent-browser screenshot output.png

```

Workflow Pattern

Open - Navigate to the target URL
Snapshot - Get the accessibility tree to see available elements
Interact - Use refs (@e1, @e2, etc.) to interact with elements
Verify - Take a snapshot or screenshot to verify state

Core Commands

See [references/commands.md](references/commands.md) for the complete command reference.

Navigation

```bash

agent-browser open # Navigate to URL

agent-browser back # Go back

agent-browser forward # Go forward

agent-browser reload # Reload page

```

Interaction

```bash

agent-browser click # Click element (or @ref)

agent-browser fill # Clear and fill

agent-browser press # Press key (Enter, Tab, etc.)

agent-browser select # Select dropdown option

```

Getting Information

```bash

agent-browser snapshot # Accessibility tree with refs

agent-browser snapshot -i # Interactive elements only

agent-browser get text # Get element text

agent-browser get url # Get current URL

```

Capture

```bash

agent-browser screenshot [path] # Take screenshot

agent-browser screenshot --full # Full page screenshot

agent-browser pdf # Save as PDF

```

Sessions

Use sessions to maintain browser state across commands:

```bash

agent-browser --session myproject open https://example.com

agent-browser --session myproject snapshot

agent-browser --session myproject click @e1

```

Selectors

Refs: @e1, @e2 (from snapshot output) - preferred
CSS: #id, .class, div > span
Text: text=Submit
Role: role=button[name="Submit"]

Best Practices

Always snapshot first - Get the accessibility tree before interacting
Use refs - Prefer @e1 refs from snapshot over CSS selectors
Use sessions - Maintain state across multiple commands
Wait appropriately - Use wait for dynamic content
Verify actions - Snapshot or screenshot after interactions

More from this repository2

🎯

pr-responder🎯Skill

Efficiently analyzes GitHub PR review comments, categorizes them by priority, and helps implement targeted code improvements.

🎯

ralph-runner🎯Skill

I apologize, but I cannot generate a description without seeing the actual repository or skill details. Could you provide more context about the "ralph-runner" skill, such as its purpose, functiona...