feat: skill expansion — browser, security, SQL, files (16 skills total)
Novas skills instaladas: - openclaw-agent-browser v1.0.0 CLI Chromium — navegação, login, screenshots, state - skill-security-audit v1.0.0 SAST scanning, prompt injection, secrets audit - sql-toolkit v1.0.0 PostgreSQL/MySQL/SQLite — schema, query, otimização - file v1.0.0 Organização de arquivos por contexto - file-summary v1.0.0 Extração e resumo de PDFs, Word, Excel Workspace expandido: - TOOLS.md: +Browser automation, Security audit, SQL, File management - AGENTS.md: +Linux Analyst section (comandos, logs, rede, scripts) + Full-stack strategy - MEMORY.md: 16 skills indexadas, stack map, comandos Linux ref - SESSION-STATE.md: atualizado com contexto completo - lock.json: sincronizado com 16 skills instaladas
This commit is contained in:
@@ -0,0 +1,8 @@
|
||||
{
|
||||
"version": 1,
|
||||
"registry": "https://clawhub.ai",
|
||||
"slug": "openclaw-agent-browser",
|
||||
"installedVersion": "1.0.0",
|
||||
"installedAt": 1779234569455,
|
||||
"fingerprint": "fd248d74c33718c268110638a4a1aadc8759a30f70e11169b69100ff3bf35531"
|
||||
}
|
||||
@@ -0,0 +1,144 @@
|
||||
---
|
||||
name: agent-browser
|
||||
description: Headless browser automation CLI for AI agents. Use when interacting with websites — navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, scraping, testing web apps, downloading files, or automating any browser task. Triggers on requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data", "test this web app", "login to a site", "monitor a page", or any task requiring programmatic web interaction.
|
||||
---
|
||||
|
||||
# Browser Automation with agent-browser
|
||||
|
||||
## Setup
|
||||
|
||||
Run `scripts/setup.sh` to install agent-browser and Chromium. Requires Node.js.
|
||||
|
||||
## Core Workflow
|
||||
|
||||
Every browser automation follows this pattern:
|
||||
|
||||
1. **Navigate**: `agent-browser open <url>`
|
||||
2. **Snapshot**: `agent-browser snapshot -i` (get element refs like `@e1`, `@e2`)
|
||||
3. **Interact**: Use refs to click, fill, select
|
||||
4. **Re-snapshot**: After navigation or DOM changes, get fresh refs
|
||||
|
||||
```bash
|
||||
agent-browser open https://example.com/form
|
||||
agent-browser snapshot -i
|
||||
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"
|
||||
|
||||
agent-browser fill @e1 "user@example.com"
|
||||
agent-browser fill @e2 "password123"
|
||||
agent-browser click @e3
|
||||
agent-browser wait --load networkidle
|
||||
agent-browser snapshot -i # Check result
|
||||
```
|
||||
|
||||
## Command Chaining
|
||||
|
||||
Chain with `&&` when you don't need intermediate output:
|
||||
|
||||
```bash
|
||||
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i
|
||||
```
|
||||
|
||||
Run separately when you need to parse output first (e.g., snapshot to discover refs).
|
||||
|
||||
## Essential Commands
|
||||
|
||||
```bash
|
||||
# Navigate
|
||||
agent-browser open <url>
|
||||
agent-browser close
|
||||
|
||||
# See the page (always do this first)
|
||||
agent-browser snapshot -i # Interactive elements with refs
|
||||
agent-browser snapshot -i -C # Include onclick divs
|
||||
|
||||
# Interact using @refs
|
||||
agent-browser click @e1
|
||||
agent-browser fill @e2 "text"
|
||||
agent-browser select @e1 "option"
|
||||
agent-browser press Enter
|
||||
agent-browser scroll down 500
|
||||
|
||||
# Get info
|
||||
agent-browser get text @e1
|
||||
agent-browser get url
|
||||
agent-browser get title
|
||||
|
||||
# Wait
|
||||
agent-browser wait @e1 # For element
|
||||
agent-browser wait --load networkidle # For network idle
|
||||
|
||||
# Capture
|
||||
agent-browser screenshot page.png
|
||||
agent-browser screenshot --full # Full page
|
||||
agent-browser pdf output.pdf
|
||||
```
|
||||
|
||||
For the full command reference, see `references/commands.md`.
|
||||
|
||||
## Ref Lifecycle (Important)
|
||||
|
||||
Refs (`@e1`, `@e2`) are invalidated when the page changes. Always re-snapshot after:
|
||||
- Clicking links/buttons that navigate
|
||||
- Form submissions
|
||||
- Dynamic content loading (dropdowns, modals)
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Form Submission
|
||||
```bash
|
||||
agent-browser open https://example.com/signup
|
||||
agent-browser snapshot -i
|
||||
agent-browser fill @e1 "Jane Doe"
|
||||
agent-browser fill @e2 "jane@example.com"
|
||||
agent-browser select @e3 "California"
|
||||
agent-browser click @e5
|
||||
agent-browser wait --load networkidle
|
||||
```
|
||||
|
||||
### Login with State Persistence
|
||||
```bash
|
||||
agent-browser open https://app.example.com/login
|
||||
agent-browser snapshot -i
|
||||
agent-browser fill @e1 "$USERNAME" && agent-browser fill @e2 "$PASSWORD"
|
||||
agent-browser click @e3
|
||||
agent-browser wait --url "**/dashboard"
|
||||
agent-browser state save auth.json
|
||||
|
||||
# Reuse later
|
||||
agent-browser state load auth.json
|
||||
agent-browser open https://app.example.com/dashboard
|
||||
```
|
||||
|
||||
### Data Extraction
|
||||
```bash
|
||||
agent-browser open https://example.com/products
|
||||
agent-browser snapshot -i
|
||||
agent-browser get text @e5
|
||||
agent-browser get text body > page.txt
|
||||
```
|
||||
|
||||
### Screenshot & Diff
|
||||
```bash
|
||||
agent-browser screenshot baseline.png
|
||||
# ... changes happen ...
|
||||
agent-browser diff screenshot --baseline baseline.png
|
||||
```
|
||||
|
||||
### Parallel Sessions
|
||||
```bash
|
||||
agent-browser --session site1 open https://site-a.com
|
||||
agent-browser --session site2 open https://site-b.com
|
||||
agent-browser session list
|
||||
```
|
||||
|
||||
## Security (Optional)
|
||||
|
||||
```bash
|
||||
export AGENT_BROWSER_CONTENT_BOUNDARIES=1 # Wrap output for AI safety
|
||||
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com" # Domain allowlist
|
||||
export AGENT_BROWSER_MAX_OUTPUT=50000 # Prevent context flooding
|
||||
```
|
||||
|
||||
## Cleanup
|
||||
|
||||
Always close sessions when done: `agent-browser close`
|
||||
@@ -0,0 +1,6 @@
|
||||
{
|
||||
"ownerId": "kn773vq0patpekkggy4d1mgct181zcnj",
|
||||
"slug": "openclaw-agent-browser",
|
||||
"version": "1.0.0",
|
||||
"publishedAt": 1772156126394
|
||||
}
|
||||
@@ -0,0 +1,135 @@
|
||||
# agent-browser Command Reference
|
||||
|
||||
## Navigation
|
||||
```bash
|
||||
agent-browser open <url> # Navigate (aliases: goto, navigate)
|
||||
agent-browser close # Close browser (aliases: quit, exit)
|
||||
```
|
||||
|
||||
## Snapshot (Primary Way to See the Page)
|
||||
```bash
|
||||
agent-browser snapshot -i # Interactive elements with refs (@e1, @e2...)
|
||||
agent-browser snapshot -i -C # Include cursor-interactive elements (onclick divs)
|
||||
agent-browser snapshot -s "#selector" # Scope to CSS selector
|
||||
agent-browser snapshot -i --json # JSON output for parsing
|
||||
```
|
||||
|
||||
## Interaction (Use @refs from snapshot)
|
||||
```bash
|
||||
agent-browser click @e1 # Click element
|
||||
agent-browser click @e1 --new-tab # Click and open in new tab
|
||||
agent-browser dblclick @e1 # Double-click
|
||||
agent-browser fill @e2 "text" # Clear and type text
|
||||
agent-browser type @e2 "text" # Type without clearing
|
||||
agent-browser select @e1 "option" # Select dropdown option
|
||||
agent-browser check @e1 # Check checkbox
|
||||
agent-browser uncheck @e1 # Uncheck checkbox
|
||||
agent-browser press Enter # Press key
|
||||
agent-browser keyboard type "text" # Type at current focus (no selector)
|
||||
agent-browser scroll down 500 # Scroll page
|
||||
agent-browser scroll down 500 --selector "div.content" # Scroll within container
|
||||
agent-browser drag @e1 @e2 # Drag and drop
|
||||
agent-browser upload @e1 file.pdf # Upload files
|
||||
agent-browser hover @e1 # Hover element
|
||||
```
|
||||
|
||||
## Get Information
|
||||
```bash
|
||||
agent-browser get text @e1 # Get element text
|
||||
agent-browser get text body > page.txt # Get all page text
|
||||
agent-browser get html @e1 # Get innerHTML
|
||||
agent-browser get url # Get current URL
|
||||
agent-browser get title # Get page title
|
||||
agent-browser get text @e1 --json # JSON output
|
||||
```
|
||||
|
||||
## Wait
|
||||
```bash
|
||||
agent-browser wait @e1 # Wait for element
|
||||
agent-browser wait --load networkidle # Wait for network idle
|
||||
agent-browser wait --url "**/page" # Wait for URL pattern
|
||||
agent-browser wait --fn "document.readyState === 'complete'" # JS condition
|
||||
agent-browser wait 2000 # Wait milliseconds
|
||||
```
|
||||
|
||||
## Downloads
|
||||
```bash
|
||||
agent-browser download @e1 ./file.pdf # Click to trigger download
|
||||
agent-browser wait --download ./output.zip # Wait for download
|
||||
agent-browser --download-path ./downloads open <url> # Set download dir
|
||||
```
|
||||
|
||||
## Capture
|
||||
```bash
|
||||
agent-browser screenshot # Screenshot to temp dir
|
||||
agent-browser screenshot page.png # Screenshot to path
|
||||
agent-browser screenshot --full # Full page screenshot
|
||||
agent-browser screenshot --annotate # Annotated with numbered labels
|
||||
agent-browser pdf output.pdf # Save as PDF
|
||||
```
|
||||
|
||||
## Diff (Compare Page States)
|
||||
```bash
|
||||
agent-browser diff snapshot # Current vs last snapshot
|
||||
agent-browser diff snapshot --baseline before.txt # Current vs saved file
|
||||
agent-browser diff screenshot --baseline before.png # Visual pixel diff
|
||||
agent-browser diff url <url1> <url2> # Compare two pages
|
||||
```
|
||||
|
||||
## Sessions
|
||||
```bash
|
||||
agent-browser --session site1 open https://site-a.com # Named session
|
||||
agent-browser session list # List active sessions
|
||||
agent-browser --session site1 close # Close specific session
|
||||
```
|
||||
|
||||
## Auth Vault
|
||||
```bash
|
||||
echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin
|
||||
agent-browser auth login github # Login using saved profile
|
||||
agent-browser auth list # List profiles
|
||||
agent-browser auth delete github # Delete profile
|
||||
```
|
||||
|
||||
## State Persistence
|
||||
```bash
|
||||
agent-browser state save auth.json # Save cookies/localStorage
|
||||
agent-browser state load auth.json # Restore state
|
||||
agent-browser --session-name myapp open <url> # Auto-save/restore
|
||||
agent-browser state list # List saved states
|
||||
agent-browser state clean --older-than 7 # Cleanup old states
|
||||
```
|
||||
|
||||
## Security
|
||||
```bash
|
||||
# Content boundaries (recommended for AI agents)
|
||||
export AGENT_BROWSER_CONTENT_BOUNDARIES=1
|
||||
|
||||
# Domain allowlist
|
||||
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"
|
||||
|
||||
# Action policy
|
||||
export AGENT_BROWSER_ACTION_POLICY=./policy.json
|
||||
|
||||
# Output limits
|
||||
export AGENT_BROWSER_MAX_OUTPUT=50000
|
||||
```
|
||||
|
||||
## Debugging
|
||||
```bash
|
||||
agent-browser --headed open <url> # Visual browser
|
||||
agent-browser highlight @e1 # Highlight element
|
||||
agent-browser record start demo.webm # Record session
|
||||
agent-browser eval "document.title" # Run JavaScript
|
||||
```
|
||||
|
||||
## Connect to Existing Chrome
|
||||
```bash
|
||||
agent-browser --auto-connect open <url> # Auto-discover Chrome
|
||||
agent-browser --cdp 9222 snapshot # Explicit CDP port
|
||||
```
|
||||
|
||||
## Local Files
|
||||
```bash
|
||||
agent-browser --allow-file-access open file:///path/to/doc.pdf
|
||||
```
|
||||
@@ -0,0 +1,40 @@
|
||||
#!/bin/bash
|
||||
# agent-browser setup script
|
||||
# Installs agent-browser globally and downloads Chromium
|
||||
|
||||
set -e
|
||||
|
||||
echo "🌐 Installing agent-browser..."
|
||||
|
||||
# Check if already installed
|
||||
if command -v agent-browser &>/dev/null; then
|
||||
VERSION=$(agent-browser --version 2>/dev/null || echo "unknown")
|
||||
echo "✓ agent-browser already installed: $VERSION"
|
||||
else
|
||||
npm install -g agent-browser
|
||||
echo "✓ agent-browser installed"
|
||||
fi
|
||||
|
||||
# Install Chromium
|
||||
echo "📦 Installing Chromium browser..."
|
||||
if [[ "$(uname)" == "Linux" ]]; then
|
||||
agent-browser install --with-deps 2>/dev/null || agent-browser install
|
||||
else
|
||||
agent-browser install
|
||||
fi
|
||||
echo "✓ Chromium ready"
|
||||
|
||||
# Verify
|
||||
echo ""
|
||||
echo "🧪 Verifying..."
|
||||
agent-browser open https://example.com >/dev/null 2>&1
|
||||
TITLE=$(agent-browser get title 2>/dev/null || echo "")
|
||||
agent-browser close >/dev/null 2>&1
|
||||
|
||||
if [[ "$TITLE" == *"Example"* ]]; then
|
||||
echo "✅ agent-browser is working!"
|
||||
else
|
||||
echo "⚠️ Installation complete but verification unclear. Try: agent-browser open https://example.com"
|
||||
fi
|
||||
|
||||
agent-browser --version
|
||||
Reference in New Issue
Block a user