Skip to main content
skillsSource-backedReview first Safety Privacy

BrowserAct Skills

MIT-licensed BrowserAct Agent Skill pack for installing and operating the `browser-act` browser automation CLI from Claude Code, Codex, OpenClaw, Cursor, OpenCode, Windsurf, Gemini CLI, and other skills-compatible agents.

by BrowserAct·added 2026-06-18·
HarnessClaude CodeCodexWindsurfGeminiCursorCLIVS Code
Level:expertType:capability-packVerified:validated
Review first review before installing

Open the source and read safety notes before installing.

Safety notes

  • BrowserAct can open pages, click, type, upload files, inspect state, capture screenshots, read page text, handle dialogs, export cookies, capture network requests, and operate logged-in browser sessions.
  • Use BrowserAct only on sites, accounts, and data sources where the user has authorization. Do not use it to evade access controls, violate site terms, scrape disallowed data, or bypass rate limits.
  • The entry skill declares confirmation gates for browser creation, deletion, local Chrome profile import, proxy/security changes, logins, form submissions, file uploads, and other sensitive operations; preserve those gates in agent workflows.
  • `solve-captcha` may send the challenge image to BrowserAct's verification-assistance service according to the skill metadata; do not use it with sensitive or unauthorized pages.
  • `remote-assist` can generate a live handoff URL for a human to take over. Treat that URL as access to the active browser session.
  • Skill Forge can generate reusable automation skills from explored sites. Review generated scripts, selectors, network assumptions, output schemas, and site authorization before reusing them at scale.

Privacy notes

  • BrowserAct workflows can expose page content, screenshots, URLs, credentials typed into forms, cookies, browser profiles, uploaded files, downloaded files, network requests, HAR data, session names, browser descriptions, and logs.
  • The BrowserAct skill metadata states that cookies, login sessions, page content, credentials, and browser profile data stay local, except the CAPTCHA challenge image when `solve-captcha` is invoked.
  • Chrome-direct and profile import workflows can connect agents to existing local browser state. Treat those modes as account access, not a blank test browser.
  • Log reports, feedback, Discord support, generated Skill Forge packages, and shared screenshots can leak private browsing or account context if submitted without review.
  • Managed proxy, stealth browser, and API-key features create additional BrowserAct service dependencies beyond local CLI execution.

Prerequisites

  • Python 3.12 or newer and the uv package manager for the documented CLI install path.
  • A compatible agent host that can read `SKILL.md` files and execute shell commands.
  • Chrome or Chromium for local `chrome` and `chrome-direct` browser modes.
  • A BrowserAct API key only for optional stealth browsers, stealth extraction, managed proxies, and CAPTCHA assistance.
  • A policy for which websites, accounts, browser profiles, uploads, downloads, form submissions, cookies, and network captures an agent may access.

Schema details

Install type
package
Reading time
7 min
Difficulty score
86
Troubleshooting
Yes
Breaking changes
No
Source repository stats
Scope
Source repo
Skill and platform metadata
Skill type
capability-pack
Skill level
expert
Verification
validated
Verified at
2026-06-18
Retrieval sources
https://github.com/browser-act/skills/blob/main/README.mdhttps://github.com/browser-act/skills/blob/main/browser-act/SKILL.mdhttps://github.com/browser-act/skills/blob/main/docs/installation.mdhttps://github.com/browser-act/skills/blob/main/docs/skills.mdhttps://github.com/browser-act/skills/blob/main/docs/commands.mdhttps://github.com/browser-act/skills/blob/main/docs/skill-forge.mdhttps://github.com/browser-act/skills/blob/main/LICENSE
Tested platforms
Claude CodeCodexOpenClawCursorOpenCodeWindsurfGemini CLIVS CodeSkills-compatible agent hosts
PlatformSupportInstall path
claude-codeNative.claude/skills/<skill-name>/SKILL.md
codexNative.agents/skills/<skill-name>/SKILL.md
windsurfNative.windsurf/skills/<skill-name>/SKILL.md
geminiNative.gemini/skills/<skill-name>/SKILL.md or .agents/skills/<skill-name>/SKILL.md
cursorAdapter.cursor/rules/<skill-name>.mdc
cliManualAGENTS.md or tool-specific context file
Tool listing metadata
Full copyable content
uv tool install browser-act-cli --python 3.12

# Install the BrowserAct entry skill into your agent host.
# Skill source: https://github.com/browser-act/skills/tree/main/browser-act

browser-act get-skills core --skill-version 2.0.2

About this resource

BrowserAct Skills

BrowserAct Skills is the Agent Skill entry point for the browser-act CLI. The entry skill is intentionally small: it tells the agent when BrowserAct should be used, declares the CLI install path, and requires the agent to load current runtime guidance through browser-act get-skills core --skill-version 2.0.2 before running automation commands.

Use it when an agent needs a real browser workflow with indexed interactions, session ownership, screenshots, network capture, local or managed browser modes, and human handoff. It is a better fit for browser operation than for static HTTP fetching, and it should stay behind explicit authorization and confirmation boundaries.

Knowledge Freshness

The upstream skill declares version 2.0.2 and uses a two-layer design: a stable installed SKILL.md plus environment-aware runtime content returned by the CLI. That means the current browser list, active sessions, API-key status, directives, and command details are not fully represented in the static skill file.

Always run the documented get-skills core command at the start of a session and do not truncate its output. Verify CLI and skill compatibility before running sensitive browser operations.

Retrieval Sources

This listing is grounded in:

  • The upstream browser-act/skills README.
  • The BrowserAct entry SKILL.md.
  • BrowserAct installation, skills, command reference, and Skill Forge docs.
  • Repository license and current GitHub metadata.

Core Workflow

Install the CLI:

uv tool install browser-act-cli --python 3.12

Install the skill from:

https://github.com/browser-act/skills/tree/main/browser-act

Then have the agent load runtime instructions:

browser-act get-skills core --skill-version 2.0.2

For full browser automation, the docs use the loop:

open -> state -> interact -> state -> close

For example, after selecting an approved browser and target URL, an agent calls state, reads indexed elements, then uses commands such as click <index>, input <index> <text>, select <index> <option>, screenshot, or network requests.

Capability Scope

Area BrowserAct Coverage
Entry skill Activates BrowserAct and directs agents to current get-skills runtime guidance
CLI install uv tool install browser-act-cli --python 3.12
Browser modes Local Chrome, Chrome direct/CDP, and optional stealth browser modes
Interaction Indexed state output, click, input, select, type, keys, scroll, upload, screenshots, JavaScript eval, waits, tabs, dialogs, and cookies
Network Request listing, request detail, HAR start/stop, offline toggle, and network clearing
Sessions Named sessions, browser descriptions, multi-browser isolation, and session close/list commands
Human handoff remote-assist for human takeover when automation stalls
Skill Forge Optional extension skill that explores a site once and generates a reusable, parameterized skill package

Use Cases

  • Let an agent inspect and interact with a JavaScript-rendered site through a real browser.
  • Use a supervised local Chrome session for an account workflow after explicit approval.
  • Capture screenshots or network requests for debugging a web workflow.
  • Hand off a difficult browser step to a human, then let the agent continue.
  • Generate a reusable site-specific skill with Skill Forge after a reviewed exploration pass.
  • Run multiple approved browser sessions without mixing cookies, account state, or task ownership.

Production Rules

  • Use BrowserAct only on authorized websites, accounts, and datasets.
  • Run browser-act get-skills core --skill-version 2.0.2 before the first browser command in each session.
  • Preserve confirmation gates for browser creation, profile import, form submission, uploads, proxy changes, CAPTCHA assistance, and remote handoff.
  • Prefer read-only inspection, screenshots, and state review before write actions such as clicks, form inputs, cookie changes, or uploads.
  • Close sessions after use and review logs, screenshots, network captures, and generated Skill Forge packages before sharing them.

Source Review

Verified on 2026-06-18:

  • GitHub metadata reported browser-act/skills as an MIT-licensed Python repository with topics including ai-agents, automation, claude-code-skills, openclaw-skills, codex-skill, web-scraping, data-extraction, cursor, and openclaw.
  • The README describes BrowserAct as a browser automation CLI built for AI agents, with local Chrome, stealth browser modes, isolated sessions, indexed interaction, remote handoff, and Skill Forge.
  • browser-act/SKILL.md declares skill name browser-act, version 2.0.2, the uv tool install browser-act-cli --python 3.12 install command, BrowserAct homepage, Python/uv requirements, local browser-profile storage, CDP permissions, confirmation requirements, and the get-skills core startup command.
  • docs/installation.md documents agent-based skill installation, manual CLI install, optional API-key authentication for stealth browsers, stealth extraction, managed proxies, and CAPTCHA assistance, plus Python 3.12+ and Chrome/Chromium requirements.
  • docs/skills.md documents the two-layer architecture: installed entry skill plus runtime content returned by get-skills, including environment state, browser list, sessions, core commands, and dynamic directives.
  • docs/commands.md documents browser interaction, navigation, state, click/input/select, screenshots, network capture, cookies, CAPTCHA, remote assist, browser management, authentication, proxy management, and get-skills commands.
  • docs/skill-forge.md documents the optional Skill Forge extension for generating reusable, parameterized skill packages from reviewed browser explorations.

Safety and Privacy

BrowserAct is powerful enough to act inside real websites and accounts. Keep the scope explicit: which site, which account, which browser profile, what data may be read, and what operations require confirmation. Do not let an agent import a local profile, submit a form, upload a file, buy proxy capacity, solve a challenge, or share a remote-assist URL without user approval.

The BrowserAct skill itself says sensitive data remains local except for CAPTCHA challenge images when solve-captcha is used. That is still a real data-flow decision. Avoid those features on sensitive pages, and review logs, screenshots, network captures, and generated Skill Forge packages before sharing them.

Duplicate Check

Checked current content/skills/, content/tools/, content/mcp/, content/agents/, guides, README entries, open pull requests, and repository-wide content for BrowserAct, browser-act/skills, browser-act skill, BrowserAct Claude Code skill, BrowserAct Codex skill, BrowserAct OpenClaw skill, Skill Forge, and matching source URLs. No dedicated BrowserAct Skills entry, exact source URL duplicate, target file, or open duplicate PR was found.

Disclosure

Editorial listing. No paid placement or affiliate link is used. BrowserAct Skills is MIT-licensed open-source content; BrowserAct's optional managed proxies, stealth browsers beyond the free tier, API-key features, CAPTCHA assistance, Discord support, target websites, browser providers, and generated skills may have separate licenses, pricing, privacy controls, and operational requirements.

Source citations

Add this badge to your README

Show that BrowserAct Skills is listed on HeyClaude. Paste this Markdown into your README — it renders the badge and links back to this page.

Listed on HeyClaude
[![Listed on HeyClaude](https://heyclau.de/badge/skills/browseract-skills.svg)](https://heyclau.de/entry/skills/browseract-skills)

How it compares

BrowserAct Skills side by side with 3 alternatives on trust, install, platform support, and disclosed safety notes — all from reviewed registry metadata.

FieldBrowserAct Skills

MIT-licensed BrowserAct Agent Skill pack for installing and operating the `browser-act` browser automation CLI from Claude Code, Codex, OpenClaw, Cursor, OpenCode, Windsurf, Gemini CLI, and other skills-compatible agents.

Open dossier
Superpowers Skills

MIT-licensed Superpowers skill and plugin framework by Jesse Vincent for Claude Code, Codex App, Codex CLI, Cursor, Gemini CLI, Antigravity, Kimi Code, OpenCode, Pi, GitHub Copilot CLI, and other coding agents, covering brainstorming, planning, TDD, systematic debugging, subagent-driven development, code review, git worktrees, and finish-the-branch workflows.

Open dossier
Addy Osmani Agent Skills

Addy Osmani's production-grade Agent Skills pack for AI coding agents, with lifecycle slash commands, engineering workflow skills, review personas, quality gates, and cross-agent setup guidance for Claude Code, Cursor, Gemini CLI, Antigravity CLI, OpenCode, GitHub Copilot, and other agents.

Open dossier
ARIS Auto-Research-In-Sleep

ARIS is a Markdown-only skill workflow pack for autonomous ML research agents, with idea discovery, experiment planning, auto-review loops, paper writing, rebuttal, resubmission, slides, posters, Research Wiki, and cross-model reviewer workflows for Claude Code, Codex, OpenClaw, Cursor, and other agent hosts.

Open dossier
Trust
Install riskReview firstReview firstReview firstReview first
Notes Safety Privacy Safety Privacy Safety Privacy Safety Privacy
Categoryskillsskillsskillsskills
Sourcesource-backedsource-backedsource-backedsource-backed
AuthorBrowserActJesse VincentAddy Osmaniwanshuiyin
Added2026-06-182026-06-182026-06-182026-06-18
Platforms
Claude CodeCodexWindsurfGeminiCursorCLIVS Code
Claude CodeCodexWindsurfGeminiCursorCLI
Claude CodeCodexWindsurfGeminiCursorCLI
Claude CodeCodexWindsurfGeminiCursorCLI
Source repo
Safety notesBrowserAct can open pages, click, type, upload files, inspect state, capture screenshots, read page text, handle dialogs, export cookies, capture network requests, and operate logged-in browser sessions. Use BrowserAct only on sites, accounts, and data sources where the user has authorization. Do not use it to evade access controls, violate site terms, scrape disallowed data, or bypass rate limits. The entry skill declares confirmation gates for browser creation, deletion, local Chrome profile import, proxy/security changes, logins, form submissions, file uploads, and other sensitive operations; preserve those gates in agent workflows. `solve-captcha` may send the challenge image to BrowserAct's verification-assistance service according to the skill metadata; do not use it with sensitive or unauthorized pages. `remote-assist` can generate a live handoff URL for a human to take over. Treat that URL as access to the active browser session. Skill Forge can generate reusable automation skills from explored sites. Review generated scripts, selectors, network assumptions, output schemas, and site authorization before reusing them at scale.Superpowers installs skills plus harness-specific bootstrap or hook behavior that can affect how an agent responds from session start. Review installed hooks and plugin metadata for the target harness. The `using-superpowers` skill strongly requires skill checks before agent responses, while also stating that explicit user and project instructions take precedence. Keep that priority order intact. The workflow skills can direct agents to create specs, plans, branches, worktrees, tests, commits, subagent tasks, review packages, and long-running implementation loops. Subagent-driven development can run for extended periods and dispatch multiple agents. Use clear budgets, model selection rules, task boundaries, and stop conditions. The TDD skill intentionally requires failing tests before production code. Confirm that this discipline fits the project before enabling it as a default workflow. The optional visual companion uses a browser/server flow during brainstorming. Review local server behavior, ports, and auth before using it with private project context.The slash commands are designed to guide real coding, testing, reviewing, committing, and shipping work; keep edits, commits, pushes, CI changes, and deploys behind the host's normal approval controls. `/build auto` is explicitly intended to generate a plan and implement multiple tasks in one approved pass. Use it on bounded specs, review the generated plan first, and stop on test failures or risky changes. The skills encode durable engineering workflows, not guaranteed-current framework APIs. Follow the source-driven-development guidance and verify current documentation before applying generated code. Security, CI/CD, observability, migration, and launch skills can touch production-sensitive systems. Require dry-run plans, rollback notes, and environment scoping before approving operational commands. Review personas and quality gates are useful second opinions, but they do not replace maintainer review, domain-specific tests, threat modeling, or release sign-off.ARIS skills can guide agents through code changes, experiment planning, experiment execution, paper drafting, rebuttal drafting, and cross-model review loops; treat those workflows as high-impact research automation rather than passive documentation. The `research-pipeline` skill supports auto-proceed modes and reviewer loops. Keep expensive runs, repository mutations, cloud/GPU jobs, and paper-submission decisions behind explicit human approval. Cross-model review through Codex MCP, Claude-review, Gemini-review, or similar reviewer adapters is a quality-control signal, not scientific proof or peer review. Generated claims, citations, tables, plots, ablations, rebuttals, and paper text need source checks, experiment audits, citation audits, and human scientific review before being relied on or submitted. Review all copied skills, scripts, MCP server configuration, and reviewer routing before installing them into a sensitive repository or giving them shell, file, web, cloud, or GPU access.
Privacy notesBrowserAct workflows can expose page content, screenshots, URLs, credentials typed into forms, cookies, browser profiles, uploaded files, downloaded files, network requests, HAR data, session names, browser descriptions, and logs. The BrowserAct skill metadata states that cookies, login sessions, page content, credentials, and browser profile data stay local, except the CAPTCHA challenge image when `solve-captcha` is invoked. Chrome-direct and profile import workflows can connect agents to existing local browser state. Treat those modes as account access, not a blank test browser. Log reports, feedback, Discord support, generated Skill Forge packages, and shared screenshots can leak private browsing or account context if submitted without review. Managed proxy, stealth browser, and API-key features create additional BrowserAct service dependencies beyond local CLI execution.Superpowers workflows can expose product ideas, specs, design docs, implementation plans, source code, tests, diffs, review findings, git history, branch names, tool outputs, and agent handoff prompts. The README states that the optional visual companion may load the Prime Radiant logo from the creator's website with the Superpowers version, and can be disabled with `SUPERPOWERS_DISABLE_TELEMETRY` or compatible Claude telemetry opt-outs. Do not include secrets, customer data, unpublished product strategy, private incidents, or proprietary code in public examples, review packages, support issues, or visual companion artifacts. Subagent prompts and review packages should be treated as private development artifacts because they may include source snippets, diffs, file paths, test output, and architecture decisions.Using the pack with an AI agent can expose repository code, product requirements, architecture notes, tests, CI logs, deployment settings, incidents, security findings, and launch plans to the configured model provider. Do not paste secrets, customer data, private incident records, production credentials, unpublished roadmap details, or proprietary compliance material into public prompts, issues, screenshots, or PR bodies. Agent personas and review workflows may ask for browser traces, performance data, logs, build output, dependency lists, and environment details; redact tokens and private URLs before sharing artifacts.Research automation can expose unpublished hypotheses, paper drafts, peer-review text, datasets, logs, source code, experiment traces, model outputs, reviewer comments, account names, and GPU or cloud configuration to the selected model providers and MCP tools. Cross-model review loops may send the same research artifact to multiple providers or local/remote reviewer services depending on configuration. Research Wiki, traces, generated reports, paper artifacts, and run logs can persist confidential results or private review material on disk. Do not share confidential reviews, unreleased findings, private datasets, credentials, proprietary code, or submission-sensitive artifacts with external services unless the research and account policies allow it.
Prerequisites
  • Python 3.12 or newer and the uv package manager for the documented CLI install path.
  • A compatible agent host that can read `SKILL.md` files and execute shell commands.
  • Chrome or Chromium for local `chrome` and `chrome-direct` browser modes.
  • A BrowserAct API key only for optional stealth browsers, stealth extraction, managed proxies, and CAPTCHA assistance.
  • A supported coding-agent harness and its plugin or extension install path.
  • A repository where Superpowers can add skills, startup hooks, and workflow instructions for the selected agent.
  • A willingness to follow structured workflows such as brainstorming, planning, TDD, subagent implementation, code review, and branch finishing.
  • Project-specific instructions that clearly state where Superpowers workflows should be adapted or overridden.
  • Claude Code plugin support, an Agent Skills compatible installer, or an agent/editor that can load Markdown instruction files.
  • A software project where lifecycle guidance for specs, planning, implementation, testing, review, simplification, or launch is appropriate.
  • A version-controlled workspace with a known approval model for edits, tests, commits, pushes, and deployments.
  • Current framework, platform, and API documentation for any concrete implementation work produced under these skills.
  • A research project, ML paper idea, baseline repository, dataset, review packet, or experiment plan that is appropriate for agent-assisted research automation.
  • A compatible agent host that can consume Markdown skills, such as Claude Code, Codex, Cursor, OpenClaw, Antigravity, Trae, GitHub Copilot CLI, or a manual prompt workflow.
  • Model-provider credentials, MCP reviewer configuration, or local model routing only when using cross-model review or external reviewer loops.
  • Compute budget, GPU quota, experiment sandboxing, version control, and artifact directories before allowing autonomous experiment execution.
Install
uv tool install browser-act-cli --python 3.12
/plugin install superpowers@claude-plugins-official
/plugin marketplace add addyosmani/agent-skills
git clone https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep.git
Config
Citations
ClaimUnclaimedUnclaimedUnclaimedUnclaimed

Signals

Loading live community signals…

More like this, weekly

A short, calm digest of reviewed Claude resources. Unsubscribe any time.