High Risk:This skill has significant security concerns. Review the findings below before installing.

playwright-scraper-skill

Caution·Scanned 2/19/2026

High-risk skill: documentation and scripts instruct executing shell commands (node scripts/playwright-stealth.js, npm install) and running included scripts that accept user-supplied URLs. It also performs external network requests (e.g., https://m.discuss.com.hk/#hot, https://example.com) and writes screenshots/HTML (/tmp/discuss-hk.png, screenshot-*.png).

from clawhub.ai·v1.2.0·36.2 KB·0 installs
Scanned from 1.2.0 at 3cc3df8 · Transparency log ↗
$ vett add clawhub.ai/waisimon/playwright-scraper-skillReview security findings before installing

Playwright Scraper Skill 🕷️

中文文檔 | English

A Playwright-based web scraping OpenClaw Skill with anti-bot protection. Successfully tested on complex websites like Discuss.com.hk.

📦 Installation: See INSTALL.md
📚 Full Documentation: See SKILL.md
💡 Examples: See examples/README.md


✨ Features

  • Pure Playwright — Modern, powerful, easy to use
  • Anti-Bot Protection — Hides automation, realistic UA
  • Verified — 100% success on Discuss.com.hk
  • Simple to Use — One-line commands
  • Customizable — Environment variable support

🚀 Quick Start

Installation

npm install
npx playwright install chromium

Usage

# Quick scraping
node scripts/playwright-simple.js https://example.com

# Stealth mode (recommended)
node scripts/playwright-stealth.js "https://m.discuss.com.hk/#hot"

📖 Two Modes

ModeUse CaseSpeedAnti-Bot
SimpleRegular dynamic sitesFast (3-5s)None
StealthSites with anti-botMedium (5-20s)Medium-High

Simple Mode

For sites without anti-bot protection:

node scripts/playwright-simple.js <URL>

Stealth Mode (Recommended)

For sites with Cloudflare or anti-bot protection:

node scripts/playwright-stealth.js <URL>

Anti-Bot Techniques:

  • Hide navigator.webdriver
  • Realistic User-Agent (iPhone)
  • Human-like behavior simulation
  • Screenshot and HTML saving support

🎯 Customization

All scripts support environment variables:

# Show browser
HEADLESS=false node scripts/playwright-stealth.js <URL>

# Custom wait time (milliseconds)
WAIT_TIME=10000 node scripts/playwright-stealth.js <URL>

# Save screenshot
SCREENSHOT_PATH=/tmp/page.png node scripts/playwright-stealth.js <URL>

# Save HTML
SAVE_HTML=true node scripts/playwright-stealth.js <URL>

# Custom User-Agent
USER_AGENT="Mozilla/5.0 ..." node scripts/playwright-stealth.js <URL>

📊 Test Results

WebsiteResultTime
Discuss.com.hk✅ 200 OK5-20s
Example.com✅ 200 OK3-5s
Cloudflare Protected✅ Mostly successful10-30s

📁 File Structure

playwright-scraper-skill/
├── scripts/
│   ├── playwright-simple.js       # Simple mode
│   └── playwright-stealth.js      # Stealth mode ⭐
├── examples/
│   ├── discuss-hk.sh              # Discuss.com.hk example
│   └── README.md                  # More examples
├── SKILL.md                       # Full documentation
├── INSTALL.md                     # Installation guide
├── README.md                      # This file
├── README_ZH.md                   # Chinese documentation
├── CONTRIBUTING.md                # Contribution guide
├── CHANGELOG.md                   # Version history
└── package.json                   # npm config

💡 Best Practices

  1. Try web_fetch first — OpenClaw's built-in tool is fastest
  2. Use Simple for dynamic sites — When no anti-bot protection
  3. Use Stealth for protected sites ⭐ — Main workhorse
  4. Use specialized skills — For YouTube, Reddit, etc.

🐛 Troubleshooting

Getting 403 blocked?

Use Stealth mode:

node scripts/playwright-stealth.js <URL>

Cloudflare challenge?

Increase wait time + headful mode:

HEADLESS=false WAIT_TIME=30000 node scripts/playwright-stealth.js <URL>

Playwright not found?

Reinstall:

npm install
npx playwright install chromium

More issues? See INSTALL.md


🤝 Contributing

Contributions welcome! See CONTRIBUTING.md


📄 License

MIT License - See LICENSE


🔗 Links