skill-defender

Verified·Scanned 2/18/2026

This skill is an offline, deterministic scanner that inspects installed skills for prompt injection, credential theft, exfiltration, obfuscation, and backdoors. It launches subprocesses via subprocess.run to execute python3 scan_skill.py on skill directories; this execution is purpose-aligned and is the only security-relevant behavior detected.

from clawhub.ai·v852e50a·62.1 KB·0 installs
Scanned from 1.0.0 at 852e50a · Transparency log ↗
$ vett add clawhub.ai/itsclawdbro/skill-defender

Skill Defender — Malicious Pattern Scanner

When to Run

Automatic Triggers

  1. New skill installed — Immediately run scan_skill.py against it before allowing use
  2. Skill updated — Re-scan after any file changes in a skill directory
  3. Periodic audit — Run batch scan on all installed skills when requested

Manual Triggers

  • User says "scan skill X" → scan that specific skill
  • User says "scan all skills" → batch scan all skills
  • User says "security check" or "audit skills" → same as above

Scripts

scripts/scan_skill.py — Single Skill Scanner

Scans one skill directory for malicious patterns. Produces JSON or human-readable output.

scripts/aggregate_scan.py — Batch Scanner

Scans ALL installed skills and produces a single JSON report. Includes a built-in allowlist to reduce false positives from security-related skills, API skills, and other known-safe patterns.

How to Run

# Scan a single skill (human-readable)
python3 scripts/scan_skill.py /path/to/skill-dir

# Scan a single skill (JSON output)
python3 scripts/scan_skill.py /path/to/skill-dir --json

# Scan ALL installed skills (JSON aggregate report)
python3 scripts/aggregate_scan.py

# With custom skills directory
python3 scripts/aggregate_scan.py --skills-dir /path/to/skills

# With verbose warnings
python3 scripts/scan_skill.py /path/to/skill-dir --verbose

# Exclude false positives
python3 scripts/scan_skill.py /path/to/skill-dir --exclude "pattern1" "pattern2"

Exit Codes (scan_skill.py)

  • 0 = clean or informational only
  • 1 = suspicious (medium/high findings)
  • 2 = dangerous (critical findings)
  • 3 = error

Output Format (aggregate_scan.py)

{
  "skills": [
    {
      "name": "skill-name",
      "verdict": "clean|suspicious|dangerous|error",
      "findingsCount": 0,
      "findings": []
    }
  ],
  "summary": "All 37 skills passed with no significant issues.",
  "totalSkills": 37,
  "cleanCount": 37,
  "suspiciousCount": 0,
  "dangerousCount": 0,
  "errorCount": 0,
  "timestamp": "2026-02-02T06:00:00+00:00"
}

Auto-Detection

Both scripts auto-detect paths:

  • Skills directory: Detected from script location (walks up to find skills/ parent), falls back to ~/clawd/skills, ~/skills, ~/.openclaw/skills
  • Scanner script: aggregate_scan.py finds scan_skill.py co-located in the same directory

Handling Results

✅ Clean (verdict: "clean")

  • No action needed — skill is safe

⚠️ Suspicious (verdict: "suspicious")

  • Warn the user with a summary of findings
  • Show the category and severity of each finding

🚨 Dangerous (verdict: "dangerous")

  • Block the skill — do not proceed with installation or use
  • Show the full detailed findings to the user
  • Require explicit user override to proceed

Built-in Allowlist

The aggregate scanner includes an allowlist for known false positives:

  • Security scanners (skill-defender, clawdbot-security-check) — their docs/scripts contain the very patterns they detect
  • Auth-dependent skills (tailscale, reddit, n8n, event-planner) — legitimately reference credential paths and API keys
  • Config-aware skills (memory-setup, eightctl, summarize) — reference config paths in documentation
  • Agent-writing skills (self-improving-agent) — designed to modify agent files

Pattern Reference

See references/threat-patterns.md for full documentation of all detected patterns, organized by category with explanations of why each is dangerous.

Important Notes

  • No external dependencies — standard library only (Python 3.9+)
  • Fast — under 1 second per skill, ~30 seconds for a full batch of 30+ skills
  • This is deterministic pattern matching (Layer 2 defense). Not LLM-based.
  • False positives are possible — the allowlist and --exclude flag help
  • The scanner will flag itself if scanned without the allowlist — this is expected