skill-defender

✓Verified·Scanned 2/18/2026

This skill is an offline, deterministic scanner that inspects installed skills for prompt injection, credential theft, exfiltration, obfuscation, and backdoors. It launches subprocesses via subprocess.run to execute python3 scan_skill.py on skill directories; this execution is purpose-aligned and is the only security-relevant behavior detected.

from clawhub.ai·v852e50a·62.1 KB·0 installs

Scanned from 1.0.0 at 852e50a · Transparency log ↗

$ vett add clawhub.ai/itsclawdbro/skill-defender

Skill Defender — Malicious Pattern Scanner

When to Run

Automatic Triggers

New skill installed — Immediately run scan_skill.py against it before allowing use
Skill updated — Re-scan after any file changes in a skill directory
Periodic audit — Run batch scan on all installed skills when requested

Manual Triggers

User says "scan skill X" → scan that specific skill
User says "scan all skills" → batch scan all skills
User says "security check" or "audit skills" → same as above

Scripts

`scripts/scan_skill.py` — Single Skill Scanner

Scans one skill directory for malicious patterns. Produces JSON or human-readable output.

`scripts/aggregate_scan.py` — Batch Scanner

Scans ALL installed skills and produces a single JSON report. Includes a built-in allowlist to reduce false positives from security-related skills, API skills, and other known-safe patterns.

How to Run

# Scan a single skill (human-readable)
python3 scripts/scan_skill.py /path/to/skill-dir

# Scan a single skill (JSON output)
python3 scripts/scan_skill.py /path/to/skill-dir --json

# Scan ALL installed skills (JSON aggregate report)
python3 scripts/aggregate_scan.py

# With custom skills directory
python3 scripts/aggregate_scan.py --skills-dir /path/to/skills

# With verbose warnings
python3 scripts/scan_skill.py /path/to/skill-dir --verbose

# Exclude false positives
python3 scripts/scan_skill.py /path/to/skill-dir --exclude "pattern1" "pattern2"

Exit Codes (scan_skill.py)

0 = clean or informational only
1 = suspicious (medium/high findings)
2 = dangerous (critical findings)
3 = error

Output Format (aggregate_scan.py)

{
  "skills": [
    {
      "name": "skill-name",
      "verdict": "clean|suspicious|dangerous|error",
      "findingsCount": 0,
      "findings": []
    }
  ],
  "summary": "All 37 skills passed with no significant issues.",
  "totalSkills": 37,
  "cleanCount": 37,
  "suspiciousCount": 0,
  "dangerousCount": 0,
  "errorCount": 0,
  "timestamp": "2026-02-02T06:00:00+00:00"
}

Auto-Detection

Both scripts auto-detect paths:

Skills directory: Detected from script location (walks up to find skills/ parent), falls back to ~/clawd/skills, ~/skills, ~/.openclaw/skills
Scanner script: aggregate_scan.py finds scan_skill.py co-located in the same directory

Handling Results

✅ Clean (`verdict: "clean"`)

No action needed — skill is safe

⚠️ Suspicious (`verdict: "suspicious"`)

Warn the user with a summary of findings
Show the category and severity of each finding

🚨 Dangerous (`verdict: "dangerous"`)

Block the skill — do not proceed with installation or use
Show the full detailed findings to the user
Require explicit user override to proceed

Built-in Allowlist

The aggregate scanner includes an allowlist for known false positives:

Security scanners (skill-defender, clawdbot-security-check) — their docs/scripts contain the very patterns they detect
Auth-dependent skills (tailscale, reddit, n8n, event-planner) — legitimately reference credential paths and API keys
Config-aware skills (memory-setup, eightctl, summarize) — reference config paths in documentation
Agent-writing skills (self-improving-agent) — designed to modify agent files

Pattern Reference

See references/threat-patterns.md for full documentation of all detected patterns, organized by category with explanations of why each is dangerous.

Important Notes

No external dependencies — standard library only (Python 3.9+)
Fast — under 1 second per skill, ~30 seconds for a full batch of 30+ skills
This is deterministic pattern matching (Layer 2 defense). Not LLM-based.
False positives are possible — the allowlist and --exclude flag help
The scanner will flag itself if scanned without the allowlist — this is expected