High Risk:This skill has significant security concerns. Review the findings below before installing.

forge

Caution·Scanned 2/18/2026

Forge is an autonomous QA swarm that discovers, builds, tests, auto-fixes, and auto-commits code across projects. It explicitly runs shell/CLI commands (nohup ${RUN_COMMAND} > backend.log 2>&1 &, cp .env.example .env), performs networked CLI operations (npx @claude-flow/cli@latest memory store), and writes .forge/progress.jsonl.

from clawhub.ai·v95d8e50·76.0 KB·0 installs
Scanned from 1.0.0 at 95d8e50 · Transparency log ↗
$ vett add clawhub.ai/ikennaokpala/forgeReview security findings before installing

Forge

Quality forged in, not bolted on.

Forge is an autonomous quality engineering swarm skill for Claude Code that combines BDD behavioral verification, 7 quality gates, confidence-tiered learning, and self-healing fix loops. It spawns 8 specialized agents that work in parallel to verify, test, fix, and commit — continuously — until every Gherkin scenario passes and every quality gate clears.


Key Features

  • 8 specialized agents working in parallel with cost-optimized model routing
  • Gherkin behavioral specifications as the single source of truth
  • 7 quality gates: Functional, Behavioral, Coverage, Security, Accessibility, Resilience, Contract
  • Confidence-tiered fix patterns (Platinum/Gold/Silver/Bronze) that evolve over time
  • Defect prediction based on historical failure data and file changes
  • Chaos/resilience testing with controlled failure injection
  • Cross-context dependency awareness with cascade re-testing
  • Shared types and cross-cutting validation across bounded contexts
  • Agent-optimized ADRs with MUST/MUST NOT constraints and verification commands
  • Visual regression testing with pixel-by-pixel comparison
  • Architecture-agnostic — monolith, microservices, monorepo, mobile+backend
  • Optional Agentic QE integration for enhanced pattern search, security scanning, and more
  • No mocking — all tests run against the real backend

Philosophy

Three Pillars

PillarSourceWhat It Does
BuildDDD+ADR+TDD methodologyStructured development with quality gates, defect prediction, confidence-tiered fixes
VerifyBDD/Gherkin behavioral specsContinuous behavioral verification — the PRODUCT works, not just the CODE
HealAutonomous E2E fix loopTest → Analyze → Fix → Commit → Learn → Repeat

"DONE DONE"

"DONE DONE" means: the code compiles AND the product behaves as specified. Every Gherkin scenario passes. Every quality gate clears. Every dependency graph is satisfied.


Quick Start

# Copy SKILL.md to your Claude Code skills directory
cp SKILL.md ~/.claude/skills/forge.md

# Run on your project
/forge --autonomous --context payments

Invocation Modes

CommandDescription
/forge --autonomous --allFull autonomous run — all contexts, all gates
/forge --autonomous --context [name]Single context autonomous run
/forge --verify-onlyBehavioral verification only (no fixes)
/forge --verify-only --context [name]Verify single context
/forge --fix-only --context [name]Fix failures, don't generate new tests
/forge --learnAnalyze patterns, update confidence tiers
/forge --add-coverage --screens [names]Add coverage for new screens/pages/components
/forge --spec-gen --context [name]Generate Gherkin specs for a context
/forge --spec-gen --allGenerate Gherkin specs for all contexts
/forge --gates-onlyRun quality gates without test execution
/forge --gates-only --context [name]Run gates for single context
/forge --predictDefect prediction only
/forge --predict --context [name]Predict defects for single context
/forge --chaos --context [name]Chaos/resilience testing for a context
/forge --chaos --allChaos testing for all contexts

Architecture

Autonomous Loop

Specify → Test → Analyze → Fix → Audit → Gate → Commit → Learn → Repeat
┌────────────────────────────────────────────────────────────────────┐
│                    FORGE AUTONOMOUS LOOP                            │
├────────────────────────────────────────────────────────────────────┤
│                                                                    │
│  ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐      │
│  │ Specify  │──▶│   Test   │──▶│ Analyze  │──▶│   Fix    │      │
│  │ (Gherkin)│   │ (Run)    │   │ (Root    │   │ (Tiered) │      │
│  └──────────┘   └──────────┘   │  Cause)  │   └──────────┘      │
│       ▲                        └──────────┘        │              │
│       │                                            ▼              │
│  ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐      │
│  │  Learn   │◀──│  Commit  │◀──│  Gate    │◀──│  Audit   │      │
│  │ (Update  │   │ (Auto)   │   │ (7 Gates)│   │ (A11y)   │      │
│  │  Tiers)  │   └──────────┘   └──────────┘   └──────────┘      │
│  └──────────┘                                                     │
│       │                                                           │
│       └──────────────── REPEAT ──────────────────────────────────│
│                                                                    │
│  Loop continues until: ALL 7 GATES PASS or MAX_ITERATIONS (10)   │
└────────────────────────────────────────────────────────────────────┘

Execution Phases

  1. Phase 0 — Backend setup (build, run, health check, seed data)
  2. Phase 1 — Behavioral specification & architecture records (Gherkin specs, ADRs)
  3. Phase 2 — Contract & dependency validation (schemas, shared types, cross-cutting)
  4. Phase 3 — Swarm initialization (load patterns, predictions, confidence tiers)
  5. Phase 4 — Spawn 8 autonomous agents in parallel
  6. Phase 5 — Quality gates evaluation (7 gates after every fix cycle)

Quality Gates

GateCheckThresholdBlocking
1. FunctionalAll tests pass100% pass rateYES
2. BehavioralGherkin scenarios satisfied100% of targeted scenariosYES
3. CoveragePath coverage>=85% overall, >=95% criticalYES (critical only)
4. SecurityNo secrets, SAST checks, no injection vectors0 critical/high violationsYES
5. AccessibilityLabels, target sizes, contrastWCAG AAWarning only
6. ResilienceOffline, timeout, error handlingTested for target contextWarning only
7. ContractAPI response matches schema0 mismatchesYES

Agent Roles

AgentModelRole
Specification VerifierSonnetGenerates/validates Gherkin specs and ADRs for bounded contexts
Test RunnerHaikuExecutes E2E test suites, parses results, maps failures to specs
Failure AnalyzerSonnetRoot cause analysis, pattern matching, dependency impact assessment
Bug FixerOpusApplies confidence-tiered fixes from first principles
Quality Gate EnforcerHaikuEvaluates all 7 gates, arbitrates agent disagreements
Accessibility AuditorSonnetWCAG AA audit: labels, contrast, targets, focus order
Auto-CommitterHaikuStages fixed files, creates detailed commits with gate statuses
Learning OptimizerSonnetUpdates confidence tiers, defect prediction, coverage metrics

Configuration

Project Config (optional)

# forge.config.yaml — placed at repo root
architecture: microservices
backend:
  services:
    - name: auth-service
      port: 8081
      healthEndpoint: /health
      buildCommand: npm run build
      runCommand: npm start
frontend:
  technology: react
  testCommand: npx cypress run --spec {target}
  testDir: cypress/e2e/
  specDir: cypress/e2e/specs/

# Model routing overrides
model_routing:
  bug-fixer: opus
  failure-analyzer: sonnet
  test-runner: haiku

# Visual regression
visual_regression:
  enabled: true
  threshold: 0.001

# Agentic QE integration
integrations:
  agentic-qe:
    enabled: true
    domains: [defect-intelligence, security-compliance, visual-accessibility, contract-testing]

Context Config (optional)

# forge.contexts.yaml — bounded context definitions
contexts:
  - name: identity
    testFile: identity.cy.ts
    specFile: identity.feature
    paths: 68
    subdomains: [Auth, Profiles, Verification]
  - name: payments
    testFile: payments.cy.ts
    specFile: payments.feature
    paths: 89
    subdomains: [Wallet, Cards, Transactions]

dependencies:
  identity:
    blocks: [payments, orders]
  payments:
    depends_on: [identity]
    blocks: [orders, subscriptions]

If no configuration files are present, Forge auto-discovers the project structure on first run.


Agentic QE Integration

Forge optionally integrates with Agentic QE via MCP for enhanced capabilities:

CapabilityWithout AQEWith AQE
Pattern Storageclaude-flow memoryReasoningBank (vector-indexed, 150x faster)
Defect PredictionFile changes + historySpecialized defect-intelligence agents
Security ScanningGate 4 static checksFull SAST/DAST analysis
AccessibilityBuilt-in auditorvisual-tester + accessibility-auditor
Contract TestingSchema validationcontract-validator + graphql-tester
Progress.forge/progress.jsonlAG-UI real-time streaming

All AQE features are additive. Forge works identically without AQE installed.


References


License

MIT