devops
✓Verified·Scanned 2/18/2026
Automate deployments, manage infrastructure, and build reliable CI/CD pipelines.
from clawhub.ai·v01336a0·4.0 KB·0 installs
Scanned from 1.0.0 at 01336a0 · Transparency log ↗
$ vett add clawhub.ai/ivangdavila/devops
DevOps Rules
CI/CD Pipelines
- Fail fast: run linting and unit tests before expensive integration tests — saves time and compute
- Cache dependencies between runs —
npm installon every build wastes minutes - Pin action versions with SHA, not tags —
actions/checkout@v3can change, SHA is immutable - Secrets in environment variables, never in code or logs — mask them in CI output
- Parallel jobs for independent steps — test, lint, and build can run simultaneously
Deployment Strategies
- Blue-green: run new version alongside old, switch traffic atomically — instant rollback by switching back
- Canary: route percentage of traffic to new version — catch issues before full rollout
- Rolling: update instances incrementally — balance between speed and risk
- Always have rollback plan before deploying — know exactly how to revert
- Deploy the same artifact to all environments — build once, promote through stages
Infrastructure as Code
- Version control all infrastructure — terraform, ansible, cloudformation in git
- Never apply changes without plan/diff review —
terraform planbeforeapply - State files contain secrets — store remotely with encryption, never in git
- Modules for reusable components — don't copy-paste infrastructure definitions
- Separate environments with workspaces or directories — dev changes shouldn't affect prod
Containers
- One process per container — containers are not VMs
- Health checks are mandatory — orchestrators need them for routing and restarts
- Don't run as root — use non-root USER in Dockerfile
- Immutable images: config via environment, not baked in — same image in all environments
- Tag images with git SHA, not just
latest— know exactly what's deployed
Secrets Management
- Never store secrets in environment files committed to git — use vault, sealed secrets, or CI secret storage
- Rotate secrets regularly — automation makes rotation painless
- Different secrets per environment — dev leak shouldn't compromise prod
- Audit secret access — know who accessed what and when
- Secrets in memory, not disk when possible — temp files persist longer than expected
Monitoring & Alerting
- Four golden signals: latency, traffic, errors, saturation — start here
- Alert on symptoms, not causes — "users seeing errors" not "CPU high"
- Every alert must be actionable — if you can't do anything, it's noise
- Dashboard per service with key metrics — one glance shows health
- Structured logs (JSON) for machine parsing — grep works, but queries are better
Reliability
- Define SLOs before building alerting — what does "healthy" mean for this service?
- Error budgets: some failures are acceptable — 99.9% means 8 hours downtime/year is OK
- Chaos engineering in staging — break things intentionally before prod breaks accidentally
- Runbooks for common incidents — 3am is not the time to figure out recovery steps
- Post-mortems without blame — focus on systems, not people
Common Mistakes
- SSH into prod to fix things — all changes through automation, or you'll forget what you did
- No staging environment — "works on my machine" doesn't mean works in prod
- Ignoring flaky tests — they erode trust in CI, either fix or delete
- Manual steps in deployment — if it's not automated, it'll be done wrong eventually
- Monitoring only happy paths — check error rates and edge cases too
Networking
- Internal services don't need public IPs — use private subnets, expose only load balancers
- TLS everywhere, including internal traffic — zero trust, even behind firewall
- DNS for service discovery — hardcoded IPs break when things move
- Load balancer health checks separate from app health — LB needs fast response, app health can be thorough
- Firewall default deny — explicitly allow what's needed, block everything else