context-engineering

Verified·Scanned 2/18/2026

This skill provides context-engineering guidance and CLI tools for analysis and compression, including ./scripts/context_analyzer.py and ./scripts/compression_evaluator.py. It documents running them locally via python context_analyzer.py analyze <context_file> and python compression_evaluator.py evaluate <original_file> <compressed_file> and exposes no network endpoints or secret-file reads.

by mrgoonie·v1.0.0·49.5 KB·159 installs
Scanned from main at 05a2268 · Transparency log ↗
$ vett add mrgoonie/claudekit-skills/context-engineering

Context Engineering

Context engineering curates the smallest high-signal token set for LLM tasks. The goal: maximize reasoning quality while minimizing token usage.

When to Activate

  • Designing/debugging agent systems
  • Context limits constrain performance
  • Optimizing cost/latency
  • Building multi-agent coordination
  • Implementing memory systems
  • Evaluating agent performance
  • Developing LLM-powered pipelines

Core Principles

  1. Context quality > quantity - High-signal tokens beat exhaustive content
  2. Attention is finite - U-shaped curve favors beginning/end positions
  3. Progressive disclosure - Load information just-in-time
  4. Isolation prevents degradation - Partition work across sub-agents
  5. Measure before optimizing - Know your baseline

Quick Reference

TopicWhen to UseReference
FundamentalsUnderstanding context anatomy, attention mechanicscontext-fundamentals.md
DegradationDebugging failures, lost-in-middle, poisoningcontext-degradation.md
OptimizationCompaction, masking, caching, partitioningcontext-optimization.md
CompressionLong sessions, summarization strategiescontext-compression.md
MemoryCross-session persistence, knowledge graphsmemory-systems.md
Multi-AgentCoordination patterns, context isolationmulti-agent-patterns.md
EvaluationTesting agents, LLM-as-Judge, metricsevaluation.md
Tool DesignTool consolidation, description engineeringtool-design.md
PipelinesProject development, batch processingproject-development.md

Key Metrics

  • Token utilization: Warning at 70%, trigger optimization at 80%
  • Token variance: Explains 80% of agent performance variance
  • Multi-agent cost: ~15x single agent baseline
  • Compaction target: 50-70% reduction, <5% quality loss
  • Cache hit target: 70%+ for stable workloads

Four-Bucket Strategy

  1. Write: Save context externally (scratchpads, files)
  2. Select: Pull only relevant context (retrieval, filtering)
  3. Compress: Reduce tokens while preserving info (summarization)
  4. Isolate: Split across sub-agents (partitioning)

Anti-Patterns

  • Exhaustive context over curated context
  • Critical info in middle positions
  • No compaction triggers before limits
  • Single agent for parallelizable tasks
  • Tools without clear descriptions

Guidelines

  1. Place critical info at beginning/end of context
  2. Implement compaction at 70-80% utilization
  3. Use sub-agents for context isolation, not role-play
  4. Design tools with 4-question framework (what, when, inputs, returns)
  5. Optimize for tokens-per-task, not tokens-per-request
  6. Validate with probe-based evaluation
  7. Monitor KV-cache hit rates in production
  8. Start minimal, add complexity only when proven necessary

Scripts

  • context_analyzer.py - Context health analysis, degradation detection
  • compression_evaluator.py - Compression quality evaluation