Table of Contents
- TL;DR
- The Problem: AI Tools Can't See the Forest for the Trees
- Our Discovery: 97.5% of Context is Waste
- Case Study: Production Memory Leak
- How It Works: 5 Smart Strategies
- Implementation Guide
- Common Pitfalls & Solutions
- The Context Maturity Model
- Actionable Takeaways for Prompt Engineering
- FAQ
- Resources
- Citation
The Context Goldilocks Zone: Finding the 97.5% Waste in AI Coding Tools #
Date: July 11, 2025 | Author: Eric Liao | Reading time: 8 minutes
TL;DR #
Current AI coding tools waste 97.5% of their context window on irrelevant code. By implementing intelligent context selection, we achieved:
- 📉 97.5% fewer tokens (168,312 → 4,218)
- 💰 40x cost reduction
- ⚡ 7x faster responses
- ✅ Same quality (81% score)
Quick wins:
- Today: Basic filtering → 85% reduction
- This week: Dependency analysis → 95% reduction
- This month: Task strategies → 97%+ reduction
Jump to code → | Get the tools →
The Problem: AI Tools Can't See the Forest for the Trees #
Context in AI coding assistants is like a flashlight beam - it determines what the AI can "see" to solve your problem. Most tools today use floodlights when they need spotlights.
Example: Fixing a login bug
❌ Traditional: 168,312 tokens (entire codebase)
✅ Smart: 4,218 tokens (just auth files + tests)
Our Discovery: 97.5% of Context is Waste #
We tested 21 real coding tasks across 5 categories. Results:
Metric | Before | After | Impact |
---|---|---|---|
Tokens | 168,312 | 4,218 | -97.5% |
Cost | $8.40/task | $0.21/task | -97.5% |
Speed | 8.4s | 1.2s | -85.7% |
Quality | 81% | 81% | Same |
Case Study: Production Memory Leak #
Task: Fix OOM errors in payment service Traditional approach: 284,291 tokens, 4 attempts, 8.4s Smart selection: 5,832 tokens, 1 attempt, 1.2s
Selected files:
├── payment-service/processor.go # Main logic
├── payment-service/cache.go # Leak source
├── payment-service/processor_test.go # Memory tests
├── shared/metrics/memory.go # Monitoring
└── logs/payment-service-oom.log # Error logs
Root cause found immediately: Unbounded cache growth
Other successful applications:
- 2FA Implementation: 94% reduction, 60% fewer bugs
- Database Migration: 92% reduction, avoided token limits
- API Refactoring: 96% reduction, 3x faster completion
How It Works: 5 Smart Strategies #
1. Dependency Analysis (Highest ROI) #
1func selectByDependency(task Task, files []File) []File {
2 selected := make(map[string]*File)
3 queue := task.MentionedFiles
4
5 // Follow imports breadth-first
6 for len(queue) > 0 && countTokens(selected) < budget {
7 file := queue[0]
8 queue = queue[1:]
9
10 selected[file] = true
11 queue = append(queue, getImports(file)...)
12 }
13
14 return selected
15}
2. Task-Specific Selection #
1strategies := map[TaskType]Config{
2 Debug: {includeTests: true, depth: 3, recentCommits: 10},
3 Feature: {includeExamples: true, depth: 2, relatedFiles: true},
4 Refactor: {includeAllUsages: true, depth: 5},
5}
3. Smart Caching #
1// Invalidate on file changes, not just time
2cache.Watch(file, func() {
3 cache.Invalidate(file)
4 cache.InvalidateDependents(file)
5})
4. Adaptive Learning #
1// Learn from feedback
2learner.Track(task, usedFiles, missingFiles)
3// After 10 tasks: 95%+ accuracy
5. Context Boundaries #
Use task structure (TODOs, comments) to define natural boundaries rather than arbitrary file counts.
Implementation Guide #
Quick Start (1 hour) - 85% reduction #
1# .ai-context.yml
2exclude:
3 - "node_modules/**"
4 - "dist/**"
5 - "*.test.js"
6 - ".git/**"
Dependency Analysis (1 day) - 95% reduction #
1// Simple import follower
2function getDependencies(file) {
3 const imports = parseImports(file);
4 return imports.map(imp => resolveImport(imp));
5}
Full System (1 week) - 97%+ reduction #
See complete implementation: github.com/rcliao/teeny-orb
Integration Examples #
VS Code + Continue.dev:
1// .continue/config.ts
2config.contextProviders = [{
3 name: "smart-context",
4 params: {
5 strategy: "dependency-aware",
6 maxTokens: 5000,
7 taskType: "auto-detect"
8 }
9}];
Similar patterns work for Cursor, Copilot, and Claude Code. See all integrations →
Common Pitfalls & Solutions #
Problem | Solution |
---|---|
Circular dependencies | Track visited files |
Dynamic imports | Pattern matching + config parsing |
Monorepos | Service boundary detection |
Missing context | Adaptive learning from feedback |
Cache staleness | File-change watchers |
The Context Maturity Model #
Level | Approach | Token Waste | Your Team? |
---|---|---|---|
0 | Send everything | 95-99% | Most teams |
1 | Basic exclusions | 85-95% | After .gitignore |
2 | Keyword matching | 70-85% | Basic search |
3 | Dependency aware | 50-70% | Import following |
4 | Task-specific | 20-50% | Strategy patterns |
5 | Adaptive learning | 2-20% | ML feedback |
6 | Goldilocks zone | <5% | Our approach |
Actionable Takeaways for Prompt Engineering #
-
Specify relevant files explicitly (immediate impact):
❌ "Fix the login bug" ✅ "Fix the login bug in auth/login.js and auth/validate.js"
Result: 90%+ context reduction, faster responses
-
Use file patterns for focused searches:
❌ "Find all API endpoints" ✅ "Find all API endpoints in routes/**/*.js"
Result: Avoids scanning unrelated files like tests, docs, configs
-
Include task type for smarter selection:
❌ "Refactor the payment service" ✅ "Refactor the payment service (focus on payment/*.go and its direct imports)"
Result: AI follows dependency chain instead of guessing
-
Leverage natural boundaries:
❌ "Implement the new feature" ✅ "Implement the TODO items in features/checkout.js"
Result: Uses existing code structure as context boundaries
-
Review and control context explicitly:
✅ "Show me what files you're including before analyzing" ✅ "List the files in scope: auth/*.js, user/*.js" ✅ "Exclude test files from the context" ✅ "Only include files modified in the last commit"
Result: Verify context before processing, avoid token waste
-
Use context commands in AI tools:
# Claude.ai "What files are currently in your context?" # GitHub Copilot @workspace /include auth/**/*.js # Cursor ⌘K → "Focus on files: [list]"
Result: Direct control over what the AI sees
FAQ #
Q: Works with my tool? A: Yes. Principles apply to any AI coding tool. We provide configs for major tools.
Q: Implementation effort? A: Hours for basic (85% reduction), days for advanced (95%+), weeks for full (97%+).
Q: What about huge context windows? A: Bigger ≠ better. More context = more noise, higher costs, slower responses.
Resources #
- Code: github.com/rcliao/teeny-orb
- Experiments:
/experiments/week5-8/
- Integration templates:
/integrations/
This research was conducted on real production codebases. All metrics are reproducible using our open-source tools.
Citation #
1@article{liao2025context,
2 title={The Context Goldilocks Zone},
3 author={Liao, Eric},
4 year={2025},
5 url={github.com/rcliao/teeny-orb}
6}