The Archaeology Pattern
Diggin things down is a completely different paradigm compared to building things. This pattern specializes on how we can use AI to understand the codebase. It might sound simple at first but reality is a bit different.
The archaeology pattern often reveals:
- Undocumented workarounds for specific edge cases
- Implicit architectural decisions
- Hidden dependencies between modules
- Performance optimizations that look like complexity
- Security measures that seem redundant
Excavating Wisdom from Existing Codebases
When applying Vibe Coding to existing projects, you become an archaeologist who is excavating layers of decisions, uncovering hidden patterns, and reconstructing the evolution story. Every line of code contains history: why certain paths were chosen, what alternatives were rejected, and what edge cases shaped the architecture.
Without this archaeological understanding, AI might simplify architecture back to already-invalidated states, removing critical workarounds or “fixing” intentional complexity that handles important edge cases.
Core Principle
Existing code contains hidden reasons—don’t simplify without understanding why.
Technical debt often exists for valid historical reasons. A seemingly redundant check might prevent a race condition discovered in production. That “ugly” workaround might handle a third-party API’s undocumented behavior. The archaeology pattern ensures we preserve this hard-won wisdom.
The Five-Phase Archaeological Process
Phase 1: Initial Context Gathering
Start by having AI read the entire codebase to understand what you’re working with:
This is an ongoing project currently under heavy development.
Read all files and tell me about this project in non-technical terms.
Focus on:
- Code files (not existing documentation)
- Read code files fully
- What the system actually does
Create: devdocs/archaeology/small_summary.md
This gives you a baseline understanding before diving deeper.
Phase 2: Technical Architecture Discovery
Next, understand the technical design:
Analyze how this codebase is designed in terms of:
- Data flow paths
- Main abstractions
- Top-level design patterns
Explain at a high level, as if introducing the architecture to a new engineer.
Create: devdocs/archaeology/intro2codebase.md
Phase 3: Deep Execution Tracing
This is where the real archaeology happens. By following execution paths to understand actual behavior:
Identify every internal interface and submodule interaction.
For each interface, trace the execution path end-to-end:
- What calls it
- What it triggers
- How data/state moves through layers
- What outcomes it produces
Create trace files under devdocs/archaeology/traces/
Each trace should include:
- Entry Point
- Execution Path
- Resource Management
- Error Path
- Performance Characteristics
- Observable Effects
- Why This Design
- What feels incomplete/vulnerable/poorly designed
These traces reveal the true architecture, not what documentation claims or what naming suggests.
Phase 4: Insight Synthesis
After tracing, identify improvement opportunities while respecting existing decisions. This part doesnt actually meant to improve the codebase later. But we are contexting the AI to see the codebase in different angle.
Now look at all traces and what are the 5 things that would improve the codebase a lot?
Put these in devdocs/archaeology/5_things_or_not.md
And make sure after each of them think for a possible reason this thing is not implemented/fixed at the current code
(the reason is there might be some undocumented decisions and these 5 things might refer to them )
Phase 5: Concept Inventory
Map all concepts found in the code:
Create concept inventories:
1. devdocs/archaeology/concepts/technical_concepts_list.md
2. devdocs/archaeology/concepts/design_concepts_list.md
3. devdocs/archaeology/concepts/business_lvl_concepts_list.md
4. devdocs/archaeology/concepts/missing_concepts_list.md
For each concept include:
- High-level explanation (max 3 sentences)
- Current implementation status
- Hidden assumptions or edge-case handling
- Future-oriented design considerations
Surface implicit concepts that exist only in code.
Moving from Archaeology to Evolution
Phase 6: Foundation Extraction
Extract the true foundations from code analysis:
Based solely on codebase analysis, extract:
1. devdocs/archaeology/foundations/project_description.md
- What the system actually does
- Current use cases evident in code
- Problems being solved
2. devdocs/archaeology/foundations/philosophy.md
- Implicit design principles
- Coding patterns consistently used
- Architectural decisions evident in structure
3. devdocs/archaeology/foundations/known_requirements.md
- Requirements inferred from implementations
- Constraints visible in code
- Compliance/security measures present
Phase 7: Gap Analysis
Compare current state to desired future:
Create devdocs/evolution/gap_analysis.md:
1. What concepts need implementation
2. What architecture changes are required
3. What technical debt blocks progress
4. What can be incrementally improved
5. What requires complete rewrite
Phase 8: Strategic Planning
Cleanup Inventory
Before adding new code, identify what can be removed:
Scan for unused elements:
- Unreferenced files and modules
- Dead code paths
- Commented-out code blocks
- Duplicate implementations
- Abandoned features
Create: devdocs/archaeology/cleanup_inventory.md
WARNING: Never delete immediately—code that looks unused might be:
- Loaded dynamically
- Referenced in configuration
- Used by external systems
- Kept for compliance/audit reasons
Refactoring Opportunities
Identify high-value improvements:
Find refactoring candidates:
- Scattered database operations → Repository Pattern
- Mixed business/infrastructure logic → Service Pattern
- Repeated API patterns → Gateway abstraction
- Global state → Dependency Injection
- Hard-coded configs → Configuration Pattern
Create: devdocs/evolution/refactoring_opportunities.md
For each opportunity document:
- Current problematic pattern
- Proposed solution
- Benefits (testability, maintainability)
- Implementation effort
- Risk assessment
- ROI justification
Phase 9: Baseline Testing
Establish what currently works:
Create comprehensive smoke tests to validate implementation.
Create: smoke_tests/check_what_is_working/
Requirements:
- 5 test files with 5 test cases each
- No mocking—use real components
- Test actual behavior, not assumptions
- Verbose output showing exactly what fails
- Cover entire codebase systematically
Document results in report.md:
- What works as expected
- What's broken but acceptable
- Critical failures needing immediate attention
Phase 10: Implementation Roadmap
Create a phased execution plan:
Create devdocs/evolution/implementation_roadmap.md:
Phase A - Foundation (Week 1-2):
- Critical cleanup
- Essential refactoring
- Fix broken core functionality
- Establish CI/CD
Phase B - Core Refactoring (Week 3-4):
- Highest-ROI improvements
- Add abstraction layers
- Decouple modules
- Run smoke tests after each change
Phase C - Gap Filling (Week 5-8):
- Implement missing concepts
- Module-by-module improvements
- Add identified features
Phase D - Integration (Week 9-10):
- Ensure modules work together
- Performance optimization
- Documentation updates
- Comprehensive testing
Common Archaeological Discoveries
The archaeology pattern often reveals:
- Undocumented workarounds for specific edge cases that took weeks to discover
- Implicit architectural decisions that seem arbitrary but prevent subtle bugs
- Hidden dependencies between modules that look independent
- Performance optimizations disguised as unnecessary complexity
- Security measures that appear redundant but handle attack vectors
- Data migrations embedded in code to handle legacy formats
- Feature flags for abandoned experiments that still affect behavior
Why This Works
Manual archaeology would take weeks of code reading and interviews with original developers (if available). AI can traverse entire codebases in minutes, recognizing patterns humans might miss. The key is asking the right questions to extract not just what exists, but why it exists that way.
Critical Success Factors
- Never assume code is wrong—assume you don’t understand its purpose yet
- Document before changing—capture existing behavior first
- Test current state—establish baseline before modifications
- Preserve edge cases—they represent real-world lessons
- Respect technical debt—it often represents conscious trade-offs
The Archaeology Mindset
Approach existing code like an archaeologist approaches an ancient site:
- Every artifact has meaning
- Context is everything
- Preserve before modifying
- Document all findings
- Respect what came before
This pattern transforms the daunting task of understanding legacy code into a systematic exploration that preserves hard-won wisdom while identifying genuine improvement opportunities.