Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

The Archaeology Pattern

Diggin things down is a completely different paradigm compared to building things. This pattern specializes on how we can use AI to understand the codebase. It might sound simple at first but reality is a bit different.

The archaeology pattern often reveals:

  • Undocumented workarounds for specific edge cases
  • Implicit architectural decisions
  • Hidden dependencies between modules
  • Performance optimizations that look like complexity
  • Security measures that seem redundant

Excavating Wisdom from Existing Codebases

When applying Vibe Coding to existing projects, you become an archaeologist who is excavating layers of decisions, uncovering hidden patterns, and reconstructing the evolution story. Every line of code contains history: why certain paths were chosen, what alternatives were rejected, and what edge cases shaped the architecture.

Without this archaeological understanding, AI might simplify architecture back to already-invalidated states, removing critical workarounds or “fixing” intentional complexity that handles important edge cases.

Core Principle

Existing code contains hidden reasons—don’t simplify without understanding why.

Technical debt often exists for valid historical reasons. A seemingly redundant check might prevent a race condition discovered in production. That “ugly” workaround might handle a third-party API’s undocumented behavior. The archaeology pattern ensures we preserve this hard-won wisdom.

The Five-Phase Archaeological Process

Phase 1: Initial Context Gathering

Start by having AI read the entire codebase to understand what you’re working with:

This is an ongoing project currently under heavy development.
Read all files and tell me about this project in non-technical terms.

Focus on:
- Code files (not existing documentation)
- Read code files fully
- What the system actually does

Create: devdocs/archaeology/small_summary.md

This gives you a baseline understanding before diving deeper.

Phase 2: Technical Architecture Discovery

Next, understand the technical design:

Analyze how this codebase is designed in terms of:
- Data flow paths
- Main abstractions
- Top-level design patterns

Explain at a high level, as if introducing the architecture to a new engineer.

Create: devdocs/archaeology/intro2codebase.md

Phase 3: Deep Execution Tracing

This is where the real archaeology happens. By following execution paths to understand actual behavior:

Identify every internal interface and submodule interaction.
For each interface, trace the execution path end-to-end:
- What calls it
- What it triggers
- How data/state moves through layers
- What outcomes it produces

Create trace files under devdocs/archaeology/traces/
Each trace should include:
- Entry Point
- Execution Path
- Resource Management
- Error Path
- Performance Characteristics
- Observable Effects
- Why This Design
- What feels incomplete/vulnerable/poorly designed

These traces reveal the true architecture, not what documentation claims or what naming suggests.

Phase 4: Insight Synthesis

After tracing, identify improvement opportunities while respecting existing decisions. This part doesnt actually meant to improve the codebase later. But we are contexting the AI to see the codebase in different angle.

Now look at all traces and what are the 5 things that would improve the codebase a lot?

Put these in devdocs/archaeology/5_things_or_not.md

And make sure after each of them think for a possible reason this thing is not implemented/fixed at the current code
(the reason is there might be some undocumented decisions and these 5 things might refer to them  )

Phase 5: Concept Inventory

Map all concepts found in the code:

Create concept inventories:
1. devdocs/archaeology/concepts/technical_concepts_list.md
2. devdocs/archaeology/concepts/design_concepts_list.md
3. devdocs/archaeology/concepts/business_lvl_concepts_list.md
4. devdocs/archaeology/concepts/missing_concepts_list.md

For each concept include:
- High-level explanation (max 3 sentences)
- Current implementation status
- Hidden assumptions or edge-case handling
- Future-oriented design considerations

Surface implicit concepts that exist only in code.

Moving from Archaeology to Evolution

Phase 6: Foundation Extraction

Extract the true foundations from code analysis:

Based solely on codebase analysis, extract:

1. devdocs/archaeology/foundations/project_description.md
   - What the system actually does
   - Current use cases evident in code
   - Problems being solved

2. devdocs/archaeology/foundations/philosophy.md
   - Implicit design principles
   - Coding patterns consistently used
   - Architectural decisions evident in structure

3. devdocs/archaeology/foundations/known_requirements.md
   - Requirements inferred from implementations
   - Constraints visible in code
   - Compliance/security measures present

Phase 7: Gap Analysis

Compare current state to desired future:

Create devdocs/evolution/gap_analysis.md:
1. What concepts need implementation
2. What architecture changes are required
3. What technical debt blocks progress
4. What can be incrementally improved
5. What requires complete rewrite

Phase 8: Strategic Planning

Cleanup Inventory

Before adding new code, identify what can be removed:

Scan for unused elements:
- Unreferenced files and modules
- Dead code paths
- Commented-out code blocks
- Duplicate implementations
- Abandoned features

Create: devdocs/archaeology/cleanup_inventory.md

WARNING: Never delete immediately—code that looks unused might be:
- Loaded dynamically
- Referenced in configuration
- Used by external systems
- Kept for compliance/audit reasons

Refactoring Opportunities

Identify high-value improvements:

Find refactoring candidates:
- Scattered database operations → Repository Pattern
- Mixed business/infrastructure logic → Service Pattern
- Repeated API patterns → Gateway abstraction
- Global state → Dependency Injection
- Hard-coded configs → Configuration Pattern

Create: devdocs/evolution/refactoring_opportunities.md

For each opportunity document:
- Current problematic pattern
- Proposed solution
- Benefits (testability, maintainability)
- Implementation effort
- Risk assessment
- ROI justification

Phase 9: Baseline Testing

Establish what currently works:

Create comprehensive smoke tests to validate implementation.

Create: smoke_tests/check_what_is_working/

Requirements:
- 5 test files with 5 test cases each
- No mocking—use real components
- Test actual behavior, not assumptions
- Verbose output showing exactly what fails
- Cover entire codebase systematically

Document results in report.md:
- What works as expected
- What's broken but acceptable
- Critical failures needing immediate attention

Phase 10: Implementation Roadmap

Create a phased execution plan:

Create devdocs/evolution/implementation_roadmap.md:

Phase A - Foundation (Week 1-2):
- Critical cleanup
- Essential refactoring
- Fix broken core functionality
- Establish CI/CD

Phase B - Core Refactoring (Week 3-4):
- Highest-ROI improvements
- Add abstraction layers
- Decouple modules
- Run smoke tests after each change

Phase C - Gap Filling (Week 5-8):
- Implement missing concepts
- Module-by-module improvements
- Add identified features

Phase D - Integration (Week 9-10):
- Ensure modules work together
- Performance optimization
- Documentation updates
- Comprehensive testing

Common Archaeological Discoveries

The archaeology pattern often reveals:

  • Undocumented workarounds for specific edge cases that took weeks to discover
  • Implicit architectural decisions that seem arbitrary but prevent subtle bugs
  • Hidden dependencies between modules that look independent
  • Performance optimizations disguised as unnecessary complexity
  • Security measures that appear redundant but handle attack vectors
  • Data migrations embedded in code to handle legacy formats
  • Feature flags for abandoned experiments that still affect behavior

Why This Works

Manual archaeology would take weeks of code reading and interviews with original developers (if available). AI can traverse entire codebases in minutes, recognizing patterns humans might miss. The key is asking the right questions to extract not just what exists, but why it exists that way.

Critical Success Factors

  1. Never assume code is wrong—assume you don’t understand its purpose yet
  2. Document before changing—capture existing behavior first
  3. Test current state—establish baseline before modifications
  4. Preserve edge cases—they represent real-world lessons
  5. Respect technical debt—it often represents conscious trade-offs

The Archaeology Mindset

Approach existing code like an archaeologist approaches an ancient site:

  • Every artifact has meaning
  • Context is everything
  • Preserve before modifying
  • Document all findings
  • Respect what came before

This pattern transforms the daunting task of understanding legacy code into a systematic exploration that preserves hard-won wisdom while identifying genuine improvement opportunities.