Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Attention Mechanism

What Attention Actually Means

When AI reads your code, it doesn’t read like humans do. It uses “attention” - weighted focus across all tokens simultaneously.

Think of it like this:

  • Human reading: Sequential, left-to-right, building understanding
  • AI reading: Parallel, everything at once, finding patterns and connections

Why it matters

Knowing how attention works can help you offload complexity from the LLM.

The Spotlight Metaphor

Imagine a dark stage with 1000 actors. AI has 1000 spotlights it can dim or brighten. When you mention “user authentication”, suddenly:

  • Spotlights on “password” brighten
  • Spotlights on “login” brighten
  • Spotlights on “security” brighten
  • Spotlights on “recipe ingredients” dim

This happens for every token, instantaneously.

Why This Matters for Vibe Coding

1. Context and Relevance Matter Most

While AI processes everything in parallel, what matters most is relevance to the task at hand.

# If you're asking about authentication, this gets attention
def authenticate_user():
    pass
    
# This gets less attention regardless of position
def calculate_recipe_calories():
    pass

Put related information together, make intent clear.

2. Proximity Is Power

Related concepts near each other strengthen attention bonds.

# Weak attention bond
class User:
    pass

# ... 500 lines later ...

def authenticate_user():
    pass

vs

# Strong attention bond  
class User:
    pass
    
def authenticate_user():
    pass

Keep related concepts close.

3. Repetition Reinforces

Each mention strengthens attention pathways.

# Single mention - weak signal
# This handles user data

# Multiple mentions - strong signal
# This UserService handles user authentication
# It manages user sessions and user preferences
class UserService:

But don’t overdo it - that’s keyword stuffing.

Context Window Limits

AI has finite context. When you approach the context limit:

  • Relevant content to your question gets prioritized
  • AI focuses on what’s needed for the current task
  • Very long contexts can sometimes cause confusion
  • Organization and clarity matter more than position

This is why grouping related files matters when sharing multiple files.

Practical Implications

1. Lead with Intent

# BAD: Burying the lead
# This function does various things with data
# It processes some stuff and returns results
# Oh, by the way, it's for authentication

# GOOD: Clear intent upfront
# Authenticates users against the database
# Returns JWT token on success, throws on failure

2. Context Clustering

When sharing multiple files, group by relevance:

1. Core business logic files
2. Related utility files
3. Configuration files
4. Test files

Not alphabetically or randomly.

3. Distinctive Names for better attention

# Weak anchor
def process():
    pass

# Strong anchor
def process_payment_webhook():
    pass

Specific names help AI maintain focus.

Attention Hijacking

Some patterns grab too much attention:

TODO/FIXME Comments

# TODO: Fix this security vulnerability
def safe_function():
    return data  # AI obsesses over the TODO

AI might focus on the TODO instead of your actual request.

Strong Keywords

Words like “deprecated”, “legacy”, “broken”, “hack” trigger strong attention. Use carefully.

Working with Attention

The Priming Pattern

Start conversations by setting attention focus:

"I'm working on the authentication module. 
Please focus on security and user experience."

The Context Window Strategy

Share files in order of importance:

  1. The file you want to modify
  2. Direct dependencies
  3. Related interfaces
  4. Supporting utilities

The Attention Reset

When AI gets fixated on wrong things:

"Let's refocus. Ignore the previous discussion about X.
I need help specifically with Y."

Attention vs Memory

Attention ≠ Memory

  • Attention: What AI focuses on right now
  • Memory: What AI remembers from the codebase

You can direct attention. You can’t change memory.

The Compound Effect

Good attention management:

  • Faster correct responses
  • Less confusion
  • Better architectural coherence
  • Fewer tangential suggestions

Poor attention management:

  • AI suggests fixing non-issues
  • Focuses on wrong patterns
  • Misses critical requirements
  • Generates irrelevant code

Quick Tips

  1. Name precisely - auth_service not service1
  2. Comment intentions - Why, not what
  3. Order thoughtfully - Important stuff first
  4. Cluster related - Group by concept
  5. Reset when needed - Don’t fight confused attention

Understanding attention helps you communicate effectively with AI. It’s not about tricking the system - it’s about clarity.