The Probe Tests Pattern
Core Concept
All testing serves three purposes: Discovery (how it behaves), Diagnostic (whatβs broken), and Validation (meets requirements).
Probe Tests combine all three into a single development loop that mirrors how developers naturally work, running code to verify it works while learning how it behaves.
Probe Tests = Smoke + Discovery + Validation in one test
Why It Matters
In AI-assisted development, syntactically perfect code can solve the wrong problem. Probe tests reveal misunderstandings immediately through:
- β Smoke - Does it run?
- π Discovery - How does it actually behave?
- β οΈ Validation - Are critical requirements met?
This isnβt about pass/failβitβs about understanding data flow, transformations, and the gap between design and implementation.
Important Distinction
Probe testing complements formal CI/CD frameworks but focuses on the developer experience during active development. Itβs for running code to inspect and understand, not automated regression testing.
Implementation Example
def probe_user_registration():
"""
PROBING: User registration system
β Verify it runs
π Discover actual behavior
β οΈ Validate critical rules
"""
print("\n㪠PROBING: User Registration")
# SMOKE - Basic operation
try:
result = register_user("test@example.com", "password123")
print(f" β Runs: returns {type(result)}")
except Exception as e:
print(f" β FAILED: {e}")
return False
# DISCOVER - Actual behavior
print(f" β Result: {result}")
if hasattr(result, '__dict__'):
print(f" β Attributes: {list(result.__dict__.keys())}")
# Test variations
duplicate = register_user("test@example.com", "password123")
weak_pass = register_user("test2@example.com", "123")
print(f" β Duplicate handling: {duplicate}")
print(f" β Weak password: {weak_pass}")
# VALIDATE - Critical requirements
if hasattr(result, 'password'):
assert result.password != "password123", "β Password in plaintext!"
assert hasattr(result, 'id') or 'id' in result, "β No user ID!"
assert duplicate is None or duplicate.error, "β Duplicates allowed!"
print(" β OPERATIONAL + UNDERSTOOD + VALIDATED")
return True
Benefits in AI Development
- Fast Feedback - Three insights from one test
- Self-Documenting - Output shows actual behavior
- Educational - AI learns from discoveries
- Safety Net - Critical validations prevent dangerous bugs
How to Implement Probe Tests pattern correctly
0. Verbose Over Clever
We design Probe tests to generate detailed output that exposes data flow and critical transformations, including both raw data samples and summarized results. These tests serve dual audiences: human developers who need to understand system behavior, and AI assistants that require explicit context to provide accurate assistance.
1. Comprehensiveness
Probe tests must encompass diverse testing scenarios. Traditional software development achieves this through specialized test categories:
- Unit tests - Test individual functions/methods in isolation
- Integration tests - Test how components work together
- End-to-end (E2E) tests - Test complete user workflows
- Acceptance tests - Verify business requirements are met
- Performance/Load tests - Test speed and scalability
Within the vibe coding paradigm, these traditional testing approaches are consolidated and adapted into our comprehensive Probe Tests pattern.
2. Progressive Complexity
Probe tests should follow a progression from simple to complex. Begin with isolated method verification and gradually advance to testing broader conceptual behaviors. Each successive test file should demonstrate deeper system understanding and better debug opportunity for future issues.
Example progression: test_login_returns_token β β¦ β test_basic_auth_flow_works
It is common to have first file as
test_01_initialization.py
- Can we import the modules?
- Do classes instantiate?
- Are constants defined?
and second file as:
test_02_connectivity.py
- Can we connect to the database?
- Do API endpoints respond?
- Are external services reachable?
- Do authentication mechanisms work?
- Can we read/write to external resources?
3. Clear Intent
Probe tests should read like documentation so we can intervene when needed. Test names should be clear and readable, describing exactly what behavior is being verified
4. Full implementation Coverage
The cost of comprehensive Probe tests is negligible compared to the cost of incorrect implementation. Generate as many tests as necessary to ensure confidence.
A practical guideline: Request AI to create 5 Probe test files, each targeting distinct logical components, with a minimum of 5 individual test cases per file.
Probe_tests/
βββ test_01_initialization.py
βββ test_02_core_functionality.py
βββ test_03_user_workflows.py
βββ test_04_edge_cases.py
βββ test_05_integration.py
5. No mocking and Providing Real Data
And you will see that AI-generated tests often default to minimal output or use mocks that obscure real behavior. Which if not checked might have result in Phantom success.
AI-generated tests frequently default to using mocks or minimal outputs that mask actual system behavior, potentially creating false positives where tests pass despite broken functionality.
It is your job to provide realistic data for Probe tests.When developing complex systems like video processing engines or voice detection algorithms, AI cannot generate realistic data to test the code . Without authentic test data, you risk phantom successes where mocked tests pass while real implementation fails.
6. Always Rerun your Probe tests by yourself.
Never rely solely on AIβs test execution or interpretation. A critical pitfall occurs when AI reports overall success despite partial failures - even one failing test invalidates the entire suite.
Additionally, manually reviewing Probe test output provides immediate insight into the AIβs implementation approach, revealing how it handles data transformations and input/output operations. This direct observation is invaluable for understanding the actual code behavior.