Phase 1: ANALYZE
The analyze phase focuses on understanding the current codebase structure and identifying refactoring opportunities.
#### Domain Boundary Identification
Identify logical boundaries in the codebase by examining:
- Module dependencies and import patterns
- Data flow between components
- Shared state and coupling points
- Public API surfaces
Use AST-grep to analyze structural patterns. For Python, search for import patterns to understand module dependencies. For class hierarchies, analyze inheritance relationships and method distributions.
#### Coupling and Cohesion Metrics
Evaluate code quality metrics:
- Afferent Coupling (Ca): Number of classes depending on this module
- Efferent Coupling (Ce): Number of classes this module depends on
- Instability (I): Ce / (Ca + Ce) - higher means less stable
- Abstractness (A): Abstract classes / Total classes
- Distance from Main Sequence: |A + I - 1|
Low cohesion and high coupling indicate refactoring candidates.
#### Structural Analysis Patterns
Use AST-grep to identify problematic patterns:
- God classes with too many methods or responsibilities
- Feature envy where methods use other class data excessively
- Long parameter lists indicating missing abstractions
- Duplicate code patterns across modules
Create analysis reports documenting:
- Current architecture overview
- Identified problem areas with severity ratings
- Proposed refactoring targets with risk assessment
- Dependency graphs showing coupling relationships
Phase 2: PRESERVE
The preserve phase establishes safety nets before making any changes.
#### Characterization Tests
Characterization tests capture existing behavior without assumptions about correctness. The goal is to document what the code actually does, not what it should do.
Steps for creating characterization tests:
- Step 1: Identify critical code paths through execution
- Step 2: Create tests that exercise these paths
- Step 3: Let tests fail initially to discover actual output
- Step 4: Update tests to expect actual output
- Step 5: Document any surprising behavior discovered
Characterization test naming convention: testcharacterize[component]\_[scenario]
#### Behavior Snapshots
For complex outputs, use snapshot testing to capture current behavior:
- API response snapshots
- Serialization output snapshots
- State transformation snapshots
- Error message snapshots
Snapshot files serve as behavior contracts during refactoring.
#### Test Safety Net Verification
Before proceeding to improvement phase, verify:
- All existing tests pass (100% green)
- New characterization tests cover refactoring targets
- Code coverage meets threshold for affected areas
- No flaky tests exist in the safety net
Run mutation testing if available to verify test effectiveness.
Phase 3: IMPROVE
The improve phase makes structural changes while continuously validating behavior preservation.
#### Incremental Transformation Strategy
Never make large changes at once. Follow this pattern:
- Make smallest possible structural change
- Run full test suite
- If tests fail, revert immediately
- If tests pass, commit the change
- Repeat until refactoring goal achieved
#### Safe Refactoring Patterns
Extract Method: When a code block can be named and isolated. Use AST-grep to identify candidates by searching for repeated code blocks or long methods.
Extract Class: When a class has multiple responsibilities. Move related methods and fields to a new class while maintaining the original API through delegation.
Move Method: When a method uses data from another class more than its own. Relocate while preserving all call sites.
Inline Refactoring: When indirection adds complexity without benefit. Replace delegation with direct implementation.
Rename Refactoring: When names do not reflect current understanding. Update all references atomically using AST-grep rewrite.
#### AST-Grep Assisted Transformations
Use AST-grep for safe, semantic-aware transformations:
For method extraction, create a rule that identifies the code pattern and rewrites to the extracted form.
For API migration, create a rule that matches old API calls and rewrites to new API format.
For deprecation handling, create rules that identify deprecated patterns and suggest modern alternatives.
#### Continuous Validation Loop
After each transformation:
- Run unit tests (fast feedback)
- Run integration tests (behavior validation)
- Run characterization tests (snapshot comparison)
- Verify no new warnings or errors introduced
- Check performance benchmarks if applicable
---