🎯

data-modeler

🎯Skill

from matteocervelli/llms

VibeIndex|
What it does

Designs comprehensive Pydantic data models with robust type annotations, validation rules, and relationship mappings for Python applications.

data-modeler

Installation

Install skill:
npx skills add https://github.com/matteocervelli/llms --skill data-modeler
5
AddedJan 27, 2026

Skill Details

SKILL.md

Design data models with Pydantic schemas, comprehensive validation rules,

Purpose

The data-modeler skill provides comprehensive guidance for designing robust data models using Pydantic, Python's most popular data validation library. This skill helps the Architecture Designer agent create type-safe, validated data structures that serve as the foundation for feature implementations.

This skill emphasizes:

  • Type Safety: Complete type annotations for all fields
  • Validation: Comprehensive validators for business rules
  • Documentation: Clear field descriptions and constraints
  • Relationships: Proper modeling of entity relationships
  • Serialization: Correct handling of JSON/dict conversion

The data-modeler skill ensures that data models are not just simple data containers, but intelligent objects that enforce business rules, validate data integrity, and provide clear contracts for data interchange.

When to Use

This skill auto-activates when the agent describes:

  • "Design data models for..."
  • "Create Pydantic schemas for..."
  • "Define data structures with..."
  • "Model the data with..."
  • "Create validation rules for..."
  • "Define entity relationships..."
  • "Specify field constraints for..."
  • "Design request/response schemas..."

Provided Capabilities

1. Pydantic Schema Design

What it provides:

  • BaseModel class structure
  • Field definitions with types and constraints
  • Default values and factory functions
  • Optional vs required fields
  • Nested model composition
  • Model inheritance patterns

Guidance:

  • Use Field() for metadata and constraints
  • Provide description for all fields
  • Set appropriate default or default_factory
  • Use Optional[T] for nullable fields
  • Validate field names follow conventions

Example:

```python

from pydantic import BaseModel, Field, validator

from typing import Optional, List

from datetime import datetime

from enum import Enum

class UserRole(str, Enum):

"""User role enumeration."""

ADMIN = "admin"

USER = "user"

GUEST = "guest"

class Address(BaseModel):

"""Nested address model."""

street: str = Field(..., description="Street address", min_length=1, max_length=200)

city: str = Field(..., description="City name", min_length=1, max_length=100)

state: str = Field(..., description="State/province code", min_length=2, max_length=2)

postal_code: str = Field(..., description="Postal/ZIP code", regex=r"^\d{5}(-\d{4})?$")

country: str = Field(default="US", description="Country code (ISO 3166-1 alpha-2)")

class Config:

schema_extra = {

"example": {

"street": "123 Main St",

"city": "Springfield",

"state": "IL",

"postal_code": "62701",

"country": "US"

}

}

class User(BaseModel):

"""User data model with comprehensive validation."""

# Identity fields

id: Optional[int] = Field(None, description="User ID (auto-generated)")

username: str = Field(..., description="Unique username", min_length=3, max_length=50)

email: str = Field(..., description="Email address (validated)")

# Profile fields

full_name: str = Field(..., description="User's full name", min_length=1, max_length=200)

role: UserRole = Field(default=UserRole.USER, description="User role")

is_active: bool = Field(default=True, description="Account active status")

# Nested model

address: Optional[Address] = Field(None, description="Mailing address")

# Lists

tags: List[str] = Field(default_factory=list, description="User tags")

# Timestamps

created_at: datetime = Field(default_factory=datetime.utcnow, description="Creation timestamp")

updated_at: Optional[datetime] = Field(None, description="Last update timestamp")

class Config:

"""Pydantic model configuration."""

# Allow ORM models to be parsed

orm_mode = True

# Use enum values in JSON

use_enum_values = True

# Example for documentation

schema_extra = {

"example": {

"username": "johndoe",

"email": "john@example.com",

"full_name": "John Doe",

"role": "user",

"address": {

"street": "123 Main St",

"city": "Springfield",

"state": "IL",

"postal_code": "62701"

},

"tags": ["verified", "premium"]

}

}

```

2. Field-Level Validators

What it provides:

  • @validator decorator usage
  • Value transformation
  • Cross-field validation
  • Custom error messages
  • Pre and post validation

Validation Types:

  • Format validation: Email, URL, phone, regex
  • Range validation: min/max for numbers, length for strings
  • Business rules: Custom logic validation
  • Referential integrity: Cross-field checks

Example:

```python

from pydantic import BaseModel, Field, validator, root_validator

import re

class UserRegistration(BaseModel):

"""User registration with comprehensive validation."""

username: str = Field(..., min_length=3, max_length=50)

email: str = Field(...)

password: str = Field(..., min_length=8)

password_confirm: str = Field(..., min_length=8)

age: int = Field(..., ge=13, le=120)

phone: Optional[str] = Field(None)

@validator('username')

def validate_username(cls, v):

"""Validate username format."""

if not re.match(r'^[a-zA-Z0-9_-]+$', v):

raise ValueError('Username must contain only letters, numbers, hyphens, and underscores')

# Check against reserved names

reserved = ['admin', 'root', 'system']

if v.lower() in reserved:

raise ValueError(f'Username "{v}" is reserved')

return v.lower() # Normalize to lowercase

@validator('email')

def validate_email(cls, v):

"""Validate email format."""

email_regex = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'

if not re.match(email_regex, v):

raise ValueError('Invalid email format')

return v.lower() # Normalize to lowercase

@validator('password')

def validate_password_strength(cls, v):

"""Validate password strength."""

if not re.search(r'[A-Z]', v):

raise ValueError('Password must contain at least one uppercase letter')

if not re.search(r'[a-z]', v):

raise ValueError('Password must contain at least one lowercase letter')

if not re.search(r'\d', v):

raise ValueError('Password must contain at least one digit')

if not re.search(r'[!@#$%^&*(),.?":{}|<>]', v):

raise ValueError('Password must contain at least one special character')

return v

@validator('phone')

def validate_phone(cls, v):

"""Validate phone number format."""

if v is None:

return v

# Remove all non-digit characters

digits = re.sub(r'\D', '', v)

if len(digits) != 10:

raise ValueError('Phone number must be 10 digits')

# Return formatted phone

return f'({digits[:3]}) {digits[3:6]}-{digits[6:]}'

@root_validator

def validate_passwords_match(cls, values):

"""Validate that passwords match (cross-field validation)."""

password = values.get('password')

password_confirm = values.get('password_confirm')

if password != password_confirm:

raise ValueError('Passwords do not match')

return values

```

3. Model-Level Validators

What it provides:

  • @root_validator for cross-field validation
  • Pre-validation transformations
  • Post-validation checks
  • Complex business rule enforcement

Example:

```python

from pydantic import BaseModel, Field, root_validator

from datetime import date, datetime

from typing import Optional

class EventBooking(BaseModel):

"""Event booking with complex validation."""

event_name: str = Field(...)

start_date: date = Field(...)

end_date: date = Field(...)

attendees: int = Field(..., ge=1, le=1000)

room_capacity: int = Field(..., ge=1)

is_catering: bool = Field(default=False)

catering_headcount: Optional[int] = Field(None, ge=1)

@root_validator(pre=True)

def convert_date_strings(cls, values):

"""Pre-validation: Convert date strings to date objects."""

for field in ['start_date', 'end_date']:

if field in values and isinstance(values[field], str):

values[field] = datetime.strptime(values[field], '%Y-%m-%d').date()

return values

@root_validator

def validate_dates(cls, values):

"""Validate date logic."""

start = values.get('start_date')

end = values.get('end_date')

if start and end:

# End must be after start

if end < start:

raise ValueError('End date must be after start date')

# Maximum event duration: 30 days

if (end - start).days > 30:

raise ValueError('Event duration cannot exceed 30 days')

# Must be future dates

if start < date.today():

raise ValueError('Event cannot be in the past')

return values

@root_validator

def validate_capacity(cls, values):

"""Validate room capacity vs attendees."""

attendees = values.get('attendees')

capacity = values.get('room_capacity')

if attendees and capacity:

if attendees > capacity:

raise ValueError(f'Attendees ({attendees}) exceeds room capacity ({capacity})')

return values

@root_validator

def validate_catering(cls, values):

"""Validate catering requirements."""

is_catering = values.get('is_catering')

catering_headcount = values.get('catering_headcount')

attendees = values.get('attendees')

if is_catering:

# Catering headcount required if catering enabled

if not catering_headcount:

raise ValueError('Catering headcount required when catering is enabled')

# Catering headcount cannot exceed attendees

if catering_headcount > attendees:

raise ValueError('Catering headcount cannot exceed number of attendees')

else:

# No catering headcount if catering disabled

if catering_headcount:

raise ValueError('Catering headcount specified but catering is disabled')

return values

```

4. Type Annotations and Constraints

What it provides:

  • Proper use of typing module
  • Generic types (List, Dict, Set, Tuple)
  • Union types and Optional
  • Literal types for constants
  • Custom types

Example:

```python

from pydantic import BaseModel, Field, constr, conint, confloat, conlist

from typing import List, Dict, Set, Optional, Union, Literal, Any

from datetime import datetime

# Custom constrained types

Username = constr(regex=r'^[a-zA-Z0-9_-]+$', min_length=3, max_length=50)

PositiveInt = conint(gt=0)

Percentage = confloat(ge=0.0, le=100.0)

NonEmptyList = conlist(str, min_items=1)

class ProductStatus(str, Enum):

"""Product status enum."""

DRAFT = "draft"

ACTIVE = "active"

ARCHIVED = "archived"

class Product(BaseModel):

"""Product model with advanced type annotations."""

# Basic types with constraints

id: Optional[int] = None

name: constr(min_length=1, max_length=200)

sku: constr(regex=r'^[A-Z]{3}-\d{6}$') # Format: ABC-123456

# Numeric types with constraints

price: confloat(gt=0.0, le=1000000.0)

discount_percentage: Percentage = 0.0

stock_quantity: PositiveInt

# Enum

status: ProductStatus = ProductStatus.DRAFT

# Collections

tags: List[str] = Field(default_factory=list)

categories: Set[str] = Field(default_factory=set)

attributes: Dict[str, Any] = Field(default_factory=dict)

# Union types

metadata: Union[Dict[str, str], None] = None

# Literal type (specific values only)

measurement_unit: Literal["kg", "lb", "oz", "g"]

# Nested models

dimensions: Optional['ProductDimensions'] = None

# Timestamps

created_at: datetime = Field(default_factory=datetime.utcnow)

updated_at: Optional[datetime] = None

class ProductDimensions(BaseModel):

"""Product dimensions (nested model)."""

length: confloat(gt=0)

width: confloat(gt=0)

height: confloat(gt=0)

unit: Literal["cm", "in", "m"]

@property

def volume(self) -> float:

"""Calculate volume."""

return self.length self.width self.height

# Enable forward reference

Product.update_forward_refs()

```

5. Relationship Mappings

What it provides:

  • One-to-one relationships
  • One-to-many relationships
  • Many-to-many relationships
  • Foreign key references
  • Embedded vs referenced documents

Relationship Patterns:

One-to-One:

```python

class UserProfile(BaseModel):

"""User profile (one-to-one with User)."""

user_id: int = Field(..., description="Foreign key to User")

bio: Optional[str] = Field(None, max_length=500)

avatar_url: Optional[str] = None

class User(BaseModel):

"""User with one-to-one profile."""

id: int

username: str

profile: Optional[UserProfile] = None # Embedded relationship

```

One-to-Many:

```python

class Comment(BaseModel):

"""Comment (many comments per post)."""

id: int

post_id: int = Field(..., description="Foreign key to Post")

content: str

created_at: datetime

class Post(BaseModel):

"""Post with many comments."""

id: int

title: str

content: str

comments: List[Comment] = Field(default_factory=list) # Embedded list

```

Many-to-Many:

```python

class Tag(BaseModel):

"""Tag entity."""

id: int

name: str

class Article(BaseModel):

"""Article with many tags."""

id: int

title: str

tag_ids: List[int] = Field(default_factory=list) # Reference by ID

# OR

tags: List[Tag] = Field(default_factory=list) # Embedded tags

```

6. Serialization Strategies

What it provides:

  • JSON serialization/deserialization
  • dict() conversion with exclusions
  • json() output with formatting
  • Custom serializers for complex types
  • Alias usage for field naming

Example:

```python

from pydantic import BaseModel, Field

from datetime import datetime

from typing import Optional

class ApiResponse(BaseModel):

"""API response with serialization control."""

id: int

name: str

internal_code: str = Field(..., alias="code") # Use 'code' in JSON

created_at: datetime

secret_key: Optional[str] = None # Should not be exposed

_internal_state: str = "processing" # Private field (not serialized)

class Config:

# Allow field aliases

allow_population_by_field_name = True

# Custom JSON encoders

json_encoders = {

datetime: lambda v: v.isoformat()

}

# Usage

response = ApiResponse(

id=1,

name="Test",

code="ABC123",

created_at=datetime.utcnow(),

secret_key="secret"

)

# Serialize to dict (exclude secret)

data = response.dict(exclude={'secret_key'})

# {'id': 1, 'name': 'Test', 'internal_code': 'ABC123', 'created_at': datetime(...)}

# Serialize to JSON with alias

json_str = response.json(by_alias=True, exclude={'secret_key'})

# {"id": 1, "name": "Test", "code": "ABC123", "created_at": "2025-10-29T..."}

# Include/exclude specific fields

data = response.dict(include={'id', 'name'})

# {'id': 1, 'name': 'Test'}

```

Usage Guide

Step 1: Identify Data Entities

```

Requirements β†’ Entities β†’ Attributes β†’ Relationships

```

Step 2: Define Base Models

```

Create BaseModel β†’ Add fields β†’ Set types β†’ Add descriptions

```

Step 3: Add Constraints

```

Field(...) β†’ min/max β†’ regex β†’ custom constraints

```

Step 4: Implement Validators

```

@validator β†’ business rules β†’ error messages β†’ transformations

```

Step 5: Model Relationships

```

Identify relationships β†’ Choose embedding vs reference β†’ Add foreign keys

```

Step 6: Configure Serialization

```

Config class β†’ JSON encoders β†’ Aliases β†’ ORM mode

```

Step 7: Add Examples

```

schema_extra β†’ Example data β†’ Documentation

```

Step 8: Test Models

```

Create instances β†’ Validate data β†’ Test edge cases β†’ Check errors

```

Best Practices

  1. Use Field() for Metadata

- Always provide descriptions

- Set constraints (min/max, regex)

- Provide examples in Config

  1. Comprehensive Validation

- Validate at field level when possible

- Use root_validator for cross-field logic

- Provide clear error messages

  1. Type Safety

- Use specific types, not Any

- Use Optional[T] for nullable fields

- Use Enum for fixed choices

  1. Documentation

- Description for every field

- Examples in schema_extra

- Docstrings for complex logic

  1. Serialization Control

- Use aliases for API compatibility

- Exclude sensitive fields

- Custom encoders for complex types

  1. Model Organization

- Group related models in same file

- Use inheritance for shared fields

- Keep models focused and cohesive

Resources

data-model-guide.md

Comprehensive data modeling guide including:

  • Entity-relationship principles
  • Normalization guidelines
  • Pydantic model structure best practices
  • Validation strategies and patterns
  • Type hints and annotations guide
  • Field constraints catalog

pydantic-patterns.md

Pydantic-specific patterns and examples:

  • BaseModel advanced usage
  • Custom validators (field and model level)
  • Field validators with dependencies
  • Root validators for complex logic
  • Config options and settings
  • Nested models and composition
  • Serialization and deserialization
  • ORM mode and database integration
  • Complete code examples

Example Usage

Input (from Architecture Designer agent):

```

"Design data models for a user management system with user profiles, addresses, roles, and audit logging."

```

Output (data-modeler skill provides):

```python

# Comprehensive data models with validation

from pydantic import BaseModel, Field, validator, root_validator

from typing import Optional, List

from datetime import datetime

from enum import Enum

class UserRole(str, Enum):

"""User roles."""

ADMIN = "admin"

MANAGER = "manager"

USER = "user"

class Address(BaseModel):

"""User address."""

street: str = Field(..., min_length=1, max_length=200)

city: str = Field(..., min_length=1, max_length=100)

state: str = Field(..., min_length=2, max_length=2)

postal_code: str = Field(..., regex=r'^\d{5}(-\d{4})?$')

class UserProfile(BaseModel):

"""User profile information."""

bio: Optional[str] = Field(None, max_length=500)

avatar_url: Optional[str] = None

phone: Optional[str] = Field(None, regex=r'^\(\d{3}\) \d{3}-\d{4}$')

@validator('avatar_url')

def validate_avatar_url(cls, v):

if v and not v.startswith(('http://', 'https://')):

raise ValueError('Avatar URL must be HTTP/HTTPS')

return v

class User(BaseModel):

"""Main user model."""

id: Optional[int] = None

username: str = Field(..., min_length=3, max_length=50)

email: str = Field(...)

full_name: str = Field(..., min_length=1, max_length=200)

role: UserRole = UserRole.USER

is_active: bool = True

# Relationships

profile: Optional[UserProfile] = None

addresses: List[Address] = Field(default_factory=list)

# Audit fields

created_at: datetime = Field(default_factory=datetime.utcnow)

created_by: Optional[int] = None

updated_at: Optional[datetime] = None

updated_by: Optional[int] = None

@validator('email')

def validate_email(cls, v):

import re

if not re.match(r'^[^@]+@[^@]+\.[^@]+$', v):

raise ValueError('Invalid email format')

return v.lower()

class Config:

orm_mode = True

schema_extra = {

"example": {

"username": "johndoe",

"email": "john@example.com",

"full_name": "John Doe",

"role": "user",

"profile": {

"bio": "Software engineer",

"phone": "(555) 123-4567"

},

"addresses": [{

"street": "123 Main St",

"city": "Springfield",

"state": "IL",

"postal_code": "62701"

}]

}

}

```

Integration

Used By:

  • @architecture-designer (Primary) - Phase 2 sub-agent for architecture design

Integrates With:

  • architecture-planner skill - Data models designed after component structure
  • api-designer skill - Data models used in API request/response schemas

Workflow Position:

  1. Analysis Specialist completes requirements analysis
  2. Architecture Designer receives analysis
  3. architecture-planner skill designs component structure (Step 3)
  4. data-modeler skill designs data models (Step 4)
  5. api-designer skill designs API contracts using these models (Step 5)
  6. Results synthesized into PRP

---

Version: 2.0.0

Auto-Activation: Yes

Phase: 2 - Design & Planning

Created: 2025-10-29

More from this repository10

🎯
implementation🎯Skill

Systematically implements software features by writing tests first, creating clean code, and documenting according to project standards.

🎯
dependency-analyzer🎯Skill

Analyzes project dependencies, builds dependency trees, detects conflicts, and checks compatibility across Python projects.

🎯
e2e-test-writer🎯Skill

Generates comprehensive end-to-end browser tests using Playwright, implementing page object model and best practices for user workflow validation.

🎯
api-test-generator🎯Skill

Generates comprehensive API endpoint tests for REST and GraphQL APIs, covering HTTP methods, authentication, validation, and error scenarios.

🎯
doc-analyzer🎯Skill

doc-analyzer skill from matteocervelli/llms

🎯
design-synthesizer🎯Skill

design-synthesizer skill from matteocervelli/llms

🎯
doc-fetcher🎯Skill

Fetches comprehensive, version-specific library and framework documentation using MCP integrations for accurate implementation guidance.

🎯
documentation-updater🎯Skill

Systematically updates project documentation across implementation docs, user guides, API docs, and architecture diagrams after feature completion.

🎯
unit-test-writer🎯Skill

Automatically generates comprehensive unit tests for functions and methods, covering edge cases and improving code reliability with minimal manual effort

🎯
architecture-planner🎯Skill

Designs clean, modular software architectures using established patterns like Layered and Hexagonal Architecture to create maintainable and scalable system structures.