Creating a high-quality MCP server involves four main phases:
Phase 1: Deep Research and Planning
#### 1.1 Understand Agent-Centric Design Principles
Before diving into implementation, understand how to design tools for AI agents by reviewing these principles:
Build for Workflows, Not Just API Endpoints:
- Don't simply wrap existing API endpoints - build thoughtful, high-impact workflow tools
- Consolidate related operations (e.g.,
schedule_event that both checks availability and creates event) - Focus on tools that enable complete tasks, not just individual API calls
- Consider what workflows agents actually need to accomplish
Optimize for Limited Context:
- Agents have constrained context windows - make every token count
- Return high-signal information, not exhaustive data dumps
- Provide "concise" vs "detailed" response format options
- Default to human-readable identifiers over technical codes (names over IDs)
- Consider the agent's context budget as a scarce resource
Design Actionable Error Messages:
- Error messages should guide agents toward correct usage patterns
- Suggest specific next steps: "Try using filter='active_only' to reduce results"
- Make errors educational, not just diagnostic
- Help agents learn proper tool usage through clear feedback
Follow Natural Task Subdivisions:
- Tool names should reflect how humans think about tasks
- Group related tools with consistent prefixes for discoverability
- Design tools around natural workflows, not just API structure
Use Evaluation-Driven Development:
- Create realistic evaluation scenarios early
- Let agent feedback drive tool improvements
- Prototype quickly and iterate based on actual agent performance
#### 1.3 Study MCP Protocol Documentation
Fetch the latest MCP protocol documentation:
Use WebFetch to load: https://modelcontextprotocol.io/llms-full.txt
This comprehensive document contains the complete MCP specification and guidelines.
#### 1.4 Study Framework Documentation
Load and read the following reference files:
- MCP Best Practices: [π View Best Practices](./reference/mcp_best_practices.md) - Core guidelines for all MCP servers
For Python implementations, also load:
- Python SDK Documentation: Use WebFetch to load
https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md - [π Python Implementation Guide](./reference/python_mcp_server.md) - Python-specific best practices and examples
For Node/TypeScript implementations, also load:
- TypeScript SDK Documentation: Use WebFetch to load
https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md - [β‘ TypeScript Implementation Guide](./reference/node_mcp_server.md) - Node/TypeScript-specific best practices and examples
#### 1.5 Exhaustively Study API Documentation
To integrate a service, read through ALL available API documentation:
- Official API reference documentation
- Authentication and authorization requirements
- Rate limiting and pagination patterns
- Error responses and status codes
- Available endpoints and their parameters
- Data models and schemas
To gather comprehensive information, use web search and the WebFetch tool as needed.
#### 1.6 Create a Comprehensive Implementation Plan
Based on your research, create a detailed plan that includes:
Tool Selection:
- List the most valuable endpoints/operations to implement
- Prioritize tools that enable the most common and important use cases
- Consider which tools work together to enable complex workflows
Shared Utilities and Helpers:
- Identify common API request patterns
- Plan pagination helpers
- Design filtering and formatting utilities
- Plan error handling strategies
Input/Output Design:
- Define input validation models (Pydantic for Python, Zod for TypeScript)
- Design consistent response formats (e.g., JSON or Markdown), and configurable levels of detail (e.g., Detailed or Concise)
- Plan for large-scale usage (thousands of users/resources)
- Implement character limits and truncation strategies (e.g., 25,000 tokens)
Error Handling Strategy:
- Plan graceful failure modes
- Design clear, actionable, LLM-friendly, natural language error messages which prompt further action
- Consider rate limiting and timeout scenarios
- Handle authentication and authorization errors
---
Phase 2: Implementation
Now that you have a comprehensive plan, begin implementation following language-specific best practices.
#### 2.1 Set Up Project Structure
For Python:
- Create a single
.py file or organize into modules if complex (see [π Python Guide](./reference/python_mcp_server.md)) - Use the MCP Python SDK for tool registration
- Define Pydantic models for input validation
For Node/TypeScript:
- Create proper project structure (see [β‘ TypeScript Guide](./reference/node_mcp_server.md))
- Set up
package.json and tsconfig.json - Use MCP TypeScript SDK
- Define Zod schemas for input validation
#### 2.2 Implement Core Infrastructure First
To begin implementation, create shared utilities before implementing tools:
- API request helper functions
- Error handling utilities
- Response formatting functions (JSON and Markdown)
- Pagination helpers
- Authentication/token management
#### 2.3 Implement Tools Systematically
For each tool in the plan:
Define Input Schema:
- Use Pydantic (Python) or Zod (TypeScript) for validation
- Include proper constraints (min/max length, regex patterns, min/max values, ranges)
- Provide clear, descriptive field descriptions
- Include diverse examples in field descriptions
Write Comprehensive Docstrings/Descriptions:
- One-line summary of what the tool does
- Detailed explanation of purpose and functionality
- Explicit parameter types with examples
- Complete return type schema
- Usage examples (when to use, when not to use)
- Error handling documentation, which outlines how to proceed given specific errors
Implement Tool Logic:
- Use shared utilities to avoid code duplication
- Follow async/await patterns for all I/O
- Implement proper error handling
- Support multiple response formats (JSON and Markdown)
- Respect pagination parameters
- Check character limits and truncate appropriately
Add Tool Annotations:
readOnlyHint: true (for read-only operations)destructiveHint: false (for non-destructive operations)idempotentHint: true (if repeated calls have same effect)openWorldHint: true (if interacting with external systems)
#### 2.4 Follow Language-Specific Best Practices
At this point, load the appropriate language guide:
For Python: Load [π Python Implementation Guide](./reference/python_mcp_server.md) and ensure the following:
- Using MCP Python SDK with proper tool registration
- Pydantic v2 models with
model_config - Type hints throughout
- Async/await for all I/O operations
- Proper imports organization
- Module-level constants (CHARACTER_LIMIT, API_BASE_URL)
For Node/TypeScript: Load [β‘ TypeScript Implementation Guide](./reference/node_mcp_server.md) and ensure the following:
- Using
server.registerTool properly - Zod schemas with
.strict() - TypeScript strict mode enabled
- No
any types - use proper types - Explicit Promise return types
- Build process configured (
npm run build)
---
Phase 3: Review and Refine
After initial implementation:
#### 3.1 Code Quality Review
To ensure quality, review the code for:
- DRY Principle: No duplicated code between tools
- Composability: Shared logic extracted into functions
- Consistency: Similar operations return similar formats
- Error Handling: All external calls have error handling
- Type Safety: Full type coverage (Python type hints, TypeScript types)
- Documentation: Every tool has comprehensive docstrings/descriptions
#### 3.2 Test and Build
Important: MCP servers are long-running processes that wait for requests over stdio/stdin or sse/http. Running them directly in your main process (e.g., python server.py or node dist/index.js) will cause your process to hang indefinitely.
Safe ways to test the server:
- Use the evaluation harness (see Phase 4) - recommended approach
- Run the server in tmux to keep it outside your main process
- Use a timeout when testing:
timeout 5s python server.py
For Python:
- Verify Python syntax:
python -m py_compile your_server.py - Check imports work correctly by reviewing the file
- To manually test: Run server in tmux, then test with evaluation harness in main process
- Or use the evaluation harness directly (it manages the server for stdio transport)
For Node/TypeScript:
- Run
npm run build and ensure it completes without errors - Verify dist/index.js is created
- To manually test: Run server in tmux, then test with evaluation harness in main process
- Or use the evaluation harness directly (it manages the server for stdio transport)
#### 3.3 Use Quality Checklist
To verify implementation quality, load the appropriate checklist from the language-specific guide:
- Python: see "Quality Checklist" in [π Python Guide](./reference/python_mcp_server.md)
- Node/TypeScript: see "Quality Checklist" in [β‘ TypeScript Guide](./reference/node_mcp_server.md)
---
Phase 4: Create Evaluations
After implementing your MCP server, create comprehensive evaluations to test its effectiveness.
Load [β
Evaluation Guide](./reference/evaluation.md) for complete evaluation guidelines.
#### 4.1 Understand Evaluation Purpose
Evaluations test whether LLMs can effectively use your MCP server to answer realistic, complex questions.
#### 4.2 Create 10 Evaluation Questions
To create effective evaluations, follow the process outlined in the evaluation guide:
- Tool Inspection: List available tools and understand their capabilities
- Content Exploration: Use READ-ONLY operations to explore available data
- Question Generation: Create 10 complex, realistic questions
- Answer Verification: Solve each question yourself to verify answers
#### 4.3 Evaluation Requirements
Each question must be:
- Independent: Not dependent on other questions
- Read-only: Only non-destructive operations required
- Complex: Requiring multiple tool calls and deep exploration
- Realistic: Based on real use cases humans would care about
- Verifiable: Single, clear answer that can be verified by string comparison
- Stable: Answer won't change over time
#### 4.4 Output Format
Create an XML file with this structure:
```xml
Find discussions about AI model launches with animal codenames. One model needed a specific safety designation that uses the format ASL-X. What number X was being determined for the model named after a spotted wild cat?
3
```
---
# Reference Files