Posted on Sep 18
(Spanish Version) Building MCP Tools: A PDF Processing Server
Model Context Protocol (MCP) has emerged as a revolutionary standard for connecting AI models with external tools and services to enhance their capabilities. I will guide you through a high-level overview of the development process to build a comprehensive PDF processing server using FastMCP, with appropriate architecture, error handling, and production-grade features.
Tools Available at a Glance
Server and File Utilities
server_info(): Get the server configuration and status.list_temp_resources(): List files currently in the server’s temporary directory.upload_file(),upload_file_base64(),upload_file_url(): Upload files to the server from your local machine or a URL.get_resource_base64(): Download a file from the server’s temporary directory.
Text and Metadata
get_pdf_info(): Quickly get the page count, file size, and encryption status.extract_text(): Extract the full text content from a PDF.extract_text_by_page(): Extract text from specific pages or page ranges.extract_metadata(): Read the PDF metadata (author, title, creation date, etc.).
PDF Manipulation
merge_pdfs(): Combine multiple PDF files into a single document.split_pdf(): Split a PDF into multiple smaller files based on page ranges.rotate_pages(): Rotate specific pages within a PDF.
Conversion
pdf_to_images(): Convert specific pages of the PDF to image files (PNG, JPEG).images_to_pdf(): Create a new PDF from a list of image files.
You can find the base code in the GitHub Repository
MCP PDF Server
Our Case Study: Tracing the “extract_text” Tool
We will explore ‘extract_text’; all other tools share a consistent workflow and are easily accessible in the repository if you want to check them out.
Pattern
By separating the logic into “Service” → “Tool” → “Registry”, we keep the code clean, testable, and easy to extend. You can add your own tool by following this pattern exactly.
Step 1: The Core Logic - the “Service”
Before thinking about servers, tools, or protocols, we need a simple and reliable Python function that can perform our core task. This is then the “Service Layer”, the engine.
File: src/fastmcp_pdf_server/services/pdf_processor.py
Our first step is to write a function that takes a file path and returns the text. We use the “pdfplumber” library for this. Note that the function returns a “TextExtractionResult” data class, which helps ensure a consistent data structure.
from __future__ import annotations
from dataclasses import dataclass
from typing import List
import pdfplumber
from ..utils.validators import validate_pdf
# A data class provides a structured and predictable return type for our service.
# It's like a lightweight, self-documenting class.
@dataclass
class TextExtractionResult:
text: str
page_count: int
char_count: int
def extract_text(file_path: str, encoding: str = "utf-8") -> TextExtractionResult:
# First, run the file through a validator to ensure it exists, is a PDF,
# and is within the allowed size limits. This fails early if the input is bad.
pdf_path = validate_pdf(file_path)
# Use pdfplumber to robustly open and process the PDF.
with pdfplumber.open(str(pdf_path)) as pdf:
texts: List[str] = []
for page in pdf.pages:
# Extract text, defaulting to an empty string if a page has no text.
texts.append(page.extract_text() or "")
# Join the text from all pages into a single string.
text = "\n".join(texts)
# Return an instance of our data class, ensuring the contract is fulfilled.
return TextExtractionResult(text=text, page_count=len(texts), char_count=len(text))
```This function is pure Python. It knows nothing about FastMCP. It could be unit tested with "pytest" or used in a completely different application. This separation is the foundation of a maintainable system. Once we have made our service logic, we continue with the "Tool" MCP.
## Step 2: The Bridge - The "Tool"
Now we need to expose our service function to the outside world as an MCP Tool. This "Tool Layer" acts as a bridge. It handles the messy reality of a tool call and translates it into a clean call to our service.
**File:** `src/fastmcp_pdf_server/tools/text_extraction.py`
This is the most critical piece of the puzzle. It will handle the tool call, resolve the file, call the service, and format the response.
```python
# Inside src/fastmcp_pdf_server/tools/text_extraction.py
from __future__ import annotations
import time
import uuid
from typing import Any
from fastmcp import FastMCP # type: ignore
from ..services import pdf_processor
from ..services.file_manager import resolve_to_path
from ..utils.logger import get_logger
logger = get_logger(__name__)
# The 'register' function is a convention for grouping tool registrations.
# The main application will call this function, passing itself as an argument.
def register(app: FastMCP) -> None:
# The decorator @app.tool() is what officially registers this function as an MCP tool.
@app.tool()
async def extract_text(file: Any, encoding: str | None = "utf-8") -> dict:
"""Extract all text from a PDF.
Accepts:
- Full path string
- Short filename previously written to temp storage
- Bytes / file type / dict with base64 (will be saved to temp)
"""
# 1. Generate a unique ID for this specific operation. This is crucial for
# tracking a single request through logs.
op_id = uuid.uuid4().hex
start = time.perf_counter()
try:
# 2. Resolve the flexible 'file' input (which could be a path, filename, or
# base64 object) into a concrete and validated absolute file path.
resolved = resolve_to_path(file, filename_hint="uploaded.pdf")
# 3. Call the clean and testable service function with the resolved path.
# This is where the actual PDF processing happens.
res = pdf_processor.extract_text(str(resolved), encoding or "utf-8")
# 4. The service returns a data class. Now we format this into the
# final JSON-friendly dictionary for the client.
duration_ms = int((time.perf_counter() - start) * 1000)
return {
"text": res.text,
"page_count": res.page_count,
"char_count": res.char_count,
# The 'meta' block provides valuable operational data to the client.
"meta": {
"operation_id": op_id,
"execution_ms": duration_ms,
"resolved_path": str(resolved),
},
}
except Exception as e: # noqa: BLE001
# 5. This is the safety net. If any part of the process fails,
# log the full error for debugging...
logger.error("extract_text error: %s", e)
hint = (
"Provide a full path, upload the file first via 'upload_file', "
"or pass bytes/base64. Example payload:\n"
"{\n"
" \"name\": \"upload_file\",\n"
" \"arguments\": {\n"
" \"file\": { \"base64\": \"<...>\", \"filename\": \"my.pdf\" }\n"
" }\n"
"}"
)
# ...and raise a simple ValueError. FastMCP will turn this into a
# clean and structured error response for the LLM, preventing a crash.
raise ValueError(f"extract_text failed: {e}. {hint}")
The tool is just a wrapper. It is a manager that coordinates other parts of the code. It handles messy inputs, calls the clean service logic, and packages the final response. The ‘try…except ValueError’ pattern is a critical best practice.
Step 3: The Final Connection - The “Registration”
Our tool function is defined, but the server application still doesn’t know it exists. The final step is to connect, or register, our tool module with the main application instance “FastMCP”.
File: src/fastmcp_pdf_server/main.py
This file is the entry point for our entire server. Its job is to build the application object and register all the tool sets.
# Inside src/fastmcp_pdf_server/main.py
from __future__ import annotations
from typing import Any
from .config import settings
from .utils.logger import get_logger
logger = get_logger(__name__)
def build_app() -> Any:
# This try/except block provides a friendly error if the user
# forgot to install the dependencies from requirements.txt.
try:
from fastmcp import FastMCP # type: ignore
except Exception as exc: # pragma: no cover
raise SystemExit(
"fastmcp is not installed. Please install the dependencies first."
) from exc
# Initialize the main application, getting name and version from config.
app = FastMCP(settings.server_name, version=settings.server_version)
# --- Tool Registration ---
# Import the modules that contain our tool definitions.
from .tools import utilities, text_extraction, pdf_manipulation, conversion, uploads
from .services.file_manager import cleanup_expired
# Call the 'register' function of each module to attach their tools to the app.
# This modular approach keeps the main file clean.
utilities.register(app)
text_extraction.register(app)
pdf_manipulation.register(app)
conversion.register(app)
uploads.register(app)
# --- Startup Tasks ---
# It's a good practice to run cleanup tasks at startup.
# Here, we delete any old files from the temp directory.
try:
cleanup_expired()
except Exception as exc: # noqa: BLE001
logger.error("cleanup_expired at startup failed: %s", exc)
return app
By importing modules and calling a “register” function from each one. The main file stays clean and acts as a high-level summary of the server’s capabilities. Adding or removing an entire category of tools is as simple as adding or removing a line here.
The Big Picture
Now, let’s trace a request from start to finish:
- An LLM calls the tool
extract_text. - The
FastMCPapplication, built inmain.py, routes the call to the async functionextract_textinsidetext_tools.py. - The tool function calls
resolve_to_pathto get a clean file path. - The tool function then calls the service
pdf_processor.extract_textwith that clean path. - The service does the heavy lifting and returns a simple dictionary:
{'text': ..., 'page_count': ...}. - The tool function receives this dictionary, adds the
char_countand themetablock, and returns the final enriched dictionary. FastMCPsends this final dictionary back to the LLM as a JSON response.
The Final Result
Using Claude Desktop as an MCP Client we can test our “extract_text” tool from our server, simply by registering the MCP, adding it to the configuration file “claude_desktop_config.json”
{
"mcpServers": {
"pdf-processor-server": {
"command": "D:\\Github Projects\\mcp_pdf_server\\.venv\\Scripts\\python.exe",
"args": [
"-m",
"fastmcp_pdf_server"
],
"env": {
"TEMP_DIR": "D:\\Github Projects\\mcp_pdf_server\\temp_files"
}
}
}
}
```Once you've added the MCP, it should look like this.
[](https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdiesxdd65n7z43bc4xyy.png)
Usually, for this type of MCP Clients, you should add to your prompt the use of the MCP Server, in this case, our "PDF Processor Server"; sometimes, you also need to specify the full path of the file.
[](https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzei96rk0aw55mcbkfde5.png)
## What's Next?
You did it! You've set up a server, learned how to connect to it, ordered it to extract text, and even took a peek under the hood to see how everything works.
What's next?
* **Explore Other Tools**: Check out the `README.md` file. You'll find a complete list of other tools you can call, such as `merge_pdfs`, `split_pdf`, and `pdf_to_images`.
* **Extend the Server**: Try adding your own tool! Follow the pattern.
* **Automate Your Life**: Think about your own workflows. Could you use this server to automatically extract text from invoices? Or to combine your weekly reports into a single PDF? The power is yours.
Happy Coding! 🤖
