Model Context Protocol

Generative AI has opened up a whole new world of opportunity. It is now foundational for business strategy and adoption has found it’s way into many homes via Open AI and ChatGPT (other GPTs are available).

Being able to ask a natural language question and receive a measured, comprehensive and (usually) accurate answer is still an incredible achievement. The Large Language Models (LLMs) we used are trained on huge publically accessible datasets scraped from the internet. But what happens when you want to ask questions on specific datasets, for example your companies accounting systems or a banks invenstment book of records. Previously we could fine-tune the LLM by retraining its existing model weights on this domain specific data or employ other techniques that involved an element of pre-training. And these approaches worked and worked well in some instances, but they needed to be coded from scratch each time. There was no convention behind each approach and it usually involved a great deal of boiler-plate code for both data extraction and model training.

What we really needed was an established standard that provided users with an API to connect to these applications and a protocol that allowed the returned information to be easily disseminated and used. A solution that allows us to do just that is named the Model Context Protocol.

A diagram illustrating the Model Context Protocol (MCP) that connects various applications, including chat interfaces, IDEs, AI applications, data systems, development tools, and productivity tools, emphasizing bidirectional data flow.
https://modelcontextprotocol.io/docs/getting-started/intro

Lets say you want to let its employees query multiple internal systems (databases, APIs, knowledge bases) through a single AI assistant. By using MCP, each system is wrapped as a server exposing structured tools (e.g. “search customer DB,” “fetch policy document”). The AI assistant, acting as a client, dynamically calls these tools during a conversation.

  • Provides secure, modular integration of many data sources without hard-coding.
  • Keeps the LLM lightweight (no retraining needed) by fetching live context at runtime.
  • Ensures auditability and control since each tool call is explicit and permissioned.

MCP Servers are like the USB-C connectors for AI systems. And their impact is already being felt. This approach democratises access to data, is open source, no fees and no vendor lock-ins. We are seeing startups emerge that can do the work of teams from Big 4 consultancies and doing something these consultancies invariably fail to achieve – namely delivering vaue (or anything at all…).

How does MCP work?

MCP follows a client-server architecture. The MCP Host is the AI application that and manages one or more MCP Client/s. The MCP Client maintains the connection to the host and talks to the MCP Server which finally provides the context to the host to start work.

Step 1: Setup

Lets run through a simple example. We will use Claude as the MCP Host, which can be downloaded here Claud Host. We will also require nodejs, which can be downloaded here NodeJS.

Step 2: Install uv

curl -LsSf https://astral.sh/uv/install.sh | sh

and create a basic python project

uv init simple_server
cd simple_server

Step 3: Create venv

Again using uv we create a virtual envoronment to isolate the code and libraries.

uv venv
source .venv/bin/activate

Step 4: Add dependencies and folders

We now add the required libraries and folder structure needed for this simple example

uv add "mcp[cli]" 
uv add pandas
uv add fastparquet

mkdir storage

Step 5: Add data

In this example we will look at crocodile information downloaded from Kaggle Download.

Add this file to the storage folder.

We will also convert this file to pandas to show how MCP can handle different data formats

import pandas as pd
# Read the crocofile CSV
df = pd.read_csv("storage/crocodile_dataset.csv")
# Save as the CVS as Parquet
df.to_parquet("storage/crocodile_dataset.parquet", index=False)

Step 6: Add file reader

We now need a simple method that allows the server to read out datasets.

# file_reader.py
import pandas as pd

def read_csv_info() -> str:
    filename = 'crocodile_dataset.csv'
    file_path = 'storage/%s' % filename
    df = pd.read_csv(file_path)
    return f"CSV '{filename}' has {len(df)} rows and {len(df.columns)} columns."

def read_parquet_summary(filename: str) -> str:
    filename = 'crocodile_dataset.parquet'
    file_path = 'storage/%s' % filename
    df = pd.read_parquet(file_path)
    return f"Parquet '{filename}' has {len(df)} rows and {len(df.columns)} columns."

Step 7: Create MCP Server

# server.py
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("simple_server")

if __name__ == "__main__":
    mcp.run()

Step 8: Add MCP @mcp.tool()

The MCP server decorates functions with @mcp.tool() that lets the AI know it call call them to perform an action. We therefore need to create two new tools for the CSV and Parquet datasets.

from mcp.server.fastmcp import FastMCP

from file_reader import read_csv_info, read_parquet_summary

mcp = FastMCP("simple_server")

@mcp.tool()
def file_parquet_info() -> str:
    return read_csv_info()

@mcp.tool()
def file_csv_info() -> str:
    return read_csv_info()

Step 9: Install Claude Deskdop

Claude for Desktop needs to be installed.

Download it here: https://www.anthropic.com/claude

Now we have to tell Claud where t find the MCP Server we created above by creating a config file here:

~/Library/Application\ Support/Claude/claude_desktop_config.json

{
"mcpServers": {
"mix_server": {
"command": "uv",
"args": [
"--directory",
"/Users/jez/simple_server",
"run",
"main.py"
]
}
}
}

Now restart Claude and verify the MCP Server is active and available:

Screenshot of an AI assistant interface displaying a prompt that says 'How can I help you today?' with menu options for 'Use style,' 'Extended thinking,' 'Web search,' and 'simple_server.'

I am now able to ask the MCP server questions about the file:

A screenshot of a chat interface where a user asks about a crocodile file. The AI responds, confirming it sees a CSV file with information about the dataset, including its dimensions.

Discover more from Data Lingua. Where Data Engineering Meets Agentic Business Strategy

Subscribe now to keep reading and get access to the full archive.

Continue reading