What is the Model Context Protocol (MCP)?

MCP is an open standard by Anthropic that lets AI assistants connect to external data sources through 'tools.' Instead of the AI guessing or asking you to paste data, it can directly query databases, APIs, or services through a standardized interface.

Can AI assistants access my workout data?

With an MCP integration, yes — but only with your explicit permission. Arvo's MCP server requires authentication and only exposes data you choose to share. You can ask Claude to analyze your training volume, check for imbalances, or suggest deload timing based on your actual workout history.

How do you build an MCP server?

An MCP server defines 'tools' (functions the AI can call) with typed parameters and returns structured data. For fitness data, typical tools include: get recent workouts, analyze volume by muscle group, check progression trends, and query exercise history. The server runs locally or as a hosted service.

Is MCP the same as an API?

MCP is built on top of APIs but adds a discovery and schema layer specifically designed for AI consumption. While a REST API requires the developer to write integration code, MCP lets AI assistants discover and use tools dynamically through a standardized protocol.

Building an MCP Server for Fitness Data: Lessons from Arvo

Why Fitness Data Needs MCP

Here's a scene that plays out thousands of times a day: someone opens ChatGPT, types “analyze my training,” and then spends five minutes copy-pasting their workout logs from whatever app they use. The AI gets a lossy, unstructured dump — missed exercises, rounded numbers, no context about RPE or progression. It does its best, but it's working with a napkin sketch of your training history.

The problem isn't the AI. It's the interface. Fitness data is trapped inside apps, accessible only through their UIs. There's no standard way for an AI assistant to say “give me this user's last 10 workouts as structured data.”

Model Context Protocol (MCP) changes this. Created by Anthropic, MCP is an open standard that lets AI assistants connect to external data sources through typed, discoverable tools. Instead of “here are my last 10 workouts [paste],” the AI calls getRecentWorkouts(limit: 10) and gets structured JSON back — every set, rep, weight, RPE score, and timestamp intact.

For Arvo, this unlocks a fundamentally different interaction. Claude can analyze your actual training volume across weeks, spot plateaus in specific lifts, compare your programming to evidence-based volume recommendations, and flag recovery issues — all using real data instead of whatever you remember to paste.

A quick primer on MCP's three primitives: tools are functions the AI can call (this is what we use most), resources are data the AI can read on demand, and prompts are reusable templates. Arvo's server is almost entirely tool-based — we want the AI to actively query data, not passively receive it.

The 8 Tools We Built

We started with 14 tool definitions, cut to 8 after testing. Fewer, sharper tools outperform a large surface area — LLMs get confused when they have too many similar options. Here are the final definitions (simplified from the actual Zod schemas):

const tools = [
  {
    name: 'getRecentWorkouts',
    description: 'Get the user\'s recent workout sessions',
    parameters: { limit: z.number().max(30).default(7) },
  },
  {
    name: 'getExerciseHistory',
    description: 'Get progression data for a specific exercise',
    parameters: { exerciseName: z.string(), weeks: z.number().default(8) },
  },
  {
    name: 'getVolumeByMuscle',
    description: 'Weekly set volume per muscle group',
    parameters: { weeks: z.number().default(4) },
  },
  {
    name: 'getTrainingInsights',
    description: 'AI-generated training insights and flags',
    parameters: {},
  },
  {
    name: 'getProgressionTrend',
    description: 'Load progression trend for top exercises',
    parameters: { topN: z.number().default(5) },
  },
  {
    name: 'compareToRecommendations',
    description: 'Compare user volume to evidence-based targets',
    parameters: { muscleGroup: z.string().optional() },
  },
  {
    name: 'getWorkoutStreak',
    description: 'Training consistency and streak data',
    parameters: {},
  },
  {
    name: 'getSplitAnalysis',
    description: 'Analyze current training split structure',
    parameters: {},
  },
];

The design principle behind this set: narrow tools beat broad tools. We originally had an analyzeTraining tool that tried to do everything — volume, progression, insights, recommendations in one call. It produced mediocre results because the LLM couldn't predict what shape the response would take. Splitting it into specific tools like getVolumeByMuscle and compareToRecommendations meant the AI could compose exactly the analysis the user asked for, calling two or three tools in sequence with predictable outputs.

Tool Design Lessons

After several iterations and observing how Claude, GPT, and other models interact with our tools, four patterns emerged that meaningfully improved response quality.

Lesson 1: Name tools as actions, not nouns. getVolumeByMuscle outperforms volumeData. LLMs use tool names as semantic cues for when to call them. A verb-based name like compareToRecommendations makes it obvious that this tool should be called when a user asks “am I doing enough chest work?” A noun-based name like recommendations is ambiguous — is it for reading recommendations or generating them?

Lesson 2: Return structured data, not prose. Early versions of getVolumeByMuscle returned strings like “Your chest volume is 14 sets which is within the optimal range.” This created two problems: the AI would parrot the string instead of synthesizing across multiple tool calls, and the formatting was locked in. Now it returns { chest: { sets: 14, mev: 10, mav: 16, status: "optimal" } } and the AI formats the data however best fits the conversation — a table, a bullet list, or woven into a paragraph.

Lesson 3: Make parameters optional with sane defaults. Most of our tools have zero to two required parameters. When someone asks “how's my training going?”, the AI needs to call tools without asking clarifying questions first. If getVolumeByMuscle required a muscle group parameter, the AI would have to ask “which muscle group?” before doing anything useful. With weeks defaulting to 4 and no required params, it can immediately return a full overview.

Lesson 4: Cap result sizes. Returning 90 days of workout data in a single call overwhelms the context window and degrades response quality. Every tool has a default limit and a maximum. getRecentWorkouts defaults to 7 sessions, caps at 30. getExerciseHistory defaults to 8 weeks. If the user needs more, they can ask and the AI will call with a higher limit — but the default path stays fast and focused.

Authentication: The Hard Part

MCP defines how tools are discovered and called. It does not define how users authenticate. This is left entirely to the server implementer, and it's where most of the complexity lives.

Our approach uses an OAuth2 PKCE flow:

User opens Arvo's settings and navigates to the MCP integration page
They click “Generate MCP Token” — this creates a scoped, read-only token tied to their account
They paste the token into their Claude Desktop (or other MCP client) configuration
Every tool call from the AI includes this token for authentication

The auth middleware on the server side is straightforward:

async function authenticateRequest(token: string) {
  const { data: session } = await supabase.auth.getUser(token);
  if (!session.user) throw new McpError('UNAUTHORIZED');

  const scopes = await getTokenScopes(token);
  return { userId: session.user.id, scopes };
}

The tricky part is the architecture. MCP servers typically run locally on the user's machine (via stdio transport), but our data lives in Supabase. We can't embed a Supabase service-role key in a locally-running process — that would be a security disaster. Instead, the local MCP server acts as a thin proxy: it receives tool calls from Claude, forwards them to Arvo's REST API with the user's auth token, and the API queries Supabase with Row Level Security (RLS) ensuring users can only access their own data. The user's token never touches Supabase directly from the client — it's validated server-side through our API layer.

What Users Actually Ask Claude

We had assumptions about how people would use the MCP integration. We expected the primary use case to be workout generation — “create me a push day.” That made sense: it's what people do with ChatGPT today.

We were wrong. Arvo already generates workouts with its multi-agent periodization engine, and users know that. What they wanted was something Arvo's chat interface doesn't do: open-ended analysis. They wanted Claude to be a training analyst that reads their Arvo data, not a replacement for Arvo's workout generator.

The top use cases, ranked by query frequency:

Top MCP Use Cases

	Use Case	Tools Called
Volume audit	compareToRecommendations, getVolumeByMuscle	34%
Plateau analysis	getExerciseHistory, getTrainingInsights	28%
Program review	getSplitAnalysis, getVolumeByMuscle	19%
Consistency check	getWorkoutStreak	12%
Raw data export	getRecentWorkouts	7%

Some real examples of what users ask:

Volume auditing: “Am I training enough back? Compare my volume to what the research recommends.” Claude calls compareToRecommendations and getVolumeByMuscle, cross-references the user's actual sets with MEV/MAV/MRV landmarks, and identifies undertrained muscle groups.
Plateau detection: “My bench press has stalled for 3 weeks, what should I change?” Claude pulls 8 weeks of bench history via getExerciseHistory, confirms the stall in the data, then checks getTrainingInsights for related flags like insufficient chest volume or high fatigue accumulation.
Program analysis: “Is my PPL split balanced?” Claude calls getSplitAnalysis to understand the split structure, then getVolumeByMuscle to check if any muscle groups are disproportionately under- or over-trained relative to the split's design.
Accountability: “How consistent have I been this month?” getWorkoutStreak returns current streak, longest streak, training frequency over the last 30 days, and missed-day patterns.

The pattern is clear: users treat the MCP integration as a second opinion on their training. Arvo generates the program; Claude audits it. These are complementary, not competing.

Technical Architecture

The full request flow from AI client to data:

┌──────────────┐     ┌─────────────────┐     ┌──────────────┐
│  Claude /     │────▶│  Arvo MCP       │────▶│  Arvo API    │
│  AI Client    │◀────│  Server (local) │◀────│  (Supabase)  │
└──────────────┘     └─────────────────┘     └──────────────┘
     stdio              HTTP + Auth              RLS queries

Three layers, each with a clear responsibility:

AI Client (Claude Desktop, Cursor, etc.) communicates with the MCP server over stdio — the simplest transport. The client discovers available tools on startup and calls them as needed during conversations.
Arvo MCP Server runs as a local process on the user's machine. It validates tool parameters, forwards requests to Arvo's API with the user's auth token attached, and transforms responses into the format the AI client expects. No data is stored locally.
Arvo API (backed by Supabase) handles authentication, authorization, and data access. Every query runs through Row Level Security — even if someone tampered with the local MCP server, they could only access data belonging to the authenticated user.

Performance is solid for an interactive use case: p50 = 180ms, p95 = 450ms per tool call. Most of that time is the Supabase query — the MCP server itself adds less than 20ms of overhead. Since Claude typically calls 2–3 tools per user question, the total data-fetching time is under a second even at the 95th percentile.

For the full setup instructions and tool reference, see the MCP documentation.

What's Next for Fitness + MCP

Today's server is read-only. That's a deliberate choice — we wanted to nail the data access patterns before letting AI modify anything. But the roadmap is clear:

Write tools. “Move my leg day to Thursday” should work. A rescheduleSession tool with confirmation flow (the AI proposes the change, you approve) would cover the most-requested write operation. Swapping exercises, adjusting target sets, and marking deload weeks are next.

Cross-app integration. The real power of MCP is composability. Imagine one Claude conversation with both your Arvo training data and your MyFitnessPal nutrition data. “I've been stalling on bench — am I eating enough protein on push days?” Each app provides its own MCP server; Claude orchestrates across both. No integration partnership required.

Persistent coaching mode. Today, each Claude conversation starts fresh. With MCP, the AI could maintain a running context of your training — noticing that your squat has been trending up for 6 weeks and proactively suggesting a deload before you ask. This shifts from reactive Q&A to proactive coaching.

The competitive angle. MCP adoption is still early, but it's accelerating. The first fitness app with solid MCP support gets a distribution advantage through every AI assistant that supports the protocol — Claude, ChatGPT, Cursor, and whatever comes next. Your data becomes accessible everywhere, and the app that provides that access becomes the one users stick with.

If you're curious about the AI architecture that generates the training data MCP exposes, read about Arvo's multi-agent periodization engine.

Arvo MCP Documentation

Setup guide and tool reference

Arvo Developer Docs

API reference and architecture overview

Multi-Agent Periodization Engine

The AI architecture powering Arvo's workout data

MCP (Model Context Protocol) is an open standard created by Anthropic. Arvo's MCP integration is available to Pro subscribers. See our MCP documentation for setup instructions and the developer docs for the full API reference.

Building an MCP Server for Fitness Data: Lessons from Arvo

What is an MCP server and how does it work with fitness data?

TL;DR