Skip to content

[do not merge] Add mcp-codemod, an automated v1 to v2 migration tool#3011

Draft
maxisbey wants to merge 4 commits into
mainfrom
mcp-codemod
Draft

[do not merge] Add mcp-codemod, an automated v1 to v2 migration tool#3011
maxisbey wants to merge 4 commits into
mainfrom
mcp-codemod

Conversation

@maxisbey

@maxisbey maxisbey commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Important

DO NOT MERGE. Opening as a draft for design review. The codemod is self-contained
(a new workspace package; nothing in mcp depends on it), but the scope, the mapping
tables, and the publishing prerequisite below deserve eyes before any of it is real.

Adds mcp-codemod, a libCST-based tool that automates the mechanical part of the v1 to v2 migration:

uvx mcp-codemod v1-to-v2 ./src
grep -rn '# mcp-codemod:' ./src   # everything left for a human

The design goal is narrow and deliberate: existing v1 code runs on v2, staying on the
legacy/compat paths
. The codemod never moves users onto 2026-era features (tasks, MRTR,
resolvers, subscriptions/listen, cache hints, extensions) and never "modernizes" beyond
what working code requires — a spelling that still works on v2 (e.error.code, camelCase
construction kwargs at runtime) is only touched when leaving it would fail the user's own
type-checking. Every change whose meaning is unambiguous from the file alone is rewritten;
every site the codemod recognizes but will not guess at gets a # mcp-codemod: comment,
so the remaining work is one grep. Re-running on its own output is a no-op, and --dry-run
(optionally with --diff) previews a run without writing anything.

Motivation and Context

docs/migration.md is ~1,600 lines, and most of what it asks for is tedium rather than
judgment. A codemod removes exactly that half, so a reader's (or an agent's) attention
goes to the changes that need it. The TypeScript SDK ships
@modelcontextprotocol/codemod as step 1 of its upgrade guide with the same philosophy
(minimal-working output, compat paths where they exist, markers elsewhere); this is the
Python counterpart.

What it rewrites (each gated on resolving a name through the file's imports, never on
matching text):

  • Import paths that moved: mcp.server.fastmcp -> mcp.server.mcpserver,
    mcp.types -> mcp_types (including the from mcp import types form),
    mcp.shared.version -> mcp_types.version.
  • Renamed symbols: FastMCP -> MCPServer, McpError -> MCPError,
    FastMCPError -> MCPServerError, streamablehttp_client -> streamable_http_client,
    and the removed Content / ResourceReference aliases.
  • Lowlevel decorator registrations — the biggest structural change v2 makes.
    @server.list_tools() / @server.call_tool() / all twelve v1 decorator kinds become
    server.add_request_handler(...) calls at the exact source position the decorator
    occupied (registration there is when the v1 decorator ran, so execution order is
    preserved by construction — no statement reordering, and the deprecated capabilities
    land on the warning-free registration path). Each is wired through a generated adapter
    that reproduces the v1 wrapper semantics the handler relied on: bare-list wrapping,
    call_tool's any-exception-to-isError contract with jsonschema input/output
    validation (tool lookup through the registered tools/list handler, exactly v1's own
    cache mechanism, so it works when list_tools lives in another module),
    read_resource content conversion, and the completion None-mapping. User handler
    bodies are never touched.
    A shape the adapter cannot serve honestly — a stacked
    decorator, a self.-attribute server, a non-v1 signature, a non-literal decorator
    argument, a taken name — is marked with the reason instead.
  • One-argument McpError(...) calls to MCPError.from_error_data(...) (same single
    ErrorData argument the v1 constructor took; the user's expression is kept as
    written). e.error.code and friends are deliberately left alone — they still work.
  • v1 positional arguments on the lowlevel Server(...) constructor to keywords (v2 is
    keyword-only after name but kept v1's names and order).
  • camelCase attribute reads on mcp.types models (.inputSchema -> .input_schema),
    restricted to the 40 field names v1 declared, plus the same rename in
    getattr/hasattr/setattr string literals (those break at runtime) and in
    construction keywords (those work at runtime but fail type-checking).
  • Client-surface breaks that construct silently and detonate on first use: an inline
    timedelta(...) session timeout gains .total_seconds() (a non-provable value is
    marked), cursor= on session list_* methods wraps into
    params=PaginatedRequestParams(...), and a pydantic AnyUrl(...)/FileUrl(...)
    wrapper around a resource URI is dropped where the target provably takes v2's plain
    str (marked where it cannot be proven).
  • The streamable_http_client(...) as (read, write, _) three-tuple to the v2 two-tuple.
  • The mcp requirement in pyproject.toml (PEP 621 tables and dependency groups) and
    requirements*.txt, to >=2,<3 — but only where the current constraint cannot accept
    any v2 release, and only the version specifier. Poetry tables, URL pins, unparseable
    lines, and the removed ws extra are marked instead.

What it deliberately only marks (never guesses): the v1 mcp.types names with no v2
home (pinned complete against the installed package by ratchet tests); imports of the
module namespaces v2 deleted (a 107-module ratchet accounts for the whole v1 namespace);
the v1 RootModel wrappers that became plain union aliases (constructing them or calling
pydantic methods on them is marked with the TypeAdapter fix); request_context /
request_handlers on receivers proven to be lowlevel servers; transport keywords on the
MCPServer constructor (the right destination depends on how the server is started);
streamablehttp_client results used outside a with item; positional constructor
arguments after the server name (v1's second positional was instructions, v2's is
title — a silent mis-route); and every removed API with no drop-in replacement. Marker
messages name the legacy-shaped fix and never advertise a 2026-era feature.

How Has This Been Tested?

  • 224 test functions/cases, 100% branch coverage on the new package (./scripts/test is
    green for the whole tree), strict pyright, ruff.
  • Runtime-proven adapters: the test suite migrates a six-registration v1 lowlevel
    server and serves it to a v1-shaped ClientSession over the legacy protocol
    (negotiating 2025-11-25) in-memory — list/call (ok, unknown tool, schema-invalid
    arguments as isError results), resources, read, subscribe, prompt. The emitted
    templates are additionally pinned against the installed v2: every registration method
    string registers, every params model exists, every injectable import resolves, and no
    template emits a 2026-era surface (InputRequiredResult, subscriptions/listen,
    cache_hints, extensions).
  • Audited against 11 real repositories with the batch harness at
    scripts/codemod-batch-test/: pinned commits of real v1 projects are migrated and
    type-checked against this workspace's v2, with the pristine side checked against the
    latest v1 as the baseline; every error that exists only on the migrated side must sit
    inside the span of a # mcp-codemod: marker. The manifest now includes the official
    reference servers plus decorator-heavy community servers (mysql_mcp_server and
    kaltura-mcp with seven decorators each, arxiv-mcp-server dispatching across
    subpackages), the marker path (fastapi_mcp's method-local server), client libraries
    (langchain-mcp-adapters, mcpadapt with the old spelling and a positional timedelta
    timeout), and an exact ==1.6.0 pin (chroma-mcp). Highlights: mysql_mcp_server's
    seven-decorator server migrates to zero new errors and zero markers; arxiv's
    13-file lowlevel server likewise.
  • Measured against ground truth: the 76 example files that exist on both v1.x and
    main were migrated by hand, so their diff approximates the correct migration. Of the
    51 files with a real migration diff, 13 are reproduced exactly, 23 partially, 12 are
    decorator-rewritten (their oracle is the runtime round-trip and the harness — the
    human migrations rewrote handlers natively, which the codemod deliberately does not
    do), and 0 are made worse.
  • The failure modes that matter most for a tool like this are rewriting code it should
    not touch and making code worse, so that is what most of the suite pins: a file that
    never imports the SDK is never modified even when it spells tempting names; nothing is
    rewritten into a silent NameError; nothing that works on v2 is broken; and a re-run
    over its own output is byte-for-byte identical.
  • The mapping tables are pinned against the installed v2 package by ratchet tests,
    so they cannot silently drift as v2 evolves: every rename target resolves, every
    removed API is provably absent, no flagged constructor keyword survives, no v1
    decorator name has come back as a live Server method, and the lowlevel positional
    conversion names must be keyword-only parameters of the installed constructor.

Breaking Changes

None. The package is additive; mcp does not depend on it.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

Needs an owner before this merges: the PR adds mcp-codemod to the publish
workflow's build step, so the PyPI project must be registered and a trusted publisher
configured for it (the same dance mcp-types needed) before this lands. The
ordering matters: if a release is tagged first, the upload job dies at the unregistered
mcp-codemod wheel after mcp is uploaded and before mcp-types, and mcp's
exact pin on mcp-types makes that half-published release uninstallable until the job
is re-run. (Without the workflow change the documented uvx mcp-codemod fails for every
reader instead, so the docs and the publishing have to land together either way.)

Deliberately not in this PR (each is a clean follow-up, and none should gate the
design review):

  • Restructuring docs/migration.md codemod-first into the two-journey split the
    TypeScript guide uses, with the mapping tables linked as the source of truth.
  • Per-transform selection (the TS codemod's --transforms). The transformer here is one
    integrated pass rather than discrete transforms, re-runs are idempotent, and disagreeing
    with a specific rewrite is handled by not committing those hunks -- deferred until
    someone actually asks for it.

AI Disclaimer

maxisbey added 2 commits July 1, 2026 12:08
A new `mcp-codemod` workspace package (`uvx mcp-codemod v1-to-v2 ./src`)
that rewrites every v1 -> v2 change whose meaning is unambiguous from the
file alone, and inserts a `# mcp-codemod:` comment above every site it
recognized but would not guess at. Built on libCST.

Names are resolved through each file's imports, never matched as text, so
an aliased import or an unrelated symbol that shares a name with an SDK
one is never touched. The camelCase to snake_case rename is restricted to
the field names v1's `mcp.types` actually declared. Anything whose correct
rewrite depends on information that is not in the file -- the lowlevel
decorator to `on_*` relocation, the transport keywords on the `MCPServer`
constructor -- is left exactly as written and marked instead, so the
remaining work is one grep. Re-running on the output is a no-op.

The mapping tables are pinned against the installed v2 package by ratchet
tests so they cannot silently drift: every rename target must resolve,
every removed API must be provably absent, and no flagged constructor
keyword may survive on `MCPServer.__init__`. Measured against the example
files that exist on both `v1.x` and `main` (whose diff is the hand-written
migration), the codemod fully reproduces 13 of the 51 with a real
migration diff, improves 35 more, and makes none worse.

Also adds an "Automated migration" section to docs/migration.md, a mention
of the tool in README.v2.md, and the package to the publish workflow's
build step (the PyPI project and its trusted publisher must exist before a
release is tagged with this in it).
Three additions to mcp-codemod, closing the gaps a comparison with the
TypeScript codemod surfaced:

Imports of module namespaces v2 deleted outright (the experimental tasks
namespaces, the WebSocket transports, `mcp.shared.progress`) are now
marked with replacement guidance. A new ratchet test freezes the 107
public modules v1 shipped and asserts every one imports on v2, is
renamed, or is in the removed table, so the whole v1 module namespace is
provably accounted for.

The codemod now also updates the `mcp` requirement in `pyproject.toml`
(PEP 621 tables and dependency groups) and `requirements*.txt` to
`>=2,<3` -- only where the current constraint cannot accept any v2
release, and only the version specifier: name, extras, environment
marker, and spacing keep the user's spelling. Poetry tables and the
removed `ws` extra are marked instead of guessed at, under the same
`# mcp-codemod:` contract as source markers.

`scripts/codemod-batch-test/` runs the codemod against pinned real
repositories and audits the marker contract end to end: it type-checks
the pristine clone against the latest v1 and the migrated copy against
this workspace's v2 with identical pyright settings, then requires every
error that exists only on the migrated side to sit next to a marker.
Across the four repos in the manifest every migration-surface error is
covered, and the audit caught two real bugs now fixed here: `Context`
imported from the old `.server` submodule is rehomed to the package (the
submodule holds the name at runtime, but a type checker treats a
non-re-exported name as private), and `request_context` on a receiver
the pre-pass proved holds a lowlevel `Server` is flagged again --
receiver-matched, so the live `ctx.request_context` idiom stays
untouched.
@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

📚 Documentation preview

Preview https://pr-3011.mcp-python-docs.pages.dev
Deployment https://3db5799b.mcp-python-docs.pages.dev
Commit d99b140
Triggered by @maxisbey
Updated 2026-07-01 13:48:58 UTC

maxisbey added 2 commits July 1, 2026 12:38
The goal is that migrated v1 code runs on v2 on its legacy paths, not
that it adopts v2 idioms. Applying that bar:

- Leave e.error.code / .message / .data chains alone: v2's MCPError
  keeps a typed .error ErrorData, so the v1 spelling runs and
  type-checks unchanged. The except-binding tracking goes with it.
- Rewrite one-argument McpError(...) calls to MCPError.from_error_data(...)
  instead of flattening the inline ErrorData: the user's expression is
  kept as written and the non-inline form no longer needs a marker.
- Convert v1 positional arguments on the lowlevel Server constructor to
  keywords (v2 is keyword-only after name but kept v1's names and order),
  pinned against the installed signature by a new ratchet test.
- Reword every marker message that pointed at replaced internals or at
  the successor of the removed experimental tasks API; state removals
  plainly instead of steering users onto new surfaces.
- Teach the batch harness that a reportArgumentType error naming a
  detonating argument type (timedelta, AnyUrl) is a real break, never
  v2 strictness drift, and ignore stale work/ directories.
The twelve v1 @server.* decorator kinds are gone on v2. Their sites now
become add_request_handler / add_notification_handler calls at the
decorator's exact source position (registration there is when the v1
decorator ran, so execution order is preserved and the deprecated
capabilities land on the warning-free path), wired through generated
adapters that reproduce the v1 wrapper semantics: bare-list wrapping,
call_tool's any-exception-to-isError contract with jsonschema input and
output validation (tool lookup through the registered tools/list
handler, v1's own cache mechanism, so cross-module list_tools works),
read_resource content conversion, and the completion None-mapping.
Handler bodies are never touched. Shapes the adapter cannot serve
honestly -- a stacked decorator, an attribute receiver, a non-v1
signature, a non-literal decorator argument, a taken name -- are marked
with the reason. The suite migrates a six-registration server and
serves it to a v1-shaped ClientSession over the legacy protocol; the
templates are pinned against the installed v2 (method strings register,
params models exist, imports resolve, no 2026-era surface is emitted).

Also on the client surface: inline timedelta session timeouts convert
to float seconds and non-provable values are marked (the mismatch only
fails on the first request); cursor= on session list_* methods wraps
into params=PaginatedRequestParams(...); pydantic URL wrappers around
resource URIs are dropped where the target provably takes v2's plain
str and marked elsewhere; constructions of and pydantic method calls on
the v1 RootModel wrappers that became plain union aliases are marked
with the TypeAdapter fix; ._mcp_server and the type-keyed handler dicts
are marked with their v2 homes. Adapters honor an explicit `uri: str`
annotation and keep v1's AnyUrl otherwise, and keep the emitted code
insensitive to user return annotations so a wrong annotation cannot
manufacture type errors inside generated code.

Batch harness: seven more pinned repositories (two seven-decorator
servers, a multi-package lowlevel server, the method-local-server
marker path, two client libraries including a positional timedelta
timeout and the old streamablehttp spelling, and an exact ==1.6.0 pin).
Markers now cover the full statement they precede rather than a fixed
radius, Unknown-typed errors in files that carry markers classify as
cascade of a marked break, and the work directory is a dot-directory so
pytest never collects the cloned repositories' own suites. All eleven
repositories audit at zero uncovered errors.

An adversarial review round over the full change confirmed ten defects,
all fixed with regression tests: adapter imports now inject at the top
of the module (a mid-file import as the anchor left registration code
running before its imports bound); the rewrite gates now also block a
handler named like a template local, and any module-level non-import
binding of a name the adapter references (both were silent runtime
breaks past the gates); import injection dedup now reads the updated
module's top-level import binds, so conditional or function-local
imports no longer suppress a needed injection; list_* adapters pass a
returned full result model through instead of double-wrapping (v1's
runtime behavior); the blocked-progress marker names
add_notification_handler (a request-handler registration would never
fire); the timeout transform skips already-v2 shapes so re-runs stay
no-ops; the emitted name scheme is defined once and shared between
templates and gates; and the harness classifier no longer lets a
marker cover a whole def/class body or write off arbitrary
Unknown-typed errors (header-only spans; cascade restricted to
propagation rules and never detonators).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant