MCP servers
The Model Context Protocol standardises
how tools, prompts, and resources expose themselves to LLMs.
mantis-agent-sdk is both an MCP client (any agent can talk to MCP
servers) and an MCP server-runtime (you can author servers in-process
using the same @tool decorator).
In-process server
The fastest way to expose a set of tools as an MCP server:
from mantis_agent import create_sdk_mcp_server, tool
@tool
async def add(a: int, b: int) -> int:
"""Add two integers."""
return a + b
calc = create_sdk_mcp_server(
name="calculator",
version="0.1.0",
tools=[add],
)
options = {"mcp_servers": [calc]}The server runs in the same process — no subprocess, no socket — but exposes the full MCP protocol surface to the agent loop. From the model's point of view it's identical to an out-of-process server.
External servers (stdio / sse / http)
options = {
"mcp_servers": [
# stdio: spawn a subprocess
{"transport": "stdio", "command": "uvx", "args": ["mcp-server-fetch"]},
# sse: connect to a Server-Sent Events endpoint
{"transport": "sse", "url": "https://mcp.example.com/sse"},
# http: connect via streamable-http transport
{"transport": "http", "url": "https://mcp.example.com/mcp"},
],
}Each transport starts its handshake at session start and tears down at
session end. Failures during handshake surface as McpServerError hooks;
failures mid-call surface as tool errors.
Elicitation
Servers can prompt the user mid-tool-call. The ctx.elicit() API
inside a server-side tool blocks the call until the agent gathers a
response:
# server side
@tool
async def book_flight(destination: str) -> str:
"""Book a flight."""
seat = await ctx.elicit(
prompt="Window or aisle?",
options=["window", "aisle"],
)
return f"Booked {destination}, {seat} seat."The agent loop pauses, surfaces a system message of subtype
elicit_request, gathers a response (from your UI / human-in-the-loop /
config-driven default), and returns it to the server.
Sampling
Servers can call back into the agent's model to do their own generation. Useful when a server tool needs an LLM but shouldn't ship its own model dependency:
# server side
@tool
async def summarise(text: str) -> str:
"""Summarise text using the calling agent's model."""
result = await ctx.sample(
messages=[{"role": "user", "content": f"Summarise: {text}"}],
system_prompt="You are a concise summariser.",
max_tokens=200,
)
return result.content[0]["text"]The agent receives a sampling_request system message, runs it through
its current model + options, and returns the result to the server.
To allow sampling, register a handler:
options = {
"sampling_handler": "auto", # use the agent's own model
# or a custom handler that may route to a different model entirely:
"sampling_handler": my_sampler_fn,
}If a server requests sampling and no handler is set, the server sees a
SamplingNotSupportedError and decides how to handle it.
Authoring an out-of-process server
The same @tool decorator works for stdio servers — mantis-agent-sdk
ships a minimal runtime you can install as a script:
# my_server.py
from mantis_agent import tool
from mantis_agent.mcp import SdkServer
@tool
async def echo(text: str) -> str:
return text
if __name__ == "__main__":
SdkServer(name="echo", version="0.1.0", tools=[echo]).serve_stdio()uv run my_server.py # or: python my_server.pyFrom the agent side:
options = {
"mcp_servers": [
{"transport": "stdio", "command": "uv", "args": ["run", "my_server.py"]},
],
}