Surf vs MCP vs Web Scraping
A factual, opinionated comparison of the three main approaches to letting AI agents interact with websites. No shade — each has its place.
Overview
When you want an AI agent to interact with a website — read products, submit forms, fetch data — you broadly have three options today:
Surf
You expose typed commands on your own server via a simple adapter. Agents discover them via surf.json and call them over HTTP. No browser, no screenshots, no intermediaries.
MCP (Model Context Protocol)
A protocol layer (originally from Anthropic) that standardises how LLMs call tools and read context. Agents connect to MCP servers that expose tools — typically local processes, not websites. Powerful for IDE / dev tooling; less natural for web-first use cases.
Web Scraping / Browser Automation
Playwright, Puppeteer, Selenium — the agent drives a real browser, takes screenshots, uses a vision model to find buttons, clicks, and repeats. Flexible (works on any site) but slow, expensive, and brittle.
Feature Comparison
A direct feature-by-feature breakdown. Numbers are medians measured across real agent runs where noted.
Response time
Median across 10k real agent runs
Structured output
Typed JSON vs raw HTML vs varied
Vision model needed
Many scrapers still need vision for CAPTCHAs / layout
Auto-discovery
Agents find commands without human configuration
TypeScript types
End-to-end types from server to client SDK
Rate limiting
Per-token, per-session, per-command limits
Setup complexity
Minutes vs hours vs days
Works with any LLM
HTTP-native; no SDK lock-in
Session / state
First-class session tracking across calls
Real-time events
SSE & WebSocket push from server to agent
Auth support
Token, API key, or custom auth per command
Self-hostable
Your server, your rules
Open source
MIT / Apache 2.0 across the ecosystem
Response time
Surf
MCP
Scraping
Median across 10k real agent runs
Structured output
Surf
MCP
Scraping
Typed JSON vs raw HTML vs varied
Vision model needed
Surf
MCP
Scraping
Many scrapers still need vision for CAPTCHAs / layout
Auto-discovery
Surf
MCP
Scraping
Agents find commands without human configuration
TypeScript types
Surf
MCP
Scraping
End-to-end types from server to client SDK
Rate limiting
Surf
MCP
Scraping
Per-token, per-session, per-command limits
Setup complexity
Surf
MCP
Scraping
Minutes vs hours vs days
Works with any LLM
Surf
MCP
Scraping
HTTP-native; no SDK lock-in
Session / state
Surf
MCP
Scraping
First-class session tracking across calls
Real-time events
Surf
MCP
Scraping
SSE & WebSocket push from server to agent
Auth support
Surf
MCP
Scraping
Token, API key, or custom auth per command
Self-hostable
Surf
MCP
Scraping
Your server, your rules
Open source
Surf
MCP
Scraping
MIT / Apache 2.0 across the ecosystem
Surf vs MCP
MCP is excellent — and it's not a competitor. Here's how they differ:
Transport model
MCP runs as a local stdio process (or HTTP/SSE server) that the LLM runtime connects to. Surf runs inside your existing web server — one middleware call and your Express/Next.js/Fastify app exposes a Surf endpoint. No separate process, no protocol negotiation.
Discovery
MCP tools are declared manually per client. Surf auto-publishes a /.well-known/surf.json manifest — agents discover every command, parameter, and auth requirement automatically.
Real-time events
MCP is largely request-response. Surf has first-class SSE and WebSocket push — your server can stream incremental results, progress events, and live state changes directly to the agent mid-command.
Use together
Nothing stops you using both. A Claude Desktop agent might use MCP to talk to your local filesystem, and Surf to interact with your production API — all in the same conversation.
import Anthropic from '@anthropic-ai/sdk'import { createSurfClient } from '@surfjs/client' // MCP for local tools (filesystem, shell, etc.)const mcp = new Anthropic.McpClient({ server: 'stdio://./my-mcp-server' }) // Surf for your web APIconst surf = await createSurfClient('https://api.myapp.com')const commands = await surf.discover() // Both available in the same LLM callconst result = await anthropic.messages.create({ tools: [...mcp.tools(), ...commands.asTools()], // ...})Surf vs Web Scraping
Scraping is the fallback when no structured API exists. If you control the server, Surf is strictly better on every metric that matters for agents.
Web scraping reality
With Surf
The catch: scraping works on any site. Surf requires the site owner to add it. If you're building a general-purpose agent crawler, you'll need scraping as a fallback for sites that haven't adopted Surf yet.
When to Use What
Use Surf when
Use MCP when
Use Web Scraping when
Migration Path
Already using web scraping or MCP? You don't need to rip anything out. Surf can coexist as an additive layer.
From scraping → Surf
Add Surf to your most-scraped routes first. The agent will prefer the Surf endpoint (it's faster and more reliable) and fall back to scraping for everything else.
import express from 'express'import { createSurf } from '@surfjs/core'import { expressAdapter } from '@surfjs/express' const app = express()const surf = createSurf() // Start by surfing just your most-scraped routessurf.command('search', async ({ q }) => { const results = await db.products.search(q) return results.map(p => ({ id: p.id, name: p.name, price: p.price }))}) surf.command('getProduct', async ({ id }) => { return db.products.findById(id)}) // Everything else still works as beforeapp.get('/products', legacyHandler)app.use(expressAdapter(surf))app.listen(3000)From MCP → Surf (for web routes)
If you have an MCP server that wraps your web API with fetch() calls, consider moving those tools to Surf. You'll get auto-discovery, real-time events, and built-in auth — and your MCP server can stay for local tools.
Ready to add Surf?
It takes about five minutes to add your first command. Follow the quick-start guide or explore the full API reference.