Home/Docs/Comparison
Reference

Surf vs MCP vs Web Scraping

A factual, opinionated comparison of the three main approaches to letting AI agents interact with websites. No shade — each has its place.

Overview

When you want an AI agent to interact with a website — read products, submit forms, fetch data — you broadly have three options today:

S

Surf

You expose typed commands on your own server via a simple adapter. Agents discover them via surf.json and call them over HTTP. No browser, no screenshots, no intermediaries.

M

MCP (Model Context Protocol)

A protocol layer (originally from Anthropic) that standardises how LLMs call tools and read context. Agents connect to MCP servers that expose tools — typically local processes, not websites. Powerful for IDE / dev tooling; less natural for web-first use cases.

W

Web Scraping / Browser Automation

Playwright, Puppeteer, Selenium — the agent drives a real browser, takes screenshots, uses a vision model to find buttons, clicks, and repeats. Flexible (works on any site) but slow, expensive, and brittle.

Feature Comparison

A direct feature-by-feature breakdown. Numbers are medians measured across real agent runs where noted.

Response time

Surf

~47ms

MCP

~200ms

Scraping

~30s

Median across 10k real agent runs

Structured output

Surf

Always

MCP

Yes

Scraping

Rarely

Typed JSON vs raw HTML vs varied

Vision model needed

Surf

no

MCP

no

Scraping

partial

Many scrapers still need vision for CAPTCHAs / layout

Auto-discovery

Surf

surf.json

MCP

Manual

Scraping

None

Agents find commands without human configuration

TypeScript types

Surf

Full

MCP

Partial

Scraping

None

End-to-end types from server to client SDK

Rate limiting

Surf

Built-in

MCP

Manual

Scraping

None

Per-token, per-session, per-command limits

Setup complexity

Surf

Low

MCP

Medium

Scraping

High

Minutes vs hours vs days

Works with any LLM

Surf

yes

MCP

yes

Scraping

yes

HTTP-native; no SDK lock-in

Session / state

Surf

yes

MCP

partial

Scraping

no

First-class session tracking across calls

Real-time events

Surf

yes

MCP

no

Scraping

no

SSE & WebSocket push from server to agent

Auth support

Surf

yes

MCP

partial

Scraping

partial

Token, API key, or custom auth per command

Self-hostable

Surf

yes

MCP

yes

Scraping

yes

Your server, your rules

Open source

Surf

yes

MCP

yes

Scraping

yes

MIT / Apache 2.0 across the ecosystem

Surf vs MCP

MCP is excellent — and it's not a competitor. Here's how they differ:

Transport model

MCP runs as a local stdio process (or HTTP/SSE server) that the LLM runtime connects to. Surf runs inside your existing web server — one middleware call and your Express/Next.js/Fastify app exposes a Surf endpoint. No separate process, no protocol negotiation.

Discovery

MCP tools are declared manually per client. Surf auto-publishes a /.well-known/surf.json manifest — agents discover every command, parameter, and auth requirement automatically.

Real-time events

MCP is largely request-response. Surf has first-class SSE and WebSocket push — your server can stream incremental results, progress events, and live state changes directly to the agent mid-command.

Use together

Nothing stops you using both. A Claude Desktop agent might use MCP to talk to your local filesystem, and Surf to interact with your production API — all in the same conversation.

both-in-the-same-agent.ts
import Anthropic from '@anthropic-ai/sdk'
import { createSurfClient } from '@surfjs/client'
 
// MCP for local tools (filesystem, shell, etc.)
const mcp = new Anthropic.McpClient({ server: 'stdio://./my-mcp-server' })
 
// Surf for your web API
const surf = await createSurfClient('https://api.myapp.com')
const commands = await surf.discover()
 
// Both available in the same LLM call
const result = await anthropic.messages.create({
tools: [...mcp.tools(), ...commands.asTools()],
// ...
})

Surf vs Web Scraping

Scraping is the fallback when no structured API exists. If you control the server, Surf is strictly better on every metric that matters for agents.

Web scraping reality

15–20 screenshots per user flow
~$0.05–$1.00 vision model cost per run
Breaks when CSS classes change
CAPTCHAs, rate limits, IP blocks
~60% real-world success rate
No type safety on extracted data

With Surf

One HTTP call per action
$0.00 vision model cost
Stable commands independent of UI
Auth handled server-side, no bans
~100% reliable on your own API
Full TypeScript end-to-end

The catch: scraping works on any site. Surf requires the site owner to add it. If you're building a general-purpose agent crawler, you'll need scraping as a fallback for sites that haven't adopted Surf yet.

When to Use What

Use Surf when

You own or control the web server
You want typed, reliable, fast agent interactions
You're building agent-first features into your product
You need real-time streaming or session state

Use MCP when

You're building local developer tooling (IDE plugins, CLI wrappers)
You need deep LLM-runtime integration (Claude Desktop, etc.)
You want to expose filesystem, database, or shell access
Your tools live in a local process, not a web server

Use Web Scraping when

You don't control the target site
The site has no structured API or Surf manifest
One-off data extraction (not production agent flows)
You're building a general crawler that must work anywhere

Migration Path

Already using web scraping or MCP? You don't need to rip anything out. Surf can coexist as an additive layer.

From scraping → Surf

Add Surf to your most-scraped routes first. The agent will prefer the Surf endpoint (it's faster and more reliable) and fall back to scraping for everything else.

incremental-migration.ts
import express from 'express'
import { createSurf } from '@surfjs/core'
import { expressAdapter } from '@surfjs/express'
 
const app = express()
const surf = createSurf()
 
// Start by surfing just your most-scraped routes
surf.command('search', async ({ q }) => {
const results = await db.products.search(q)
return results.map(p => ({ id: p.id, name: p.name, price: p.price }))
})
 
surf.command('getProduct', async ({ id }) => {
return db.products.findById(id)
})
 
// Everything else still works as before
app.get('/products', legacyHandler)
app.use(expressAdapter(surf))
app.listen(3000)

From MCP → Surf (for web routes)

If you have an MCP server that wraps your web API with fetch() calls, consider moving those tools to Surf. You'll get auto-discovery, real-time events, and built-in auth — and your MCP server can stay for local tools.

Ready to add Surf?

It takes about five minutes to add your first command. Follow the quick-start guide or explore the full API reference.