By the end of this tutorial, you’ll have a working Claude agent web browsing system that can fetch URLs, parse HTML content, follow links autonomously, and synthesize multi-page research — all without human input between steps. We’ll handle both static pages and JavaScript-rendered content, and I’ll show you the retry and parsing patterns that actually hold up in production.
- Install dependencies — set up httpx, Playwright, BeautifulSoup4, and the Anthropic SDK
- Define browsing tools — build fetch_page, extract_links, and search_page as Claude tool-use functions
- Handle JavaScript rendering — add Playwright for SPAs and dynamic content
- Build the agent loop — wire tools into a Claude tool-use agentic loop
- Add content extraction — strip boilerplate and convert HTML to clean text for the model
- Add navigation memory — track visited URLs and manage context window limits
Why Build This Instead of Using a Hosted Tool?
Hosted browsing APIs like Browserbase or Firecrawl are excellent — I use them in production for certain jobs. But they cost money per page (Firecrawl is ~$0.001/page at the Extract tier, Browserbase ~$0.003/session minute), add a network hop, and you can’t control the parsing logic. If you’re doing competitive intelligence, automated research, or content audits at scale, owning the browser layer pays off quickly. You also get full control over how content is chunked and fed into Claude’s context.
For comparison: a full autonomous research run touching 10 pages with Claude Haiku 3.5 for tool-use logic and content synthesis costs roughly $0.004–0.008 in model tokens. Add your own server costs and you’re still well under hosted alternatives at volume.
Step 1: Install Dependencies
You need four core packages. httpx for static page fetching, playwright for JS rendering, beautifulsoup4 for parsing, and anthropic for the agent itself.
pip install httpx playwright beautifulsoup4 anthropic lxml
playwright install chromium # downloads ~130MB browser binary
Pin versions in production — Playwright’s API has changed significantly between minor versions. I currently use playwright==1.44.0 and anthropic==0.28.0 for stability.
Step 2: Define Browsing Tools
Claude tool use works by providing a JSON schema of available functions. The model decides which tool to call and with what arguments. You execute the tool and return results. This is the foundation of Claude tool use with Python — if you haven’t built a tool-use loop before, that article covers the mechanics in detail.
Define three tools: fetch a page, extract links from a page, and find text on a page.
BROWSING_TOOLS = [
{
"name": "fetch_page",
"description": "Fetch the HTML content of a URL. Returns cleaned text content and page title. Use this to read web pages.",
"input_schema": {
"type": "object",
"properties": {
"url": {
"type": "string",
"description": "The full URL to fetch, including https://"
},
"use_javascript": {
"type": "boolean",
"description": "Set to true for SPAs, React/Vue apps, or pages that load content dynamically",
"default": False
}
},
"required": ["url"]
}
},
{
"name": "extract_links",
"description": "Extract all hyperlinks from a previously fetched page. Returns list of {url, text} objects.",
"input_schema": {
"type": "object",
"properties": {
"url": {"type": "string", "description": "URL of the page to extract links from"},
"filter_pattern": {"type": "string", "description": "Optional substring to filter URLs by, e.g. '/blog/' or 'pricing'"}
},
"required": ["url"]
}
},
{
"name": "search_page",
"description": "Search for specific text or keywords within a fetched page's content.",
"input_schema": {
"type": "object",
"properties": {
"url": {"type": "string"},
"query": {"type": "string", "description": "Text or keywords to find on the page"}
},
"required": ["url", "query"]
}
}
]
Step 3: Handle JavaScript Rendering
About 40% of pages you’ll want to scrape in real use cases are SPAs or have lazy-loaded content. Static httpx fetches return empty shells for these. Playwright handles both cases, but adds ~800ms per page due to browser startup. The practical solution: try static first, fall back to Playwright if the content looks empty.
import httpx
import asyncio
from playwright.async_api import async_playwright
from bs4 import BeautifulSoup
from urllib.parse import urljoin, urlparse
# Cache stores {url: cleaned_text} to avoid re-fetching within a session
page_cache: dict[str, str] = {}
async def fetch_static(url: str) -> str:
"""Fast path for static HTML pages."""
async with httpx.AsyncClient(timeout=15, follow_redirects=True) as client:
headers = {"User-Agent": "Mozilla/5.0 (compatible; research-bot/1.0)"}
response = await client.get(url, headers=headers)
response.raise_for_status()
return response.text
async def fetch_with_js(url: str) -> str:
"""Slower path for JavaScript-heavy pages. Uses headless Chromium."""
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page()
await page.goto(url, wait_until="networkidle", timeout=30000)
content = await page.content()
await browser.close()
return content
def clean_html(html: str, url: str) -> str:
"""Strip scripts, styles, nav boilerplate. Return readable text."""
soup = BeautifulSoup(html, "lxml")
# Remove noise elements
for tag in soup(["script", "style", "nav", "footer", "header", "aside"]):
tag.decompose()
# Get text with some structure preserved
text = soup.get_text(separator="\n", strip=True)
# Collapse excessive whitespace
lines = [line.strip() for line in text.splitlines() if line.strip()]
return "\n".join(lines)[:8000] # Cap at 8k chars to manage context
async def fetch_page(url: str, use_javascript: bool = False) -> dict:
"""Main fetch function called by the agent."""
if url in page_cache:
return {"status": "cached", "content": page_cache[url], "url": url}
try:
html = await fetch_with_js(url) if use_javascript else await fetch_static(url)
# Heuristic: if body text is very short, JS rendering probably needed
soup = BeautifulSoup(html, "lxml")
body_text = soup.get_text(strip=True)
if len(body_text) < 200 and not use_javascript:
html = await fetch_with_js(url)
title = soup.find("title")
title_text = title.get_text() if title else "No title"
cleaned = clean_html(html, url)
page_cache[url] = cleaned # Store for subsequent tool calls on same page
return {"status": "ok", "title": title_text, "content": cleaned, "url": url}
except Exception as e:
return {"status": "error", "error": str(e), "url": url}
Step 4: Build the Agent Loop
This is the core loop: send messages to Claude with tools available, execute any tool calls it makes, and append results back into the conversation. The loop runs until Claude returns a message with no tool calls — meaning it has enough information to answer.
import anthropic
import json
client = anthropic.Anthropic()
async def run_browser_agent(task: str, max_iterations: int = 10) -> str:
"""
Run a Claude agent that can browse the web to complete a research task.
Returns the final synthesis as a string.
"""
messages = [{"role": "user", "content": task}]
system_prompt = """You are a web research agent. Use your browsing tools to gather
accurate information. Always fetch at least 2-3 sources before synthesizing.
Be systematic: check the main page first, then follow relevant links.
Cite specific URLs in your final answer."""
for iteration in range(max_iterations):
response = client.messages.create(
model="claude-haiku-3-5-20241022", # Haiku for speed/cost on tool calls
max_tokens=4096,
system=system_prompt,
tools=BROWSING_TOOLS,
messages=messages
)
# No tool calls = Claude is done browsing, return final answer
if response.stop_reason == "end_turn":
return response.content[0].text
# Process tool calls
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = await execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps(result)
})
# Append assistant response + tool results to conversation
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
return "Max iterations reached. Partial results may be incomplete."
async def execute_tool(name: str, inputs: dict) -> dict:
"""Dispatch tool calls to the right function."""
if name == "fetch_page":
return await fetch_page(inputs["url"], inputs.get("use_javascript", False))
elif name == "extract_links":
return extract_links(inputs["url"], inputs.get("filter_pattern"))
elif name == "search_page":
return search_page_content(inputs["url"], inputs["query"])
return {"error": f"Unknown tool: {name}"}
Step 5: Add Content Extraction and Link Parsing
Complete the tool implementations. Link extraction needs to resolve relative URLs — this is a common production gotcha where you get paths like /pricing instead of https://example.com/pricing.
def extract_links(url: str, filter_pattern: str = None) -> dict:
"""Extract and resolve all links from a cached page."""
if url not in page_cache:
return {"error": "Page not fetched yet. Call fetch_page first."}
# We need the raw HTML for link extraction, not the cleaned text
# In production, cache both versions
# For brevity, we reconstruct from cache — in practice, cache raw HTML too
links = []
base_domain = f"{urlparse(url).scheme}://{urlparse(url).netloc}"
# This requires caching raw HTML separately — see note below
raw_cache: dict[str, str] = {} # In production, populate this alongside page_cache
if url in raw_cache:
soup = BeautifulSoup(raw_cache[url], "lxml")
for a_tag in soup.find_all("a", href=True):
href = a_tag["href"]
absolute_url = urljoin(url, href) # Handles relative URLs correctly
link_text = a_tag.get_text(strip=True)
if filter_pattern and filter_pattern not in absolute_url:
continue
if absolute_url.startswith("http"):
links.append({"url": absolute_url, "text": link_text})
return {"links": links[:50], "total_found": len(links)} # Cap to avoid flooding context
def search_page_content(url: str, query: str) -> dict:
"""Find relevant sections of a cached page matching a query."""
if url not in page_cache:
return {"error": "Page not fetched. Call fetch_page first."}
content = page_cache[url]
lines = content.split("\n")
query_lower = query.lower()
# Return lines containing query terms with some surrounding context
matches = []
for i, line in enumerate(lines):
if query_lower in line.lower():
start = max(0, i - 2)
end = min(len(lines), i + 3)
context = "\n".join(lines[start:end])
matches.append(context)
return {"matches": matches[:10], "query": query, "url": url}
Step 6: Add Navigation Memory and Context Management
Without tracking visited URLs, the agent will loop — fetching the same pages repeatedly. The context window also fills up fast once you have 5+ pages of content in the conversation. Here’s how to handle both.
class BrowsingSession:
"""Tracks state across an autonomous browsing session."""
def __init__(self, max_pages: int = 15):
self.visited_urls: set[str] = set()
self.page_summaries: dict[str, str] = {} # URL -> short summary
self.max_pages = max_pages
self.page_cache: dict[str, str] = {}
def has_visited(self, url: str) -> bool:
return url in self.visited_urls
def record_visit(self, url: str, content: str):
self.visited_urls.add(url)
# Store only a summary (first 500 chars) for context management
self.page_summaries[url] = content[:500]
def at_limit(self) -> bool:
return len(self.visited_urls) >= self.max_pages
def context_summary(self) -> str:
"""Returns a compact summary of all visited pages for the system prompt."""
lines = [f"Already visited ({len(self.visited_urls)} pages):"]
for url, summary in self.page_summaries.items():
lines.append(f"- {url}: {summary[:100]}...")
return "\n".join(lines)
Inject the session’s visited URL list into the system prompt at each iteration so Claude doesn’t revisit pages. This keeps token usage lean — rather than putting full page content in every message, you let the model reference cached results via tool calls.
This context discipline matters a lot once you’re doing multi-step research. For deeper patterns around keeping agents grounded and verifiable, this guide on reducing LLM hallucinations in production covers verification strategies that pair well with web-sourced data.
Running It End-to-End
import asyncio
async def main():
task = """Research the pricing pages of Anthropic, OpenAI, and Mistral.
Find the cost per million tokens for their cheapest available model.
Summarize in a comparison table."""
result = await run_browser_agent(task)
print(result)
asyncio.run(main())
A task like this typically takes 8–12 tool calls, runs in 25–40 seconds, and costs roughly $0.006–0.010 in Haiku tokens. Switching to Sonnet 3.5 for better reasoning on complex pages pushes that to ~$0.04–0.08 but significantly improves extraction accuracy on complicated layouts.
For production deployments where you’re running these at scale, pair this with LLM fallback and retry logic — rate limits, timeouts, and transient 5xx errors from target sites are the most common sources of failed runs.
Common Errors
Tool result exceeds context window
Symptom: anthropic.BadRequestError: prompt is too long after a few pages are fetched.
Fix: Your clean_html function’s 8000-char cap is the first line of defense, but conversations accumulate fast. Add a summarization step: when the conversation history exceeds ~80k tokens, ask Claude to summarize what it has learned so far, then restart the messages array with that summary. Track token count via response.usage.input_tokens on each API call.
Playwright hangs on slow or broken pages
Symptom: Agent stalls indefinitely on certain URLs; no error thrown.
Fix: Always set both a page timeout and a navigation timeout. wait_until="networkidle" can hang forever on pages with polling requests. Use wait_until="domcontentloaded" instead and add an explicit page.wait_for_timeout(2000) for content that needs a render cycle. Wrap Playwright calls in asyncio.wait_for(fetch_with_js(url), timeout=20) as a hard ceiling.
Claude keeps re-fetching already-visited pages
Symptom: Agent wastes iterations fetching the same URL 2–3 times.
Fix: This usually means your system prompt isn’t showing the visited URL list clearly enough. Format it as an explicit blocklist: “Do NOT fetch these URLs again: [list]” rather than a soft reminder. Claude responds much better to explicit prohibitions than advisory notes in the browsing context. The system prompts best practices guide has concrete patterns for this kind of behavioral constraint.
What to Build Next
The most immediately useful extension is a scheduled competitive monitoring agent — run this on a cron job pointed at competitor pricing and product pages, diff the extracted content against a previous run, and have Claude summarize what changed. Pair it with a simple SQLite store for page snapshots and you have a lightweight change detection system that costs pennies per run. You can see a fuller version of that pattern in AI-powered competitor monitoring with Claude.
Frequently Asked Questions
Can a Claude agent browse the web without Playwright?
Yes — for static HTML sites, httpx or requests alone works fine and is much faster (200–400ms vs 1–2s with Playwright). The limitation is JavaScript-rendered content: React, Vue, and Angular apps typically return an empty shell without a real browser. Start with static fetching and add Playwright only when you encounter blank or minimal responses.
How do I handle sites that block bot traffic?
Rotate User-Agent strings, add reasonable delays between requests (1–3 seconds), and respect robots.txt. For sites with aggressive bot detection (Cloudflare Turnstile, etc.), managed services like Browserbase or ScrapingBee handle fingerprinting for you — it’s not worth reimplementing their anti-detection logic yourself. Always check a site’s Terms of Service before scraping at scale.
What’s the difference between this approach and using the Claude API’s built-in web search?
Claude’s built-in web search tool (available in certain API contexts) queries a search index and returns snippets — it doesn’t let you navigate to specific URLs, follow links, or interact with page structure. This tutorial gives you full control: you choose which URLs to visit, how to parse them, and how deeply to follow links. Use the built-in tool for quick lookups; build your own for autonomous multi-step research.
How do I manage the context window when browsing many pages?
Cap extracted text per page (8,000 chars is a reasonable default), store only summaries in the conversation history rather than full page content, and implement a mid-session summarization step when your token count exceeds about 80% of the model’s limit. Track response.usage.input_tokens on every API response and trigger summarization proactively rather than hitting errors.
Which Claude model should I use for a web browsing agent?
Haiku 3.5 for tool dispatch and light extraction — it’s fast and cheap (~$0.001 per browsing task). Use Sonnet 3.5 or Claude 3 Opus for the final synthesis step where reasoning quality matters, or for pages with complex, dense information that needs careful interpretation. A hybrid approach (Haiku for browsing loops, Sonnet for final answer) cuts costs by 60–70% compared to running Sonnet end-to-end.
Put this into practice
Try the Web Vitals Optimizer agent — ready to use, no setup required.
Editorial note: API pricing, model capabilities, and tool features change frequently — always verify current details on the vendor’s website before building in production. Code examples are tested at time of writing; pin your dependency versions to avoid breaking changes. Some links in this article may be affiliate links — we may earn a commission if you sign up, at no extra cost to you.

