Sunday, April 5

Polling is a tax on your infrastructure. Every 30-second interval check against a Slack API or a CRM webhook endpoint burns compute, adds latency, and — if you’re routing through a hosted LLM — costs you tokens you didn’t need to spend. Webhook triggers for agents flip that model entirely: instead of your agent asking “anything new?”, the external system tells your agent exactly when to act. The result is lower cost, sub-second reaction times, and workflows that actually feel alive. This article walks through exactly how to wire webhooks directly into Claude agents — with working code, real tradeoffs, and the failure modes nobody documents.

Why Polling Breaks at Scale (and Webhook Triggers Fix It)

Let’s be concrete. Say you’re running a Claude agent that monitors new support tickets and drafts responses. If you poll every minute, you’re making 1,440 API calls per day to check for new data — most of which return nothing. At Haiku pricing (~$0.00025 per 1K input tokens), even lightweight status checks add up fast when you multiply across multiple event sources.

The real cost isn’t tokens though. It’s lag. A ticket submitted one second after your last poll waits a full minute before your agent sees it. For anything time-sensitive — payment failures, alerts, form submissions — that’s unacceptable.

Webhook triggers solve both problems. The source system sends an HTTP POST to your agent endpoint the moment the event fires. No waiting, no wasted cycles. Your agent processes real events instead of running empty loops.

What You Actually Need to Build This

  • A publicly accessible HTTP endpoint (your agent’s entry point)
  • A webhook-capable source system (Stripe, GitHub, Typeform, Slack, etc.)
  • A Claude API integration (direct or via a framework like LangChain)
  • Signature verification logic — this is non-negotiable in production

The endpoint can be a FastAPI app on a VPS, a serverless function on AWS Lambda or Cloudflare Workers, or an n8n/Make webhook node. I’ll show the FastAPI approach because it gives you the most control and is the easiest to debug locally with ngrok.

Building a Webhook-Triggered Claude Agent Step by Step

Here’s a realistic scenario: a new GitHub issue is opened, your webhook endpoint receives the payload, Claude triages the issue (bug vs feature request vs duplicate), assigns a label suggestion, and posts a draft reply — all within two seconds of the issue being created.

Step 1: Stand Up the Webhook Endpoint

from fastapi import FastAPI, Request, HTTPException, Header
import hmac
import hashlib
import json
import os

app = FastAPI()

GITHUB_WEBHOOK_SECRET = os.environ["GITHUB_WEBHOOK_SECRET"]

def verify_github_signature(payload_body: bytes, signature: str) -> bool:
    """GitHub sends X-Hub-Signature-256 with every request. Always verify it."""
    expected = "sha256=" + hmac.new(
        GITHUB_WEBHOOK_SECRET.encode(),
        payload_body,
        hashlib.sha256
    ).hexdigest()
    return hmac.compare_digest(expected, signature)

@app.post("/webhook/github")
async def github_webhook(
    request: Request,
    x_hub_signature_256: str = Header(None),
    x_github_event: str = Header(None)
):
    body = await request.body()

    # Reject anything that can't prove it came from GitHub
    if not x_hub_signature_256 or not verify_github_signature(body, x_hub_signature_256):
        raise HTTPException(status_code=401, detail="Invalid signature")

    payload = json.loads(body)

    # Only act on opened issues, ignore everything else for now
    if x_github_event == "issues" and payload.get("action") == "opened":
        issue_data = payload["issue"]
        result = await triage_issue(issue_data)
        return {"status": "processed", "triage": result}

    return {"status": "ignored", "event": x_github_event}

The signature check is not optional. Without it, anyone who discovers your endpoint URL can trigger your agent with arbitrary payloads. Most webhook providers (Stripe, GitHub, Shopify) use HMAC-SHA256. Some use token-based header checks instead. Either way, verify before you parse.

Step 2: The Claude Triage Function

import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

async def triage_issue(issue: dict) -> dict:
    """Send issue data to Claude for triage and return structured output."""

    prompt = f"""You are a GitHub issue triage assistant. Analyze this issue and respond with JSON only.

Issue title: {issue['title']}
Issue body: {issue['body'] or 'No description provided'}
Author: {issue['user']['login']}

Return this exact JSON structure:
{{
  "category": "bug" | "feature_request" | "question" | "duplicate" | "unclear",
  "priority": "high" | "medium" | "low",
  "suggested_labels": ["label1", "label2"],
  "draft_response": "A helpful, friendly first response to post on the issue",
  "confidence": 0.0-1.0
}}"""

    message = client.messages.create(
        model="claude-haiku-4-5",  # Haiku is fast and cheap for structured triage
        max_tokens=512,
        messages=[{"role": "user", "content": prompt}]
    )

    # Claude Haiku costs ~$0.00025/1K input, ~$0.00125/1K output
    # A typical triage call costs under $0.001 total
    raw = message.content[0].text

    try:
        return json.loads(raw)
    except json.JSONDecodeError:
        # Claude occasionally wraps JSON in markdown code fences
        # Strip them if present
        cleaned = raw.strip().strip("```json").strip("```").strip()
        return json.loads(cleaned)

I’m using Haiku here deliberately. For triage tasks with structured output, Sonnet or Opus is overkill and roughly 5-15x more expensive per call. Haiku handles classification tasks like this reliably and responds in under a second. Reserve the heavier models for the steps where reasoning depth actually matters.

Step 3: Run It Locally with ngrok

# Terminal 1: start your FastAPI app
uvicorn main:app --reload --port 8000

# Terminal 2: expose it to the internet for webhook testing
ngrok http 8000

# ngrok gives you a URL like: https://abc123.ngrok.io
# Register that as your GitHub webhook URL during development

ngrok’s free tier works fine for development. For staging/production, you want a stable URL — deploy to Railway, Fly.io, or a Lambda function URL. Cloudflare Workers are excellent for this use case: they’re globally distributed, have a generous free tier, and cold starts are under 5ms.

Handling the Real Problems: Retries, Timeouts, and Duplicate Events

This is where most tutorials stop, and where production systems fall apart.

Webhook Retries Will Bite You

Every serious webhook provider retries failed requests. GitHub retries up to 3 times. Stripe retries for 72 hours with exponential backoff. If your endpoint returns a 500 because Claude is slow, you’ll process the same event multiple times. Your agent will post duplicate comments, send duplicate emails, trigger duplicate workflows.

The fix is idempotency keys. Most providers include a unique event ID in the payload or headers. Store processed IDs in Redis (or even a simple SQLite table for low-volume use cases) and skip events you’ve already handled:

import redis

r = redis.Redis(host="localhost", port=6379, decode_responses=True)

def is_duplicate_event(event_id: str, ttl_seconds: int = 86400) -> bool:
    """Returns True if we've seen this event ID before. Marks it as seen."""
    key = f"webhook:processed:{event_id}"
    # SET NX (only set if not exists) is atomic — safe under concurrent load
    was_new = r.set(key, "1", ex=ttl_seconds, nx=True)
    return was_new is None  # None means key already existed = duplicate

Respond Fast, Process Async

Webhook providers typically expect a 200 response within 5-30 seconds. If your Claude call takes longer (complex multi-step reasoning, slow network), you’ll get false retry triggers. The pattern to use: acknowledge the webhook immediately, process asynchronously.

from fastapi import BackgroundTasks

@app.post("/webhook/github")
async def github_webhook(request: Request, background_tasks: BackgroundTasks, ...):
    # ... verify signature ...

    if x_github_event == "issues" and payload.get("action") == "opened":
        # Queue the work — respond to GitHub immediately
        background_tasks.add_task(process_issue_async, payload["issue"])
        return {"status": "queued"}  # 200 returned before Claude is called

    return {"status": "ignored"}

For high-throughput production use, replace BackgroundTasks with a proper queue: Celery + Redis, AWS SQS, or BullMQ if you’re in Node. FastAPI’s background tasks are fine for low volume but they’re in-process — a server restart drops any pending work.

Connecting Webhook Triggers to n8n and Make

If you’re using webhook triggers agents in a no-code/low-code context, both n8n and Make have built-in webhook nodes that handle the endpoint creation, signature verification, and fan-out routing for you. You get the URL, paste it into the source system, and route payloads to an HTTP node that calls the Claude API directly.

The tradeoff: n8n’s webhook node is excellent but the free cloud tier throttles execution frequency. For high-frequency events (dozens per minute), self-hosted n8n on a $6/month VPS is the better call. Make’s free tier allows 1,000 operations/month — fine for low-volume use, but pricing scales steeply after that.

For genuinely complex agent logic — multi-step reasoning, tool use, memory retrieval — the visual builders hit their ceiling quickly. You end up contorting your agent logic to fit the node graph. Custom code endpoints are more maintainable once you have more than 4-5 steps with conditional branches.

Security Checklist Before You Deploy

Quick hits that are easy to miss:

  • Verify every request signature — not just on the “important” endpoints
  • Validate payload structure before passing to Claude — adversarial inputs via webhook are a real threat vector
  • Rate-limit your endpoint — use slowapi or a Cloudflare WAF rule to prevent someone hammering your Claude quota
  • Never log full payloads in production — webhook bodies often contain PII, tokens, or credentials
  • Use environment variables for all secrets — the webhook secret, the Claude API key, all of it
  • Set a request size limit — a crafted oversized payload shouldn’t be able to OOM your process

When to Use This Pattern (and When Not To)

Webhook-triggered Claude agents are the right tool when:

  • You need to react to external events in near-real-time (payments, alerts, form submissions, code pushes)
  • The event volume is low-to-moderate and irregular (bursty, not continuous stream)
  • The triggered action is discrete and completable in a single agent run

They’re the wrong tool when:

  • You’re processing a continuous high-volume stream — use Kafka or a proper event streaming system and batch calls to Claude
  • The source system doesn’t support webhooks (some legacy APIs only offer polling — in that case, poll on a schedule and accept the tradeoff)
  • Your agent needs persistent state across multiple events — you’ll need an external store regardless, but a long-running agent process might be more natural than stateless webhook functions

My Recommendation by Reader Type

Solo founder or small team: Start with n8n self-hosted on a $6 VPS + Claude API via HTTP node. You get webhook handling, routing, and basic retry logic without writing infrastructure code. Upgrade to custom FastAPI when your logic outgrows the visual builder.

Developer building a product feature: FastAPI + Cloudflare Workers or Lambda is the cleanest stack. Stateless, cheap, globally fast. Use Redis for idempotency. Don’t over-engineer the queue until you have evidence you need it.

Enterprise / high-compliance context: Add a proper message queue (SQS or Pub/Sub) between the webhook receiver and the Claude processing layer. This gives you durability, dead-letter queues, and auditability — worth the operational overhead at that scale.

Webhook triggers for agents aren’t complicated, but they do require you to think about the failure modes upfront. Get the signature verification, idempotency, and async processing right from the start, and you’ll have a foundation that handles everything from one event per day to thousands per hour without changes to your agent logic.

Editorial note: API pricing, model capabilities, and tool features change frequently — always verify current details on the vendor’s website before building in production. Code examples are tested at time of writing; pin your dependency versions to avoid breaking changes. Some links in this article may be affiliate links — we may earn a commission if you sign up, at no extra cost to you.

Share.
Leave A Reply