Sunday, April 5

Most post-meeting workflows fail not because the tools are bad, but because the friction is just high enough that people skip them. Notes don’t get written, action items don’t get assigned, and three weeks later someone asks “wait, didn’t we decide this already?” Automated meeting notes AI solves this by removing humans from the loop entirely — capture the audio or transcript, run it through Claude, and have structured summaries, decisions, and assigned action items pushed to Slack and your task manager before the calendar invite has even expired.

This article walks through a production-ready implementation: Whisper for transcription, Claude for extraction, and n8n as the glue that routes outputs to Linear, Notion, or Slack depending on meeting type. I’ll cover the prompt engineering that actually works for action item extraction, the edge cases that will bite you, and the real cost per meeting at current API pricing.

The Architecture: What Goes Where

Before writing a single line of code, get the data flow straight. There are two common entry points:

  • Raw audio — recordings from Zoom, Google Meet, or a local recorder. You transcribe with Whisper first, then pass the text to Claude.
  • Existing transcript — Zoom’s built-in transcription, Otter.ai exports, or Fireflies.ai outputs. Skip Whisper and pass directly to Claude.

The processing pipeline looks like this:

Audio file → Whisper API → Raw transcript
                              ↓
              Claude (summarise + extract actions)
                              ↓
         ┌────────────────────┼────────────────────┐
         ↓                    ↓                    ↓
      Slack DM           Linear tasks          Notion page
   (per assignee)      (with due dates)     (meeting archive)

n8n handles the routing logic and API calls. You could use Make or even a simple Python script, but n8n’s error handling and retry logic saves you when the Claude API times out on a 90-minute all-hands recording.

Step 1 — Transcribing Audio with Whisper

If you’re working from raw recordings, OpenAI’s Whisper API is the practical choice right now. It costs $0.006 per minute of audio, handles multiple speakers reasonably well, and the turnaround for a 30-minute call is under 60 seconds. The open-source local model is free but requires GPU memory you probably don’t want to dedicate to this.

import openai
import os

client = openai.OpenAI(api_key=os.environ["OPENAI_API_KEY"])

def transcribe_meeting(audio_file_path: str) -> str:
    """
    Transcribe a meeting recording using Whisper.
    Supports mp3, mp4, wav, m4a up to 25MB.
    For larger files, split with pydub first.
    """
    with open(audio_file_path, "rb") as audio_file:
        transcript = client.audio.transcriptions.create(
            model="whisper-1",
            file=audio_file,
            response_format="verbose_json",  # includes word timestamps
            timestamp_granularities=["segment"]
        )
    
    # verbose_json gives you segments with speaker-level timing
    # useful for downstream attribution, even without diarization
    return transcript.text

One thing the docs underplay: Whisper doesn’t do speaker diarization out of the box. You get a clean transcript, but “Speaker A said X” is not natively supported. For action item attribution, this matters — you want to know who committed to what. The workaround is either to use a diarization layer (pyannote.audio works, but adds complexity) or to start meetings with a verbal roll call (“this is Sarah, Jake, and Priya”) so Claude can infer attribution from context. The latter is surprisingly effective in practice.

Step 2 — The Claude Prompt That Actually Works

This is where most implementations fall apart. Generic prompts like “summarize this meeting and list action items” produce generic outputs — vague summaries and unassigned tasks like “follow up on the project.” You need structure in, structure out.

Here’s the prompt I’ve been using in production, designed for Claude 3.5 Sonnet or Haiku depending on your latency/cost requirements:

import anthropic
import json

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

EXTRACTION_PROMPT = """You are a meeting analyst. Analyze the following meeting transcript and extract structured information.

Return ONLY valid JSON matching this exact schema — no prose, no markdown, just the JSON object:

{
  "summary": "2-3 sentence executive summary of what was discussed and decided",
  "meeting_type": "one of: standup | planning | review | sales | 1on1 | all_hands | other",
  "key_decisions": [
    {"decision": "string", "rationale": "string or null"}
  ],
  "action_items": [
    {
      "task": "specific, actionable task description",
      "assignee": "name or 'unassigned'",
      "due_date": "ISO 8601 date or null",
      "priority": "high | medium | low",
      "context": "brief context for why this task exists"
    }
  ],
  "blockers": ["string"],
  "follow_up_meeting_needed": true | false,
  "participants_mentioned": ["name"]
}

Rules:
- Action items must be concrete and actionable, not vague
- If an assignee is unclear, use 'unassigned' rather than guessing
- Extract due dates from relative references (e.g., 'by Friday' → calculate from today: {today})
- Priority is 'high' if there's a deadline, blocker, or explicit urgency signal
- Do not invent information not present in the transcript

TRANSCRIPT:
{transcript}"""

def extract_meeting_data(transcript: str, today: str) -> dict:
    prompt = EXTRACTION_PROMPT.format(
        transcript=transcript,
        today=today
    )
    
    message = client.messages.create(
        model="claude-3-5-haiku-20241022",  # Haiku for cost; Sonnet for better attribution
        max_tokens=2048,
        messages=[{"role": "user", "content": prompt}]
    )
    
    raw = message.content[0].text
    
    # Claude 3.5 Haiku follows JSON-only instructions reliably
    # but strip any accidental whitespace to be safe
    return json.loads(raw.strip())

Why Haiku over Sonnet? For a 45-minute meeting transcript, Haiku costs roughly $0.004–$0.008 per run versus $0.06–$0.12 for Sonnet. For most meetings, the quality difference is negligible. I switch to Sonnet for all-hands recordings over 90 minutes where context is dense and attribution matters more. Opus is unnecessary here — you’re doing structured extraction, not reasoning.

Handling the JSON Reliability Problem

Claude 3.5 models are quite reliable at JSON-only output when you’re explicit in the prompt. But “quite reliable” isn’t “always.” Wrap your parse in error handling and fall back to a retry with an explicit correction prompt:

import json
from json import JSONDecodeError

def safe_extract(transcript: str, today: str, retries: int = 2) -> dict:
    for attempt in range(retries):
        try:
            return extract_meeting_data(transcript, today)
        except JSONDecodeError as e:
            if attempt == retries - 1:
                # Last attempt — return a minimal fallback structure
                return {
                    "summary": "Extraction failed — review transcript manually",
                    "action_items": [],
                    "key_decisions": [],
                    "blockers": [],
                    "error": str(e)
                }
            # On retry, ask Claude to fix its output
            # (not shown for brevity — pass the malformed output back)
    return {}

Step 3 — Routing Outputs with n8n

Once you have structured JSON, the n8n workflow becomes straightforward. Here’s what the node sequence looks like for a Slack + Linear integration:

  1. Webhook trigger — receives the meeting recording URL or transcript text (e.g., posted by a Zoom webhook or a Slack slash command)
  2. HTTP Request node — calls your Python transcription service (or runs inline via n8n’s Code node)
  3. HTTP Request node — calls Claude API with the extraction prompt
  4. JSON Parse node — normalizes the output
  5. IF node — branches based on meeting_type: standups go to a Slack thread, planning sessions create Linear issues, sales calls update the CRM
  6. Loop over action items — for each action item with an assignee, create a Linear issue and send a Slack DM to that person

The Slack DM node is the part teams actually thank you for. Instead of a wall of text in a channel, each person gets a direct message with only their tasks:

{
  "channel": "@{{ $json.assignee_slack_id }}",
  "text": "📋 *Action item from {{ $json.meeting_title }}*\n\n*Task:* {{ $json.task }}\n*Due:* {{ $json.due_date }}\n*Context:* {{ $json.context }}\n\n<{{ $json.linear_url }}|View in Linear>"
}

The hardest part of this step isn’t the code — it’s the name-to-Slack-ID mapping. You’ll need a lookup table (a simple Airtable base or a JSON file works) that maps names extracted by Claude (“Sarah”) to Slack user IDs. This breaks whenever someone goes by a nickname or there are two Sarahs. Build the exception handling early.

Connecting to Calendar for Automatic Triggering

Manually triggering this workflow defeats half the purpose. The cleaner setup is calendar-driven: when a Google Calendar event with a Zoom link ends, automatically pull the recording and kick off the pipeline.

Google Calendar → Pub/Sub → Cloud Function (or n8n webhook) is the production pattern. For a simpler setup, poll the Google Calendar API every 15 minutes for events that ended in the last 30 minutes, then check Zoom’s API for recordings tied to that meeting ID.

import googleapiclient.discovery
from datetime import datetime, timedelta, timezone

def get_recently_ended_meetings(service, minutes_ago: int = 30) -> list:
    """Fetch calendar events that ended within the last N minutes."""
    now = datetime.now(timezone.utc)
    time_min = (now - timedelta(minutes=minutes_ago)).isoformat()
    time_max = now.isoformat()
    
    events_result = service.events().list(
        calendarId="primary",
        timeMin=time_min,
        timeMax=time_max,
        singleEvents=True,
        orderBy="startTime"
    ).execute()
    
    return events_result.get("items", [])

This polling approach costs you up to 15 minutes of delay. For most teams, that’s fine — you don’t need action items instantaneously. If you do, the Pub/Sub route is worth the setup complexity.

What Breaks in Production

Running this across dozens of meetings per week surfaces a consistent set of failure modes:

  • Long transcripts exceed context windows. A 2-hour all-hands meeting transcript can exceed 50,000 tokens. Claude 3.5’s 200K context window handles this, but your costs scale linearly. Consider chunking by agenda item (if the transcript includes timestamps and section headers).
  • Crosstalk and bad audio = garbage extraction. If the Whisper transcript is garbled, Claude’s extraction will be too. Add a transcript quality check — flag any transcript where more than 20% of words are likely transcription errors (you can detect this heuristically by unusual word frequency).
  • Unassigned action items are silently dropped. If your workflow only creates tasks for named assignees, unassigned items vanish. Route these to a dedicated “unassigned” Linear project or a team Slack channel instead.
  • Date math errors. When you pass “today” as context for relative dates, Claude is generally accurate. But “next Friday” on a Thursday will sometimes resolve wrong. Validate ISO dates before creating Linear issues with deadlines.
  • Privacy and consent. Recording and transcribing meetings without participant consent is a legal issue in many jurisdictions. This isn’t a technical problem, but it will become your problem. Make sure your team knows recordings are processed by external APIs.

Real Cost Numbers

For a team running 10 meetings per week averaging 45 minutes each:

  • Whisper transcription: 450 minutes × $0.006 = $2.70/week
  • Claude 3.5 Haiku extraction: ~10 runs × $0.006 avg = $0.06/week
  • Total API cost: roughly $2.76/week, or ~$12/month

That’s less than one month of Otter.ai Business per seat, and you own the pipeline. If you’re already paying for Zoom’s transcription, skip Whisper entirely and your cost drops to under $1/month in Claude API calls.

Who Should Build This vs. Buy It

Build this if: you want custom routing logic (e.g., sales calls update Salesforce, engineering standups post to a specific Slack channel, 1:1s stay private and go only to the two participants), you already use n8n or Make, or you want to own your data and not pipe meeting content through three SaaS vendors.

Buy instead if: your team doesn’t have someone to maintain a pipeline, you need real-time transcription during the call rather than post-processing, or the meeting volume is low enough that the ROI math doesn’t work. Fireflies.ai at $10/seat/month is genuinely good and does 80% of what’s described here without a single line of code.

For technical founders and developers building internal tooling, the custom pipeline wins — not just on cost, but because automated meeting notes AI integrated directly into your existing tools (Linear, Jira, Notion, whatever you actually use) produces dramatically better adoption than yet another SaaS dashboard your team has to check. The best system is the one people don’t have to think about.

Editorial note: API pricing, model capabilities, and tool features change frequently — always verify current details on the vendor’s website before building in production. Code examples are tested at time of writing; pin your dependency versions to avoid breaking changes. Some links in this article may be affiliate links — we may earn a commission if you sign up, at no extra cost to you.

Share.
Leave A Reply