What is the fan-out/fan-in pattern in multi-agent systems?

Fan-out/fan-in is an orchestration pattern where a parent agent spawns multiple child agents to handle parallel subtasks (fan-out), then collects and aggregates their results when they complete (fan-in). For competitive research, the parent creates one child issue per competitor, each child researches one target independently, and the parent aggregates the structured outputs into a digest.

How is this tutorial different from the Paperclip multi-agent coordination tutorial?

The multi-agent coordination tutorial covers team structure, role isolation, and coordination failure modes — the architecture of a Paperclip company. This tutorial is implementation-focused: it walks through building a specific fan-out/fan-in pipeline for competitive research, with complete agent configs, sub-agent task schemas, and aggregation logic.

How often should a competitive research agent run?

For most SaaS products, weekly is the right cadence for broad competitor monitoring (pricing, changelog, blog). For fast-moving competitors or high-stakes pricing changes, daily monitoring with a narrower scope (pricing page + changelog only) is more appropriate. The tutorial covers both configurations.

What does a competitive research sub-agent actually scrape?

Each sub-agent is scoped to one competitor and typically covers: pricing page (tier names, prices, feature inclusions), changelog or release notes (product velocity, new capabilities), job board (hiring signals that indicate investment areas), and public blog or Twitter (positioning changes, new market moves). The exact scope depends on what intelligence matters to your product team.

Can I extend this pipeline beyond pricing and changelog monitoring?

Yes. The tutorial covers three extensions in Section 7: monitoring review platforms (G2, Capterra) for sentiment signals, watching for pricing changes as a trigger for automated alerts, and integrating with Slack for real-time digest delivery. The pipeline architecture is the same — only the sub-agent research scope changes.

How to Build an Automated Competitive Research Pipeline with Paperclip Sub-Agents

A step-by-step tutorial for building an automated competitive intelligence pipeline using Paperclip's fan-out/fan-in sub-agent pattern. Covers parent orchestrator setup, research sub-agents, result aggregation, and a full working 5-competitor weekly example.

Affiliate disclosure: This article references Paperclip throughout. We may earn a commission on Paperclip signups through our links once the affiliate program launches. This is a product-owned tutorial: the implementation is real and the examples are production-ready.

Prerequisites: This tutorial assumes you have a Paperclip company set up with at least one agent. If you’re starting from scratch, read Paperclip Autonomous Company Setup first, then return here. For an overview of multi-agent coordination concepts, see Paperclip Multi-Agent Coordination.

Every SaaS product manager has a version of this problem: competitor X just changed their pricing, and you found out from a customer who mentioned it on a call two weeks after it happened. Competitor Y shipped a feature you’re about to build — you’re about to invest three sprints into something the market already has. Your job board just posted a job that signals competitor Z is moving into your market segment.

This information is public. It’s not hard to find — if you’re looking. The problem is that “if you’re looking” translates to a weekly manual sweep of 5–10 competitor websites, which takes 2–3 hours that no one has reliably. The research happens in bursts, goes stale, and lives in someone’s personal notes.

The solution is a competitive research pipeline: a set of agents that run on a schedule, each monitoring one competitor, whose outputs are aggregated into a structured digest that lands in your team’s channel every Monday morning without anyone lifting a finger.

This tutorial builds that pipeline using Paperclip’s fan-out/fan-in sub-agent pattern.

The Problem: Why Manual Competitive Research Doesn’t Scale

Manual competitive research has three failure modes that make it systematically unreliable:

Context decay. Research done once gets stale fast. The pricing you noted in Q1 may have changed three times by Q4. If you’re not monitoring continuously, you’re making decisions on stale data.

Coverage gaps. When time is limited, you cover the two or three competitors you think matter most. The one you skip might be the one that just changed their go-to-market strategy.

Single point of failure. Competitive research lives in one person’s head or their private notes. When they leave or go on vacation, the research gap is weeks wide before anyone notices.

An automated pipeline solves all three: it runs on schedule, covers the same competitors every week regardless of who’s busy, and produces structured outputs that persist in a shared system.

The Architecture: Fan-Out → Collect → Synthesize

The pipeline uses Paperclip’s native issue-based coordination to implement the fan-out/fan-in pattern:

Parent Orchestrator Agent
│
├─ [heartbeat trigger: every Monday 08:00]
│
├─ Fan-out: create child issues
│   ├─ Child Issue: Research competitor-a.com
│   ├─ Child Issue: Research competitor-b.com
│   ├─ Child Issue: Research competitor-c.com
│   ├─ Child Issue: Research competitor-d.com
│   └─ Child Issue: Research competitor-e.com
│
├─ [child agents pick up issues independently, run in parallel]
│
└─ Fan-in: parent wakes on issue_children_completed
    └─ Aggregates child outputs → posts digest to team

Each layer has a single responsibility:

Parent orchestrator: manages the schedule, creates child issues with consistent structure, waits for completion, aggregates results, posts the digest
Research sub-agent: takes one competitor, runs the research scope (pricing, changelog, jobs, blog), returns a structured output in its completion comment
Aggregation step: the parent reads each child’s completion comment, extracts the structured data, diffs it against the previous run, and formats the weekly digest

The key advantage of this architecture over a single agent trying to research all competitors: parallelism. Five child agents running simultaneously take the same wall-clock time as one. A 10-competitor sweep that takes 30 minutes single-threaded takes 6 minutes in parallel.

Step 1 — Configuring the Parent Orchestrator Agent

The parent orchestrator runs on a scheduled heartbeat. Its job is to create the child issues, then wake again when they’re all done to aggregate.

Agent configuration

# Parent Orchestrator Agent — Competitive Research
role: Competitive Research Orchestrator
description: >
  Runs every Monday at 08:00. Creates one child research issue per competitor.
  Waits for all children to complete, then aggregates outputs into a weekly digest
  posted to the #competitive-intel channel.

heartbeat:
  schedule: "0 8 * * 1"   # Every Monday at 08:00 UTC
  wakeReasons:
    - schedule              # Monday trigger
    - issue_children_completed  # All child research issues done

budget:
  monthly: 50              # USD — covers parent + child API costs
  alertAt: 80              # Pause at 80%, not 100%

competitors:
  - name: Competitor A
    url: https://competitor-a.com
    priority: high
  - name: Competitor B
    url: https://competitor-b.com
    priority: high
  - name: Competitor C
    url: https://competitor-c.com
    priority: medium
  - name: Competitor D
    url: https://competitor-d.com
    priority: medium
  - name: Competitor E
    url: https://competitor-e.com
    priority: low

Parent heartbeat logic

On a schedule wake, the parent creates child issues and sets them to block itself:

# Parent orchestrator — schedule wake handler (pseudocode)

def on_schedule_wake(context):
    competitors = context.config["competitors"]
    child_issue_ids = []

    for competitor in competitors:
        # Create a child research issue per competitor
        child = paperclip.issues.create(
            title=f"Research {competitor['name']} — week of {today()}",
            description=build_research_brief(competitor),
            parentId=context.current_issue_id,
            goalId=context.goal_id,
            assigneeAgentId=RESEARCH_AGENT_ID,
            priority=competitor["priority"],
            labels=["competitive-research", "auto-generated"]
        )
        child_issue_ids.append(child["id"])

    # Update parent to in_progress, note the children
    paperclip.issues.update(
        issue_id=context.current_issue_id,
        status="in_progress",
        comment=f"Fan-out complete. Created {len(child_issue_ids)} research issues. Waiting for children to complete."
    )

def build_research_brief(competitor):
    return f"""
## Research scope for {competitor['name']}

URL: {competitor['url']}

### Required outputs (structured, in completion comment)

1. **Pricing**: tier names, prices, feature inclusions per tier (note any changes from last week)
2. **Changelog**: last 3 changelog entries (date, feature name, one-line description)
3. **Jobs**: active job postings (role, department — signals investment areas)
4. **Positioning**: any changes to homepage headline, value prop, or ICP language

### Output format

Return a JSON object in your completion comment with keys:
`pricing`, `changelog`, `jobs`, `positioning`, `scraped_at`, `delta_notes`

`delta_notes` should call out anything that changed vs. last week's known state.
"""

On an issue_children_completed wake, the parent aggregates:

def on_children_completed_wake(context):
    # Fetch all child issues and their completion comments
    children = paperclip.issues.list(
        parent_id=context.current_issue_id,
        status="done"
    )

    research_outputs = []
    for child in children:
        completion_comment = get_last_comment(child["id"])
        parsed = extract_json_from_comment(completion_comment["body"])
        if parsed:
            research_outputs.append({
                "competitor": child["title"],
                "data": parsed
            })

    # Generate digest
    digest = generate_weekly_digest(research_outputs)

    # Post digest
    paperclip.issues.update(
        issue_id=context.current_issue_id,
        status="done",
        comment=digest
    )
    # Optionally: post to Slack
    slack.post_message(channel="#competitive-intel", text=digest)

Step 2 — Building the Research Sub-Agent

The research sub-agent is scoped to one task: receive a competitor brief, execute the research scope, return structured output. It should be stateless between runs — all context it needs is in the issue brief.

Sub-agent configuration

# Research Sub-Agent
role: Competitive Research Analyst
description: >
  Picks up individual competitor research issues. For each issue, scrapes the
  competitor's pricing page, changelog, job board, and homepage. Returns structured
  JSON output in the completion comment.

heartbeat:
  wakeReasons:
    - issue_assigned

budget:
  monthly: 20     # Scoped lower than parent — this agent is cost-contained
  alertAt: 90

Research execution loop

# Research sub-agent heartbeat (pseudocode)

def on_issue_assigned_wake(context):
    issue = paperclip.issues.get(context.task_id)
    brief = parse_brief(issue["description"])
    competitor_url = brief["url"]

    # Scrape pricing
    pricing_data = scrape_pricing_page(
        url=f"{competitor_url}/pricing",
        extract=["tier_names", "prices", "feature_lists"]
    )

    # Scrape changelog
    changelog_data = scrape_changelog(
        url=find_changelog_url(competitor_url),
        limit=3   # Last 3 entries only
    )

    # Scrape job board
    jobs_data = scrape_jobs(
        url=f"{competitor_url}/careers",
        extract=["role", "department", "location"]
    )

    # Scrape homepage positioning
    positioning_data = scrape_positioning(
        url=competitor_url,
        extract=["headline", "subheadline", "cta_text"]
    )

    # Compare to last known state (if available in issue history)
    prior_state = get_prior_state(issue["parentId"], competitor_url)
    delta_notes = compute_delta(prior_state, {
        "pricing": pricing_data,
        "changelog": changelog_data,
        "jobs": jobs_data,
        "positioning": positioning_data
    })

    # Structured output
    output = {
        "pricing": pricing_data,
        "changelog": changelog_data,
        "jobs": jobs_data,
        "positioning": positioning_data,
        "scraped_at": utcnow(),
        "delta_notes": delta_notes
    }

    # Complete the issue with structured output in comment
    paperclip.issues.update(
        issue_id=context.task_id,
        status="done",
        comment=f"Research complete.\n\n```json\n{json.dumps(output, indent=2)}\n```"
    )

What “scrape” means in practice

The research sub-agent uses web fetch tools to read publicly accessible pages. For a pricing page, it fetches the HTML, passes it to the LLM with a structured extraction prompt, and gets back normalized JSON. This works well for most SaaS pricing pages; it fails on JavaScript-heavy SPAs that render content client-side (the agent sees the shell, not the rendered content).

For JS-heavy pricing pages, add a fallback: if the initial fetch returns minimal content, try a cached version (Google’s cache or the Wayback Machine) or flag it for manual verification in the delta_notes field.

Step 3 — Aggregating Results in the Parent

The aggregation step is the fan-in: the parent collects structured JSON from each child’s completion comment and synthesizes a digest.

Diff from last run

The most valuable part of the digest is not the raw data — it’s what changed. Storing the previous week’s output and diffing against it surfaces actionable intelligence immediately:

def compute_digest_diff(current_week, prior_week):
    changes = []

    for competitor_id, current in current_week.items():
        prior = prior_week.get(competitor_id, {})

        # Check pricing changes
        if current["pricing"] != prior.get("pricing"):
            changes.append({
                "competitor": competitor_id,
                "type": "pricing_change",
                "before": prior.get("pricing"),
                "after": current["pricing"],
                "severity": "high"   # Always high — pricing changes affect deals
            })

        # Check new changelog entries
        new_entries = [
            e for e in current["changelog"]
            if e not in prior.get("changelog", [])
        ]
        if new_entries:
            changes.append({
                "competitor": competitor_id,
                "type": "new_features",
                "entries": new_entries,
                "severity": "medium"
            })

        # Check new job postings in strategic areas
        strategic_depts = ["engineering", "sales", "marketing", "product"]
        new_jobs = [
            j for j in current["jobs"]
            if j["department"].lower() in strategic_depts
            and j not in prior.get("jobs", [])
        ]
        if new_jobs:
            changes.append({
                "competitor": competitor_id,
                "type": "hiring_signal",
                "jobs": new_jobs,
                "severity": "low"
            })

    return sorted(changes, key=lambda x: ["high","medium","low"].index(x["severity"]))

Digest format

The parent posts the aggregated digest as a markdown comment. A good digest is scannable in 2 minutes:

## Competitive Research Digest — Week of 2026-05-13

### 🔴 High Priority Changes

**Competitor A — Pricing Change**
- Starter tier: $29/mo → $39/mo (+34%)
- Pro tier unchanged at $79/mo
- New Enterprise tier added at $199/mo (SSO, custom integrations)
- Action: Review how this affects deals where we're compared on price

### 🟡 New Features

**Competitor B — 3 new changelog entries**
- 2026-05-10: Native Slack integration (we have this; parity maintained)
- 2026-05-08: CSV bulk import (we do not have this; add to backlog?)
- 2026-05-05: Mobile app v2 launch (iOS and Android)

**Competitor C — 2 new changelog entries**
- 2026-05-11: API rate limits increased 10x for paid plans
- 2026-05-09: Zapier integration added

### 🟢 Hiring Signals

**Competitor D — 4 new engineering roles**
- 2x Senior Backend Engineers (Rust)
- 1x ML Engineer (signal: investing in AI features)
- 1x Head of Platform

### No Changes This Week

Competitors C, E: no pricing, changelog, or hiring changes detected.

---
*Scraped: 2026-05-13 08:12 UTC | 5 competitors monitored | Next run: 2026-05-20*

Full Working Example: 5 Competitors, Weekly Cadence, Slack-Ready Output

Putting it all together: a Monday morning trigger spawns 5 child issues simultaneously, each picked up by a research sub-agent worker. Within 6–10 minutes, all 5 complete and the parent wakes on issue_children_completed. The parent runs the aggregation, computes the diff, formats the digest, and posts it to the Paperclip issue thread and to #competitive-intel on Slack.

Timeline for a typical Monday run:

08:00:00  — Parent wakes on schedule trigger
08:00:30  — 5 child issues created (fan-out complete)
08:01:00  — Research agents pick up child issues (parallel execution begins)
08:04:15  — First child completes (competitor C — fast pricing page)
08:06:30  — All 5 children complete (fan-in triggers)
08:06:35  — Parent wakes on issue_children_completed
08:08:10  — Aggregation + digest generation complete
08:08:15  — Digest posted to issue thread and Slack channel

What the team sees in Slack at 08:08:

A pinned message with the weekly digest. No manual work required from anyone. The research lead reviews it, adds context where needed, and shares with the product team — all within the first 30 minutes of their Monday.

Cost for 5 competitors, weekly:

Estimated LLM cost per run (GPT-4o):

5 research sub-agents × ~8K tokens each = 40K tokens ≈ $0.12
Parent aggregation ≈ 12K tokens ≈ $0.04
Total per run: ~$0.16

Monthly cost: 4 runs × $0.16 = ~$0.64 in LLM costs. Add Paperclip subscription (covers the platform, scheduling, and agent infrastructure). At this scale, the research that previously cost 2–3 hours of PM time weekly now costs under $1 in API tokens.

Extending the Pipeline

The base pipeline covers pricing, changelog, jobs, and positioning — the core competitive signals. Several extensions are worth adding once the base pipeline is stable:

Review platform monitoring

Add G2 and Capterra to the research scope. Each sub-agent fetches the competitor’s profile, extracts recent reviews (rating, review text excerpt, verified buyer), and flags review clusters (multiple reviews mentioning the same pain point). This surfaces what customers are saying about competitors in the moment — more valuable than the company’s own positioning.

# Add to sub-agent research scope
reviews_data = scrape_reviews(
    platforms=["g2.com", "capterra.com"],
    competitor=competitor_url,
    limit=10,          # Most recent reviews
    extract=["rating", "excerpt", "pros", "cons", "date"]
)

Pricing change alerts

For competitors where pricing changes are high-stakes (close deals frequently compare you on price), add a real-time alerting path: if the current week’s pricing data differs from last week’s, trigger an immediate Slack alert rather than waiting for the Monday digest. This can be wired as a condition in the aggregation step.

# In aggregation: check for high-severity changes before the weekly cadence
if any(c["severity"] == "high" for c in changes):
    slack.post_message(
        channel="#competitive-intel-alerts",
        text=f"🚨 Pricing change detected: {format_pricing_alert(changes)}"
    )

Sentiment trend tracking

Store 12 weeks of review data and run a sentiment trend analysis in the parent’s monthly digest (triggered by a separate monthly schedule trigger). Compare sentiment scores over time — competitors whose review sentiment is declining are vulnerable in the market; those whose sentiment is improving are gaining momentum. This is the kind of strategic signal that’s nearly impossible to derive manually.

Wrapping Up

The fan-out/fan-in pattern is the foundational multi-agent coordination model for research workloads. Once you have it running for competitive research, the same architecture applies to:

Market monitoring: one sub-agent per market segment, parent aggregates weekly market intelligence report
Backlink and SEO monitoring: one sub-agent per domain cluster, parent identifies new competitor content
Customer review monitoring: one sub-agent per review platform, parent surfaces emerging product gaps

The pipeline takes 2–3 hours to configure the first time. After that, it runs itself.

For the structural concepts behind multi-agent coordination in Paperclip — role isolation, budget allocation, and coordination failure modes — see Paperclip Multi-Agent Coordination. For pricing on the agent tier that supports this level of multi-agent orchestration, see the Paperclip Pricing Guide.

Last updated: May 2026