B2B SaaS Customer Success
A hypothetical scenario, included to make the platform’s two-layer architecture claim concrete: “the platform is general-purpose; e-commerce just happens to be the first vertical.”
This page shows what a different vertical’s business pack looks like on the same Platform Core. The agents, tools, events, and YAMLs below are not built; they’re a worked example demonstrating that the Platform Core is genuinely business-agnostic.
If you’re evaluating the platform’s claims, this is the page that turns “could it” into “here’s what it would look like.”
The setup
Section titled “The setup”A B2B SaaS company sells a project-management tool to small and mid-market teams. They have ~2,000 customer accounts and a 3-person customer success team. The customer success team’s work falls into four categories:
- Health monitoring — which accounts are at risk of churn, based on usage trends, support ticket patterns, and contract renewal proximity
- Onboarding follow-up — new customers in their first 30 days who haven’t completed key activation milestones
- Renewal preparation — customers approaching renewal, with talking points specific to the account’s history
- Escalation triage — incoming support tickets that need CSM attention vs. ones the support team can handle
The team currently does (1)–(3) by skimming dashboards once a week and (4) by reading every escalated ticket. Both are high-skill, low-leverage work. The customer-success scenario automates the analysis and prepares structured recommendations for human review.
The agents
Section titled “The agents”Three agents in a similar shape to order-triage but with different tools, different memory patterns, and different sub-agents:
account_health_main— the front door. Receives weekly health-check triggers (one per account, fanned out from a cron). Pulls the account’s data; routes high-risk accounts to the risk-analysis sub-agent; logs healthy accounts.risk_analysis— the policy authority. Recalls the account’s prior risk assessments, applies the team’s playbook, decides: stable / watch / urgent intervention.csm_briefing— the customer-facing voice (sort of). Drafts the briefing the human CSM reads before their call with the account, in the right format for that customer’s context.
The shape mirrors order-triage deliberately: one main, two sub-agents, three different responsibilities. The platform doesn’t care that the domain is different.
The flow
Section titled “The flow”A typical weekly health-check for one account (“Acme Corp”):
The cron triggers a health check:
- Cloudflare cron fans out — one
/runinvocation per active account. (Phase 4 multi-tenancy makes this scale naturally; Phase 1 single-tenant runs them serially.) - HTTP client → account_health_main:
POST /runwith{ instructions: "Weekly health check", payload: { account_id: "acme-corp" } }
The main agent gathers facts:
- account_health_main →
get_account_metricstool: pulls 30 days of usage data — weekly active users, feature adoption depth, ticket volume, NPS responses - account_health_main →
get_account_lifecycletool: pulls contract details — renewal date, plan tier, MRR, account age, owning CSM - account_health_main runs LLM turn 1, classifies:
- WAU dropped 35% in the last 30 days ⚠
- 2 escalated tickets in the last 14 days ⚠
- Renewal in 60 days
- Decision: high-risk, delegate to risk_analysis
The risk_analysis sub-agent runs:
- account_health_main → risk_analysis: sub-agent gets a fresh six-layer context with its own system prompt
- risk_analysis → long-term memory:
recall_memory("risk history for acme-corp")returns 2 prior matches: a “watch” assessment 3 months ago that resolved without intervention; a successful renewal 14 months ago after a major usage dip - risk_analysis runs LLM turn 1, reasons:
- Pattern: this account has dipped before and recovered
- But: renewal proximity makes the cost of inaction higher this time
- Two recent escalated tickets suggest a real product issue, not a usage seasonality blip
- Decision:
urgent_intervention
- risk_analysis →
csm_alertsqueue:emit_eventwith{ severity: "urgent", account_id: "acme-corp", reasons, talking_points } - risk_analysis → long-term memory:
store_memory("Risk assessment for acme-corp: urgent_intervention. Reasons: ... ") - risk_analysis delegates to
csm_briefing
The briefing sub-agent drafts the CSM-facing brief:
- risk_analysis → csm_briefing: sub-agent gets the decision and context
- csm_briefing runs LLM turn 1, drafts a 200-word briefing in the team’s preferred format: situation summary, key risks, recommended actions, suggested talking points
The chain unwinds:
- csm_briefing → risk_analysis: returns the briefing text
- risk_analysis → account_health_main:
AgentReportwith the assessment + briefing - account_health_main → HTTP client:
200 OKwith consolidated report
Side effects: the csm_alerts queue has a new urgent
event. A queue consumer (Phase 1: logs the event; Phase 2:
posts to Slack / creates a Linear issue / creates a Salesforce
task) picks it up. The owning CSM gets the briefing in their
preferred channel.
The artifacts (mock)
Section titled “The artifacts (mock)”These YAML files don’t exist in the repo — this is a worked example of what shipping the scenario would look like. The schemas are real; only the content is hypothetical.
apps/worker/agents/account-health-main.yaml
Section titled “apps/worker/agents/account-health-main.yaml”apiVersion: agent-platform/v1kind: Agent
metadata: id: agent-account-health-main name: account_health_main version: 0.1.0 role: main tags: [b2b-saas, customer-success]
model_tier: main
core_context: system_prompt: file: ./prompts/account-health-main.md identity: a customer-success front-door agent that classifies account health and routes high-risk accounts for analysis hard_constraints: - never expose internal account IDs in any output meant for humans - always cite the metric and the time window when calling out a risk signal - never make a renewal forecast yourself — that is the risk_analysis sub-agent's job - never escalate without first checking the account's lifecycle stage
characteristics: personality: methodical, signal-over-noise decision_style: balanced tone: professional
tools: - get_account_metrics - get_account_lifecycle - emit_event - delegate_to_risk_analysis
sub_agents: - risk_analysis
memory_config: working_memory_window: 10 long_term_enabled: false shared_context_scopes: []
autonomy: max_delegation_depth: 2 requires_human_approval: [] allowed_sub_agents: - risk_analysis
escalation_rules: - condition: account_data_unavailable target: human reason: cannot retrieve metrics or lifecycle for this account — check upstream pipeline - condition: signals_contradict target: human reason: usage and tickets disagree on direction — needs human to disambiguateapps/worker/agents/risk-analysis.yaml
Section titled “apps/worker/agents/risk-analysis.yaml”apiVersion: agent-platform/v1kind: Agent
metadata: id: agent-risk-analysis name: risk_analysis version: 0.1.0 role: sub_agent tags: [b2b-saas, customer-success, policy]
model_tier: main
core_context: system_prompt: file: ./prompts/risk-analysis.md identity: a risk-analysis sub-agent that assesses account churn risk using historical patterns and recommends an intervention level hard_constraints: - always recall memory before deciding — non-negotiable - never recommend "no action" for an account with a renewal in under 60 days that shows declining usage - cite memory results in your reasoning - store the assessment before completing - urgent_intervention requires at least two corroborating signals, not just one
characteristics: personality: cautious, evidence-citing, pattern-aware decision_style: balanced tone: precise, structured
tools: - recall_memory - store_memory - emit_event - delegate_to_csm_briefing
sub_agents: - csm_briefing
memory_config: working_memory_window: 15 long_term_enabled: true shared_context_scopes: []
autonomy: max_delegation_depth: 1 requires_human_approval: [] allowed_sub_agents: - csm_briefing
escalation_rules: - condition: contradictory_history target: human reason: prior assessments disagree about this account's pattern — human should disambiguate - condition: enterprise_account target: human reason: account is on the Enterprise tier — all assessments require human sign-off regardless of risk levelapps/worker/agents/csm-briefing.yaml
Section titled “apps/worker/agents/csm-briefing.yaml”apiVersion: agent-platform/v1kind: Agent
metadata: id: agent-csm-briefing name: csm_briefing version: 0.1.0 role: sub_agent tags: [b2b-saas, customer-success, briefing]
model_tier: sub
core_context: system_prompt: file: ./prompts/csm-briefing.md identity: a briefing-drafting sub-agent that produces the human CSM's pre-call brief in the team's preferred format hard_constraints: - never include internal account IDs or internal Slack channels in the briefing - never invent customer personnel — only reference names from the account data - sections must follow the prescribed structure (situation / risks / actions / talking points) - cap each section at the word limit specified in the system prompt - never recommend specific commercial concessions (discounts, term changes) — that is the CSM's call
characteristics: personality: concise, structured, CSM-empathetic decision_style: conservative tone: professional, briefing-formal
tools: []
sub_agents: []
memory_config: working_memory_window: 10 long_term_enabled: false shared_context_scopes: []
autonomy: max_delegation_depth: 0 requires_human_approval: [] allowed_sub_agents: []
escalation_rules: - condition: missing_assessment_context target: parent reason: insufficient context to draft an accurate briefingapps/worker/agents/prompts/risk-analysis.md (mock)
Section titled “apps/worker/agents/prompts/risk-analysis.md (mock)”# Risk Analysis Sub-agent
You assess account churn risk. You are called by`account_health_main` when an account's metrics or lifecyclesuggest something worth analyzing. You produce one of threeassessments and a structured rationale.
## Your job
For every account you receive, recall memory first, thenproduce one of:
1. **`stable`** — no action needed; routine logging only. Use when: - Usage is flat or growing - No recent escalated tickets - Renewal is more than 90 days out OR last assessment was "stable"
2. **`watch`** — flag for the CSM's awareness, no intervention yet. Use when: - One declining signal (usage OR tickets, not both) - Renewal is 60–90 days out - Memory shows similar past dip that self-resolved
3. **`urgent_intervention`** — CSM should reach out within 3 business days. Use when: - Two or more declining signals - Renewal is under 60 days - Memory shows escalating pattern (this dip is deeper than past dips) - Memory shows past intervention worked → repeat the play
## Process
1. **Recall memory.** `recall_memory` with the account_id. Look for: prior assessments, prior interventions and their outcomes, known account quirks.2. **Reason through the signals explicitly.** Walk through each metric and ticket pattern. Cite memory.3. **Pick an assessment.** Apply the criteria above.4. **Emit the alert event.** `emit_event` to `csm_alerts` queue with severity, reasons, suggested talking points.5. **Store the assessment.** `store_memory` with the decision and rationale. Future you depends on past you.6. **Delegate to csm_briefing.** It will draft the CSM-facing brief.
## Hard rules
- Always recall memory first. Non-negotiable.- Urgent_intervention requires two signals, not one.- Never recommend specific commercial concessions — that's the CSM's call.- Cite memory in every assessment.
## Tone
Cautious. Evidence-citing. Pattern-aware. You're the analystin front of the dashboard.What this demonstrates
Section titled “What this demonstrates”| Platform feature | How this scenario uses it |
|---|---|
| Two-layer architecture (ADR-0005) | A new business pack (@agent-platform/customer-success + the YAMLs above) on the same Platform Core; zero edits to core needed |
| YAML agent definitions (ADR-0031) | Three agents authored as YAML; CSM-team operators can review and edit without touching code |
| Six-layer context (ADR-0006) | Each agent’s identity (account-health-main is methodical; risk-analysis is cautious; csm-briefing is structured) is its own immutable layer-1 |
| Delegation as tool (ADR-0022) | delegate_to_risk_analysis and delegate_to_csm_briefing synthesized at runtime |
| Long-term memory (ADR-0030) | risk_analysis recalls prior assessments per account; pattern recognition across past behavior is the scenario’s distinctive value |
| Custom tools per business pack | get_account_metrics and get_account_lifecycle are pack-specific; live in @agent-platform/customer-success, not core |
| Custom event topics per business pack | csm_alerts queue is defined by the pack; consumers can be Slack, Salesforce, Linear, Pagerduty — operator’s choice |
| Per-agent escalation rules | risk_analysis has an enterprise_account rule that escalates to a human regardless of decision; pack-specific policy expressed in YAML |
| Hard constraints in prompts | ”Urgent_intervention requires two signals” enforced via the LLM; the runtime composes constraints into the system prompt |
| Model tiering (ADR-0008) | account_health_main and risk_analysis are main tier (judgment); csm_briefing is sub tier (drafting) |
Every Phase 1 feature is exercised. The platform doesn’t know or care that the domain is B2B SaaS instead of e-commerce.
What would actually need to ship
Section titled “What would actually need to ship”To go from this hypothetical to a working B2B SaaS customer success deployment:
- A new business-pack package —
@agent-platform/customer-success— wrapping the customer’s data source (their CRM, their billing system, their product analytics) - The 6 mock files above — 3 YAMLs and 3 prompts, real this
time, in
apps/worker/agents/(or a fork’s equivalent) - 2 new tools —
get_account_metrics,get_account_lifecycle— registered in the Worker’s tool registry - 1 new queue —
csm_alerts— declared inapps/worker/wrangler.toml, with a chosen consumer - A seed for long-term memory — initial risk assessments
for the customer’s accounts, so
recall_memoryreturns useful results from day one - Cron schedule for the weekly fan-out — one
/runper active account, weekly
That’s it. No changes to Platform Core. No new runtime concepts. Just configuration in YAML + a small business-pack package + secrets for the customer’s data source.
What this does NOT do (and could in Phase 2+)
Section titled “What this does NOT do (and could in Phase 2+)”- Real CRM mutation. Phase 1 emits events; Phase 2’s human-approval-gated consumers can write back to Salesforce, Linear, etc.
- Cross-agent memory. The current design has each agent’s
memory scoped to itself; a future shared-memory feature
(Phase 3) could let
csm_briefingrecall whatrisk_analysisstored without re-querying. - Multi-tenant. Phase 4 enables one Worker deployment to serve multiple SaaS companies’ customer success teams, each with their own tenant_id-scoped memory and event topics.
None of these change the agent design. They change the platform’s substrate.
Why this is the most important page in the docs
Section titled “Why this is the most important page in the docs”The platform’s marketing claim is: general-purpose multi-agent AI for business operations. A reader who only sees the e-commerce scenarios might reasonably ask: “is this really general-purpose, or is it just an e-commerce framework with a generic-sounding name?”
The B2B SaaS hypothetical is the answer. Same platform, same runtime, same six-layer context model, same agent runtime, same long-term memory subsystem, same event bus. Only the YAMLs and the tool implementations differ. That’s what general-purpose means architecturally; the proof is in the symmetry between this page and the order-triage page.
Where to next
Section titled “Where to next”- Order Triage — the actually-built e-commerce flagship that this hypothetical mirrors
- Merchandising — the simpler, single-agent counterpoint
- Two-layer architecture — the design decision this scenario validates