Agentic Safety Monitor

A second AI agent that watches the parent-AI conversation flow for red-flag patterns and direct-SMSes the doctor on a hit. Designed alongside the Parent Triage feature — the two are inseparable.

The product principle: AI in the parent flow is more defensible, not less, when it is supervised by another AI. The triage assistant's job is to be helpful to the parent in real time. The safety monitor's job is to make sure the doctor is paged when a conversation needs human attention right now — not in the morning when the doctor reads the inbox.

What it watches for

The safety monitor is a classifier-and-pattern-matcher, not a diagnostic agent. It is looking for conversation patterns that warrant immediate human escalation, not making medical judgments. Categories include:

Red-flag class	Example signals
Acute medical emergency	Mentions of "blue lips," "won't wake up," "can't breathe," "took the whole bottle" — patterns where time matters and the triage assistant alone is not enough.
Child abuse or neglect	Patterns suggesting non-accidental injury, repeated unexplained injuries across conversations, bruising in non-mobile infants, caretaker explanations inconsistent with developmental capabilities.
Mental-health crisis	Self-harm ideation in adolescent patients, parent describing severe distress in a teen, suicidality language patterns.
Medication misuse	Refill-shopping patterns, mentions of medication being given to a non-prescribed family member, accidental overdose or near-miss reports.
Comorbidity dot-connecting	The triage assistant is great at considering one issue at a time. The monitor sees the parent mention three things across three messages that, taken together, raise concern (e.g., a Type 1 diabetic patient who's been losing weight + skipping meals + describing fatigue — possible DKA or eating-disorder pattern).
HEEADSSS-confidential threads	Adolescent patient discussing topics that are confidential between patient and provider (sexuality, substance use, mental health) — the monitor ensures the parent-app side does not auto-respond and that the provider is paged for direct conversation.

The monitor does not diagnose. It does not treat. It pattern-matches against trained red-flag signals and pages the doctor.

What happens when it fires

A direct SMS to the doctor:

⚠ Starlight alert. Patient: April Smith (age 4). Parent: Mary Smith. Concerning conversation in parent app — possible non-accidental-injury indicators. Last 5 messages summarized: [link]. Tap to open chart.

The doctor opens the chart, reads the conversation, decides whether to call the parent, contact the practice's social worker, or escalate further.

A few design principles for the alerts:

High-precision, not high-recall. False alarms desensitize doctors to alerts. The monitor is tuned to alert only when there is real signal. Some red flags will be missed; that's the deliberate trade.
Always summarized and contextualized. The alert isn't "look at this conversation" — it's "here's what triggered, here's what the assistant said, here's what we recommend you do."
Always actionable. Every alert ends with a recommended action: call the parent, schedule an in-person, contact local resources, page the practice's covering colleague.
Always logged. The alert, the doctor's response time, the action taken — all logged in the chart and in compliance evidence.

Why this is liability-positive, not liability-negative

The fear: "we built an AI agent that watches conversations for abuse — what if we miss something and get sued?"

The countervailing reality:

Today's baseline: doctors read after-hours messages in the morning. Concerning content sits unread for hours. There is no automated escalation. The current standard of care is "the doctor responds when they next check the inbox."
Our system's baseline: the monitor flags concerning patterns in real time and pages the doctor immediately. The doctor responds faster than they otherwise would have, and the chart documents the entire chain of detection → alert → response.
Legal posture: we are adding an oversight layer that did not exist before. We are not removing oversight. We are not making medical decisions; we are routing human attention to where it's needed.
Documentation: the monitor's alert log is itself evidence of the practice taking concerns seriously. It supports rather than undermines a defensible standard of care.

The general principle Erik articulated:

You're not making a diagnosis, you're pattern-matching against known danger signals and saying "this conversation needs human attention." The doctor gets an SMS that actually has signal instead of noise. From a liability perspective, you're adding oversight, not removing it.

How it relates to mandatory reporting

Mandatory child-abuse reporting laws apply to licensed clinicians, not to software. The safety monitor does not file a report. It surfaces concerning patterns to the doctor, who exercises clinical judgment about whether mandatory reporting obligations are triggered — exactly as the doctor would today, just earlier.

This means the monitor must:

Never claim to have detected abuse — only "concerning pattern, recommend review."
Never auto-route to anyone other than the doctor — not to CPS, not to police, not to family members.
Always preserve the underlying conversation verbatim in the chart — for the doctor's own clinical review and for any subsequent reporting the doctor decides to make.

Engineering shape

A simplified architecture sketch:

Parent message arrives
  ↓
Parent Triage assistant generates response
  ↓
Conversation snapshot (parent message + assistant response + chart context)
  ↓
Safety Monitor classifier
  ├─ No flag → conversation logged, parent gets response
  └─ Flag → conversation logged + alert generated + doctor SMS + chart entry created
                ↓
            Doctor opens chart, reviews, takes action
                ↓
            Action logged to chart

The monitor runs in parallel with the triage response, not before it. The parent gets the triage response without delay; the doctor gets the alert independently. The triage assistant is allowed to be helpful even when the monitor is also concerned — and the monitor's alert ensures the doctor catches up to the conversation before too much time passes.

What's net new vs reused

Reused from v1 / parent triage:

Chart RAG infrastructure.
Conversation logging in the chart.
Doctor SMS notification (Twilio is already on the v1 stack).

Net new:

Red-flag classifier (Claude with explicit pattern instructions, fine-tuned on synthetic and de-identified real cases).
Alert generation and routing logic.
Doctor's review-and-action workflow on the alert.

Open questions

Tuning the precision/recall trade. Need real volume to calibrate. Until we have it, ship the monitor in "shadow mode" — the classifier runs but only the security officer sees alerts; the doctor doesn't get paged. After a calibration period, switch to live alerts.
Adolescent confidentiality. HIPAA + state-specific minor-confidentiality laws + practice policy interact. The monitor must understand which conversations should not be visible to the parent-account-holder, and route only to the doctor.
Parent awareness. Parents should be told that the doctor reviews conversations and that the assistant has an oversight layer. Whether parents are told the monitor exists by name is a UX decision — likely yes, in the spirit of the transparent-AI principle.
Multi-language. If we ever support non-English families, the classifier needs to work in those languages. Claude does, but tuning is per-language.

Cross-references

Parent Triage — the feature this watches over.
Compliance · synthetic data program — the substrate for safe red-flag-classifier development.
Compliance · regulated SDLC — relevant for the FDA SaMD line and audit-trail engineering.

What it watches for​

What happens when it fires​

Why this is liability-positive, not liability-negative​

How it relates to mandatory reporting​

Engineering shape​

What's net new vs reused​

Open questions​

Cross-references​