Multi-Tenancy

Status: Implemented — migration 0005_multi_tenancy.sql, full app-layer enforcement as of 2026-03-26 Priority: Must be completed before onboarding a second practice

What is Multi-Tenancy?

Multi-tenancy means one running application serves multiple independent customers (tenants) at the same time, with each tenant's data completely isolated from every other.

For Starlight, a tenant is a practice — Dr. Prajapati's pediatric DPC practice is one tenant. When the second practice joins, they are another. Their patients, billing history, revenue, settings, and message templates must be invisible to each other.

Right now, Starlight is single-tenant. There is one database, and all data implicitly belongs to Dr. P's practice. There is no concept of "which practice does this record belong to" anywhere in the schema.

Why It Matters

The business plan targets 500 practices by Year 3.

There are two ways to serve multiple practices:

Approach	Description	Problem
Database per practice	Each practice gets their own Aurora cluster	Expensive ($30-50/mo × 500 practices = $15-25K/mo infrastructure), operationally painful to deploy and migrate
Shared database, tenant column	All practices share one DB, every table has a `practice_id` column	One database, low cost, easy to operate — standard SaaS approach

The correct approach is the shared database model. That means adding a practice_id foreign key to every tenant-scoped table.

Current State

The schema today has 14 tables. None of them have a practice_id column:

users                  — who can log in
families               — sibling groups
patients               — active members + pipeline prospects
patient_notes          — timestamped notes
visit_logs             — clinical visit records
wellness_visits        — AAP milestone tracking
billing_payments       — monthly paid/unpaid per patient
billing_change_actions — tier transition action items
message_templates      — email templates
nurture_progress       — drip sequence tracking
email_logs             — sent email history
revenue_history        — monthly revenue totals
practice_settings      — key-value config
pricing_tiers          — the 5 age-based membership tiers

This works fine for one practice. With two practices, data leaks across practices the moment any query forgets a filter — and there's nothing at the database level to prevent it.

The Risk of Doing This Later

Adding practice_id now costs one migration and one afternoon of work.

Adding practice_id after two practices have real production data costs:

A migration that touches every table and every index
Backfilling practice_id on existing rows without downtime
Auditing every query in the application to add WHERE practice_id = $current — missing even one is a data leak
Re-testing every feature across practice boundaries

The longer this waits, the more expensive it gets. Do it before Practice #2 onboards.

Decisions Required Before Implementation

These questions must be answered before writing any code. Getting them wrong means refactoring the multi-tenancy design later.

Decision 1: User ↔ Practice Relationship

How does a logged-in user know which practice they belong to?

Option A — One user, one practice (simple) The users table gets a practice_id column. A user belongs to exactly one practice. Logging in automatically scopes all queries to that practice.

users.practice_id → practices.id

Option B — Users can belong to multiple practices (complex) A junction table practice_memberships(user_id, practice_id, role) allows one user account (e.g. a locum doctor or a billing admin working for a group) to access multiple practices and switch between them.

practice_memberships (user_id, practice_id, role)

Recommendation: Start with Option A. The current model (Dr. P is the only user, role = 'practitioner') maps cleanly. Add multi-practice membership only when a real customer requires it.

Decision 2: Which Tables Are Tenant-Scoped?

Not all tables need a practice_id. Some are shared platform data, some are per-practice.

Table	Tenant-scoped?	Reasoning
`practices`	—	This is the tenant table itself
`users`	✅ Yes	Belongs to a practice (Option A)
`families`	✅ Yes	Sibling groups are per-practice
`patients`	✅ Yes	Core per-practice entity
`patient_notes`	Via patient	Inherited — `patient_id` FK is enough
`visit_logs`	Via patient	Inherited
`wellness_visits`	Via patient	Inherited
`billing_payments`	Via patient	Inherited
`billing_change_actions`	Via patient	Inherited
`nurture_progress`	Via patient	Inherited
`email_logs`	Via patient	Inherited
`message_templates`	✅ Yes	Each practice has their own templates
`revenue_history`	✅ Yes	Per-practice revenue
`practice_settings`	✅ Yes	Already named for it
`pricing_tiers`	❌ No	Platform-level reference data — all practices share the same 5 tiers

"Via patient" tables don't need their own practice_id column — they're scoped to a practice transitively through their patient_id foreign key. Queries against these tables always join through patients, which carries the practice_id.

Decision 3: `pricing_tiers` — Platform or Per-Practice?

The 5 tiers (Orion, Lyra, Sirius, Pegasus, Polaris) with their prices currently live as platform-level reference data shared across all practices.

This works if every DPC practice uses the same tier structure and prices. If a future practice wants custom tier names or different prices, pricing_tiers would need to become per-practice.

Recommendation: Keep pricing_tiers as shared platform data for now. When a practice needs custom pricing, introduce a practice_pricing_overrides table rather than duplicating the tier structure.

The Implementation Plan

Once decisions are made, the work breaks down into four steps.

Step 1 — Create the `practices` table

CREATE TABLE practices (
  id          uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  name        varchar(255) NOT NULL,
  created_at  timestamp DEFAULT now() NOT NULL,
  updated_at  timestamp DEFAULT now() NOT NULL
);

Step 2 — Backfill Dr. P's practice

Insert Dr. P's practice as the first row and capture the generated UUID. This UUID will be used to backfill all existing rows in the next step.

INSERT INTO practices (id, name)
VALUES ('00000000-0000-0000-0000-000000000001', 'Starlight Pediatrics')
RETURNING id;
-- Use this ID as STARLIGHT_PRACTICE_ID in the next step

Using a fixed UUID (rather than a random one) makes the backfill reproducible across dev, staging, and prod.

Step 3 — Add `practice_id` to tenant-scoped tables and backfill

Each tenant-scoped table gets a nullable practice_id column first, then all existing rows are backfilled, then the NOT NULL constraint is enforced. This allows the migration to run without downtime.

-- Phase A: add nullable columns
ALTER TABLE users             ADD COLUMN practice_id uuid REFERENCES practices(id);
ALTER TABLE families          ADD COLUMN practice_id uuid REFERENCES practices(id);
ALTER TABLE patients          ADD COLUMN practice_id uuid REFERENCES practices(id);
ALTER TABLE message_templates ADD COLUMN practice_id uuid REFERENCES practices(id);
ALTER TABLE revenue_history   ADD COLUMN practice_id uuid REFERENCES practices(id);
ALTER TABLE practice_settings ADD COLUMN practice_id uuid REFERENCES practices(id);

-- Phase B: backfill existing rows
UPDATE users             SET practice_id = '00000000-0000-0000-0000-000000000001';
UPDATE families          SET practice_id = '00000000-0000-0000-0000-000000000001';
UPDATE patients          SET practice_id = '00000000-0000-0000-0000-000000000001';
UPDATE message_templates SET practice_id = '00000000-0000-0000-0000-000000000001';
UPDATE revenue_history   SET practice_id = '00000000-0000-0000-0000-000000000001';
UPDATE practice_settings SET practice_id = '00000000-0000-0000-0000-000000000001';

-- Phase C: enforce NOT NULL
ALTER TABLE users             ALTER COLUMN practice_id SET NOT NULL;
ALTER TABLE families          ALTER COLUMN practice_id SET NOT NULL;
ALTER TABLE patients          ALTER COLUMN practice_id SET NOT NULL;
ALTER TABLE message_templates ALTER COLUMN practice_id SET NOT NULL;
ALTER TABLE revenue_history   ALTER COLUMN practice_id SET NOT NULL;
ALTER TABLE practice_settings ALTER COLUMN practice_id SET NOT NULL;

Step 4 — Add indexes

Every tenant-scoped table needs an index on practice_id. The most common query pattern is WHERE practice_id = $id AND ..., so composite indexes on (practice_id, ...) matter:

-- Single-column indexes for tables where practice_id is the primary filter
CREATE INDEX patients_practice_id_idx         ON patients(practice_id);
CREATE INDEX families_practice_id_idx         ON families(practice_id);
CREATE INDEX message_templates_practice_id_idx ON message_templates(practice_id);

-- Composite index for revenue history (most queries filter by practice + month range)
CREATE INDEX revenue_history_practice_month_idx ON revenue_history(practice_id, month);

-- practice_settings is a KV store — (practice_id, key) is the natural lookup
CREATE UNIQUE INDEX practice_settings_practice_key_idx ON practice_settings(practice_id, key);
-- Note: the current primary key on `key` alone becomes wrong once multi-tenant.
-- It must be replaced with a composite PK or unique constraint on (practice_id, key).

Application Layer Changes

The database migration is only half the work. Every query in the application must be updated to filter by practice_id.

What changes

Every SELECT against a tenant-scoped table adds WHERE practice_id = $practiceId
Every INSERT into a tenant-scoped table includes the practice_id value
The authenticated user's practice_id must be read from users at the start of every request and injected into all repository calls

The risk

Missing a single WHERE practice_id = $id in a query means one practice can read another's data. This is a data leak and a HIPAA violation once real PHI is in the system.

The safest pattern is to make practice_id a required parameter on every repository method — enforced by TypeScript's type system — so it is impossible to write a query that omits it:

// Every repository method requires practiceId — TypeScript makes it impossible to forget
class PatientRepository {
  async findAll(practiceId: string): Promise<Patient[]> {
    return db.select().from(patients).where(eq(patients.practiceId, practiceId))
  }

  async findById(practiceId: string, patientId: string): Promise<Patient | null> {
    return db.select().from(patients).where(
      and(eq(patients.practiceId, practiceId), eq(patients.id, patientId))
    ).limit(1).then(r => r[0] ?? null)
  }
}

Where `practiceId` comes from at runtime

// In each Lambda handler, after verifying the Cognito token:
const user = await userRepository.findById(cognitoSub)
const practiceId = user.practiceId  // scoped to this practice for the entire request

Defense in Depth: PostgreSQL Row-Level Security

The TypeScript repository pattern above enforces practice_id at the application layer. PostgreSQL Row-Level Security (RLS) adds a second, independent enforcement layer at the database level — so even a raw SQL query or a bug that bypasses the repository cannot leak data across practices.

How it works

RLS attaches a policy to each tenant-scoped table. When a policy is active, PostgreSQL rejects any query that doesn't satisfy it — regardless of whether the application remembered to add a WHERE clause.

The policy reads a session variable (app.practice_id) that the Lambda handler sets at the start of each connection:

-- Enable RLS on each tenant-scoped table
ALTER TABLE patients          ENABLE ROW LEVEL SECURITY;
ALTER TABLE families          ENABLE ROW LEVEL SECURITY;
ALTER TABLE message_templates ENABLE ROW LEVEL SECURITY;
ALTER TABLE revenue_history   ENABLE ROW LEVEL SECURITY;
ALTER TABLE practice_settings ENABLE ROW LEVEL SECURITY;

-- Create the isolation policy (repeat for each table)
CREATE POLICY practice_isolation ON patients
  USING (practice_id = current_setting('app.practice_id')::uuid);

CREATE POLICY practice_isolation ON families
  USING (practice_id = current_setting('app.practice_id')::uuid);

CREATE POLICY practice_isolation ON message_templates
  USING (practice_id = current_setting('app.practice_id')::uuid);

CREATE POLICY practice_isolation ON revenue_history
  USING (practice_id = current_setting('app.practice_id')::uuid);

CREATE POLICY practice_isolation ON practice_settings
  USING (practice_id = current_setting('app.practice_id')::uuid);

Setting the session variable at runtime

After reading practiceId from the authenticated user, set it on the connection before any queries run:

// In each Lambda handler, after verifying the Cognito token:
const user = await userRepository.findById(cognitoSub)
const practiceId = user.practiceId

// Set the session variable — PostgreSQL RLS reads this for every subsequent query
await db.execute(sql`SELECT set_config('app.practice_id', ${practiceId}, true)`)

// All queries from here are automatically scoped — even if practice_id is omitted

What this protects against

Scenario	TypeScript layer	RLS layer
Repository method missing `WHERE practice_id`	❌ Leaks data	✅ Blocked
Raw SQL in a migration script	❌ No filter	✅ Blocked
Ad-hoc query in a Lambda handler	❌ No filter	✅ Blocked
Correct repository call	✅ Filtered	✅ Also filtered

Important notes

RLS policies apply to all roles by default. If you connect as a superuser for migrations, bypass RLS explicitly: SET SESSION AUTHORIZATION DEFAULT or use BYPASSRLS on the migration role.
The true third argument to set_config makes the setting local to the current transaction. If you use connection pooling across requests, this prevents one request's practice_id from leaking into another.
"Via patient" tables (patient_notes, visit_logs, etc.) don't need RLS directly — they're always queried through a JOIN on patients, which is already protected.

A Note on `practice_settings`

The current practice_settings table uses key as its primary key:

practice_settings (
  key       varchar(100) PRIMARY KEY,
  value     text,
  updated_at timestamp
)

With multi-tenancy, two practices can have the same key (e.g. "reminder_email"). The primary key must change to (practice_id, key):

-- Current: PRIMARY KEY (key)           ← wrong for multi-tenant
-- Correct: PRIMARY KEY (practice_id, key)

This is a breaking schema change — the existing PK must be dropped and replaced.

When to Do This

Trigger	Action
Second practice is confirmed (even just signed up)	Start immediately
Second practice is 4+ weeks out	Begin schema migration in the current sprint
No second practice on the horizon	Add `practices` table and backfill now anyway — it's one migration

The minimum safe action right now: Create the practices table and insert Dr. P's practice. This commits to nothing but establishes the anchor that everything else will reference. The column additions to other tables can follow incrementally.

What is Multi-Tenancy?​

Why It Matters​

Current State​

The Risk of Doing This Later​

Decisions Required Before Implementation​

Decision 1: User ↔ Practice Relationship​

Decision 2: Which Tables Are Tenant-Scoped?​

Decision 3: pricing_tiers — Platform or Per-Practice?​

The Implementation Plan​

Step 1 — Create the practices table​

Step 2 — Backfill Dr. P's practice​

Step 3 — Add practice_id to tenant-scoped tables and backfill​

Step 4 — Add indexes​

Application Layer Changes​

What changes​

The risk​

Where practiceId comes from at runtime​

Defense in Depth: PostgreSQL Row-Level Security​

How it works​

Setting the session variable at runtime​

What this protects against​

Important notes​

A Note on practice_settings​

When to Do This​