Government & Public-Sector Data Sources for Pediatric DPC
Map of US public-sector data sources for Starlight Practice (pediatric DPC EMR + parent app + AI substrate). Covers NIH/NLM, CDC, FDA, HRSA, AHRQ, CMS, state public-health, and standardized vocabularies. Paid commercial sources (UpToDate, Lexicomp, Nelson's, Harriet Lane, paywalled AAP Red Book) are tracked separately in /docs/data-sources/commercial.md.
URLs flagged "unverified this session" mean WebFetch was blocked at compile time. They reflect canonical .gov endpoints from training; re-verify before crawling.
- MUST-HAVE — launch blocker
- NICE-TO-HAVE — v1.x AI grounding / CDS
- WAIT — analytics / research later
1. NIH / NLM Properties
1.1 PubMed / MEDLINE
- Description: ~37M citations & abstracts of biomedical literature; the backbone reference corpus.
- URL: https://pubmed.ncbi.nlm.nih.gov/ — unverified this session
- Access: E-utilities API (
eutils.ncbi.nlm.nih.gov), bulk FTP (annual baseline + daily update files), web UI. - Cost: Free.
- Format: XML (PubMed DTD), JSON via E-utils, MEDLINE flat-file.
- Cadence: Daily updates, annual baseline reload (December).
- License: US Government work / public domain for NLM-produced metadata. Abstracts may carry copyright by publishers — redistribution of full abstracts in derivative products requires care.
- Pediatric relevance: Filter by MeSH
Pediatrics,Child,Infant,Adolescentfor AI grounding on age-specific evidence. - Priority: MUST-HAVE for AI substrate.
1.2 PubMed Central (PMC) Open Access Subset
- Description: Full-text articles, of which ~5M+ are in the OA subset (CC-BY family licenses).
- URL: https://www.ncbi.nlm.nih.gov/pmc/ — unverified this session
- Access: Bulk FTP (
ftp.ncbi.nlm.nih.gov/pub/pmc/), OAI-PMH, E-utilities, OA Web Service API. - Cost: Free.
- Format: NXML (JATS XML), PDF, plain text, package tarballs.
- Cadence: Daily.
- License: Per-article (CC-BY, CC-BY-NC, CC0); the OA subset is explicitly redistributable. The non-OA subset is read-only.
- Pediatric relevance: Full-text RAG corpus; superior to abstracts for clinical question answering.
- Priority: MUST-HAVE.
1.3 ClinicalTrials.gov
- Description: Registry & results database of ~500K clinical studies worldwide.
- URL: https://clinicaltrials.gov/ ; API: https://clinicaltrials.gov/data-api/api — unverified this session
- Access: REST API v2 (JSON), bulk download (XML/JSON ZIPs), CSV exports.
- Cost: Free.
- Format: JSON, XML, CSV.
- Cadence: Real-time / daily.
- License: Public domain (US Government).
- Pediatric relevance: Filter
StdAge=Childfor pediatric trials; useful for parent-app "trials near you" feature and for sourcing rare-disease cohorts. - Priority: NICE-TO-HAVE.
1.4 MedlinePlus / MedlinePlus Connect
- Description: Consumer-facing health information; Connect lets EHRs link diagnosis/medication codes to patient-friendly content.
- URL: https://medlineplus.gov/ ; Connect: https://medlineplus.gov/connect/ — unverified this session
- Access: Connect Web Service (XML/JSON), Health Topics XML feed, bulk download.
- Cost: Free.
- Format: XML, JSON, HTML.
- Cadence: Continuous.
- License: Most NLM-authored content is public domain; some embedded images/encyclopedia entries are licensed (A.D.A.M.) and not redistributable.
- Pediatric relevance: Patient-education content keyed off ICD-10/RxNorm/LOINC — perfect for parent app "learn about this condition/med/lab" surfaces.
- Priority: MUST-HAVE for parent app.
1.5 RxNorm (NLM)
- Description: Normalized naming & codes for clinical drugs in the US.
- URL: https://www.nlm.nih.gov/research/umls/rxnorm/ — unverified this session
- Access: Monthly RRF release (UMLS download), RxNav REST API, RxClass, RxMix.
- Cost: Free.
- Format: RRF (pipe-delimited), JSON via RxNav.
- Cadence: Monthly full release; weekly updates via RxNav.
- License: UMLS Metathesaurus License (free, requires UMLS account); RxNorm itself is unrestricted.
- Pediatric relevance: Required for e-prescribing; pediatric weight-based dosing wiring; mapping to FDA NDC.
- Priority: MUST-HAVE.
1.6 MeSH (Medical Subject Headings)
- Description: Controlled vocabulary of biomedical concepts; powers PubMed indexing.
- URL: https://www.nlm.nih.gov/mesh/ — unverified this session
- Access: Annual XML/RDF download, SPARQL endpoint (
id.nlm.nih.gov/mesh). - Cost: Free.
- Format: XML, RDF/Turtle, ASCII.
- Cadence: Annual (November).
- License: Public domain.
- Pediatric relevance: Concept tagging for AI corpora; faceted search of literature by age subgroups.
- Priority: NICE-TO-HAVE.
1.7 UMLS Metathesaurus
- Description: Cross-walks 200+ vocabularies (SNOMED, ICD, LOINC, RxNorm, MeSH, etc.) into unified concepts (CUIs).
- URL: https://www.nlm.nih.gov/research/umls/ — unverified this session
- Access: Bulk RRF download, REST API.
- Cost: Free with UMLS license (free account).
- Format: RRF, JSON.
- Cadence: Two releases/year (May, November).
- License: Free for US users; some source vocabularies have restrictions inherited (e.g., CPT). Must accept UMLS license.
- Pediatric relevance: The single most important crosswalk asset for clinical interoperability.
- Priority: MUST-HAVE.
1.8 NCBI Bookshelf (incl. GeneReviews, StatPearls)
- Description: Free full-text books and reports; GeneReviews is the gold standard for genetic conditions; StatPearls is open-access clinical reference.
- URL: https://www.ncbi.nlm.nih.gov/books/ — unverified this session
- Access: OAI-PMH, bulk FTP, E-utilities.
- Cost: Free.
- Format: NXML, PDF.
- Cadence: Continuous.
- License: Per-title; many CC-BY or NLM public domain. StatPearls is CC-BY-NC-ND — commercial redistribution restricted.
- Pediatric relevance: GeneReviews for inborn errors of metabolism, syndromic conditions; StatPearls for general peds reference.
- Priority: NICE-TO-HAVE (verify StatPearls license fits commercial use).
1.9 DailyMed (NLM)
- Description: Authoritative FDA-submitted drug labeling (Structured Product Labels).
- URL: https://dailymed.nlm.nih.gov/ — unverified this session
- Access: REST/SOAP API, bulk download (FTP).
- Cost: Free.
- Format: SPL (HL7 XML), PDF, JSON via API.
- Cadence: Daily.
- License: Public domain.
- Pediatric relevance: Pediatric Use sections, Boxed Warnings, Dosage & Administration — required for safe e-prescribing.
- Priority: MUST-HAVE.
1.10 Pillbox (retired Jan 2021)
Use DailyMed images going forward. WAIT / archival.
1.11 NLM Value Set Authority Center (VSAC)
- Description: Authoritative repository of value sets used in eCQMs (electronic Clinical Quality Measures).
- URL: https://vsac.nlm.nih.gov/ — unverified this session
- Access: API (FHIR ValueSet, SVS), web download.
- Cost: Free with UMLS license.
- Format: FHIR JSON/XML, SVS XML, Excel.
- Cadence: Continuous.
- License: Per-source; SNOMED, LOINC, RxNorm value sets are redistributable under their parent licenses.
- Pediatric relevance: Pre-built CMS/ONC-blessed value sets for well-child visits, immunizations, BMI, lead screening, etc.
- Priority: MUST-HAVE for quality measure reporting.
1.12 NIH Common Data Elements (CDE) Repository
- URL: https://cde.nlm.nih.gov/ — unverified this session
- Access: Web UI, REST API.
- Cost: Free.
- Pediatric relevance: Standardized question/answer items for assessments (PROMIS, NIH Toolbox).
- Priority: NICE-TO-HAVE.
1.13 NIH PROMIS / NIH Toolbox
- Description: Patient-Reported Outcomes Measurement Info System with pediatric and parent-proxy short forms.
- URL: https://www.healthmeasures.net/ — unverified this session (NIH-funded, hosted at Northwestern).
- Access: Free instruments via registration; Assessment Center API.
- License: Free for use; redistribution of items with attribution.
- Pediatric relevance: Validated peds PROs (anxiety, depression, mobility, peer relationships).
- Priority: NICE-TO-HAVE.
1.14 dbGaP, GenBank, ClinVar, MedGen
NCBI databases via E-utilities/FTP; free (dbGaP individual-level requires controlled access). ClinVar/MedGen useful for newborn-screen follow-up. WAIT unless we add genomics.
2. CDC Properties
2.1 CDC WONDER
- Description: Online query system for mortality, natality, cancer, STD, TB, environmental data.
- URL: https://wonder.cdc.gov/ — unverified this session
- Access: Web UI; programmatic XML POST API (rate-limited; deprecated for some datasets).
- Cost: Free.
- Format: TSV, XML.
- Cadence: Annual / quarterly depending on dataset.
- License: Public domain (US Government).
- Pediatric relevance: Infant mortality, birth defects, leading causes of death by age, vaccine-preventable disease counts.
- Priority: NICE-TO-HAVE (population baselines for synthetic patient generation).
2.2 CDC Growth Charts (2000)
- Description: US growth references for ages 2–20 (height-for-age, weight-for-age, BMI-for-age, weight-for-stature). Used above age 2.
- URL: https://www.cdc.gov/growthcharts/ ; data files: https://www.cdc.gov/growthcharts/cdc-data-files.htm — unverified this session
- Access: Direct CSV/Excel download of L, M, S parameters and percentile tables; PDF charts.
- Cost: Free.
- Format: CSV, Excel, PDF (for printable charts).
- Cadence: Static (2000 reference; stable).
- License: Public domain.
- Pediatric relevance: Core — every well-child visit calculates and plots growth. LMS parameters enable z-score / percentile calculation.
- Priority: MUST-HAVE.
2.3 WHO Growth Standards (CDC-hosted & recommended for 0–2)
- Description: WHO Multicentre Growth Reference Study; CDC recommends WHO charts for birth–age 24 months, then CDC charts after.
- URL: https://www.cdc.gov/growthcharts/who-growth-charts.htm ; primary: https://www.who.int/tools/child-growth-standards — unverified this session
- Access: Direct download of LMS tables.
- Cost: Free.
- Format: Excel, TXT.
- License: WHO terms allow non-commercial & commercial use with attribution; CDC-hosted copies are public domain.
- Pediatric relevance: Core for infants/toddlers.
- Priority: MUST-HAVE.
2.4 CDC/AAP/ACIP Immunization Schedules
- Description: Annual recommended childhood and adolescent immunization schedules + catch-up schedules.
- URL: https://www.cdc.gov/vaccines/hcp/imz-schedules/ — unverified this session
- Access: PDF, HTML; CDC Immunization Schedule API/JSON has been published in recent years (verify currency).
- Cost: Free.
- Format: PDF, HTML; machine-readable schedule (JSON) experimental.
- Cadence: Annual (released early each year by ACIP).
- License: Public domain.
- Pediatric relevance: Core — CDS for due/overdue immunizations.
- Priority: MUST-HAVE. Pair with CDS Connect logic and HL7 CDS for Immunizations (ICE/COVE).
2.5 CDC ACIP Vaccine Recommendations & Statements
- URL: https://www.cdc.gov/vaccines/acip/ — unverified this session
- Access: PDF, HTML, MMWR.
- Cost: Free.
- License: Public domain.
- Priority: MUST-HAVE (full text grounding for AI).
2.6 CVX / MVX Code Sets (CDC IIS)
- Description: CVX (vaccine-administered codes) and MVX (manufacturer codes) used in HL7 immunization messaging.
- URL: https://www2a.cdc.gov/vaccines/iis/iisstandards/vaccines.asp?rpt=cvx — unverified this session
- Access: HTML tables, downloadable XML.
- Cost: Free.
- License: Public domain.
- Priority: MUST-HAVE for vaccine recording.
2.7 Immunization Information Systems (IIS) — State Registries
- Description: Per-state registries (every state + DC has one). Bidirectional HL7 v2.5.1 messaging is required for "Public Health Reporting" in CMS Promoting Interoperability.
- URL: https://www.cdc.gov/vaccines/programs/iis/contacts-locate-records.html — unverified this session
- Access: Per-state onboarding (HL7 v2 over SOAP/MLLP/SFTP); CDC AIRA model docs at
repository.immregistries.org— unverified this session - Cost: Free (regulatory).
- Format: HL7 v2.5.1 VXU/QBP/RSP.
- License: N/A (regulated reporting).
- Pediatric relevance: Core — required for full vaccine history queries and meaningful use.
- Priority: MUST-HAVE (state-by-state rollout).
2.8 VAERS (Vaccine Adverse Event Reporting System)
- Description: Co-managed by CDC & FDA; passive surveillance of post-vaccination adverse events.
- URL: https://vaers.hhs.gov/data.html — unverified this session
- Access: CSV downloads (annual + current year), CDC WONDER VAERS query.
- Cost: Free.
- Format: CSV.
- Cadence: Weekly.
- License: Public domain (PII redacted).
- Pediatric relevance: Background-rate context for vaccine safety counseling.
- Priority: NICE-TO-HAVE.
2.9 NHANES (National Health and Nutrition Examination Survey)
- URL: https://www.cdc.gov/nchs/nhanes/ — unverified this session
- Access: Bulk SAS XPT & CSV files per cycle.
- Cost: Free.
- Format: SAS XPT, CSV, codebooks.
- Cadence: Continuous (2-year cycles).
- License: Public domain.
- Pediatric relevance: Pediatric anthropometry, lab norms, dietary intake; basis of CDC growth refs and BP percentiles. Critical for synthetic patient generation.
- Priority: MUST-HAVE for synthetic data + reference ranges.
2.10 NSCH (National Survey of Children's Health)
- Description: Annual HRSA-funded, Census-administered survey of ~50K households on child health, well-being, special health-care needs.
- URL: https://www.childhealthdata.org/ ; raw: https://www.census.gov/programs-surveys/nsch.html — unverified this session
- Access: Public-use SAS/Stata/CSV; restricted-use data via Census FSRDC.
- Cost: Free.
- Cadence: Annual.
- License: Public domain (public-use file).
- Pediatric relevance: Population priors for screening prevalences (ADHD, asthma, mental health, ACEs).
- Priority: NICE-TO-HAVE.
2.11 NSFG (National Survey of Family Growth)
https://www.cdc.gov/nchs/nsfg/ — adolescent reproductive-health priors. WAIT.
2.12 YRBSS (Youth Risk Behavior Surveillance System)
- URL: https://www.cdc.gov/healthyyouth/data/yrbs/ — unverified this session
- Access: CSV/SAS bulk + interactive analytic tool.
- Cost: Free.
- Cadence: Biennial.
- License: Public domain.
- Pediatric relevance: Adolescent health-risk prevalences; drives CDS thresholds for screening (substance use, sexual activity, suicidality).
- Priority: NICE-TO-HAVE.
2.13 MMWR (Morbidity and Mortality Weekly Report)
- URL: https://www.cdc.gov/mmwr/ — unverified this session
- Access: HTML, PDF, RSS.
- Cost: Free.
- Format: HTML/PDF; machine-readable JSON via CDC Data API for some series.
- License: Public domain.
- Pediatric relevance: Outbreak alerts, vaccine policy updates, screening guidance.
- Priority: NICE-TO-HAVE (RSS subscribe for alerts).
2.14 CDC Open Data Portal (data.cdc.gov)
- URL: https://data.cdc.gov/ — unverified this session
- Access: Socrata SODA API (JSON/CSV/XML), per-dataset endpoints.
- Cost: Free.
- Format: JSON, CSV, XML, GeoJSON.
- License: Public domain.
- Pediatric relevance: ~3,000 datasets including weekly flu/RSV/COVID by age, lead testing, BRFSS, environmental.
- Priority: NICE-TO-HAVE.
2.15 NNDSS (National Notifiable Diseases Surveillance System)
- URL: https://www.cdc.gov/nndss/ — unverified this session
- Access: Weekly tables via CDC Data Portal.
- Cost: Free.
- Pediatric relevance: Local incidence of pertussis, measles, varicella for clinical-suspicion priors.
- Priority: NICE-TO-HAVE.
2.16 CDC Lead Surveillance / CLPPP
https://www.cdc.gov/nceh/lead/data/ — state-level BLL prevalence for risk-based screening. NICE-TO-HAVE.
2.17 EPT (Early Periodic Screening) References & Bright Futures-aligned tools
- See HRSA section. CDC hosts the "Learn the Signs. Act Early." developmental milestone checklists (Free, public domain) at https://www.cdc.gov/ncbddd/actearly/ — unverified this session. MUST-HAVE for parent app developmental tracker.
3. FDA Properties
3.1 OpenFDA
- Description: Unified API surface across drug labels, drug events (FAERS), drug enforcement, device events (MAUDE), device 510(k)/PMA, food enforcement, NSDE, animal events.
- URL: https://open.fda.gov/ ; API: https://api.fda.gov/ — unverified this session
- Access: REST API (JSON), bulk downloads (JSON ZIPs).
- Cost: Free; 240 req/min and 120K/day without API key, higher with free key.
- Format: JSON.
- Cadence: Quarterly to weekly depending on endpoint.
- License: Public domain.
- Pediatric relevance: Drug labeling extraction (
pediatric_usefield), pediatric AE signals via FAERS. - Priority: MUST-HAVE.
3.2 Drugs@FDA
- URL: https://www.accessdata.fda.gov/scripts/cder/daf/ — unverified this session
- Access: Web search; bulk downloads at FDA download pages.
- Cost: Free.
- Format: Excel, JSON via OpenFDA.
- Pediatric relevance: Approval letters, labels; pediatric study requirements (PREA/BPCA).
- Priority: NICE-TO-HAVE (covered by OpenFDA + DailyMed in practice).
3.3 FDA Pediatric Labeling Information / Pediatric Studies Database
- URL: https://www.fda.gov/science-research/pediatrics/pediatric-labeling-information-database — unverified this session
- Access: Web; downloadable lists.
- Cost: Free.
- License: Public domain.
- Pediatric relevance: Authoritative list of drugs with pediatric-specific labeling changes from PREA/BPCA studies.
- Priority: MUST-HAVE (citation backbone for "is this drug pediatric-labeled?").
3.4 FAERS (FDA Adverse Event Reporting System)
- URL: https://www.fda.gov/drugs/questions-and-answers-fdas-adverse-event-reporting-system-faers — unverified this session
- Access: Quarterly ASCII/XML downloads; OpenFDA
/drug/eventAPI. - Cost: Free.
- Format: XML, ASCII, JSON.
- Cadence: Quarterly.
- License: Public domain.
- Pediatric relevance: Pediatric subset extraction by patient age.
- Priority: NICE-TO-HAVE.
3.5 FDA NDC Directory
- URL: https://www.accessdata.fda.gov/scripts/cder/ndc/ — unverified this session
- Access: ZIP download (Excel/Text), OpenFDA
/drug/ndc. - Cost: Free.
- Cadence: Daily.
- License: Public domain.
- Pediatric relevance: NDC↔RxNorm crosswalk for e-Rx and pharmacy interfaces.
- Priority: MUST-HAVE.
3.6 FDA Orange Book / Purple Book
Monthly ZIP at fda.gov; therapeutic equivalence + biosimilars. Free, public domain. NICE-TO-HAVE.
3.7 FDA Drug Shortages
- URL: https://www.accessdata.fda.gov/scripts/drugshortages/ — unverified this session
- Access: Web + RSS + downloadable XLS; ASHP also publishes a partner feed.
- Cost: Free.
- Pediatric relevance: Critical for pediatric formulations (oral suspensions, low-dose) which shortage frequently.
- Priority: NICE-TO-HAVE.
3.8 FDA MedWatch & Safety Communications
https://www.fda.gov/safety/medwatch — web + RSS, public domain. NICE-TO-HAVE (alerts).
3.9 GUDID / AccessGUDID
https://accessgudid.nlm.nih.gov/ — bulk + REST API; device UDI lookups. WAIT.
4. HRSA / MCHB
4.1 NSCH (HRSA-funded; see CDC §2.10)
HRSA MCHB funds, Census fields. Primary access via https://www.childhealthdata.org/ — unverified this session. MUST-HAVE for population priors.
4.2 Title V MCH Block Grant Information System (TVIS)
https://mchb.tvisdata.hrsa.gov/ — state MCH performance measures (Excel). WAIT.
4.3 HRSA Data Warehouse
https://data.hrsa.gov/ — Socrata-style APIs; HPSA/MUA/MUP for telehealth planning. NICE-TO-HAVE.
4.4 Bright Futures (AAP / HRSA co-publication)
- Description: The recommended pediatric preventive-services schedule and visit content ("Periodicity Schedule" + visit forms).
- URL: https://brightfutures.aap.org/ ; HRSA: https://mchb.hrsa.gov/programs-impact/programs/bright-futures — unverified this session
- Access: PDF (free) for the Periodicity Schedule and visit summaries; the full Bright Futures Guidelines, 4th Edition book is paid (AAP). Pocket Guide and forms are free.
- Cost: Free for periodicity schedule, visit handouts, pre-visit questionnaires; paid for the complete guidelines.
- Format: PDF, some Word.
- License: AAP-copyrighted; HRSA-funded items often permit reproduction with attribution — must verify per-asset license.
- Pediatric relevance: Core — the canonical well-child visit framework for US pediatrics.
- Priority: MUST-HAVE (free portions); flag the paid book in
/commercial.md.
4.5 Healthy People 2030
https://health.gov/healthypeople — objectives JSON API, public domain. National benchmarks for screening/immunization coverage. NICE-TO-HAVE.
4.6 Newborn Screening (NewSTEPs / APHL + HRSA RUSP)
- Description: Recommended Uniform Screening Panel (RUSP) is HRSA-blessed; NewSTEPs (APHL) tracks state implementations.
- URL: https://www.hrsa.gov/advisory-committees/heritable-disorders/rusp ; https://www.newsteps.org/ — unverified this session
- Access: PDF / web; some bulk via NewSTEPs login.
- Cost: Free.
- Pediatric relevance: Per-state newborn screen panels and follow-up workflows.
- Priority: MUST-HAVE for newborn-visit workflow.
5. AHRQ
5.1 HCUP (Healthcare Cost and Utilization Project)
- Description: Largest US all-payer encounter datasets. HCUP-KID = Kids' Inpatient Database (triennial, ~3M peds discharges); NIS has pediatric subset; NEDS = ED.
- URL: https://www.hcup-us.ahrq.gov/ — unverified this session
- Access: Purchased via HCUP Central Distributor; DUA required; nominal cost (often a few hundred USD per file for non-students; free for some federal users).
- Cost: Low-cost paid + DUA. Flag as "not free".
- Format: ASCII, SAS, Stata.
- Cadence: Annual / triennial.
- License: AHRQ DUA — restricted re-use; no individual identification, no commercial redistribution.
- Pediatric relevance: Pediatric inpatient/ED epidemiology; benchmark for "is this admission rate normal?"
- Priority: WAIT (until research/analytics phase).
5.2 MEPS (Medical Expenditure Panel Survey)
- URL: https://meps.ahrq.gov/ — unverified this session
- Access: Public-use SAS/Stata/ASCII.
- Cost: Free (public-use).
- Pediatric relevance: Family medical expenditure modeling; useful for DPC pricing analytics.
- Priority: NICE-TO-HAVE.
5.3 USPSTF Recommendations (AHRQ-supported)
- URL: https://www.uspreventiveservicestaskforce.org/ — unverified this session
- Access: Web, PDF; JSON/REST API for recommendations exists.
- Cost: Free.
- License: Public domain.
- Pediatric relevance: Pediatric screening recommendations (vision, depression in adolescents, lipid, etc.). Plus letter grades drive ACA preventive-services coverage.
- Priority: MUST-HAVE.
5.4 CDS Connect (AHRQ)
- URL: https://cds.ahrq.gov/cdsconnect — unverified this session
- Access: Web; CQL/FHIR artifacts free to download.
- Cost: Free.
- License: Per-artifact, mostly Apache 2.0 / CC.
- Pediatric relevance: Pre-built CDS artifacts incl. pediatric BMI, adolescent depression screening, opioid.
- Priority: MUST-HAVE for CDS scaffolding.
5.5 AHRQ PSNet, TeamSTEPPS, SOPS
https://psnet.ahrq.gov/ — patient-safety guidance. WAIT.
6. CMS
6.1 CMS Public Use Files / data.cms.gov
https://data.cms.gov/ — Socrata API, bulk CSV, public domain. Skews adult; useful for Medicaid peds provider directories. WAIT.
6.2 Medicaid & CHIP (T-MSIS DataHub)
- URL: https://www.medicaid.gov/dq-atlas/ ; https://data.medicaid.gov/ — unverified this session
- Access: Web reports; researcher access to T-MSIS Analytic Files via DUA.
- Cost: Free public; researcher tier requires DUA.
- Pediatric relevance: Medicaid/CHIP cover ~40% of US kids; utilization & quality measures (CMS Child Core Set) are key.
- Priority: NICE-TO-HAVE.
6.3 CMS Child Core Set (Quality Measures)
- URL: https://www.medicaid.gov/medicaid/quality-of-care/performance-measurement/adult-and-child-health-care-quality-measures/ — unverified this session
- Access: PDF, value sets in VSAC.
- License: Public domain (CMS-authored portions).
- Pediatric relevance: Pediatric quality measures we may need to report (well-child visits, immunization status, BMI, lead screening, ADHD follow-up).
- Priority: MUST-HAVE for value-based contracting.
6.4 CMS NPPES (NPI Registry)
- URL: https://npiregistry.cms.hhs.gov/ — unverified this session
- Access: REST API, bulk monthly download.
- Cost: Free.
- License: Public domain.
- Pediatric relevance: Provider directory backbone.
- Priority: MUST-HAVE.
6.5 HCPCS Level II
- URL: https://www.cms.gov/medicare/coding-billing/healthcare-common-procedure-system — unverified this session
- Access: Quarterly Excel/Text downloads.
- Cost: Free.
- License: Public domain (Level II); CPT (Level I) is NOT free — AMA license required.
- Priority: MUST-HAVE (vaccines, supplies billing).
6.6 CMS Promoting Interoperability / ONC USCDI
- URL: https://www.healthit.gov/isa/united-states-core-data-interoperability-uscdi — unverified this session
- Access: Web, downloadable spec.
- License: Public domain.
- Pediatric relevance: USCDI v3+ defines required EHR data classes — we comply with this for ONC certification.
- Priority: MUST-HAVE (compliance).
7. State / Local Public Health
7.1 State Immunization Information Systems (IIS)
See §2.7. Per-state. MUST-HAVE state-by-state.
7.2 State Newborn Screening Programs
See §4.6. MUST-HAVE for newborn workflows.
7.3 State Electronic Lab Reporting (ELR) & Syndromic Surveillance
- URL (CDC orchestrator): https://www.cdc.gov/nssp/ — unverified this session
- Access: Per-state HL7 v2 ORU connection.
- Pediatric relevance: Required public-health reporting (reportable conditions: measles, pertussis, lead, etc.).
- Priority: MUST-HAVE for ONC certification (Public Health Reporting criteria).
7.4 PRAMS
https://www.cdc.gov/prams/ — pregnancy risk assessment; public-use free. WAIT.
7.5 State Open-Data Portals
Socrata/ArcGIS hubs (data.ca.gov, health.data.ny.gov, data.texas.gov, etc.). WAIT per launch state.
8. Standardized Vocabularies & Code Sets
| Vocabulary | URL | Cost | License | Cadence | Format | Priority |
|---|---|---|---|---|---|---|
| ICD-10-CM (diagnoses) | https://www.cms.gov/medicare/icd-10/2026-icd-10-cm | Free | Public domain | Annual (Oct 1) | XML, TXT, Excel | MUST-HAVE |
| ICD-10-PCS (inpatient procedures) | CMS | Free | Public domain | Annual | XML, TXT | WAIT (outpatient peds rarely uses) |
| SNOMED CT US Edition | https://www.nlm.nih.gov/healthit/snomedct/us_edition.html | Free for US use | NLM/SNOMED Intl. Affiliate License | Bi-monthly | RF2 | MUST-HAVE |
| LOINC (labs, observations) | https://loinc.org/downloads/ | Free | LOINC License (permissive, attribution) | Twice/year | CSV, OWL, FHIR | MUST-HAVE |
| RxNorm | https://www.nlm.nih.gov/research/umls/rxnorm/ | Free | Unrestricted | Monthly | RRF, JSON | MUST-HAVE |
| NDC | OpenFDA / FDA | Free | Public domain | Daily | JSON, Excel | MUST-HAVE |
| CVX/MVX (vaccines) | CDC IIS Standards | Free | Public domain | Continuous | XML/HTML | MUST-HAVE |
| HCPCS Level II | CMS | Free | Public domain | Quarterly | Excel | MUST-HAVE |
| CPT (Level I procedures) | AMA | PAID | AMA copyright; per-user license fees | Annual | proprietary | MUST-HAVE but paid — budget for AMA license |
| HL7 FHIR R4 / R4B / R5 | https://www.hl7.org/fhir/ | Free | CC0 (FHIR core) | Periodic | StructureDefinitions JSON/XML | MUST-HAVE |
| US Core IG | https://hl7.org/fhir/us/core/ | Free | CC0 | Yearly | FHIR IG | MUST-HAVE |
| USCDI | https://www.healthit.gov/isa/uscdi | Free | Public domain | Annual | PDF/JSON | MUST-HAVE |
| C-CDA | https://www.hl7.org/implement/standards/product_brief.cfm?product_id=492 | Free (HL7 membership not required for read) | HL7 license | Periodic | XML | NICE-TO-HAVE |
| UCUM (units) | https://ucum.org/ | Free | Open-source license | Periodic | XML | MUST-HAVE |
| MeSH | NLM | Free | Public domain | Annual | XML/RDF | NICE-TO-HAVE |
| MedDRA (regulatory AE terminology) | https://www.meddra.org/ | PAID for commercial | MSSO license tiered by revenue | Twice/year | ASCII | WAIT |
| ICF (functioning/disability) | WHO | Free | WHO license | Periodic | XML | WAIT |
CPT is NOT public domain. AMA charges per-user (or per-EHR-seat) license fees. Budget this into the rate card. Some workarounds: use the free CMS HCPCS Level II for vaccines/supplies; license CPT through a clearinghouse partner (Change Healthcare/Office Ally) where their distribution license can cover end-users.
9. Pediatric-Specific Guideline Documents (Free / Near-Free Subset)
| Guideline | Source | URL | Cost | Priority |
|---|---|---|---|---|
| CDC/AAP Childhood Immunization Schedule | CDC | cdc.gov/vaccines/hcp/imz-schedules | Free | MUST-HAVE |
| ACIP Vaccine Recommendations & Statements | CDC | cdc.gov/vaccines/acip | Free | MUST-HAVE |
| CDC Growth Charts (2-20) | CDC | cdc.gov/growthcharts | Free | MUST-HAVE |
| WHO Growth Standards (0-2) | WHO/CDC | who.int/tools/child-growth-standards | Free | MUST-HAVE |
| Bright Futures Periodicity Schedule | AAP/HRSA | brightfutures.aap.org | Free (schedule PDF) | MUST-HAVE |
| Bright Futures Pre-Visit Questionnaires | AAP/HRSA | brightfutures.aap.org/materials-and-tools | Free | MUST-HAVE |
| Bright Futures Guidelines, 4th Ed (full book) | AAP | shop.aap.org | PAID | flag in /commercial.md |
| AAP Red Book | AAP | redbook.solutions.aap.org | PAID | flag in /commercial.md |
| USPSTF Recommendations | AHRQ | uspreventiveservicestaskforce.org | Free (incl. API) | MUST-HAVE |
| CDC Lead Screening Recommendations | CDC | cdc.gov/nceh/lead | Free | MUST-HAVE |
| CDC Developmental Milestones (Learn the Signs) | CDC | cdc.gov/ncbddd/actearly | Free | MUST-HAVE |
| NHLBI Pediatric BP Tables | NIH/NHLBI | nhlbi.nih.gov | Free | MUST-HAVE |
| NHLBI Cholesterol Pediatric Guidelines (Expert Panel 2011) | NIH/NHLBI | nhlbi.nih.gov | Free | NICE-TO-HAVE |
| NHLBI Pediatric Asthma EPR-3 / 2020 Focused Updates | NIH/NHLBI | nhlbi.nih.gov/health-topics/asthma | Free | MUST-HAVE |
| NIDA Screening Tools (S2BI, BSTAD, CRAFFT) | NIH/NIDA | nida.nih.gov | Free | NICE-TO-HAVE |
| AAP-endorsed Bilirubin Guidelines (Hyperbilirubinemia 2022) | AAP | publications.aap.org | Free PDF (open access) | MUST-HAVE |
| AAP Obesity CPG (2023) | AAP | publications.aap.org | Free open access | MUST-HAVE |
| ADHD CPG (AAP 2019) | AAP | publications.aap.org | Free open access | MUST-HAVE |
AAP has been making select CPGs free open-access on publications.aap.org. Verify each at ingest time — license may be CC-BY-NC-ND restricting commercial redistribution but permitting clinical use.
10. Other Federal Sources Worth Knowing
| Source | URL | Pediatric relevance | Priority |
|---|---|---|---|
| NIH RePORTER | https://reporter.nih.gov/ | Funded research lookup | WAIT |
| NIDDK Data Repository | https://repository.niddk.nih.gov/ | Pediatric T1D, obesity cohorts | WAIT |
| NICHD DASH | https://dash.nichd.nih.gov/ | Pediatric study data archive | WAIT |
| NIEHS / EJScreen | https://ejscreen.epa.gov/ | Environmental risk by census tract (asthma, lead) | NICE-TO-HAVE |
| EPA AirNow API | https://docs.airnowapi.org/ | Real-time AQI for asthma alerts | NICE-TO-HAVE |
| USDA FoodData Central | https://fdc.nal.usda.gov/ | Nutrition lookup for parent app | NICE-TO-HAVE |
| SAMHSA Treatment Locator API | https://findtreatment.gov/ | Adolescent SUD referrals | WAIT |
| NIH HEAL / Helping End Addiction Long-term | https://heal.nih.gov/ | Adolescent OUD resources | WAIT |
| Census ACS / Decennial | https://www.census.gov/data/developers/data-sets.html | SDoH risk stratification | NICE-TO-HAVE |
| NCES School Data | https://nces.ed.gov/ | School-located care planning | WAIT |
| Social Vulnerability Index (CDC/ATSDR) | https://www.atsdr.cdc.gov/placeandhealth/svi | SDoH overlay | NICE-TO-HAVE |
11. Implementation Notes for Starlight
Ingestion architecture
- Tier 1 (launch blockers) — ICD-10-CM, RxNorm, NDC, LOINC, SNOMED CT US, CVX/MVX, HCPCS, CDC + WHO growth charts, ACIP schedule, USCDI/US Core, Bright Futures periodicity, USPSTF, CDS Connect artifacts, MedlinePlus Connect, DailyMed, NPPES.
- Tier 2 (v1.x AI grounding) — PubMed/PMC OA, ClinicalTrials.gov, NHANES, NSCH, MMWR, AAP open-access CPGs, NHLBI BP tables, "Learn the Signs."
- Tier 3 (analytics/research) — HCUP-KID (paid, DUA), MEPS, T-MSIS, YRBSS, SVI.
- State-by-state rollout — IIS, ELR, syndromic surveillance, newborn screen follow-up — per launch jurisdiction.
Licensing minefields to track
- CPT — budget AMA license.
- MedDRA — only if we go regulatory/PV. Skip otherwise.
- SNOMED CT — free for US use only; international expansion requires SNOMED International affiliate license.
- StatPearls — CC-BY-NC-ND blocks commercial redistribution.
- A.D.A.M. content in MedlinePlus — not redistributable.
- HCUP — DUA prohibits re-identification & commercial redistribution; analytics output OK with caveats.
- Bright Futures — AAP-copyrighted; HRSA grant terms generally permit reproduction with attribution; verify each asset.
- AAP CPGs — per-paper open-access status varies; verify CC license at ingest.
Verification ToDo
Before crawl/ingest, re-confirm every URL flagged unverified this session and capture: response 200 + content-type, license/ToS snapshot, rate-limit / robots.txt posture.
Last updated: 2026-05-07.