On October 20, 2025, Anthropic launched Claude for Life Sciences with explicit positioning for regulatory affairs managers, clinical coordinators, and scientists. On January 11, 2026, the company expanded the offering on both axes: a new Claude for Healthcare product with HIPAA-ready Claude for Enterprise access and connectors for the CMS Coverage Database, ICD-10, and the National Provider Identifier Registry; and an expanded Claude for Life Sciences with connectors for Medidata, ClinicalTrials.gov, bioRxiv, medRxiv, Open Targets, ChEMBL, ToolUniverse, and Owkin's Pathology Explorer. In April 2026, Anthropic acquired AI biotech startup Coefficient Bio in a reported $400M stock deal. On May 14, 2026, Anthropic announced a four-year, $200M partnership with the Gates Foundation, targeting vaccine candidate screening (starting with polio), therapy development for high-burden diseases like HPV and preeclampsia, and disease forecasting through the Institute for Disease Modeling; the full partnership also spans education and economic mobility.

Between those announcements, life sciences and healthcare adoption of LLMs moved from "interesting research tool" to operational infrastructure for regulated work. The compliance posture across most organizations has not kept pace.

This piece walks through the regulatory boundaries that apply when an LLM enters a regulated workflow in pharma, medical device, biotech, or clinical research. It is not a generic AI governance overview. It is the framework for deciding which specific FDA, EU, and ICH requirements attach to which specific use cases.

What Anthropic Just Endorsed

Claude for Life Sciences expanded the official capability list to cover literature reviews and hypothesis generation, study protocol and SOP drafting, bioinformatics and genomic data analysis, clinical trial operations, regulatory submission drafting and review, and compliance data compilation. The connector ecosystem now includes Benchling (electronic lab notebooks), BioRender, PubMed, Wiley's Scholar Gateway, Synapse.org, 10x Genomics, Medidata (trial enrollment and site performance), ClinicalTrials.gov, ToolUniverse (over 600 vetted scientific tools), bioRxiv and medRxiv preprints, Open Targets, ChEMBL, and Owkin's Pathology Explorer for tissue image analysis. Existing general-purpose connectors already covered Google Workspace, Microsoft 365, Databricks, and Snowflake.

Claude for Healthcare added operationally-focused connectors for the CMS Local and National Coverage Determination database, ICD-10 diagnosis and procedure codes (sourced from CMS and CDC), and the National Provider Identifier Registry, among others, alongside HIPAA-ready Claude for Enterprise access. Two new Healthcare Agent Skills shipped with the launch: FHIR development and a sample prior authorization review skill. Life sciences Agent Skills additions include scientific problem selection, Allotrope conversion for instrument data, scVI-tools and Nextflow deployment for bioinformatics, and a sample clinical trial protocol draft generation skill that accounts for FDA guidelines and competitive landscape.

The Gates Foundation announcement deepens the trajectory. Reuters reported on May 14, 2026 that Anthropic and the Gates Foundation pledged $200 million over four years to back AI public goods and applications in health and education, with research centers using Claude to predict drug candidates for HPV and preeclampsia among the initiatives under consideration.

The implication for regulated workflows: the use cases Anthropic is now actively endorsing and tooling for are use cases that cross into GxP, GCP, HIPAA, and medical device territory.

Where Most Companies Get the Scoping Wrong

The most common compliance error in early LLM adoption is treating "AI" as a single regulatory question. It is not. The relevant regulations attach to specific use cases, not to the model itself.

A foundation model accessed via API is a third-party computerized system. Whether and how it falls under regulatory scope is determined by what records or decisions it produces, whether those records or decisions enter a regulated process, and the intended use and risk classification of that process.

The same Claude API call can be regulated under 21 CFR Part 11, FDA CSA, EU AI Act, IEC 62304, or none of the above, depending entirely on what the output does next.

The Five Compliance Boundaries

Boundary 1: Research output vs. GxP record

An LLM-generated literature summary used to inform a hypothesis is research output. The same summary saved into an electronic lab notebook supporting a GLP study or a CMC submission becomes part of a GxP record. The moment that transition happens, the record is subject to ALCOA+ data integrity expectations and, if signed or used as an attributable record, 21 CFR Part 11.

The boundary is the record system, not the model.

Boundary 2: Production and QMS software vs. device function

FDA's February 3, 2026 guidance, "Computer Software Assurance for Production and Quality Management System Software" (Level 2 revision, title updated to align with QMSR), explicitly covers software used as part of production or the quality management system. It explicitly excludes software that is itself a device function. SaMD and SiMD remain under the General Principles of Software Validation and IEC 62304.

LLM use cases that fall on the production/QMS side include batch record review support, complaint triage drafting, CAPA narrative drafting, training material generation, and SOP authoring assistance. CSA's risk-based assurance approach applies. Assurance activities are scaled to process risk, with high-risk uses receiving deeper testing and documentation and lower-risk uses receiving lighter evidence.

LLM use cases that fall on the device function side include any model directly providing a clinical recommendation, diagnostic output, or therapeutic decision support that meets the SaMD definition under IMDRF or the Clinical Decision Support criteria under FDA's January 6, 2026 Clinical Decision Support guidance (re-issued January 29, 2026). These are device territory. IEC 62304 software lifecycle requirements apply. If the model is AI/ML-enabled and integrated into a cleared device, FDA's predetermined change control plan (PCCP) framework becomes relevant.

The boundary is the intended use of the output, not the underlying technology.

Boundary 3: Configured product vs. custom integration

GAMP 5 Second Edition (ISPE, July 2022) software categorization matters here. A direct Claude API integration into a custom internal workflow is typically Category 5 (custom application). A configured commercial product that embeds an LLM and is sold as a regulated tool is typically Category 4. Off-the-shelf chat use of a hosted assistant for non-regulated tasks may sit at Category 3 if the use is constrained and the output never enters a regulated record.

The GAMP categorization drives the depth of supplier assessment, qualification testing, and ongoing periodic review obligations.

Boundary 4: EU AI Act intended-purpose test

Regulation EU 2024/1689 Annex III high-risk obligations become enforceable August 2, 2026. The high-risk classifications most relevant to life sciences are AI systems that are safety components of products regulated under EU MDR or IVDR, AI systems used for biometric categorization of natural persons, and AI systems used in employment for screening candidates (relevant to clinical investigator selection in some scenarios).

The MDR/IVDR safety component test is the one most life sciences companies will face. A standalone LLM used for SOP drafting does not meet this test. A model integrated into a medical device under MDR Rule 11 likely does. The intended purpose declaration drives the analysis.

Annex IV technical documentation is the deliverable required for high-risk systems. Conformity assessment must be completed before the system is placed on the EU market or put into service for high-risk applications.

Boundary 5: Clinical investigation use vs. supportive function

ICH E6(R3), Step 4 Final Guideline adopted January 6, 2025 and effective in the EU July 23, 2025 under Regulation EU 536/2014, applies to clinical investigations. LLM use cases inside trial conduct need to be assessed for impact on source data quality and ALCOA principles (Annex 1), investigator oversight, sponsor responsibilities for vendor qualification, and essential records and their retention.

An LLM used to draft a protocol synopsis sits outside the trial conduct envelope. An LLM used to triage adverse event narratives, generate medical coding suggestions reviewed by a physician, or summarize site monitoring findings sits inside it. The latter requires documented qualification, defined human oversight, and traceability.

The HIPAA Layer

Independent of the five boundaries above, HIPAA applies any time Protected Health Information enters the workflow. Claude for Healthcare, launched January 11, 2026, makes Claude available through HIPAA-ready Claude for Enterprise products for providers, payers, and health tech companies. For life sciences companies, the HIPAA question arises more often than most teams anticipate: real-world evidence studies that touch identifiable patient records, expanded access programs, patient registries, and adverse event narratives that retain identifying detail all routinely involve PHI.

Whenever PHI flows through the LLM, the company is operating as a covered entity or business associate for that flow, and the HIPAA Privacy Rule (45 CFR Part 164 Subpart E) and Security Rule (45 CFR Part 164 Subpart C) apply. The operational minimums:

Two HIPAA nuances are worth flagging. First, de-identification under the Safe Harbor method (45 CFR 164.514(b)(2)) removes data from HIPAA scope, but the bar is high: eighteen identifier categories must be removed and the residual risk of re-identification must be very small. LLM-mediated de-identification is itself a workflow that needs validation; foundation models without specific tooling are not reliable at the eighteen-identifier removal task and should not be treated as a de-identification engine without validated controls. Second, the HIPAA security risk analysis is an annual obligation, not a one-time activity. Adding an AI system to the environment triggers refresh.

The Framework Matrix

The following matrix shows which framework triggers for common LLM use cases in life sciences. HIPAA applies orthogonally whenever PHI is involved; treat the HIPAA Layer above as a fixed overlay on the matrix below. Apply professional judgment to fact patterns; the matrix is a starting point, not a substitute for assessment.

Use Case Part 11 CSA EU AI Act IEC 62304 / PCCP ICH E6(R3) ALCOA+ Vendor Qual
Literature summary, not stored in regulated record
SOP drafting, reviewed and approved by SME P Y
Batch record review narrative, attached to EBR Y Y Y Y
Complaint triage drafting, QA reviewed P Y P Y
LLM integrated into SaMD for clinical decision Y Y P Y Y
Clinical protocol drafting, clinical ops reviewed Y
Adverse event narrative review during trial conduct P P Y Y Y
Training material generation, used for GxP record training Y Y Y Y
Regulatory submission drafting, RA reviewed P P P Y
Y = triggers P = possibly triggers (fact-dependent) — = does not trigger

Four Patterns That Catch Most Companies Off Guard

The "drafting tool" assumption

Companies often assume that because a human reviews and approves LLM output, the LLM is just a drafting tool and no validation applies. This is partially correct. The risk depends on what the human is reviewing for and whether the LLM materially shapes the final record.

If the human reviews for typos and approves substantively unchanged content, the LLM is materially the author and the record's quality depends on the model. If the human substantively rewrites or substantively verifies against source data, the LLM is closer to a drafting aid and the burden is lighter. The boundary is the level of substantive review, not the existence of review.

The connector multiplier

When Claude connects to Benchling, the integration touches the electronic lab notebook directly. If the ELN is part of the regulated record system, that connection creates a new data flow that needs to be assessed under the lab's data integrity and Part 11 procedures. The connector is a feature; your validation scope expanded the moment you turned it on.

The same logic applies to Medidata, Synapse, Snowflake, Databricks, and any system that the LLM can read from or write to. The Medidata connector is particularly notable for life sciences sponsors: trial enrollment and site performance data is operational clinical trial data, and an LLM with read access to that data is operating inside the ICH E6(R3) envelope.

One compliance-positive worth noting: Anthropic's documentation indicates platform connectors like Benchling honor existing access permissions rather than bypassing them, which means the connector inherits the source system's authorization model rather than creating a new bypass route. That reduces, but does not eliminate, the validation lift. The data flow, audit trail completeness on the LLM side, and any new aggregation patterns still need assessment. And custom connectors built outside Anthropic's published set introduce an additional vendor qualification surface: each custom connector is effectively a new integration that needs its own supplier assessment under GAMP 5 principles.

Agent Skills change the governance surface

Anthropic's Agent Skills framework introduces governance considerations that prompt-only LLM use does not. Skills run in code execution environments where Claude has filesystem access and bash command execution. They can invoke tools, execute scripts, and access bundled resources. This is more capability than a prompt and more risk surface than a prompt.

Three specifics matter for regulated use:

None of this disqualifies Skills for regulated use. It means the validation scope expands when Skills are introduced and the governance approach needs to be designed for the platform's actual sharing model rather than the assumed one.

Foundation model updates

The validation you completed today is against the model version you tested. Foundation models update. Anthropic's release cadence has produced multiple model generations in the past year, with capability improvements that are not isolated to model size. If your validation evidence references one model version and you are now calling another, your evidence is not current.

A periodic re-validation cadence and a change notification protocol with the supplier are both reasonable expectations. For SaMD applications, this same problem is what FDA's PCCP framework was designed to address. For non-device GxP use, your internal change control procedure must handle it.

Operational reality: if your supplier qualification record lists "Claude Sonnet 4" and your production API calls now hit Claude Opus 4.7, your evidence has a gap. The fix is not difficult; it requires a documented change control protocol that triggers a defined re-assessment when the underlying model version changes.

What to Do Before Production Use

A practical assessment sequence for any LLM use case being moved from experimentation to regulated production:

  1. Define intended use precisely. Not "AI for compliance," but specifically what input, what output, what decision the output supports.
  2. Classify against the five boundaries. Which side of each boundary does the intended use sit on?
  3. Map applicable frameworks. Apply the matrix. List every framework that triggers.
  4. Conduct a risk assessment. Process risk under CSA, software risk under GAMP 5, patient risk under ISO 14971 if device-adjacent.
  5. Qualify the supplier. The foundation model provider is a third-party supplier. A TPS/COTS Fit-for-Use evaluation under GAMP 5 Second Edition principles is the appropriate framework for hosted API access.
  6. Document data flows. Every data ingress and egress. Every connector. Every system the LLM reads from or writes to.
  7. Define human oversight. Who reviews what, against what source, with what authority to reject.
  8. Establish change control. Model version notification, periodic re-validation cadence, re-testing triggers.
  9. Build the evidence package. Validation summary, test scripts, supplier assessment, risk assessment, change control protocol.

For high-risk use cases under EU AI Act Annex III, add the Annex IV technical documentation deliverable and the conformity assessment pathway selection. Both become enforceable obligations August 2, 2026.

Where This Lands

Companies already doing this work at scale show what disciplined regulated deployment looks like. Novo Nordisk's NovoScribe, built with Claude Code, Amazon Bedrock, and MongoDB Atlas, has cut clinical study report production from over 10 weeks to under 10 minutes for the AI draft, with documentation reportedly receiving positive feedback from regulators. Pfizer's VOX platform, integrated with Claude through Amazon Bedrock, has reclaimed roughly 16,000 hours annually for its research teams. Bluenote, a specialized life sciences AI vendor, builds agents on Claude Platform and reports 50 to 75 percent acceleration in regulatory document production for its biopharma clients.

Two things stand out about these deployments. First, they are not turnkey claude.ai usage; they are validated, integrated, vendor-qualified builds that treat the LLM as a third-party component within a controlled architecture. Second, multi-tier vendor relationships (where the regulated company contracts with a specialized vendor, who in turn contracts with Anthropic) require multi-tier supplier qualification under GAMP 5 Second Edition principles. Anthropic is the sub-supplier; the qualification record needs to address both tiers, including the controls each layer commits to and the data flow between them.

The companies that will get this right are not the ones that wait for FDA or EMA to publish a definitive LLM-specific guidance. There is no such guidance coming in the near term that will resolve every boundary above. The existing frameworks, 21 CFR Part 11, FDA CSA, GAMP 5, ICH E6(R3), EU AI Act, IEC 62304, HIPAA, and the data integrity guidance from MHRA, PIC/S, and WHO, are adequate to govern LLM use cases when correctly scoped.

The work is the scoping.

Where RegulatoryIQ Fits

The five boundaries above map directly to the template packages and AI analysis tools we have already built. If you are doing this scoping work yourself, these are the artifacts that compress the timeline:

For use-case-specific scoping, regulatory strategy on a hybrid AI deployment, or Annex IV technical documentation preparation, book a discovery call.

References

This analysis is general regulatory commentary. It is not legal advice and does not establish a consulting relationship. Apply professional judgment to specific fact patterns or engage qualified regulatory and legal counsel.

Zach Galloway is the founder of RegulatoryIQ, an AI-powered regulatory compliance platform for life sciences companies. He has 14+ years of experience in regulatory affairs and quality assurance for medical devices, SaMD, and clinical trials. He holds RAC, CMQ/OE, CQE, CPHQ, LSSBB, Lead Auditor (ISO 9001/13485/27001), AIGP, and ACRP-MDP certifications.

Scoping an LLM deployment into a regulated workflow? Book a discovery call or explore the EU AI Act Assessment Pack and TPS/COTS FFU Toolkit.