Claude for Life Sciences Just Made LLM Compliance Urgent. Here's the Framework.

On October 20, 2025, Anthropic launched Claude for Life Sciences with explicit positioning for regulatory affairs managers, clinical coordinators, and scientists. On January 11, 2026, the company expanded the offering on both axes: a new Claude for Healthcare product with HIPAA-ready Claude for Enterprise access and connectors for the CMS Coverage Database, ICD-10, and the National Provider Identifier Registry; and an expanded Claude for Life Sciences with connectors for Medidata, ClinicalTrials.gov, bioRxiv, medRxiv, Open Targets, ChEMBL, ToolUniverse, and Owkin's Pathology Explorer. In April 2026, Anthropic acquired AI biotech startup Coefficient Bio in a reported $400M stock deal. On May 14, 2026, Anthropic announced a four-year, $200M partnership with the Gates Foundation, targeting vaccine candidate screening (starting with polio), therapy development for high-burden diseases like HPV and preeclampsia, and disease forecasting through the Institute for Disease Modeling; the full partnership also spans education and economic mobility.

Between those announcements, life sciences and healthcare adoption of LLMs moved from "interesting research tool" to operational infrastructure for regulated work. The compliance posture across most organizations has not kept pace.

This piece walks through the regulatory boundaries that apply when an LLM enters a regulated workflow in pharma, medical device, biotech, or clinical research. It is not a generic AI governance overview. It is the framework for deciding which specific FDA, EU, and ICH requirements attach to which specific use cases.

What Anthropic Just Endorsed

Claude for Life Sciences expanded the official capability list to cover literature reviews and hypothesis generation, study protocol and SOP drafting, bioinformatics and genomic data analysis, clinical trial operations, regulatory submission drafting and review, and compliance data compilation. The connector ecosystem now includes Benchling (electronic lab notebooks), BioRender, PubMed, Wiley's Scholar Gateway, Synapse.org, 10x Genomics, Medidata (trial enrollment and site performance), ClinicalTrials.gov, ToolUniverse (over 600 vetted scientific tools), bioRxiv and medRxiv preprints, Open Targets, ChEMBL, and Owkin's Pathology Explorer for tissue image analysis. Existing general-purpose connectors already covered Google Workspace, Microsoft 365, Databricks, and Snowflake.

Claude for Healthcare added operationally-focused connectors for the CMS Local and National Coverage Determination database, ICD-10 diagnosis and procedure codes (sourced from CMS and CDC), and the National Provider Identifier Registry, among others, alongside HIPAA-ready Claude for Enterprise access. Two new Healthcare Agent Skills shipped with the launch: FHIR development and a sample prior authorization review skill. Life sciences Agent Skills additions include scientific problem selection, Allotrope conversion for instrument data, scVI-tools and Nextflow deployment for bioinformatics, and a sample clinical trial protocol draft generation skill that accounts for FDA guidelines and competitive landscape.

The Gates Foundation announcement deepens the trajectory. Reuters reported on May 14, 2026 that Anthropic and the Gates Foundation pledged $200 million over four years to back AI public goods and applications in health and education, with research centers using Claude to predict drug candidates for HPV and preeclampsia among the initiatives under consideration.

The implication for regulated workflows: the use cases Anthropic is now actively endorsing and tooling for are use cases that cross into GxP, GCP, HIPAA, and medical device territory.

Where Most Companies Get the Scoping Wrong

The most common compliance error in early LLM adoption is treating "AI" as a single regulatory question. It is not. The relevant regulations attach to specific use cases, not to the model itself.

A foundation model accessed via API is a third-party computerized system. Whether and how it falls under regulatory scope is determined by what records or decisions it produces, whether those records or decisions enter a regulated process, and the intended use and risk classification of that process.

The same Claude API call can be regulated under 21 CFR Part 11, FDA CSA, EU AI Act, IEC 62304, or none of the above, depending entirely on what the output does next.

The Five Compliance Boundaries

Boundary 1: Research output vs. GxP record

An LLM-generated literature summary used to inform a hypothesis is research output. The same summary saved into an electronic lab notebook supporting a GLP study or a CMC submission becomes part of a GxP record. The moment that transition happens, the record is subject to ALCOA+ data integrity expectations and, if signed or used as an attributable record, 21 CFR Part 11.

The boundary is the record system, not the model.

Boundary 2: Production and QMS software vs. device function

FDA's February 3, 2026 guidance, "Computer Software Assurance for Production and Quality Management System Software" (Level 2 revision, title updated to align with QMSR), explicitly covers software used as part of production or the quality management system. It explicitly excludes software that is itself a device function. SaMD and SiMD remain under the General Principles of Software Validation and IEC 62304.

LLM use cases that fall on the production/QMS side include batch record review support, complaint triage drafting, CAPA narrative drafting, training material generation, and SOP authoring assistance. CSA's risk-based assurance approach applies. Assurance activities are scaled to process risk, with high-risk uses receiving deeper testing and documentation and lower-risk uses receiving lighter evidence.

LLM use cases that fall on the device function side include any model directly providing a clinical recommendation, diagnostic output, or therapeutic decision support that meets the SaMD definition under IMDRF or the Clinical Decision Support criteria under FDA's January 6, 2026 Clinical Decision Support guidance (re-issued January 29, 2026). These are device territory. IEC 62304 software lifecycle requirements apply. If the model is AI/ML-enabled and integrated into a cleared device, FDA's predetermined change control plan (PCCP) framework becomes relevant.

The boundary is the intended use of the output, not the underlying technology.

Boundary 3: Configured product vs. custom integration

GAMP 5 Second Edition (ISPE, July 2022) software categorization matters here. A direct Claude API integration into a custom internal workflow is typically Category 5 (custom application). A configured commercial product that embeds an LLM and is sold as a regulated tool is typically Category 4. Off-the-shelf chat use of a hosted assistant for non-regulated tasks may sit at Category 3 if the use is constrained and the output never enters a regulated record.

The GAMP categorization drives the depth of supplier assessment, qualification testing, and ongoing periodic review obligations.

Boundary 4: EU AI Act intended-purpose test

Regulation EU 2024/1689 Annex III high-risk obligations become enforceable August 2, 2026. The high-risk classifications most relevant to life sciences are AI systems that are safety components of products regulated under EU MDR or IVDR, AI systems used for biometric categorization of natural persons, and AI systems used in employment for screening candidates (relevant to clinical investigator selection in some scenarios).

The MDR/IVDR safety component test is the one most life sciences companies will face. A standalone LLM used for SOP drafting does not meet this test. A model integrated into a medical device under MDR Rule 11 likely does. The intended purpose declaration drives the analysis.

Annex IV technical documentation is the deliverable required for high-risk systems. Conformity assessment must be completed before the system is placed on the EU market or put into service for high-risk applications.

Boundary 5: Clinical investigation use vs. supportive function

ICH E6(R3), Step 4 Final Guideline adopted January 6, 2025 and effective in the EU July 23, 2025 under Regulation EU 536/2014, applies to clinical investigations. LLM use cases inside trial conduct need to be assessed for impact on source data quality and ALCOA principles (Annex 1), investigator oversight, sponsor responsibilities for vendor qualification, and essential records and their retention.

An LLM used to draft a protocol synopsis sits outside the trial conduct envelope. An LLM used to triage adverse event narratives, generate medical coding suggestions reviewed by a physician, or summarize site monitoring findings sits inside it. The latter requires documented qualification, defined human oversight, and traceability.

The HIPAA Layer

Independent of the five boundaries above, HIPAA applies any time Protected Health Information enters the workflow. Claude for Healthcare, launched January 11, 2026, makes Claude available through HIPAA-ready Claude for Enterprise products for providers, payers, and health tech companies. For life sciences companies, the HIPAA question arises more often than most teams anticipate: real-world evidence studies that touch identifiable patient records, expanded access programs, patient registries, and adverse event narratives that retain identifying detail all routinely involve PHI.

Whenever PHI flows through the LLM, the company is operating as a covered entity or business associate for that flow, and the HIPAA Privacy Rule (45 CFR Part 164 Subpart E) and Security Rule (45 CFR Part 164 Subpart C) apply. The operational minimums:

Business Associate Agreement with the AI vendor before PHI moves through the system. The Claude for Healthcare positioning provides the BAA mechanism, but it is tied to the Claude for Enterprise tier. Claude Pro and Claude Max individual subscriptions, including the Claude for Life Sciences capabilities accessible at those tiers, are not HIPAA-ready and do not provide the BAA path. Treating Pro or Max access as HIPAA-covered is a common scoping error. Deployment without an executed BAA creates personal liability for executives and institutional liability for the organization.
Technical safeguards under 45 CFR 164.312: encryption of PHI at rest and in transit, access controls including multi-factor authentication, audit controls that log access events, and transmission security.
Audit trails retained for at least six years per 45 CFR 164.530(j), tamper-evident, and readily searchable. The audit trail must be sufficient to detect inappropriate access within minutes, not discover it months later during an audit.
Breach notification procedures specific to AI-related security incidents under 45 CFR Part 164 Subpart D, including investigation responsibility, notification timelines, and remediation steps.
Risk assessment specific to the AI system under 45 CFR 164.308(a)(1), documenting what PHI the system accesses, how data flows through the system, where data is stored, who has access, what controls protect data, and what could go wrong.

Two HIPAA nuances are worth flagging. First, de-identification under the Safe Harbor method (45 CFR 164.514(b)(2)) removes data from HIPAA scope, but the bar is high: eighteen identifier categories must be removed and the residual risk of re-identification must be very small. LLM-mediated de-identification is itself a workflow that needs validation; foundation models without specific tooling are not reliable at the eighteen-identifier removal task and should not be treated as a de-identification engine without validated controls. Second, the HIPAA security risk analysis is an annual obligation, not a one-time activity. Adding an AI system to the environment triggers refresh.

The Framework Matrix

The following matrix shows which framework triggers for common LLM use cases in life sciences. HIPAA applies orthogonally whenever PHI is involved; treat the HIPAA Layer above as a fixed overlay on the matrix below. Apply professional judgment to fact patterns; the matrix is a starting point, not a substitute for assessment.

Use Case	Part 11	CSA	EU AI Act	IEC 62304 / PCCP	ICH E6(R3)	ALCOA+	Vendor Qual
Literature summary, not stored in regulated record	—	—	—	—	—	—	—
SOP drafting, reviewed and approved by SME	—	P	—	—	—	—	Y
Batch record review narrative, attached to EBR	Y	Y	—	—	—	Y	Y
Complaint triage drafting, QA reviewed	P	Y	—	—	—	P	Y
LLM integrated into SaMD for clinical decision	—	—	Y	Y	P	Y	Y
Clinical protocol drafting, clinical ops reviewed	—	—	—	—	—	—	Y
Adverse event narrative review during trial conduct	P	P	—	—	Y	Y	Y
Training material generation, used for GxP record training	Y	Y	—	—	—	Y	Y
Regulatory submission drafting, RA reviewed	P	P	—	—	—	P	Y

Y = triggers P = possibly triggers (fact-dependent) — = does not trigger

Four Patterns That Catch Most Companies Off Guard

The "drafting tool" assumption

Companies often assume that because a human reviews and approves LLM output, the LLM is just a drafting tool and no validation applies. This is partially correct. The risk depends on what the human is reviewing for and whether the LLM materially shapes the final record.

If the human reviews for typos and approves substantively unchanged content, the LLM is materially the author and the record's quality depends on the model. If the human substantively rewrites or substantively verifies against source data, the LLM is closer to a drafting aid and the burden is lighter. The boundary is the level of substantive review, not the existence of review.

The connector multiplier

When Claude connects to Benchling, the integration touches the electronic lab notebook directly. If the ELN is part of the regulated record system, that connection creates a new data flow that needs to be assessed under the lab's data integrity and Part 11 procedures. The connector is a feature; your validation scope expanded the moment you turned it on.

The same logic applies to Medidata, Synapse, Snowflake, Databricks, and any system that the LLM can read from or write to. The Medidata connector is particularly notable for life sciences sponsors: trial enrollment and site performance data is operational clinical trial data, and an LLM with read access to that data is operating inside the ICH E6(R3) envelope.

One compliance-positive worth noting: Anthropic's documentation indicates platform connectors like Benchling honor existing access permissions rather than bypassing them, which means the connector inherits the source system's authorization model rather than creating a new bypass route. That reduces, but does not eliminate, the validation lift. The data flow, audit trail completeness on the LLM side, and any new aggregation patterns still need assessment. And custom connectors built outside Anthropic's published set introduce an additional vendor qualification surface: each custom connector is effectively a new integration that needs its own supplier assessment under GAMP 5 principles.

Agent Skills change the governance surface

Anthropic's Agent Skills framework introduces governance considerations that prompt-only LLM use does not. Skills run in code execution environments where Claude has filesystem access and bash command execution. They can invoke tools, execute scripts, and access bundled resources. This is more capability than a prompt and more risk surface than a prompt.

Three specifics matter for regulated use:

Claude.ai custom Skills are per-user, not organization-wide. Anthropic's documentation confirms that claude.ai does not support centralized admin management or org-wide distribution of custom Skills. For an organization standardizing on a validated Skill, each user uploads independently. Version control and consistency become an organizational discipline rather than a platform feature.
Skills do not sync across surfaces. A Skill uploaded to claude.ai is not available through the API; a Skill uploaded through the API is not available on claude.ai. Organizations using both need separate qualification and version control workflows.
Agent Skills are not eligible for Zero Data Retention. Per Anthropic's documentation, Skill definitions and execution data are retained per the standard data retention policy. If your data flow analysis assumed ZDR for the LLM layer, Skills break that assumption and require a refreshed risk assessment.

None of this disqualifies Skills for regulated use. It means the validation scope expands when Skills are introduced and the governance approach needs to be designed for the platform's actual sharing model rather than the assumed one.

Foundation model updates

The validation you completed today is against the model version you tested. Foundation models update. Anthropic's release cadence has produced multiple model generations in the past year, with capability improvements that are not isolated to model size. If your validation evidence references one model version and you are now calling another, your evidence is not current.

A periodic re-validation cadence and a change notification protocol with the supplier are both reasonable expectations. For SaMD applications, this same problem is what FDA's PCCP framework was designed to address. For non-device GxP use, your internal change control procedure must handle it.

Operational reality: if your supplier qualification record lists "Claude Sonnet 4" and your production API calls now hit Claude Opus 4.7, your evidence has a gap. The fix is not difficult; it requires a documented change control protocol that triggers a defined re-assessment when the underlying model version changes.

What to Do Before Production Use

A practical assessment sequence for any LLM use case being moved from experimentation to regulated production:

Define intended use precisely. Not "AI for compliance," but specifically what input, what output, what decision the output supports.
Classify against the five boundaries. Which side of each boundary does the intended use sit on?
Map applicable frameworks. Apply the matrix. List every framework that triggers.
Conduct a risk assessment. Process risk under CSA, software risk under GAMP 5, patient risk under ISO 14971 if device-adjacent.
Qualify the supplier. The foundation model provider is a third-party supplier. A TPS/COTS Fit-for-Use evaluation under GAMP 5 Second Edition principles is the appropriate framework for hosted API access.
Document data flows. Every data ingress and egress. Every connector. Every system the LLM reads from or writes to.
Define human oversight. Who reviews what, against what source, with what authority to reject.
Establish change control. Model version notification, periodic re-validation cadence, re-testing triggers.
Build the evidence package. Validation summary, test scripts, supplier assessment, risk assessment, change control protocol.

For high-risk use cases under EU AI Act Annex III, add the Annex IV technical documentation deliverable and the conformity assessment pathway selection. Both become enforceable obligations August 2, 2026.

Where This Lands

Companies already doing this work at scale show what disciplined regulated deployment looks like. Novo Nordisk's NovoScribe, built with Claude Code, Amazon Bedrock, and MongoDB Atlas, has cut clinical study report production from over 10 weeks to under 10 minutes for the AI draft, with documentation reportedly receiving positive feedback from regulators. Pfizer's VOX platform, integrated with Claude through Amazon Bedrock, has reclaimed roughly 16,000 hours annually for its research teams. Bluenote, a specialized life sciences AI vendor, builds agents on Claude Platform and reports 50 to 75 percent acceleration in regulatory document production for its biopharma clients.

Two things stand out about these deployments. First, they are not turnkey claude.ai usage; they are validated, integrated, vendor-qualified builds that treat the LLM as a third-party component within a controlled architecture. Second, multi-tier vendor relationships (where the regulated company contracts with a specialized vendor, who in turn contracts with Anthropic) require multi-tier supplier qualification under GAMP 5 Second Edition principles. Anthropic is the sub-supplier; the qualification record needs to address both tiers, including the controls each layer commits to and the data flow between them.

The companies that will get this right are not the ones that wait for FDA or EMA to publish a definitive LLM-specific guidance. There is no such guidance coming in the near term that will resolve every boundary above. The existing frameworks, 21 CFR Part 11, FDA CSA, GAMP 5, ICH E6(R3), EU AI Act, IEC 62304, HIPAA, and the data integrity guidance from MHRA, PIC/S, and WHO, are adequate to govern LLM use cases when correctly scoped.

The work is the scoping.

Where RegulatoryIQ Fits

The five boundaries above map directly to the template packages and AI analysis tools we have already built. If you are doing this scoping work yourself, these are the artifacts that compress the timeline:

AI Gap Analysis $79 per analysis
For: single-system assessment against Part 11, CSA, ALCOA+, QMSR, or EU AI Act.
EU AI Act Assessment Pack $399
For: high-risk classification, 74-item conformity assessment, Annex IV documentation template, post-market monitoring plan.
CSV/CSA Package $199
For: software categorization, vendor assessment, CSA testing protocol aligned to the February 2026 guidance.
21 CFR Part 11 Complete $299
For: 33-criteria system assessment, electronic records and signatures SOPs, FDA Part 11 certification letter.
TPS/COTS FFU Toolkit $279
For: third-party software and SaaS vendor qualification under GAMP 5 Second Edition; the right framework for hosted API access to a foundation model.
Data Integrity Pack $179
For: ALCOA+ assessment, audit trail review SOP, data integrity risk assessment with 45 mapped risks.

For use-case-specific scoping, regulatory strategy on a hybrid AI deployment, or Annex IV technical documentation preparation, book a discovery call.

References

Anthropic, "Claude for Life Sciences" (October 20, 2025).
Anthropic, "Advancing Claude in healthcare and the life sciences" (January 11, 2026), introducing Claude for Healthcare and expanded Claude for Life Sciences.
Anthropic, "Getting Started with Claude for Life Sciences" (support article, current).
Anthropic, "Agent Skills" documentation (current).
Anthropic, customer story: "Novo Nordisk accelerates clinical documentation and drug development with Claude."
Anthropic, customer story: "Bluenote powers intelligent agents for life sciences with Claude."
Anthropic, "The Enterprise AI Transformation Guide for Healthcare & Life Sciences" (2025), referencing Pfizer VOX platform deployment.
Fierce Biotech / The Information, "Anthropic acquires AI biotech startup Coefficient Bio in reported $400M deal" (April 2026).
Anthropic, "Anthropic forms $200 million partnership with the Gates Foundation" (May 14, 2026).
Reuters / The Star, "Anthropic, Gates Foundation launch $200 million partnership for AI in health, education" (May 14, 2026).
Gates Foundation, "Making AI work for more people" (May 14, 2026).
FDA, "Computer Software Assurance for Production and Quality Management System Software" (Level 2 revision, February 3, 2026), supersedes the September 24, 2025 final guidance.
FDA, "Clinical Decision Support Software" (issued January 6, 2026, re-issued as Level 2 revision January 29, 2026).
FDA, "Quality Management System Regulation" final rule, effective February 2, 2026.
FDA, "Marketing Submission Recommendations for a Predetermined Change Control Plan for Artificial Intelligence-Enabled Device Software Functions" (final, December 2024).
21 CFR Part 11 (current).
FDA, "Part 11, Electronic Records; Electronic Signatures: Scope and Application" (August 2003).
FDA, "Use of Electronic Systems, Electronic Records, and Electronic Signatures in Clinical Investigations: Questions and Answers" (October 2024).
HIPAA Privacy Rule, 45 CFR Part 164 Subpart E.
HIPAA Security Rule, 45 CFR Part 164 Subpart C, including 164.308 (administrative safeguards) and 164.312 (technical safeguards).
HIPAA Breach Notification Rule, 45 CFR Part 164 Subpart D.
De-identification standard, 45 CFR 164.514(b)(2) (Safe Harbor method).
Regulation (EU) 2024/1689 (EU AI Act), Annex III obligations enforceable August 2, 2026.
ICH E6(R3), Step 4 Final Guideline, adopted January 6, 2025; EU effective July 23, 2025 under Regulation (EU) 536/2014.
IEC 62304:2006/AMD1:2015, Medical device software lifecycle processes.
ISO 13485:2016, incorporated by reference under QMSR.
ISO 14971:2019, Application of risk management to medical devices.
ISPE, GAMP 5 Second Edition (July 2022).
MHRA, "GXP Data Integrity Guidance and Definitions," Revision 1 (March 2018).
PIC/S PI 041-1, "Good Practices for Data Management and Integrity in Regulated GMP/GDP Environments" (July 2021).
WHO, TRS 1033, Annex 4, "Guideline on data integrity" (2021).
ICH Q9(R1), Quality Risk Management (January 2023).
IMDRF, "Machine Learning-enabled Medical Devices: Key Terms and Definitions" (final, May 2022) and N81 "Characterization Considerations for Medical Device Software" (final, 2025).

This analysis is general regulatory commentary. It is not legal advice and does not establish a consulting relationship. Apply professional judgment to specific fact patterns or engage qualified regulatory and legal counsel.

Zach Galloway is the founder of RegulatoryIQ, an AI-powered regulatory compliance platform for life sciences companies. He has 14+ years of experience in regulatory affairs and quality assurance for medical devices, SaMD, and clinical trials. He holds RAC, CMQ/OE, CQE, CPHQ, LSSBB, Lead Auditor (ISO 9001/13485/27001), AIGP, and ACRP-MDP certifications.

Scoping an LLM deployment into a regulated workflow? Book a discovery call or explore the EU AI Act Assessment Pack and TPS/COTS FFU Toolkit.