What Clinical Medicine Teaches Us About Governing Artificial Intelligence
Why This Analogy Works
Every discipline that learned to manage high-consequence, probabilistic interventions on human beings had to invent governance before it invented technology. Pharmaceutical medicine is the clearest example. Long before we understood receptor pharmacology or pharmacokinetics with any rigor, we had learned — often through catastrophe — that substances which alter human biology require provenance documentation, indication-specific labeling, contraindication warnings, post-market surveillance, scheduled access controls, independent review, and a regulatory architecture that assigns accountability when things go wrong. The thalidomide disaster of the late 1950s did not teach us chemistry; it taught us governance.
Artificial intelligence is in an analogous moment. We are deploying probabilistic, context-sensitive, often inscrutable interventions into consequential human decisions — credit, employment, healthcare, criminal justice, education, content moderation — and we are discovering, sometimes catastrophically, that technical sophistication does not substitute for governance maturity. The AAISM certification exists precisely because organizations need professionals who can build the governance layer that transforms AI from a laboratory curiosity into a deployable, defensible, accountable capability.
The pharmaceutical analogy is useful not because AI is literally like medicine, but because the governance pattern pharma evolved over a century maps with startling precision onto the AI security management body of knowledge. Every major AAISM concept has a pharmaceutical cousin that makes it intuitive, and the places where the analogy breaks down are themselves instructive.
The Four Artifacts: A Product Dossier for AI
The Cloud Security Alliance’s four pillars of responsible AI documentation — Data Sheets, Model Cards, Risk Cards, and Scenario Planning — correspond almost one-to-one with the components of a pharmaceutical product dossier. A Data Sheet is the Chemistry, Manufacturing and Controls record: it documents what went into the training set, where it came from, how it was processed, what known impurities and systemic biases it carries, and what limitations the manufacturing process imposes on the final product. A Model Card is the package insert’s Indications, Dosage, and Clinical Pharmacology sections: it describes what the model is intended to do, the population and conditions under which it was validated, and the performance metrics that justify its use for the stated indication. A Risk Card is the Contraindications, Warnings, and Adverse Reactions section: it enumerates the known failure modes, the populations for whom the model should not be used, the foreseeable misuses, and the remediations or compensating controls that mitigate each. Scenario Planning is the Risk Evaluation and Mitigation Strategy or the pharmacovigilance Risk Management Plan: it anticipates what could go wrong under adversarial or unusual conditions, enumerates failure modes the validation process may not have exercised, and prescribes monitoring and response plans.
Understood this way, these artifacts are not bureaucratic overhead. They are the mechanism by which a model is transformed from something a data scientist built in a notebook into something a reasonable professional can deploy, rely on, and defend in front of a regulator or a court. The AAISM candidate should internalize that each artifact has a distinct primary audience (engineers, deployers, risk owners, executives and regulators respectively), a distinct owner (data steward, model owner, AI risk owner, governance committee), and a distinct refresh trigger (dataset change, retraining, incident, material change in context of use). The continuous feedback loop in the CSA diagram is the AI equivalent of pharmacovigilance: field findings must update the dossier, and a stale dossier is itself a governance failure.
Risk Tiering: The Schedule Principle
The single most important governance concept either field has produced is the idea that obligation scales with consequence. No pharmacopoeia treats paracetamol the way it treats morphine. Paracetamol is available over the counter because its therapeutic index is wide, its abuse potential is limited, and the consequences of misjudgment are usually recoverable. Morphine sits under Schedule H or its equivalents because its therapeutic index is narrow, its abuse potential is high, and the consequences of misjudgment are often irreversible and sometimes fatal. The regulator does not demand identical documentation, review, and access controls for both, and a regulatory regime that tried to do so would collapse under its own weight while failing to protect anyone adequately.
The EU AI Act is, in essence, a pharmacopoeia for artificial intelligence. It sorts AI systems into unacceptable risk (banned outright, like substances with no therapeutic value and severe harm potential), high risk (permitted but subject to conformity assessment, post-market monitoring, and incident reporting, like prescription drugs with mandatory pharmacovigilance), limited risk (subject to transparency obligations, like drugs that require specific consumer warnings), and minimal risk (governed by general product safety rules, like OTC substances). NIST AI RMF achieves the same outcome through its Govern and Map functions without legal force. ISO/IEC 42001 builds tiering into its risk-based management system expectations. The specific labels vary; the principle is universal.
For the AAISM candidate, the operational consequence is a governance instrument that many programs lack: an AI risk classification policy that defines tier criteria (use case, population affected, reversibility of outcomes, autonomy level, regulatory exposure, data sensitivity), assigns the authority to classify (crucially, not the model owner alone, for the same reason a pharmaceutical sponsor cannot self-schedule its own product), and specifies the obligations that flow from each tier — depth of documentation, independence of validation, frequency of monitoring, level of executive approval required for deployment. The moment an exam scenario describes a data science team self-certifying the risk level of its own model, you should hear the same alarm a pharmacist hears when a manufacturer claims its own drug is OTC without regulatory review. That is a classic conflict of interest, and its remediation — independent classification and independent validation — is the AI descendant of the principle US banking regulators codified in SR 11-7 more than a decade before the current AI moment.
The AI Management System: ISO/IEC 42001 as Good Manufacturing Practice
Pharmaceutical manufacturers do not demonstrate quality by testing the final product alone. They demonstrate it by operating under Good Manufacturing Practice — a management system that governs facilities, equipment qualification, personnel training, document control, change control, deviation management, and continuous improvement. The premise is that a disciplined system reliably produces quality outputs, whereas ad-hoc brilliance does not scale and does not survive personnel changes.
ISO/IEC 42001 is the Good Manufacturing Practice of AI. It defines an AI Management System (AIMS) with clauses covering context of the organization, leadership commitment, planning, support and resources, operation, performance evaluation, and improvement, plus AI-specific controls in its annexes. Its relationship to ISO/IEC 27001 mirrors the relationship between pharmaceutical GMP and general quality management — the structural logic is the same, but the hazards it addresses are domain-specific. ISO/IEC 23894 plays the role that ICH Q9 plays in pharmaceutical quality risk management, adapting a generic risk framework (ISO 31000 in one case, ICH Q10 in the other) to the particular uncertainties of the field. ISO/IEC 22989 and 23053 play the role of the pharmacopoeia’s vocabulary and monograph structure, giving the field a shared language so that a “model,” a “training dataset,” or an “inference” means the same thing across organizations.
The AAISM candidate should recognize that when an exam scenario describes an organization with policies but no management system — documents without document control, controls without change management, incidents without corrective action — the correct framing is not “add more policies” but “establish an AIMS.” This is the AI equivalent of saying a pharmaceutical company does not need more SOPs; it needs GMP.
The Lifecycle: From Discovery to Post-Market Surveillance
A drug does not enter the market through a single decision. It passes through discovery, preclinical testing, Phase I safety trials, Phase II efficacy and dose-finding trials, Phase III confirmatory trials, regulatory review, launch, and post-market surveillance — and the risk assessment applied at each stage is specific to that stage. Preclinical toxicology asks whether the molecule is safe enough to test in humans at all; Phase III asks whether it is efficacious and safe in the intended population; post-market surveillance asks what happens in populations the trials could not adequately represent.
The AI system lifecycle — plan and design, collect and process data, build and use model, verify and validate, deploy and use, operate and monitor, retire — demands the same stage-specific discipline. At problem framing, the security manager should ask whether AI is the right tool at all (the pharma analog is asking whether a drug is the right modality versus a device, a behavioral intervention, or no intervention). At data collection, the focus is provenance, quality, representativeness, bias, and privacy (the pharma analog is supply chain qualification and raw material testing). At model development, the focus shifts to robustness, explainability, and security properties (preclinical pharmacology). At validation, the question is whether testing is sufficient to support the intended use (the clinical trial question). At deployment, operational controls and monitoring take center stage (launch readiness). In operation, drift, abuse, and emergent behavior require continuous attention (pharmacovigilance). At retirement, data disposition and residual obligations must be managed (product discontinuation and long-tail liability).
Many AAISM exam questions can be decoded by asking “at what lifecycle stage is this scenario, and what should a mature program be doing at that stage?” A question about a model that is performing worse than it did at launch is an operate-and-monitor question, and the answer usually involves drift detection, root cause analysis, and a decision between retraining, restricting, or retiring — not “retrain the model immediately,” which presupposes the diagnosis rather than performing it.
The AI-Specific Threat Landscape: A New Pharmacopoeia of Adverse Events
Classical drug safety recognized a taxonomy of adverse events — allergic reactions, dose-dependent toxicities, drug-drug interactions, idiosyncratic reactions, teratogenicity, and so on — and each type demanded different surveillance and mitigation. AI security requires a similar taxonomy, and frameworks like MITRE ATLAS, the OWASP Top 10 for LLM Applications, and NIST AI 100-2 provide it.
Adversarial examples are the AI equivalent of drug-drug interactions: inputs that are safe in isolation become harmful in specific combinations that the validation process did not test. Data poisoning is the AI equivalent of raw material contamination, and like pharmaceutical contamination, it can be introduced accidentally by a negligent supplier or intentionally by a malicious one; either way, the consequence manifests in the finished product. Model extraction and model inversion are the AI equivalents of reverse-engineering a proprietary formulation, with the added wrinkle that model inversion can also leak the identities of individuals in the training set — a privacy harm with no clean pharmaceutical parallel. Membership inference attacks, which determine whether a specific record was in the training data, are a distinctively AI-era concern that cryptographic and statistical disclosure controls were never designed to address. Prompt injection, especially its indirect variant where malicious instructions arrive through data a language model retrieves rather than through the user’s prompt, is a genuinely new class of vulnerability that classical security’s input validation vocabulary only partially captures. Jailbreaking — coaxing a model past its safety training — is closer to social engineering than to any pharmaceutical precedent, which is why defenses rooted purely in technical controls often fail.
The AAISM candidate should be able to distinguish these precisely on the exam, because ISACA often constructs distractors by swapping similar-sounding attacks. Data poisoning corrupts the training process; model poisoning corrupts the model artifact itself, perhaps through a compromised supply chain. Prompt injection exploits the instruction-following behavior of a language model; jailbreaking exploits the alignment training that is supposed to constrain it. Membership inference reveals whether a record was used; model inversion reconstructs what the records looked like. These are not interchangeable, and the controls that address each differ.
Trustworthy AI Characteristics: The Therapeutic Profile
A good drug is not simply “effective.” It is effective, safe, tolerable, manufacturable at consistent quality, acceptable to patients, and compatible with the healthcare system that will administer it. These properties interact and sometimes conflict: the most effective molecule may have intolerable side effects, the safest formulation may be too expensive to manufacture at scale, the most convenient dosing schedule may compromise efficacy.
Trustworthy AI characteristics — validity and reliability, safety, security and resilience, accountability and transparency, explainability and interpretability, privacy, and fairness with harmful bias managed — interact in the same way. Explainability can conflict with intellectual property protection when the explanation reveals proprietary features. Privacy-preserving techniques like differential privacy can degrade fairness if they disproportionately add noise to small subgroups. Robustness to adversarial input can reduce accuracy on benign input. Transparency can enable adversarial probing that undermines security.
The AAISM candidate’s job is not to maximize any single property but to help the organization make these trade-offs deliberately, with the right decision-makers informed and accountable. When a scenario asks what the AI security manager should do when the product team wants to ship a more accurate but less explainable model, the answer is almost never “demand the explainable one” or “approve the accurate one.” It is to ensure the trade-off is characterized, escalated to the appropriate risk owner with clear information, and resolved through a decision that is documented and defensible. This is the same judgment a pharmacist exercises when a physician’s prescription presents a benefit-risk question — the pharmacist does not overrule the physician, but also does not silently dispense; they ensure the decision is made with eyes open.
Responsibility for AI Safety vs AI Security: A Distinction That Matters
Pharmaceutical medicine distinguishes between efficacy (does the drug do what it is supposed to do?), safety (does it avoid harming the patient through its normal mechanism?), and security (is it protected from tampering, counterfeiting, and diversion?). These are related but not identical concerns, managed by overlapping but distinct functions.
AI has a parallel distinction that trips up many candidates. AI safety concerns whether the system behaves as intended and avoids causing harm through its normal operation — a self-driving car that fails to recognize a pedestrian, a medical AI that misdiagnoses, a language model that confidently fabricates citations. AI security concerns whether the system resists adversarial interference — the same self-driving car defeated by adversarial stickers on a stop sign, the same medical AI whose training data was poisoned by a competitor, the same language model coerced into revealing training data through membership inference. The distinction matters because the controls, the threat models, the owners, and often the regulators are different. Safety failures often stem from inadequate validation, representativeness gaps, or distribution shift. Security failures often stem from adversarial adaptation that no amount of benign validation would have surfaced.
Mature AI programs address both and recognize their overlap — a safety failure and a security failure can produce identical-looking incidents, and only careful post-incident analysis distinguishes them. The AAISM candidate should expect exam scenarios that blur the line and reward the answer that reflects awareness of the distinction.
Third-Party and Foundation Model Risk: The Generic Drug Problem at Scale
When a hospital dispenses a generic drug, it relies on an assurance chain it did not build. The manufacturer followed GMP, the regulator inspected the facility, the pharmacopoeia specified the acceptable impurity profile, and the distributor maintained cold chain integrity. If any link breaks, patients are harmed, and the hospital bears moral and often legal responsibility despite not having manufactured the drug. The hospital’s risk management cannot end at its own walls.
Foundation models have created the same situation for AI, at unprecedented scale and opacity. An organization deploying a language model from a frontier lab inherits that lab’s training data decisions, its safety training, its alignment choices, its known vulnerabilities, and its supply chain. The deploying organization did not make those decisions and often cannot inspect them, but it bears the consequences when they go wrong. AAISM treats third-party and foundation model risk as a distinct discipline requiring assurance mechanisms — vendor questionnaires, model cards and data sheets from the provider, contractual obligations around incident notification and retraining, independent testing where feasible, and contingency planning for the day a provider deprecates a model or suffers a compromise.
The pharmaceutical analogy suggests what mature assurance looks like: the equivalent of a Certificate of Analysis for each model version, the equivalent of supplier qualification audits, the equivalent of adverse event reporting flowing back up the chain, and the equivalent of having a secondary supplier qualified when single-sourcing is unacceptable. An AI program that treats a foundation model as infrastructure-as-a-service, without these assurances, is operating at the governance maturity of a hospital that buys pharmaceuticals from whoever offers the best price without inspecting the label.
Incident Response: Distinguishing Hazard, Incident, and Harm
Pharmacovigilance distinguishes carefully between a signal (a pattern suggesting a possible adverse relationship), an adverse event (an undesired occurrence temporally associated with the drug, without implying causation), and an adverse drug reaction (a harm causally attributed to the drug). The distinctions matter because the response to each differs — a signal triggers investigation, an adverse event triggers documentation and analysis, a reaction triggers labeling changes, restrictions, or withdrawal.
AAISM preserves an analogous taxonomy. An AI hazard is a condition or property of the system that could lead to harm — an inadequately tested model, a known bias, an absent monitoring control. An AI incident is an event in which the system behaved in a way that caused or nearly caused harm. An AI harm is a realized adverse outcome to a person, group, organization, or society. Mature programs track all three, because hazards that are not incidents today may become incidents tomorrow, and incidents that did not cause harm today reveal failure modes that will cause harm eventually. The AAISM candidate should recognize scenarios in which an organization is responding only to realized harms while ignoring accumulating hazards, and identify this as a governance failure — the equivalent of a hospital that responds to deaths but not to near-misses.
Monitoring: Pharmacovigilance for Models
A drug’s label is not the end of its story. Post-market surveillance captures adverse events that the trials missed, populations that were underrepresented in testing, long-term effects that emerged only with prolonged use, and interactions with co-morbidities or other drugs that no trial could reasonably have anticipated. Pharmacovigilance is why the pharmaceutical industry catches problems its pre-market testing missed.
AI systems need the same continuous surveillance, and for analogous reasons. Distribution shift means the input data a model sees in production drifts away from the data it was trained and validated on, sometimes slowly and sometimes abruptly — the equivalent of a drug being used in populations the trial underrepresented. Concept drift means the relationship between inputs and the target variable changes over time — the equivalent of a pathogen evolving resistance. Adversarial adaptation means attackers learn the model’s weaknesses and exploit them, which has no stable pharmaceutical analog but is perhaps closest to antimicrobial resistance. Feedback loops, where a model’s outputs influence the data it will later be retrained on, can amplify bias or degrade performance in ways that only longitudinal monitoring will reveal.
The AAISM candidate should understand that monitoring is not a technical afterthought but a governance obligation that flows from deployment. The design of monitoring — what metrics, what thresholds, what escalation paths, what authority to restrict or withdraw — should be established before deployment, not after the first incident. And monitoring findings must feed back into the documentation loop: an emerging drift pattern updates the Model Card’s stated limitations, an exploited vulnerability updates the Risk Card, an unanticipated failure mode updates the Scenario Planning, and a sufficiently serious finding triggers retraining, restriction, or retirement.
Program Thinking: The Formulary Committee
A hospital does not manage pharmaceuticals one prescription at a time. It operates a Pharmacy and Therapeutics Committee — a standing body that maintains the formulary, evaluates new additions, monitors utilization, reviews adverse events, and sets policy that individual prescribers operate within. This is program thinking: the committee’s work is continuous, its decisions shape thousands of individual prescriptions, and its absence would force every prescriber to re-litigate medication policy with each patient.
AI governance requires the equivalent — an AI governance committee or equivalent body that maintains the AI inventory, approves risk classifications for new systems, reviews incidents and lessons learned, updates policy, and escalates to executive leadership when risk appetite is approached or exceeded. The committee’s composition reflects the multi-disciplinary nature of AI risk: security, privacy, legal, compliance, ethics, model risk if the organization has that function, line-of-business representation, and increasingly an AI ethics or responsible AI function. Its authority flows from executive sponsorship, and its legitimacy depends on being seen as enabling rather than obstructing — the formulary committee that said no to everything would be routed around, and the AI governance committee that only said no would be routed around too.
The AAISM candidate should recognize that when a scenario describes an organization without such a body — or with one that meets quarterly and rubber-stamps whatever the data science team brings — the correct response is structural, not procedural. Adding another policy to a program that lacks a governance body is the equivalent of adding another protocol to a hospital without a P&T committee: the document exists, and nothing changes.
Where the Analogy Breaks
Every analogy has limits, and the AAISM candidate should know where this one frays, because exam questions sometimes exploit precisely these seams.
Pharmaceutical harm is usually individual, physical, and temporally proximate — a patient takes a drug and has a reaction hours, days, or weeks later. AI harm is often collective, informational, and temporally diffuse — a biased hiring model disadvantages a protected class over years before the pattern is detectable, a recommendation engine subtly shapes a population’s beliefs without any single user experiencing an identifiable incident. This means AI monitoring cannot rely solely on individual adverse event reporting; it requires population-level measurement of disparate impact, belief drift, and systemic effects that have no clean pharmaceutical template.
Pharmaceutical efficacy is reasonably stable — the same molecule interacts with the same receptors the same way across patients, with variation driven by genetics, comorbidities, and dose. AI efficacy is unstable because the environment is reflexive — once a model is deployed, the world adapts to it. Fraud detection models are studied by fraudsters; content moderation models are probed by content creators; credit models reshape the applications they receive. This means AI validation is never truly complete in the way a drug approval is, and the governance model must reflect that permanence of uncertainty.
Pharmaceutical supply chains, while complex, involve physical substances that can be tested, labeled, and certified. AI supply chains involve data, weights, and code that can be silently modified, are difficult to verify independently, and often originate from opaque foundation models whose own provenance is unclear. The assurance problem is harder, not easier, than in pharma, which means the governance solution cannot simply import pharma’s controls; it must invent new ones — model signing, training data attestations, evaluation transparency, and emerging techniques like model fingerprinting and watermarking.
Pharmaceutical regulation evolved over a century with a clear public-health mandate and relatively stable science. AI regulation is evolving in real time against a technology that reshapes itself faster than any legislative cycle can track, which means the AAISM candidate must be prepared for a regulatory landscape that will look materially different in three years. This is why frameworks like NIST AI RMF emphasize adaptability, why ISO/IEC 42001 is structured as a management system rather than a control checklist, and why mature programs invest in regulatory intelligence as a continuous function rather than a compliance project.
The Security Manager as Hospital Pharmacist
The deepest lesson of the analogy is what it implies about the AAISM professional’s role. A hospital pharmacist is not the physician and not the patient. They do not diagnose, and they do not decide whether to take the medication. But they are the professional who ensures that the right drug, at the right dose, for the right patient, through the right route, with the right monitoring, reaches the point of use — and who has the authority and the duty to intervene when any of those conditions is unmet. They are accountable for the integrity of a system they did not design and the safe use of substances they did not manufacture.
The AI security manager sits in the same position. They are not the data scientist and not the business decision-maker. They do not build the model and do not decide whether to use it. But they are the professional who ensures that the right model, validated for the right use, with the right documentation, subject to the right monitoring, reaches deployment — and who has the authority and the duty to intervene when any of those conditions is unmet. Their value is not in the depth of their machine learning expertise or the sophistication of their adversarial testing. It is in the judgment, the governance discipline, and the stakeholder orchestration that transforms AI from a laboratory capability into a deployable, defensible, accountable organizational function.

Leave a comment