Model Design by Kelly Emrick, DHSc, PhD, MBA
How Do I Use the Radiology AI Governance Knowledge Base Model? The interactive model serves as a practical reference for radiology executives and IT leaders managing the governance challenges of clinical AI. Start with the Overview tab to understand the eight governance goals and what makes radiology data uniquely complex. From there, the model can be explored in any sequence: open Stakeholders & RACI when forming an AI committee or defining responsibilities; review the Data Lifecycle tab step-by-step when designing data flows or auditing existing ones; refer to Technical Controls during vendor assessments or when planning privacy-preserving architectures like federated learning or secure enclaves. The Bias, Quality & Drift, and Model Governance tabs support the clinical safety discussion, while the Regulatory Compass consolidates your obligations under HIPAA, GDPR, FDA SaMD, and the EU AI Act into one view. The model concludes with the Maturity Assessment, a 42-item self-evaluation covering seven domains that generates a live radar chart, an overall maturity level (from Initial to Optimized), and an automatically generated action plan prioritizing your three lowest-scoring areas with specific next steps. Use it as a summary for board presentations, a strategic guide for your next governance cycle, or a collaborative diagnostic tool that aligns clinical, technical, compliance, and executive stakeholders around your current position and future directions.
Radiology AI Data & Application Governance
A comprehensive interaction model for governing imaging data, AI models, and the people, policies, and platforms that connect them — from acquisition through retirement, across HIPAA, GDPR, FDA, and the EU AI Act.
The Governance Imperative
AI in radiology is a data-intensive specialty where imaging archives hold rich identifiers, scanner metadata, and decades of clinical context. Responsible AI deployment demands governance that addresses algorithmic bias, transparency, and medico-legal accountability — not as a regulatory checkbox, but as a clinical safety discipline that runs in parallel with the technology itself.
Scope of the Model
An AI data governance model in radiology spans imaging data from capture through disposal — primary clinical imaging (X-ray, CT, MRI, US, NM), derived annotations (radiologist reports, segmentations, structured findings), associated metadata, and the AI models trained on those data.
It governs both the data and the algorithms. Data without model oversight is incomplete; model oversight without data lineage is unverifiable.
Eight Governance Objectives
- Protect patient privacy & security
- Maintain data quality & integrity
- Mitigate bias and ensure fairness
- Achieve and document regulatory compliance
- Establish accountability and transparency
- Enable lawful innovation and research
- Drive operational efficiency in AI adoption
- Sustain trust with patients, clinicians, and regulators
What This Knowledge Base Delivers
For Radiology Executives
A defensible framework for board reporting, vendor diligence, and regulatory readiness — with maturity scoring that maps directly to investment priorities.
For IT Leadership
Technical control patterns (de-ID, federated learning, differential privacy, secure enclaves, audit trails) mapped to data lifecycle stages and risk tier.
For Compliance & Legal
A regulatory compass spanning HIPAA, GDPR, FDA SaMD, and the EU AI Act, with policy/SOP templates and DPIA checkpoints aligned to each authority.
Why Radiology Is Different
Unlike most health data, imaging data live in PACS with large file sizes, rich metadata, and indefinite retention horizons. They retain long-term scientific value — an MRI from 2015 may be the training datum for a 2027 model. That temporal asymmetry changes the privacy calculus, the consent calculus, and the data-stewardship calculus. Governance must treat de-identification as a process, not a one-time step, and treat retrospective imaging as a regulated asset class.
Stakeholders & the RACI Matrix
Successful AI governance is multidisciplinary. The most consistent lesson from leading academic medical centers and NHS trusts: stakeholder alignment, not technology, is the rate-limiting step. Below is a reference RACI matrix mapping core governance activities to the roles that should own them.
Core Governance Body
An AI Governance Steering Committee (or Radiology AI Working Group, depending on organizational scope) should convene at minimum the following voting roles:
- Chief AI / Data & Analytics Officer (chair)
- Radiology Clinical Lead (physician)
- Lead Radiologic Technologist
- Data Science / ML Lead
- IT & Security Lead (CISO delegate)
- Compliance & Privacy Officer
- Legal Counsel
- Patient / Community Liaison
Subcommittees
The steering committee typically delegates to standing subcommittees:
- Technical Evaluation — model validation, drift review, retraining approvals
- Ethics & Equity Review — bias audits, fairness metrics, patient impact
- Vendor & Procurement — due diligence, BAAs/DPAs, contract terms
- Incident Response — AI-related safety events, decommissioning
Penn Medicine and the NHS reported that explicit subcommittee structure was the single highest-leverage decision in their governance maturity.
RACI Matrix — Core Governance Activities
| Activity | Radiologist | Data Sci. | IT / Security | Compliance | Exec / CDAO | Vendor | Patient Adv. |
|---|---|---|---|---|---|---|---|
| Define clinical use case & success criteria | R | C | I | I | A | I | C |
| Dataset curation & annotation QA | R | R | C | C | A | — | — |
| De-identification & PHI risk review | C | R | R | A | I | C | I |
| Model validation & clearance review | R | R | C | C | A | C | I |
| Bias & fairness audit | C | R | I | C | A | C | C |
| Drift monitoring & performance review | C | R | C | I | A | C | — |
| Vendor due diligence & contracting | C | C | R | R | A | C | — |
| Audit log review & access management | I | I | R | A | I | — | — |
| Incident response & decommissioning | R | R | R | C | A | C | I |
| Patient communication & transparency | C | I | I | R | A | — | R |
Lessons from the Field
Penn Medicine
Convened a Radiology AI Committee with clinicians, AI experts, and administrators. Reported that the most important decision was “including all stakeholders in the process” and aligning department-level governance with enterprise-level AI priorities.
NHS Trust (Breast Imaging)
A consultant radiologist convened both an AI Working Group and an AI Project Group spanning clinicians, IT, legal, contracts, and R&D. Critical first step: setting an AI vision presented to executive leadership before any pilots launched.
The Imaging Data Lifecycle
Governance must address every stage of the data lifecycle — from the moment a scanner produces an image to the moment that image (or the model trained on it) is decommissioned. Each stage has distinct risks, controls, and accountable owners. Click any stage below to drill in.
Governance Checkpoints Across the Lifecycle
Each stage requires a defined approval gate before the data progresses. Suggested checkpoint mapping:
Pre-Acquisition Gate
Protocol standardization sign-off, scanner QA verification, consent framework documented in EHR/RIS workflow.
De-Identification Gate
Automated DICOM header scrub + burned-in-pixel detection + sample audit by privacy officer before any data leaves the clinical archive for AI use.
Access Gate
IRB or ethics review (research) or AI Committee approval (operational), Data Use Agreement on file, role-based access provisioned with time limits.
Sharing Gate
Re-identification risk assessment, HIPAA/GDPR pathway documented (limited dataset, anonymization, federated alternative), BAA/DPA executed.
Retention Gate
Annual review of dataset utility vs. retention obligations; archived datasets re-attested for ongoing research relevance.
Deletion Gate
Cryptographic erasure verification, model lineage updated to reflect dataset retirement, audit log permanently retained.
Technical Controls Library
Privacy-by-design in radiology AI rests on six interlocking technical controls. Each addresses a different failure mode — and each carries trade-offs that governance must explicitly accept.
Removes or obfuscates direct identifiers (name, MRN, SSN) and indirect identifiers (device ID, scan parameters, acquisition timestamps, burned-in pixel annotations) from images and metadata. Standard tools include the RSNA Clinical Trial Processor (CTP), DICOM Toolkit, and commercial offerings.
Treat de-identification as a process, not a one-time transform. Audit a sample of every de-identified batch — burned-in pixels and free-text annotation fields are the most common leak vectors.
Instead of pooling images centrally, each participating site trains a local copy of the model and shares only gradient updates with a central aggregator. Raw images never leave the originating institution. NVIDIA Clara FL, OpenFL, and TensorFlow Federated are the dominant frameworks for medical imaging.
FL is particularly valuable for rare-disease imaging, multi-center clinical trials, and any scenario where data localization laws (e.g., EU Member State-level restrictions) prohibit central pooling.
Adds calibrated mathematical noise to data, gradients, or model outputs so that the contribution of any single patient cannot be recovered, providing a quantifiable privacy guarantee (the ε, δ budget). Commonly applied to FL gradient updates before aggregation.
Sensitivity-aware DP variants dynamically adjust noise magnitude based on the rarity or sensitivity of training samples, preserving more utility for common cases while protecting outliers.
Trusted Execution Environments (Intel SGX, AWS Nitro Enclaves, GCP Confidential VMs) compute on encrypted data inside hardware-isolated regions. Memory contents are invisible even to system administrators and cloud operators.
Useful pattern: a vendor’s algorithm runs inside the enclave on identifiable hospital data, and only aggregate inference results leave. The vendor never sees raw images; the hospital never exposes its data to vendor infrastructure.
Every action — data access, annotation, model training run, sharing event, inference query — generates an immutable audit record. This enables forensic review of breaches, regulatory inspection, and root-cause analysis of model failures.
Tooling: MLflow or Weights & Biases for model lineage; Splunk or ElasticSearch for system-level logs; data versioning via DVC or LakeFS. Integrate PACS access logs into the same audit fabric — siloed logs are unauditable in practice.
Encryption at rest (AES-256) for PACS, archives, and AI training stores; TLS 1.3 for all in-flight data. Network segmentation isolates AI development environments from clinical production networks. Multi-factor authentication and role-based access control gate every system entry point.
For cloud AI workloads, customer-managed encryption keys (CMK) are the governance preference — cloud-provider-managed keys offer convenience but reduce institutional control over key rotation and revocation.
Control Selection Decision Tree
Match controls to data sensitivity and use case — not all controls are warranted for all workloads.
Internal QI / Clinical Use
De-ID (where feasible) + audit trails + encryption + RBAC. Federated learning rarely needed.
Multi-Site Research
All foundational controls + FL with DP. Re-identification risk assessment required.
External Vendor Validation
All controls + secure enclave or zero-trust pattern. BAA/DPA mandatory; vendor must demonstrate equivalent posture.
Bias, Data Quality & Drift
High-quality, representative data is the foundation of trustworthy radiology AI. Bias is not a single problem but a class of failures — demographic, technical, geographic, label-quality, and temporal — each requiring different mitigation. Continuous drift monitoring closes the loop after deployment.
Five Sources of Bias in Imaging AI
Datasets weighted toward one ethnicity, age band, sex, or body habitus produce models that systematically underperform on the underrepresented groups.
Different vendors, field strengths, reconstruction kernels, and acquisition protocols introduce distributional shifts that masquerade as “real” signal during training.
A model trained at one academic center captures that institution’s case mix, referral patterns, and prevalence rates — not the world’s.
Inter-reader disagreement, hurried free-text reports, and inconsistent grading scales propagate as label noise that the model learns as truth.
Models latch onto image artifacts (chest tubes, laterality markers, hospital-specific image processing) instead of pathology — revealed only by explainability tools.
Mitigation Strategy Stack
Multi-site collaboration, oversampling underrepresented groups, augmentation, harmonization (e.g., ComBat for MRI). Document dataset demographics before training.
Add fairness regularization terms during training; adversarial debiasing; group-balanced loss functions. Frameworks: IBM AI Fairness 360, Aequitas.
Calibrate threshold per subgroup to equalize false-positive or false-negative rates. Document the trade-off explicitly in the model card.
Mandatory subgroup metrics (sensitivity, specificity, AUC) by sex, age band, race/ethnicity (where ethically captured), scanner vendor, and site. Republish with each model update.
Saliency maps, Grad-CAM, and counterfactual analysis surface which image features the model is actually using — the most reliable shortcut-learning detector.
Track input distribution statistics and model confidence over time. Alerts on significant shifts trigger investigation, retraining, or temporary suspension.
Drift Response Playbook
Model Lifecycle Governance
Governance for AI models parallels software quality management — but with clinical-safety stakes. Validation, explainability, monitoring, and version control all extend across a five-stage lifecycle that should be formalized in policy and enforced through MLOps tooling.
Validation Tiers
Tier 1 — Internal Holdout
Model evaluated on held-out portion of training data. Reports AUC, sensitivity, specificity with confidence intervals. Necessary but insufficient.
Tier 2 — External Validation
Evaluation on data from sites and scanners not represented in training. Reveals geographic and equipment bias.
Tier 3 — Prospective Silent Pilot
Model runs in clinical environment without affecting care decisions; outputs compared to radiologist ground truth.
Tier 4 — Clinical Outcome Study
Live use with measured patient outcomes. Required for high-risk deployments under the EU AI Act.
Explainability Requirements
Models should produce interpretable outputs — saliency maps, attention weights, or feature attributions — that allow clinicians to understand and contest predictions. Governance should classify models by explainability tier:
High Glass-box model with native interpretability (rare in deep learning).
Medium Black-box model paired with post-hoc XAI (Grad-CAM, SHAP, LIME).
Low Black-box with no XAI — should be flagged as elevated risk and require additional human-in-the-loop controls.
Opacity in image-based AI is a documented patient-safety concern and should never be accepted silently.
Predetermined Change Control Plan (PCCP)
The FDA’s evolving guidance for AI/ML-enabled SaMD permits pre-authorized model modifications under a Predetermined Change Control Plan. Governance should mirror this internally even when not legally required: define before deployment what classes of update are pre-approved, what require re-review, and what trigger a new clearance.
Pre-Approved Updates
Retraining on additional data of the same type and population, with no architecture change. Approved by AI Committee chair.
Standard Re-Review
Architecture changes, new input modalities, expanded indications. Full Technical Evaluation Subcommittee review.
Major Re-Clearance
New clinical claims, new patient populations, change in intended use. Full governance review and, if SaMD, FDA notification.
CI/CD for Clinical AI
Continuous Integration / Continuous Deployment pipelines should automate the mechanical work of retraining and testing — but never bypass human review for clinical models. A defensible pipeline includes:
- Automated dataset versioning with cryptographic hashes (DVC, LakeFS)
- Automated training-run logging (MLflow, Weights & Biases) with hyperparameters and metrics
- Automated unit tests, integration tests, and regression tests on held-out gold sets
- Automated subgroup performance evaluation with fairness gates
- Mandatory human approval gate before any deployment to production inference
- Automated rollback capability if monitoring detects post-deployment regression
Decommissioning — The Forgotten Stage
Most AI governance frameworks address deployment exhaustively and decommissioning sparsely. A model that is no longer monitored, no longer maintained, or no longer aligned to current data has become a latent risk. Governance must define:
- Clear triggers for decommissioning (sustained drift, vendor exit, superseded by replacement, policy change)
- Communication plan for end-users and downstream systems
- Archival of model artifacts, training data lineage, and validation history (regulatory inspection may follow years later)
- Post-mortem review documenting lessons learned for future deployments
The Regulatory Compass
Radiology AI sits at the intersection of privacy law, medical device law, and emerging AI-specific law. Each authority imposes overlapping obligations — treating them as one program with one mapping is more efficient than running parallel compliance tracks.
- Technical safeguards: access control, audit logs, encryption
- De-identification via Safe Harbor or Expert Determination
- Business Associate Agreements with all vendors
- 6-year retention of audit logs & policies
- Breach notification within 60 days
- Limited Data Sets permitted with DUA for research
- Lawful basis required for every processing activity
- Data Protection Impact Assessment (DPIA) for AI uses
- Patient rights: access, rectification, erasure, portability
- Privacy by design and by default mandated
- Pseudonymized data still personal data; only true anonymization escapes scope
- Data Processing Agreements with all processors
- Premarket review: 510(k), De Novo, or PMA pathway
- Good Machine Learning Practice (GMLP) principles
- Predetermined Change Control Plan for updates
- Quality Management System (ISO 13485 alignment)
- Real-world performance monitoring expectations
- Transparency & algorithm change communication
- Risk management system across the AI lifecycle
- Data governance: relevance, representativeness, error analysis
- Technical documentation & record-keeping
- Transparency to users (clear AI disclosure)
- Human oversight mandatory
- Accuracy, robustness, and cybersecurity requirements
- Post-market monitoring & serious incident reporting
- CE marking via Notified Body (most clinical AI is Class IIa+)
- Clinical evaluation and Post-Market Clinical Follow-up
- Unique Device Identification (UDI)
- EUDAMED registration
- Risk management per ISO 14971
- ISO/IEC 27001 — Information security management
- ISO/IEC 23894 — AI risk management guidance
- ISO/IEC 42001 — AI management system
- ISO 13485 — Medical device QMS
- ISO 14971 — Medical device risk management
- Govern — culture, policies, accountability
- Map — context, use case, risk identification
- Measure — quantitative & qualitative analysis
- Manage — prioritize, respond, communicate, monitor
- Increasingly cited in federal procurement & state laws
- Stakeholder consultation expectations
- Professional accountability frameworks
- Model card / dataset card disclosure norms
- Continuing education requirements for AI use
One Program, Many Authorities — A Unified Mapping
Rather than running separate HIPAA, GDPR, FDA, and EU AI Act compliance tracks, map your governance controls once to all relevant authorities. Each requirement an authority imposes typically overlaps with two or three others. The pattern that scales:
Authority × Control Crosswalk
Build a matrix where rows are your governance controls and columns are the authorities. A single audit log requirement, for example, satisfies HIPAA, GDPR Article 30, EU AI Act Article 12, and ISO 27001 simultaneously — documented once.
Single Source of Truth
Maintain one policy library, one risk register, one DPIA template, and one model card template. Tag each document with the authorities it satisfies. This dramatically simplifies audits and inspections.
Governance Maturity Assessment
Rate your organization on each of 42 items across 7 governance domains. Scores generate an overall maturity level, a radar of domain strength, and a prioritized action plan focused on the lowest-scoring areas.
Overall Maturity
Auto-Generated Action Plan
Recommendations are prioritized by lowest-scoring domains first. Complete the assessment above to populate this section.
Action plan will appear after you rate at least one domain…
Maturity Bands
Ad-hoc, undocumented, person-dependent.
Some policies exist, inconsistently applied.
Documented, trained, applied across most projects.
Measured, monitored, continuously improved.
Embedded in culture, externally auditable, drives strategy.