Part 5 of the “Preparing for Joint Commission AI Certification” series
The RUAIH guidance is clear that AI monitoring can’t be a one-time validation exercise. Algorithms drift. Data inputs change. Model updates happen. Without ongoing monitoring, an AI tool that worked well at deployment can silently degrade-potentially affecting patient care before anyone notices.
But the guidance also acknowledges reality: “Ongoing post-deployment monitoring should be risk-based and scaled to your setting.”
For regional health systems without dedicated data science teams, the question isn’t whether to monitor AI-it’s how to do so effectively with limited resources.
Why Monitoring Matters
AI tools are fundamentally different from traditional software. A scheduling application either works or it doesn’t. An AI algorithm can work-but work differently than expected, or work well for some patients but poorly for others.
Several factors can cause AI performance to change over time:
Data drift: The patients you serve may differ from the population the AI was trained on-or your patient population may shift over time.
Concept drift: Clinical practices, documentation patterns, or disease presentations may evolve, changing what the AI is trying to predict.
Model updates: Vendors periodically retrain and update their algorithms. These updates usually improve performance-but not always for every use case.
Integration changes: Updates to your EHR, new clinical workflows, or changes in how data flows can affect AI inputs.
The RUAIH guidance captures this well: “AI algorithms may have the capacity to learn and adapt over time, data inputs can change or drift over time, and AI tools and their underlying algorithms may be updated periodically. This means AI tool outcomes and performance can change.”
Risk-Stratify Your Monitoring Approach
Not every AI tool requires the same monitoring intensity. The guidance recommends asking: “How close is this tool to patient care decisions, and what could go wrong if it performs poorly?”
High-Risk: Clinical Decision Support
AI tools that directly influence diagnostic or treatment decisions require the most rigorous monitoring:
- Sepsis prediction and early warning systems
- Diagnostic imaging AI
- Treatment recommendation algorithms
- Clinical deterioration alerts
- Risk stratification affecting treatment plans
Monitoring approach: Quarterly (or more frequent) performance reviews, regular accuracy assessment against ground truth, ongoing bias monitoring across patient populations, immediate investigation of any suspected failures.
Medium-Risk: Clinical-Adjacent AI
AI tools that inform clinical workflows without autonomously driving decisions warrant regular but less intensive monitoring:
- Ambient clinical documentation
- Clinical documentation improvement suggestions
- Care gap identification
- Prior authorization support
- Patient no-show predictions
Monitoring approach: Quarterly or semi-annual performance reviews, sampling-based quality assessment, user feedback collection, trend monitoring for issues.
Lower-Risk: Administrative and Operational AI
AI tools supporting non-clinical operations present lower patient safety risk but still warrant baseline monitoring:
- Revenue cycle optimization
- Staffing predictions
- Supply chain management
- Scheduling optimization
Monitoring approach: Annual review aligned with contract renewals, exception-based monitoring triggered by user complaints or unexpected outcomes, integration into existing operational reporting.
Practical Monitoring Approaches
You don’t need a data science team to monitor AI effectively. Several approaches work for resource-constrained environments:
Leverage Vendor-Provided Monitoring
The RUAIH guidance explicitly notes that monitoring resources “may be obtained externally in agreements with third-party vendors, or be part of vendor-agreed-upon support or tooling for AI tools.”
When contracting with AI vendors, ask:
- Do you provide performance dashboards or monitoring tools?
- What metrics do you track, and how can we access them?
- How will you notify us of model updates or performance changes?
- Can you provide performance reports specific to our patient population?
Many vendors offer monitoring capabilities-but you may need to ask for them and ensure they’re included in your agreement.
Use Existing Quality Infrastructure
The guidance recommends: “When possible, use structures you already have (quality, patient safety, compliance) rather than creating something new.”
Integrate AI monitoring into existing processes:
Quality committee: Add AI tool performance as a standing agenda item. Review dashboards and exception reports.
Patient safety reporting: Ensure your incident reporting system can capture AI-related events. Train staff to recognize and report AI issues.
Compliance audits: Include AI tool verification in periodic compliance reviews.
EHR governance: If your EHR governance process reviews clinical decision support, extend it to AI-powered CDS.
Establish Sampling-Based Review
For clinical AI tools, periodic sampling can identify performance issues without requiring automated dashboards:
Example for ambient documentation: Monthly review of 10-20 randomly selected AI-generated notes. Compare against audio recordings or check with providers. Track accuracy trends over time.
Example for clinical predictions: Quarterly review of prediction accuracy. For a sepsis alert system, check what percentage of alerts were true positives. Compare performance across demographic groups.
Sampling isn’t perfect, but it’s infinitely better than no monitoring-and it’s achievable without dedicated analytics staff.
Monitor User Feedback
Clinicians using AI tools daily are your best early warning system. Establish clear channels for them to report:
- AI outputs that seem incorrect or unusual
- Patterns of AI errors they’re noticing
- Workflow issues caused by AI behavior
- Changes in AI behavior after updates
This doesn’t require complex systems-a designated email address, a Teams channel, or a simple form in your intranet can work.
What to Monitor
Specific metrics depend on the AI tool, but key dimensions include:
Performance Metrics
- Accuracy/precision/recall for classification tasks
- Error rates and error types
- Processing time and system availability
- Completion rates (for documentation AI)
Equity Metrics
- Performance broken down by patient demographics (age, race/ethnicity, sex)
- Alert rates across patient populations
- Outcome differences across groups
Operational Metrics
- User adoption and utilization rates
- Workflow efficiency impacts
- User satisfaction scores
- Time savings achieved
Safety Metrics
- Incidents or near-misses involving AI
- Complaints or concerns reported
- Overrides or rejections of AI recommendations by clinicians
Vendor Communication Channels
The RUAIH guidance emphasizes establishing “clear feedback channels between third-party vendors and those responsible for monitoring the AI tool so that relevant parties stay informed about model changes or updates.”
Key information to receive from vendors:
- Advance notice of model updates (ideally 30+ days for significant changes)
- Release notes explaining what changed and why
- Updated performance data following updates
- Known issues or limitations that emerge
Key information to share with vendors:
- Performance issues you identify in your environment
- Patient population concerns (e.g., “Your model seems less accurate for our elderly population”)
- Feature requests or workflow concerns
- Safety events potentially related to the AI
Document these communications. If a vendor repeatedly fails to notify you of updates or address concerns, that’s relevant information for governance and contract renewal decisions.
Monitoring Documentation
Your monitoring program should generate records that demonstrate ongoing oversight:
For each AI tool, document:
- Monitoring plan (frequency, metrics, responsible parties)
- Performance reviews completed (date, findings, actions taken)
- Issues identified and resolution status
- Vendor communications regarding performance
- Any changes to monitoring approach
This documentation serves two purposes: it drives actual oversight, and it provides evidence of governance if certification or accreditation reviews examine your AI program.
Getting Started: Minimum Viable Monitoring
If you’re building a monitoring program from scratch, start with these essentials:
Month 1:
- List all AI tools in production
- Risk-stratify each tool (high/medium/lower)
- Identify who is responsible for each tool’s monitoring
Month 2:
- For high-risk tools: Establish quarterly review calendar and define key metrics
- For all tools: Ensure vendor contact is documented and communication channel exists
- Create simple issue reporting mechanism for end users
Month 3:
- Conduct first monitoring review for highest-risk tools
- Add AI monitoring as standing item on quality/governance committee agenda
- Document monitoring plan for each tool
You won’t have automated dashboards and real-time drift detection. That’s okay. What you will have is conscious, regular oversight-which is what the RUAIH guidance actually requires.
Pre-Deployment: Questions to Ask Vendors
The RUAIH guidance emphasizes that monitoring starts before deployment: “During procurement, healthcare organizations should request information from developers/vendors on how AI tools were tested and validated for their intended use.”
Before purchasing or deploying an AI tool, ask:
- How was this tool tested and validated?
- What population was it trained and tested on? How does that compare to our patient population?
- How were relevant biases evaluated?
- Are you willing to validate or tune the model on our local data?
- What monitoring tools do you provide?
- How will you notify us of model updates?
- What performance metrics will you share with us?
Document vendor responses. These become the baseline for ongoing monitoring and the foundation for vendor accountability.
Next in the series: AI Safety Events: Building a Reporting Culture Before Something Goes Wrong
Harness.health helps regional health systems build AI governance programs aligned with Joint Commission RUAIH guidance. Learn more about our platform