CARAT — Clinical AI Reliability & Accuracy Test

The Standard

The world does not need another AI product. It needs a way to know which ones are safe.

Every ambient AI company claims accuracy. Every vendor promises safety. Every sales deck shows a time savings graph. But when a hospital CEO asks to see the hallucination rate, the clinical error data, or the sovereignty audit — the room goes quiet.

CARAT changes that. It is an open, independently validated benchmark that scores clinical AI platforms across four dimensions that actually determine whether a system is safe to deploy in a public health environment. No marketing language. No cherry picked numbers. Just the score.

"In healthcare, the question is never whether AI can save time. The question is whether it can be trusted with a life."

CARAT Clinical Advisory Board

The Seven Dimensions

What CARAT measures.

Seven dimensions. Each one independently measurable. Together they define whether a clinical AI platform is genuinely ready for a public health environment — not just a private clinic demo.

A

Dimension One

Accuracy Score

How faithfully does the AI capture what was actually said? We measure hallucination rate — content generated that was never spoken — and omission rate — clinically relevant content that was spoken but not captured. Tested across 100 standardised clinical scenarios validated by practising clinicians.

Hallucination Rate Omission Rate Verbatim Fidelity

S

Dimension Two

Safety Score

In clinical AI, an error is not a bug report. It is a patient safety event. We score drug name accuracy against a standardised medication corpus, clinical risk flag detection, and whether the platform has governance controls that catch errors before they enter the permanent record.

Drug Name Accuracy Risk Flag Detection Governance Controls

S

Dimension Three

Sovereignty Score

Where does patient data go? For how long? Under whose jurisdiction? We score data hosting location, security certification level, government compliance frameworks achieved, and whether the vendor can demonstrate that patient data never leaves the country of care. This dimension is a pass or fail for government health systems.

Data Hosting Location Security Certification Government Compliance

T

Dimension Four

Clinician Trust Score

Adoption rate tells you whether clinicians actually use the tool. Edit rate tells you how much they trust what it produces. Satisfaction score tells you whether it helps or adds a new burden. We measure all three because a clinical AI platform that clinicians quietly stop using is not a solution. It is an expensive pilot.

Adoption Rate Edit Rate Clinician Satisfaction

C

Dimension Five

Consent and Transparency Score

Patients have a right to know they are being recorded. We score whether the platform requires explicit documented patient consent before recording begins, whether patients are notified clearly and in plain language, and whether consent records are retained and auditable. In 2026, lawsuits have already been filed against ambient AI companies for recording without adequate consent. This dimension is not optional.

Patient Consent Protocol Notification Standard Consent Audit Trail

E

Dimension Six

Edit Rate Score

The edit rate is the most honest measure of real world AI accuracy that exists. It tells you what clinicians actually do with AI output after the demo is over. A low edit rate means clinicians trust what the AI produces. A high edit rate means they are essentially rewriting notes — and the AI is adding work, not removing it. We score vendors on their independently verified edit rate from live clinical deployments.

Pre-Sign Edit Rate Field Level Corrections Time to Final Sign-off

D

Dimension Seven

De-identification and Data Destruction Score

When a clinical consultation ends, what happens to the recording? What happens to the transcript? What happens to the draft note? Most vendors cannot answer these questions precisely. We score the anonymisation standard applied to stored data, the verified retention period, the data destruction protocol and proof of destruction, the right to erasure process under applicable privacy law, and whether a complete audit trail of every deletion event exists and is independently verifiable. In a world where patient conversations contain some of the most sensitive data that exists, the question of what happens after the note is signed matters as much as the note itself.

Anonymisation Standard Data Retention Period Verified Destruction Protocol Right to Erasure Compliance Deletion Audit Trail

Global Leaderboard

The CARAT Leaderboard.

The first independently scored ranking of ambient clinical AI platforms in the world. Scores are derived from published peer reviewed research, vendor disclosures, and real world deployment data. The first full scored release publishes August 2026.

◆

August 2026 Release: Full CARAT scores will be published in the first week of August 2026, incorporating real world outcome data from a contracted Australian Government health system deployment with 5,000 clinicians. Subscribe to the ClinicalAI Pulse below to receive the full report the moment it publishes.

Platform	Accuracy	Safety	Sovereignty	Clinician Trust	Consent	Edit Rate	De-ID and Destruction	CARAT Score
MedTalk AI Gov Contracted Canberra Health Services, Australia	Aug 26	Aug 26	Aug 26	Aug 26	Aug 26	Aug 26	Aug 26	Aug 2026
Nuance DAX Pending Microsoft, United States	—	—	—	—	—	—	—	Pending
Abridge Pending Abridge AI, United States	—	—	—	—	—	—	—	Pending
Suki Pending Suki AI, United States	—	—	—	—	—	—	—	Pending
Heidi Health Pending Heidi Health, Australia	—	—	—	—	—	—	—	Pending

Scores are derived from published peer reviewed research, public vendor disclosures, and where available, real world deployment data. Vendors are invited to submit their own data for independent verification. Methodology published openly at CARATscore.org/methodology.

The Process

How CARAT scores are determined.

Every score is derived from a combination of standardised test scenarios, published research, and real world deployment data. No vendor pays for inclusion. No score is negotiated. The methodology is published openly so any researcher in the world can replicate it.

1

Standardised Scenarios

100 validated clinical conversations across Emergency, Outpatient and Specialty settings. Each has a clinician verified ground truth note.

2

Independent Scoring

AI output is scored against ground truth by the clinical advisory board. Hallucinations, omissions, drug errors and risk flags are each catalogued.

3

Real World Validation

Where deployment data is available from real health systems, it is incorporated and weighted above simulated testing.

4

Public Publication

Scores and methodology are published openly. Vendors may submit data for reconsideration. Every change to a score is logged and versioned.

Clinical Advisory Board

Independent voices. No vendor ties.

CARAT scores are validated by an international clinical advisory board of practising clinicians with direct experience deploying AI in live healthcare environments across multiple countries. Every board member is independent. No member is employed by or has a financial interest in any platform scored by CARAT.

+

Emergency Medicine Clinician

Advisory Board Member — North America

We are seeking a practicing Emergency Medicine clinician with direct experience reviewing AI generated clinical documentation in a high volume acute care setting.

+

Clinical Informatics Specialist

Advisory Board Member — Europe

We are seeking a clinician with expertise in health informatics, EHR governance, and AI documentation quality assessment from a European health system context.

+

Primary Care Physician

Advisory Board Member — Asia Pacific

We are seeking a primary care clinician from the Asia Pacific region with experience assessing AI documentation tools across diverse patient populations and multilingual settings.

Are you a practising clinician? We are building the most credible independent clinical AI advisory board in the world. If you have direct experience with AI documentation tools in a clinical setting and want to contribute to the global standard, we want to hear from you. Board members are not paid. They are credited. And they help shape the standard that every hospital procurement team will reference.

Express Your Interest

Get in Touch

We want to hear from you.

Whether you are a clinician who wants to join the advisory board, a vendor who wants to submit your platform data, a journalist covering clinical AI, or a hospital procurement team with questions about the CARAT standard — there is a right door for you below.

For Vendors

Submit Your Platform Data

Every platform in the CARAT leaderboard is invited to submit their own data for independent verification. Transparency is not a weakness. It is the only way to build trust.

hello@caratscore.org →

For Clinicians

Join the Advisory Board

We are recruiting independent practising clinicians from North America, Europe and Asia Pacific. No vendor ties. No payment. Just the most credible seat at the most important table in clinical AI right now.

hello@caratscore.org →

For Media

Press and Media Enquiries

The first CARAT scores publish August 2026. If you are a journalist, editor or producer covering clinical AI, health technology or patient safety and want early access or an embargo briefing, we would love to talk.

hello@caratscore.org →

General Enquiries

hello@caratscore.org

Powered By

MedTalk AI — medtalk.co

Atif Nisar

linkedin.com/in/atif-medtalk

Subscribe to the Pulse

Does your clinical AI
earn the right to be trusted?

The world does not need another AI product. It needs a way to know which ones are safe.

What CARAT measures.

The CARAT Leaderboard.

How CARAT scores are determined.

Independent voices. No vendor ties.

The Zero Hallucination Challenge.

We want to hear from you.

Does your clinical AIearn the right to be trusted?

The world does not need another AI product. It needs a way to know which ones are safe.

What CARAT measures.

The CARAT Leaderboard.

How CARAT scores are determined.

Independent voices. No vendor ties.

The Zero Hallucination Challenge.

The weekly briefing for people who make decisions about clinical AI.

We want to hear from you.

Does your clinical AI
earn the right to be trusted?