→ Proof Points
Hey {{first_name}},
I wrote this from Las Vegas, at HIMSS26: the large, loud, and somewhat bewildering annual gathering of hospital systems and health IT providers, held in a city designed to remind you that rational decision-making is entirely optional.
There is no better setting to watch the AI hype cycle unfold in real time.
In the Gartner Hype Cycle of things, I've been an early adopter my whole life: First GPS. First mobile phone. First Internet radio. First VR headset…. I was so far ahead of the curve at one point that I could buy a gadget, own it for a year, and sell it on eBay for more than I'd paid.
But, I got wiser. Now I wait for (at least) Version 3. I've been around long enough to watch multiple technologies ride the full Gartner hype cycle: robotics, IoT, Web 3.0, blockchain, digital twins. They all followed the same arc. Technology trigger. Peak of inflated expectations. Trough of disillusionment. Then eventually, if the underlying idea was sound, a plateau of productivity.
AI in healthcare is somewhere between the peak and the trough right now, hurtling at speed.
The good news? Some people are already doing the work to pull it toward the plateau.
And they did it in eight days! See below…
— Paul
OUR PODCAST
How do papers really get published?
No, we did not speak with an AI. We got the insider scoop with The Secret Editor. In this episode of Prove It, I sat down with a senior journal editor to pull back the curtain on how papers actually get published.
The question we're answering:
🤔 What is the #1 rule for getting accepted into the best journals?
DEEP DIVE
Someone checked OpenAI's homework

On 7 January 2026, a Wednesday, OpenAI launched ChatGPT Health.
By the following Thursday, 18 researchers at Mount Sinai's Icahn School of Medicine had collected their data, run their analysis, written a 2,000-word critique, and submitted it to Nature Medicine 8 days later. One month later, it was covered in the Guardian.
They did this over the weekend. On their own time. Without specific funding. The data collection took 3 days. The paper was accepted in 35 days; moving to the front of the queue at Nature Medicine as an accelerated article preview, with Eric Topol (one of the most respected figures in digital health AI globally) listed as peer reviewer.
This is what "move fast and prove things" looks like when it's done right.
The paper studied 60 vignettes across 21 clinical conditions and found something we should all be concerned with:
→ ChatGPT Health could be talked out of its own clinical recommendations.
For example, you could tell it that despite your sudden onset of acute symptoms, your mum thinks you're fine, and the system would revise its triage recommendation downward. This type of system isn't helping anyone, especially the patient.
The researchers identified two major failure modes.
1️⃣ First: over-triage pushes emergency resources toward patients who don't urgently need them, crowding out those who do.
2️⃣ Second: under-triage means a patient presenting with symptoms of a heart attack is told to rest and take paracetamol. In a healthcare system already running at capacity (and in the US, one now consuming nearly 20 cents of every dollar in GDP) both failure modes are expensive. Potentially fatal.
Compare that to the companies I've worked with who have spent years building validated health AI.
Ada Health had 50 doctors and scientists on staff when we first engaged with them. They spent years refining their triage for their regulated medical device in both the EU and the US and their safety team is world renowned. Hospitals, health systems, and pharma trust them enough to change care flows, dilligence that was rewarded last week with a paper in New England Journal of Medicine AI.
Woebot Health had a dozen clinical psychologists and psychiatrists building and stress-testing their suicide safety detection (a very important subject). This included wrestling with the genuinely difficult problem of distinguishing "I wish I was dead" from Gen Z shorthand for finding something funny, like “omg im ded ☠️”. These teams understand how hard nuance is because they've spent 1,000s of hours getting it wrong first. They also understand that their business models require human oversight at a level that ChatGPT's simply cannot.
Then there is Microsoft. In June 2025, their CEO claimed their AI was “4 times better than human doctors” at diagnosing disease. This was published on Microsoft's own blog, titled The Path to Medical Superintelligence.
Not in a peer-reviewed journal. Not replicated by an independent team. Announced, like you’re not supposed to. (Microsoft just made another PR push last week!)
I am not saying AI cannot eventually reach extraordinary diagnostic capability. It probably can, eventually. But the question is which route gets you there safely? And who’s showing their working such that when we look back in a few years when things go wrong (as they always do) that we can fix it?
"Move fast and break things" has a real cost in healthcare that it doesn't carry in consumer tech. These are people's lives. If your triage system calls a heart attack a panic attack, there is no moving on.
The Mount Sinai team understood the urgency. They burned a weekend. They did the work. And their code and data are all publicly available. Anyone can verify their findings independently. That is (and should be!) the standard.
Not a press release on a company blog.
📝 TIP
Most digital health companies are sleeping on preprints.
A preprint, published on a server such as medRxiv, lets you share your findings before full peer review is complete. This means that when you complete study, you don't have to wait 18 months for a journal to catch up before you can begin shaping the narrative publicly.
I played a small part in getting preprints accepted in medicine through work on the BMJ's editorial board. The BMJ became an institutional sponsor of medRxiv.
Similarly, the Health Chatbot User's Guide (a UK government-funded initiative at the University of Birmingham) recently announced their launch in Nature Health while the project was still underway. Traditionally research is only surfaced at the end like some sort of nerdy submarine, to which a lot of people say “why wasn’t I involved?”
But this is the right approach for public-facing work. Building in public, when you do it with proper methodology, is not a concession of standards. Done with precision, it can solidify your spot in this space.
The Proof Point
If you're waiting until everything is complete before communicating how you’re working and what you've found so far, you're leaving influence, authority, and competitive positioning on the table. Never hand it to those who would be rather less careful with our health.
— Paul
FROM OUR DESK
This Month at ProofStack
This was my first time at the infamous HIMSS conference, and my honest take is this conference was not what I was expecting. It was like a state fair but for health nerds. Think puppy parks, coffee machines, robots, virtual reality headsets, a guy even dressed up as a pulse oximeter? (Yep)
There was also a ton of AI fatigue, with the expo floor full of companies describing the same exact AI use-case, yet few knew if they had data to back up the marketing claims on their booths. The expo floor was full of companies describing what they hope their AI does to close revenue gaps or improve billing but very few describing how they know it works or how it helps patients!
It was easy to feel disorientated amongst 25,000 people, a great many sponsor logos, and a question nobody was saying out loud but everyone was circling awkwardly: will the US government continue to permit large, unregulated tech companies to practice medicine without a license?
Fortunately I got to spend time with my first podcast interviewee, Dr. Acacia Parks!
ONE QUESTION
What's the biggest lie in digital health right now?

P.S. Do you know your evidence score? We created this quick quiz to help you understand your regulatory compliance level.






