20 Dec
-
28 December 2025

AI in the NHS Weekly Newsletter - Issue #29

This festive period saw the group tackle some of the most substantive debates of the year, from the thorny question of whether AVT audio recordings should be retained for medico-legal purposes, to a paradigm-shifting discussion on whether human clinicians "hallucinate" in their notes just as LLMs do. The community welcomed new tools including Vigil (for simplified MHRA reporting) and a HEIC-to-JPG converter, whilst sharing Christmas cheer through AI-generated content and the much-appreciated Xmas Special Newsletter. A standout NEJM AI paper on VeriFact--automated clinical fact-checking--sparked debate on the future of documentation verification, and members explored the "jagged frontier" of AI capabilities as 2025 draws to a close.

Major Topic Sections

1. The Great AVT Audio Retention Debate (20 Dec)

The week opened with what became the most substantive single-day discussion of the period: should ambient voice transcription (AVT) tools retain the original consultation audio?

A digital health specialist opened proceedings with a reminder that out-of-hours services have recorded telephone consultations since the early 2000s without the "world exploding." This sparked a vigorous debate about the parallels--and differences--between historical call recording and modern AI transcription.

"The principle is we keep raw data for future verification & traceability. If this is the principle then all audio from AVT should be kept." -- Innovation-focused GP

A healthcare informatics specialist pushed back thoughtfully, questioning whether the comparison held:

"Is it because a conscious decision was made to do that or is it because x-rays were physical things that formed part of the record? We don't keep blood samples forever. Or empty vaccine vials after writing down the batch number." -- Healthcare Informatics Specialist

The debate intensified when an AVT company CEO entered the discussion, explaining the practical constraints: data controller responsibilities, cybersecurity implications equivalent to holding a "shadow patient record," and the fact that some organisations explicitly requested audio NOT be retained.

"It's a massive cybersecurity risk (equivalent to EHR level capabilities) - no AVT vendor should really want to get into that space unless they are prepped to be cleared to the same level to hold essentially a shadow patient record." -- AVT Company CEO

A clinical safety expert emphasised the importance of proper DCB0160 processes:

"This is why the process of doing the DCB0160 correctly is so important, and why I automatically judge people as a bit... amateurish if they see it as a chore to be outsourced to someone else." -- Clinical Safety Expert

The discussion also touched on the Abridge lawsuit in California, where stricter consent requirements apply, and the practical nightmare of Subject Access Requests for audio files.

Key dates referenced: 20 December (primary), with callbacks throughout the week

2. LLMs as Knowledge Bases: The Reliability Question (23 Dec)

Monday saw a technical deep-dive into whether LLMs can reliably serve as clinical knowledge sources--a question with profound implications for medical AI deployment.

An AVT company CEO articulated the fundamental challenge:

"The problem with LLMs is they aren't designed to be knowledge bases - they are encoded by accident in the training of their weights to produce coherent sentences - but their original design was for translation (text to text)." -- AVT Company CEO

"The reasoning is actually BS - there was a paper recently that the reasoning graph they give is actually nothing to do with what is activated in the model - they literally pretend. So it's not traceable." -- AVT Company CEO

A recently qualified GP and AI developer offered a more optimistic technical perspective:

"Greedy decoding, low temperature (0) and tight server architecture can make output near deterministic. With a fixed knowledge base (or graph) the underlying LLM is almost irrelevant if the semantic match is high." -- Recently Qualified GP & AI Developer

The discussion evolved into liability implications, with a clinical safety expert noting:

"Until there's a case that gets to at least the Court of Appeal, insurance/indemnity providers will not know the extent of their liabilities for harm cases. It really is the wild west of unmade law just now." -- Clinical Safety Expert

The concept of a "Bolam test for AI" was floated half-jokingly--but may prove prescient.

Key dates referenced: 23 December (primary), 27 December (follow-up)

3. Do Humans Hallucinate Too? The 7% Revelation (27 Dec)

Perhaps the most paradigm-shifting discussion of the period emerged on Boxing Day weekend, when an AVT company CEO shared findings from incidental testing:

"We did some incidental testing of human clinician outputs and humans do indeed appear to hallucinate information - 7% was the figure although this was a tiny study so should be done properly." -- AVT Company CEO

This challenged the prevailing assumption that AI hallucination is categorically different from human error. A digital health specialist noted from personal audit experience:

"I saw lots of information added to notes that had never been discussed in consultation in my time auditing calls OOH. Commonly, red flag checking that never happened." -- Digital Health & Clinical AI Specialist

"It's not always cut and dried though. Some clinicians genuinely thought they had asked the question or heard the answer." -- Digital Health & Clinical AI Specialist

An innovation-focused GP initially pushed back, distinguishing intentional falsification from LLM design limitations:

"This is falsification, criminal activity. LLM hallucinations are due to the design not malicious. I would say both are different." -- Innovation-focused GP

But the conversation revealed more nuance--template consultations, dot phrases, and copy-paste culture all create risks of "accidental hallucination" without any AI involvement.

A cardiologist and clinical informatician offered a philosophical coda:

"All data is contextualised and challenged by smart humans automatically, something that's incredibly hard to replicate in AI because you'd essentially then be programming a version of distrust and mistrust into algorithms." -- Cardiologist & Clinical Informatician

Key dates referenced: 27 December (primary), 28 December (follow-up)

4. VeriFact: Automated Clinical Fact-Checking (27 Dec)

A digital health specialist shared an NEJM AI paper introducing VeriFact--an AI system for automatically verifying clinical documents against patient records.

"This kind of thing genuinely offers hope for improved summaries... VeriFact achieved ~93% agreement with clinician chart review, exceeding inter-clinician agreement itself." -- Digital Health & Clinical AI Specialist

The methodology uses hybrid retrieval (dense + sparse embeddings) with an "LLM-as-a-judge" approach. A recently qualified GP raised valid concerns:

"So instead of sending health data to one platform we now send it to 3. This can work if self hosted/siloed otherwise the data risks are off the charts." -- Recently Qualified GP & AI Developer

An innovation-focused GP listed practical objections including cost, latency, and the irony of adding another hallucination-prone layer to catch hallucinations. But others saw genuine promise in the approach.

Key dates referenced: 27 December

5. Tools & Gifts: Community Contributions (22, 26 Dec)

The festive spirit extended to practical tool-sharing.

When a clinical safety expert asked for a reliable HEIC-to-JPG converter, within minutes a digital health specialist had spun up a Python solution and published it to GitHub:

"done. have at it folks" -- Digital Health & Clinical AI Specialist

The playful licensing noted it was "for use exclusively by members of the 'AI in the NHS' WhatsApp group."

On Boxing Day, a new tool was unveiled:

"Boxing day is traditionally the day you get to return faulty gifts to the shops, so in the spirit of that tradition Team Curistica has a gift for you... Vigil allows you to quickly create a report that you can submit via email to the MHRA." -- Digital Health & Clinical AI Specialist

Community members immediately tested it, identified bugs, and saw fixes deployed within hours--a perfect example of the group's collaborative ethos.

Key dates referenced: 22 December, 26 December

Quote Wall

"Your very first query/test on ChatGPT was 'summarise this in the style of Brian Blessed'" -- Clinical Safety Expert on early AI experiments

"The problem with LLMs is they aren't designed to be knowledge bases - they are encoded by accident" -- AVT Company CEO on fundamental AI architecture

"I feel bad enough every time I scoff an illicit bit of cheese from the fridge, I don't need Gemini using that data to share it with my life insurance company" -- Practice Manager on smart home privacy

"Gemini is a much better doctor than it is a lawyer for sure" -- Recently Qualified GP & AI Developer

"Anyone who is anyone knows about this group!" -- Healthcare Informatics Specialist on the community's reputation

"Trust is earned continuously - through care, clarity & humility" -- From the "12 Days of Christmas (Clinical AI Edition)"

"We can't build a new train line as a nation without it costing a major chunk of GDP, and even then we STILL can't do it" -- Clinical Safety Expert on UK infrastructure challenges

"It's only when change comes that we think deeply about the value of what we have" -- Digital Health & Clinical AI Specialist on the art of clinical note-taking

Journal Watch

Academic Papers & Key Studies

VeriFact: Automated Clinical Fact-Checking NEJM AI (December 2025) Introduces VeriFact, an AI system achieving 93% agreement with clinician chart review for verifying clinical documents against EHRs. Uses hybrid retrieval and LLM-as-a-judge methodology. Shared: 27 December

Clinical Error Significance Classification BMJ Quality & Safety (2025) Work by Asgari et al on classifying the clinical significance of errors in AI-generated content. Referenced: 27 December

Australian Telco Firewall Failure Investigation Slashdot/Official Report (December 2025) Analysis of the Optus outage where 10 mistakes in a firewall upgrade led to 14 hours of emergency services downtime, 455 failed calls, and 2 deaths. Shared: 20 December https://it.slashdot.org/story/25/12/19/2241221

HealthBench: Evaluating LLMs in Healthcare OpenAI (2025) Framework for benchmarking LLM performance in conversational healthcare AI. Shared: 24 December https://openai.com/index/healthbench/

AI Psychosis: What Physicians Should Know Medscape (December 2025) Article on emerging patterns of AI-related mental health presentations, suggesting clinicians should "ask about AI use the way you ask about sleep, substance use, or social isolation." Shared: 23 December https://www.medscape.com/viewarticle/ai-psychosis-what-physicians-should-know-about-emerging-2025a100104z

Anthropic's Bloom Research Anthropic (December 2025) New research from Anthropic on AI interpretability. Shared: 28 December https://www.anthropic.com/research/bloom

Industry & News Articles

DXS Systems Data Breach TechCrunch (December 2025) NHS clinical decision support provider confirms data breach. Shared: 22 December https://techcrunch.com/2025/12/18/tech-provider-for-nhs-england-confirms-data-breach/

TPP Quadruples Profits HSJ (December 2025) Major NHS IT supplier reports significant profit increase. Shared: 22 December https://www.hsj.co.uk/technology-and-innovation/major-nhs-it-supplier-quadruples-profits/7040606.article

Moorfields AI Research Legal Battle HSJ (December 2025) Trust launches legal action against consultant over AI research IP. Shared: 23 December https://www.hsj.co.uk/technology-and-innovation/trust-launches-legal-battle-with-consultant-who-led-ai-research/7040624.article

NHS England Removes Open Source Policy Digital Health (December 2025) NHS England quietly removes open source policy web pages. Shared: 22 December https://www.digitalhealth.net/2025/12/nhs-england-quietly-removes-open-source-policy-web-pages/

AI Debt Boom & Corporate Bond Sales Financial Times (December 2025) Analysis of AI-driven debt and potential bubble indicators. Shared: 23 December

What 2026 Looks Like (Written in 2021) LessWrong (2021) Predictions about 2026 made in 2021, with Wayback Machine verification. Shared: 22 December https://www.lesswrong.com/posts/6Xgy6CAf2jqHhynHL/what-2026-looks-like

Technical Resources & Guidelines

MedASR - Google Medical Speech Recognition Google's health AI developer foundation for medical speech recognition. Shared: 23 December

ProfValMed - AI Medical Reference Tool GPT-4o + RAG based clinical reference system (FDA cleared). Discussed: 23 December https://profvalmed.com/

AI-2027 Predictions Follow-up to 2026 predictions, forecasting AI developments to 2027. Shared: 23 December https://ai-2027.com/

NotebookLM 30-Minute Lectures Testing Catalog. Google tests extended audio lectures including British English voices. Shared: 23 December https://www.testingcatalog.com/exclusive-google-tests-30-minute-audio-lectures-on-notebooklm/

Stanford HAI: What Workers Want from AI Shared: 28 December https://hai.stanford.edu/news/what-workers-really-want-from-artificial-intelligence

The Shape of AI: Jaggedness & Bottlenecks One Useful Thing. Shared: 28 December https://www.oneusefulthing.org/p/the-shape-of-ai-jaggedness-bottlenecks

Two's Company, Three's a Crowd: AI in the Consultation BJGP Life (December 2025). Shared: 24 December https://bjgplife.com/twos-company-threes-a-crowd-ai-in-the-consultation/

Group Personality Snapshot

This Christmas period captured everything that makes this community special: the ability to pivot from deeply technical debates about transformer architecture and clinical liability, to collective mockery of AI-generated stethoscopes, and back to thoughtful reflection on the art of medical record-keeping that technology risks erasing.

Dashboard Table

Metric: Total Messages Value: 383 Metric: Peak Day Value: Friday 20 Dec (82 messages) Metric: Most Active Period Value: 11:00-14:00 Metric: Average/Day Value: 42.6 messages Metric: Weekend Activity Value: 38% (146/383) Metric: Weekday Activity Value: 62% (237/383) Metric: Date Range Value: 9 days (20-28 Dec)

Key Insights

Friday Dominance: The AVT audio retention debate on 20 Dec generated sustained engagement from 00:16 through 18:51

Christmas Resilience: Despite the holiday, 25 Dec maintained healthy activity primarily through greetings and the Xmas Newsletter release

Weekend Deep-Dives: Saturday 27 Dec saw the second-highest engagement with the clinical hallucination discussion

Morning Peak: Most active window was 10:00-14:00 across all days

Late Night Thinkers: Substantive posts at 01:40 (28 Dec) and 23:23 (25 Dec) from a cardiologist and clinical informatician

Top 10 Contributors (by message count)

Rank: 1 Role Descriptor: Digital Health & Clinical AI Specialist (Moderator) Messages: 58 Rank: 2 Role Descriptor: Innovation-Focused GP Messages: 42 Rank: 3 Role Descriptor: Clinical Safety Expert Messages: 31 Rank: 4 Role Descriptor: Recently Qualified GP & AI Developer Messages: 28 Rank: 5 Role Descriptor: Healthcare Informatics Specialist Messages: 24 Rank: 6 Role Descriptor: Radiology & AI Specialist Messages: 18 Rank: 7 Role Descriptor: AVT Company CEO Messages: 16 Rank: 8 Role Descriptor: Community Healthcare Lead Messages: 14 Rank: 9 Role Descriptor: Pharmacist & Content Curator Messages: 13 Rank: 10 Role Descriptor: Clinical Safety Officer & Digital GP Messages: 12

Hottest Debate Topics (by message volume)

Rank: 1 Topic: AVT Audio Retention & Storage Est. Messages: ~65 Peak Day: Fri 20 Dec Rank: 2 Topic: LLMs as Knowledge Bases / Liability Est. Messages: ~40 Peak Day: Mon 23 Dec Rank: 3 Topic: Human vs AI Hallucinations in Notes Est. Messages: ~35 Peak Day: Sat 27 Dec Rank: 4 Topic: VeriFact & Clinical Verification Est. Messages: ~20 Peak Day: Sat 27 Dec Rank: 5 Topic: NotebookLM Voice Preferences Est. Messages: ~15 Peak Day: Mon 23 Dec Rank: 6 Topic: Smart Fridges / AI Privacy Est. Messages: ~12 Peak Day: Sun 22 Dec Rank: 7 Topic: HEIC to JPG Conversion Est. Messages: ~10 Peak Day: Sun 22 Dec Rank: 8 Topic: Tick-boxes vs Free-text (Legal) Est. Messages: ~10 Peak Day: Sun 22 Dec

Discussion Quality Metrics

Evidence-Based Contributions: High--multiple academic papers cited and discussed

Cross-Expertise Engagement: Strong--clinicians, developers, CSOs, and executives all participated

Constructive Debate: Excellent--AVT audio discussion maintained respect despite strong disagreements

Tool Sharing: 2 new tools released (Vigil, HEIC converter)

External Resource Sharing: 20+ links to papers, articles, and tools

Daily Theme Summary

Friday, 20 December 2025

Primary Theme: AVT Audio Recording & Storage Debate Key Discussion: Vigorous exchange on whether ambient voice transcription tools should retain original audio recordings. Debate centred on medico-legal implications, data controller responsibilities, cybersecurity risks, and parallels to historical OOH call recording. An AVT company CEO provided insider perspective on why audio storage isn't offered by default. Secondary Discussions:

- Australian telco firewall failure causing 2 deaths (clinical safety parallel)

- DCB0160 clinical safety processes

- California vs UK data protection requirements

- Vaccine batch number analogy for traceability

Notable: Most active day of the period with 82 messages. Newsletter #28 published.

Saturday, 21 December 2025

Primary Theme: Newsletter Distribution & Light Discussion Key Discussion: Newsletter #28 officially published with audio summary. Low-traffic day with the group recovering from Friday's intensive debate. Secondary Discussions:

- Fake DDR5 memory hardware scams

- Hardware authenticity concerns

Notable: Quietest day of the period (18 messages). Newsletter archive link shared.

Sunday, 22 December 2025

Primary Theme: Data Breaches & Community Tool Building Key Discussion: DXS Systems data breach reported, prompting discussion of NHS supplier security. Separately, a request for HEIC-to-JPG conversion led to rapid community tool development with a GitHub repo created in under 15 minutes. Secondary Discussions:

- Microsoft Voice Typing compliance in GP settings

- Tick-boxes vs free-text for medico-legal purposes

- TPP quadruples profits story

- AI fridges with Gemini integration (privacy concerns)

- NHS removes open source policy pages

- 2026 predictions written in 2021

Notable: Strong collaborative spirit with multiple solutions offered for file conversion.

Monday, 23 December 2025

Primary Theme: LLMs as Knowledge Bases -- Technical & Legal Implications Key Discussion: Deep technical debate on whether LLMs can reliably serve as clinical knowledge sources. Discussion of transformer architecture limitations, the "fake" nature of LLM reasoning chains, and implications for clinical liability. The Bolam test for AI was floated. Secondary Discussions:

- Google MedASR announcement

- AI debt bubble / financial concerns

- NotebookLM British English voices coming

- Preferred voice personalities (Brian Blessed debate)

- Moorfields AI research legal battle

- AI Psychosis article from Medscape

- LLM coding capabilities (Claude vs Gemini)

Notable: Second-highest activity day (76 messages). Strong mix of technical and lighter content.

Tuesday, 24 December (Christmas Eve)

Primary Theme: AI Mental Health & Evaluation Frameworks Key Discussion: Discussion of how to assess AI use in mental health contexts, prompted by AI Psychosis article. HealthBench framework shared for LLM evaluation. Secondary Discussions:

- ChatGPT "Wrapped" feature

- BMJ Quality & Safety paper on clinical AI

- BJGP Life article on AI in consultations

- GitHub usage resolution for new year

- Coding debugging tips

Notable: Group beginning to wind down for Christmas. Festive greetings starting.

Wednesday, 25 December 2025 (Christmas Day)

Primary Theme: Christmas Greetings & AI-Generated Festive Content Key Discussion: Community exchanged warm wishes, shared AI-generated Christmas images and audio. The Xmas Special Newsletter was released as a gift to the group. Secondary Discussions:

- "12 Days of Christmas (Clinical AI Edition)" poem shared

- AI-generated Die Hard images

- eGPlearning Christmas Special video

Notable: Despite the holiday, 36 messages posted. Strong community spirit and mutual appreciation expressed.

Thursday, 26 December 2025 (Boxing Day)

Primary Theme: Vigil Tool Launch -- MHRA Reporting Simplified Key Discussion: New tool released to simplify MHRA incident reporting. Community immediately tested, found bugs, provided feedback, and saw fixes deployed within hours. Secondary Discussions:

- ChatGPTea shop reference

- NHS hardware specifications for edge AI

- Privacy implementation review of Vigil

Notable: Excellent example of community-driven tool development and rapid iteration.

Saturday, 27 December 2025

Primary Theme: Human Clinician "Hallucinations" & VeriFact Paper Key Discussion: Paradigm-shifting discussion triggered by revelation that ~7% of human clinicians also appear to "hallucinate" information in notes. VeriFact paper from NEJM AI shared, proposing automated fact-checking against EHRs. Debate on whether AI introduces new risks or makes existing risks visible. Secondary Discussions:

- Template consultation risks

- Copy-paste culture in hospital notes

- Claude vs ChatGPT for coding

- Record falsification vs unintentional error

- Data controller implications for verification layers

Notable: Second-highest weekend activity (52 messages). Major conceptual breakthrough in framing AI documentation risks.

Sunday, 28 December 2025

Primary Theme: The Art of Clinical Notes & AI's Future Key Discussion: Reflective discussion on what makes clinical notes valuable--author context, community knowledge, reading between the lines. Thoughtful analysis of what AI standardisation might lose as well as gain. Secondary Discussions:

- Qwen model recommendations for local deployment

- Stanford AI + work studies

- "Jagged frontier" concept for AI capabilities

- Anthropic Bloom research

Notable: Philosophical and forward-looking close to the period. Template reporting history shared from radiology experience.

Newsletter compiled for the AI in the NHS WhatsApp Group Date Range: 20-28 December 2025 This newsletter uses role-based descriptors throughout to protect participant privacy

Archive: https://www.curistica.com/ai-in-the-nhs-newsletters Contact: keith.grimes@curistica.com

Whether it's clinical safety (DCB0129/0160), data protection (DPIA/Privacy Notices), or the ongoing governance of Clinical AI that integrates with your ways of working,

visit www.curistica.com or contact hello@curistica.com

Brought to you by Curistica - your healthtech innovation partner.

For help with clinical safety (DCB0129/0160), data protection (DPIA/Privacy Notices), and governance of Clinical AI that integrates with your ways of working,

visit www.curistica.com or contact hello@curistica.com

AI in the NHS Weekly Newsletter is produced by Curistica Ltd for members of the AI in the NHS WhatsApp community. All contributors are anonymised. Views expressed are those of individual community members and do not represent any organisation.