curistica.com

Executive Summary

A week shaped by two big-vendor moves into the clinical workspace and a sustained group debate about whether the wider safety conversation is keeping pace. Anthropic's "Claude Code in Healthcare" webinar showcased clinicians building working tools at remarkable speed, but several members noted that data protection, regulation, and clinical safety received little airtime in the session itself. In parallel, OpenAI launched a free clinical-tier ChatGPT product for verified clinicians in the United States with its own benchmark, HealthBench Professional, prompting a careful UK reading of the regulatory implications. South Korea's Class 3 medical AI authorisation, equivalent to EU Class IIb, was greeted as a meaningful first.

Underneath the headline product news, the group worked through harder structural questions. A long technical thread explored "Q-day", the prospective end of current public-key cryptography, and what an integrated NHS data strategy might look like once that day arrives. Members compared notes on second-hand Mac Studio rigs and the rapidly improving economics of running open-weight models such as DeepSeek V4 and Kimi 2.6 locally. AVT discussion returned to accent handling, fine-tuning, and the realistic limits of off-the-shelf transcription. A serious cyber thread covered a published vulnerability in the MCP ecosystem, the Vercel platform incident, and the continuing supply-chain disruption flowing from the Stryker attack.

The week closed with members watching the post-event recordings, comparing notes on AVT performance, and reflecting on what genuinely useful AI in primary care looks like once the marketing dust settles.

Activity at a Glance

Week 46 generated 444 messages across the period, with peak activity on Thursday 23 April (105 messages) as members worked through model news, regulatory questions, and the post-quantum thread. Saturday 18 April was unusually busy at 98 messages as the previous week's AVT and infrastructure conversation carried over, and Friday 24 April reached 85 messages with the post-Anthropic-webinar review and Microsoft's Agent Mode launch in play.

📌 Major Topic Sections

1. Anthropic Claude Code in Healthcare and the Safety Gap

The week's centre of gravity was Anthropic's "Claude Code in Healthcare: How Physicians Are Building with Claude" webinar, available on demand. Members watched live and recorded, and the response was a study in honest enthusiasm tempered by professional caution.

The capability story was striking. Clinicians demonstrated building functional internal tools, dashboards, and workflow aids quickly with Claude Code in the desktop application. One member noted being impressed with the implementation in the desktop app having previously used Claude Code in VS Code, particularly the ability to visualise changes. Another noted that Claude Code seemed less token-hungry than Claude Desktop for similar work.

The concerns were equally consistent. A number of members pointed out that across the session, clinical safety received no explicit treatment, and data protection and regulatory framing were touched on only briefly. One member summarised the worry succinctly:

"Very exciting stuff, but folks: SAFETY WAS NOT MENTIONED ONCE." — Digital health and clinical AI specialist

"Best to share in the group and create a repository so we don't waste efforts on existing tools created by one of us." — Primary care AI lead

"Need paid version for using Claude Code... AI - privacy, GDPR and protection of data and lack of UK EHR will prevent all functionality and we should be careful in designing automated tools or agentic tools. This can lead to everyone making their own solutions and lot of reinvention of wheels." — Primary care clinician

The point about heterogeneous tooling proved a productive thread in its own right. If many clinicians build many small applications, iterate on some, and share them informally, the result is a large ecosystem with a real maintenance and assurance burden. One member observed that this also expands the risk surface, because where one clinician builds a tool with a flaw and another uses it, the question of accountability becomes thorny. The webinar did acknowledge the need for expert input on privacy and engineering, and Anthropic hinted at potential HIPAA and GDPR plugins for future work.

2. OpenAI ChatGPT for Clinicians: A Free US-Only Launch

OpenAI announced a free healthcare version of its current model for verified clinicians in the United States, accompanied by a published evaluation on its own HealthBench Professional benchmark. The product is positioned as a clinical decision support, documentation support, and research tool. Members read the launch material carefully and produced a balanced assessment.

On the positive side, the group welcomed the visible investment in healthcare model performance, the strengthened data protection framing, and the involvement of clinicians in evaluation. The benchmark methodology, including comparisons against specialty-matched physicians with web access on real and difficult clinical tasks, was acknowledged as a serious effort even though the benchmark is publisher-built.

On the cautious side, members noted that a tool of this nature would raise classification questions in UK and EU contexts, and that the absence of a published privacy policy, full evaluations, and a UK regulatory pathway means the product will not be available domestically for now. One member offered a measured pragmatic frame: shadow use of AI is widespread already, and a clinician-facing tool with stronger evaluation and protections may, in some settings, be a step forward against the alternative of unsupervised consumer chatbot use. The counterpoint was equally important. If access expands to clinicians who have not yet developed safe-use practices, the net effect could be greater risk rather than less.

"Overall, there is a pragmatism about this that I'll applaud. Shadow use of AI is rife, in UK, EU, and US, so something that is better than the alternative could be argued to be a step forward." — Digital health specialist

3. Korean Class 3 Approval: A Global First

Late in the week, news reached the group of South Korea's Ministry of Food and Drug Safety granting a Class 3 authorisation for a regulated next-generation AI in healthcare, roughly equivalent to EU Class IIb. The framing was that this is the first of its kind globally, putting Korea ahead of the United States and European Union in approving regulated AI for clinical use at this risk classification.

"This seems like a big deal…?" — Healthcare technology strategist

The group recognised the significance both as a regulatory milestone and as a competitive signal. It surfaces the question of which jurisdictions will set the early operating norms for high-risk clinical AI, and whether UK and EU pathways can adapt with sufficient pace.

4. Q-day, Sovereignty, and the Future of Integrated Data

A reflective and technically rich thread explored the long-term security architecture for NHS data. Several members noted that intelligence and commercial agencies operate on a "harvest now, decrypt later" assumption: collect encrypted data today, decrypt it once cryptographic primitives weaken. The frequently quoted estimate for "Q-day", the point at which current public-key cryptography ceases to function reliably, is around 2029. Post-quantum cryptography schemes are in active development, but transitions across complex estates take time.

Members discussed practical mitigations. Digital borders, well-engineered air gaps, and integrated data behind closed organisational boundaries were proposed as core ingredients. The point was made that older-generation tech professionals who delivered integration in the era of patchy international links solved many of these problems already, and that institutional memory of those approaches is valuable.

A reader-provoking thought experiment landed in the middle of the week: if Integrated Care Boards could no longer afford CSU services and clinicians had to procure their own equipment, what would secure-enough tooling actually look like? Responses converged on a federation-led approach with disciplined "gold build" reference configurations, tightly controlled apps, and the lessons of CSU operational experience translated into smaller-scale practice federations.

"Integrated data behind closed borders is achievable for the truly large organisations, and the NHS should be investing heavily in this." — NHS digital and federation lead

"Dig out your older gen tech folk who got tons done in the days of patchy international links and you'll find many solutions exist already." — NHS digital and federation lead

5. Local Models, Mac Studios, and the Improving Economics of On-Prem AI

A consistent thread across the week traced the steadily improving case for running models locally. Members shared tips on second-hand Mac Studios with large unified memory configurations, the imminent Mac mini refresh, and the integrated-memory architecture that, while not GPU-grade, is increasingly capable. A clustered Mac Studio approach was cited as a way to run sizeable models without enterprise GPU spend.

New open-weight model releases featured: a major DeepSeek V4 update and Kimi 2.6 were shared mid-week, alongside a wider community piece on running models such as MiniMax M2.7 and Gemma 4 on personal hardware. The argument was that iterative improvements in open models are eroding the assumption that on-prem inference requires hundreds of thousands of pounds of hardware.

"A suite of local models fine tuned to meet your specific needs are the future, with orchestration layer." — Health technology builder

"Just needs us to shift our mind set from being consumers of tech, to builders." — Health technology builder

The thread also included a frank exchange about the resource intensity of frontier models. Several members felt that recent Claude Opus iterations have become more token-hungry, with adaptive thinking sometimes consuming budget faster than expected. Others reported steady performance. The honest summary was that performance is being tuned in flight by the vendors, and that monitoring usage closely remains worthwhile.

6. AVT, Accents, and the Limits of Off-the-Shelf Transcription

The ambient voice technology conversation continued to mature. Members explored whether AVT tools can be tuned to individual users with strong accents in the same way that voice recognition systems for radiology have been tuned over many years. The technical answer is that LoRA fine-tuning on a curated dataset is achievable and does not require data-centre-scale infrastructure, but that this raises governance questions about consent for the data used and regression testing across other accents.

A practical concern surfaced repeatedly: some AVT tools, when summarising, can transform a patient's expression of opinion ("I think my migraines are caused by hormone imbalance") into an apparent statement of fact ("migraines are caused by hormone imbalance"), which then implies a clinician concurrence that may not exist. The mitigation is review of the generated output, but the time cost can offset the headline efficiency gains.

"AVT is time neutral but gives me a more eye contact. Most patients also like that I explain my examination in greater detail (for AVT to document)." — Practising GP

A member proposed an attractive direction of travel: a federation-curated data pipeline in which patients with accents that the system finds difficult are invited to contribute short voice samples, with clear consent, into a continuously improving local training corpus. This was framed as the collaborative alternative to a per-patient charging model that would gradually drain general practice resources.

7. MCP Supply Chain Vulnerability and the Wider Cyber Picture

A serious security disclosure circulated mid-week describing a critical vulnerability affecting the Model Context Protocol ecosystem. The framing of the published analysis was that this represents a systemic weakness in how AI agents reach external tools, with potential consequences for any healthcare deployment using MCP-mediated workflows. Members urged a careful read and a review of any clinical pipelines using MCP.

In parallel, members shared news of the Vercel platform incident and recommended that anyone whose AI software suppliers list Vercel as a sub-processor make appropriate enquiries. The disruption flowing from the cyber attack on Stryker, dating back to early March, also returned to the group's attention with NHS England communications and NHS Supply Chain advisories on supply of medical equipment and consumables.

The drumbeat of these stories underpinned a familiar refrain: clinical AI sits inside a software supply chain, and the assurance work has to follow the supply chain rather than just the clinical-facing surface.

8. Parliamentary Evidence and the Regulatory Conversation

A member shared notes from the Parliamentary committee session on innovation in the NHS held on 14 April, which the group then read and discussed. The submission emphasised a shift from isolated pilots to system-wide adoption, and a regulatory philosophy that explicitly weighs the risk of failing to deploy a beneficial new tool alongside the risk of deploying it. The recommendation was for iterative, performance-based regulation rather than a single high barrier, and for regulatory thinking that addresses producer, provider, professional, and patient as a connected system.

The reception in the group was constructive but measured. There was support for thoughtful regulatory modernisation, and a parallel concern that the witness mix in such sessions does not always include enough operational researchers and implementers to balance the strategic-level testimony.

9. Microsoft Office Agent Mode and Vibe Working

Microsoft's Agent Mode launch for Word, Excel, and PowerPoint was shared with a mixture of curiosity and dry humour. Members noted the commitment to "vibe working" inside the productivity suite as a significant productisation step. A separate observation about the prominence of safety messaging in vendor briefings produced a wry exchange about whether a single safety slide can fairly be said to discharge the responsibility.

🌟 Lighter Moments

The week did not lack levity. The names members proposed for hypothetical future Claude releases produced one of the standout exchanges:

"Wait till Claude Mythos (will detect your deepest computer secrets), Claude Thanos (telepathic skills), Claude Sauron (this will see everything everyone does), Claude Voldemort (this will suck out your soul)." — Healthcare AI strategist

A cheerful side-quest emerged when one member announced an intention to use Claude to design STL files for 3D printing, leading to the discovery of the open-source ClaudeCAD project, and a swift acknowledgement that the rest of the day's housework was now in jeopardy.

The phlebotomy-robot thread showcased the group's gallows humour at full volume, with bribes of leftover Easter chocolate, the prospect of unionised robots, and the inevitable Gillette-model-of-business observation about consumables. In the same spirit, a member's question about a CIA news story prompted a careful, evidence-based response about quantum magnetometry and nitrogen-vacancy diamond sensors, and the conclusion that the underlying physics is real even if the operational claim does not survive scrutiny.

A small but well-loved running joke continued about the persistent inability of dictation systems to render "bronchi" correctly after twenty-five years of practice, and a request for a standard clinical agentic benchmark with a "pelican on a bicycle" energy.

💬 Quote Wall

"Sovereignty is the name of the game, but if I'm honest I think our compute will have to reside in Europe until such time as we can address our energy costs." — Healthcare technology strategist (referenced from earlier in the month, returned in a thread on local models)

"It would be good to see greater commitment of federations and scaled providers developing solutions like this. There's no reason it couldn't be a commercial offering of its own and revenue generation for patient services." — Digital health specialist

"The clinical stuff you mention really should be picked up in the review of the summary or transcript. AVT is nowhere near ready for direct access without clinical review of the outputs. The obvious risk is clinicians taking shortcuts." — NHS digital and federation lead

"What I've noticed is that the great writers and bloggers have unusual benchmarks that they come back to, and it's got me thinking about a good clinical one." — Digital health and clinical AI specialist

"Reality is stranger than the most outlandish fiction at the moment." — Healthcare AI strategist

📚 Journal Watch

🔭 Looking Ahead

Several themes will be worth tracking in the week ahead. Independent evaluation of the new clinician-tier ChatGPT product against UK regulatory expectations, particularly around classification and data protection, will be a priority. The MCP vulnerability will continue to ripple through any organisation using agentic workflows in clinical pipelines, and group members are likely to share remediation experience as they work through it.

On the local-model side, more members will experiment with Mac Studio clusters and the latest open-weight releases, and there is appetite for shared notes on what works for which clinical use cases. The Claude Design announcement is likely to feature in next week's exchanges as members test it on real wireframing tasks. And as always, the AVT conversation will return whenever a member encounters a transcription error worthy of sharing.

Finally, the post-quantum thread is unlikely to disappear. The group's instinct that strategic data infrastructure decisions need to consider Q-day timelines is one of the more distinctive contributions this group makes to the wider conversation, and it is worth continuing.

👥 Group Personality

Week 46 was a characteristic showcase of what makes this group distinctive. The ability to hold a serious technical discussion about post-quantum cryptography alongside a Hydra-and-SkyNet riff about NHS reorganisations, without losing momentum on either, is a real cultural asset. The clinical realism remained intact: members tested the marketing claims of major launches against day-to-day practice, and arrived at conclusions that were neither hype nor reflexive negativity. The builder energy was visible in the local-model thread, the AVT data pipeline proposal, and the willingness to share token-economics tips and Python scripts.

Two qualities were especially evident this week. The first was the readiness to disagree productively. Threads on jobs impact, commercial AI ethics, and integrated-care infrastructure produced a wide range of views, and the conversation stayed substantive throughout. The second was the willingness of senior members to share half-formed thinking openly, which then enabled others to build on it. The agentic clinical benchmark idea, in particular, is the kind of thing that emerges only in a group where people are comfortable thinking out loud.

📊 Appendix A: Visual Summary

The visual summary shows weekday traffic dominating the week, with Thursday emerging as the peak day as members worked through model news, regulatory questions, and the post-quantum thread. Saturday 18 April recorded a notably busy morning as the AVT and infrastructure conversation carried over from the previous week.

📈 Appendix B: Activity Metrics

Total messages: 444 across the period

Most active day: Thursday 23 April (105 messages)

Daily breakdown: Sat 18 April 98, Sun 19 April 30, Mon 20 April 37, Tue 21 April 33, Wed 22 April 55, Thu 23 April 105, Fri 24 April 85, Sat 25 April (until 09:00) 1.

Most active period: late morning to early evening on weekdays, with notable evening clusters during the Anthropic webinar review

Estimated unique contributors: approximately 30 across the week

Quality indicators:

Debate topics:

📅 Appendix C: Daily Theme Summary

Saturday 18 April 2026

A busy first day of the period, dominated by AVT and accents, the realistic ceiling of off-the-shelf transcription, and the case for federation-curated voice data pipelines with consent. Discussion ranged across LoRA fine-tuning, regression testing, and the long shadow of voice recognition in radiology. A separate thread explored data sources for private healthcare activity in the United Kingdom, with members sharing the Private Healthcare Information Network and the ADAPT programme as starting points. The Claude Design launch landed and was widely flagged as a useful tool for UX wireframing. Mid-afternoon brought a sustained thread on Claude Opus token economics in API use, with members trading practical tips on PDF-to-markdown conversion ahead of agent runs. The day ended on a lighter note with ClaudeCAD and 3D printing.

Sunday 19 April 2026

A quieter day with a deliberate, reflective tone. Members shared the Stryker cyber attack timeline and NHS England's communication, with an emphasis on the importance of multi-factor authentication and supplier supply-chain awareness. A discussion on Microsoft Copilot in NHS settings explored its functional limitations, the absence of Outlook connectivity, and the implications of an "entertainment" liability framing in a clinical context. A reflective video on responsible AI deployment was shared and recommended for circulation to prospective AI buyers. Notes from the Parliamentary committee session of 14 April were distributed and discussed.

Monday 20 April 2026

Token economics and platform reliability dominated, with members comparing Opus 4.7 adaptive thinking against Sonnet 4.6 in routine work. A useful piece on Claude token consumption was shared. The MCP supply-chain vulnerability disclosure was circulated and discussed seriously. A member shared news of the Vercel platform incident, prompting a wider conversation about sub-processor diligence in AI software supply chains. The day's lighter moments included a Hydra-and-SkyNet riff on NHS reorganisations and a brisk exchange about AI receptionists in primary care.

Tuesday 21 April 2026

A range-finding day. Members discussed routes into TPP SystmOne integration for new entrants and the recurring difficulty of finding direct contact channels. The OpenAI ChatGPT for Clinicians announcement landed and dominated the afternoon, with members reading the launch material carefully and producing a balanced UK-context assessment. An evening thread reflected on whether the medico-legal profession is prepared for new categories of AI-related case. The Federated Data Platform returned to the conversation, with members sharing differing views on user-friendliness and value-for-money, and discussing the underlying business case for centralised analytics versus existing distributed approaches.

Wednesday 22 April 2026

A wide-ranging day. Members shared a forecast that AI could reshape professional services within four to five years and weighed it against direct experience of frontier model performance, which several described as still "jagged" in real use. Phlebotomy robotics, employee tracking by major tech firms, and a member's reflective article on AI role impersonation in clinical oncology produced lively threads. A frank discussion explored what corporate AI strategy might look like for an organisation prioritising data security, with thin-client architectures and disciplined browsing controls cited as practical starting points. An evening thread reflected on a member's AI talk to United States executives and the engagement of UK clinicians.

Thursday 23 April 2026

The week's peak day. The OpenAI clinician-tier launch returned for deeper analysis, with members assessing benchmark methodology, classification implications, and the pragmatic case for clinician-facing tools that improve on shadow consumer use. A long thread explored the post-quantum cryptography horizon, "Q-day" as commonly estimated for 2029, the harvest-now-decrypt-later assumption in intelligence and commercial sectors, and what a sustainable integrated-data architecture for the NHS might look like. The Anthropic Claude Code in Healthcare webinar was watched live and on-demand, generating a sustained safety-framing critique. A shared thought experiment about clinicians providing their own equipment generated practical proposals for federation-led "gold builds". The day also included an informal proposal for an agentic clinical day benchmark using a synthetic dataset across triage, labs, referrals, consults, meetings, prescriptions, and education.

Friday 24 April 2026

A second peak day. The Claude Code in Healthcare review continued, with members watching the recording, exchanging detailed observations, and reflecting on the gap between the capability story and the safety story. A wide-ranging discussion explored data privacy, the asymmetry of insider versus outsider risk, and how mid-grade staff can become the most common source of leaks for a variety of motivations. New open-weight model releases (DeepSeek V4, Kimi 2.6) were shared and tested. Microsoft's Agent Mode launch was dissected, including an exchange about the prominence of safety messaging in vendor briefings. The day closed with a careful, evidence-based response to a news story about a CIA rescue operation and the underlying physics of magnetometric heart-signal detection.

Saturday 25 April 2026 (until 09:00)

A short coverage window before the newsletter cut-off, opened by news of South Korea's Class 3 medical AI authorisation, framed as the first of its kind globally and a meaningful regulatory milestone.

This newsletter was generated from the AI in the NHS WhatsApp group conversations between 18 and 25 April 2026. All contributors are anonymised by role descriptor only. Direct quotes have been verified against source messages. URLs have been checked against the source data.

AI in the NHS is a community of practice for clinicians, technologists, researchers, and policymakers exploring how artificial intelligence can responsibly benefit the National Health Service.

Newsletter #46 — compiled by Curistica.

AI in the NHS Weekly Newsletter - Issue #46