{"text":[[{"start":6.5,"text":"Two AI medical tools matched or surpassed doctors across a range of diagnostic and treatment decisions, in the latest sign that specialist health large language models are moving closer to demonstrating clinical value. "}],[{"start":19.6,"text":"Mira, developed by researchers in Germany, outperformed physicians in analyses of diseases including pancreatic cancer and pneumonia, while Google’s Amie produced more precise treatments and investigation plans than humans, according to results published in Nature on Wednesday. "}],[{"start":36.1,"text":"The studies suggest specialist health AI tools can give better medical advice than general consumer AI models. But their inventors and independent experts warned that the tests were conducted in controlled simulations and did not mean the tools were ready for real-world clinical use. "}],[{"start":52.45,"text":"“We are getting a preview of how AI could transform medicine,” said Jakob Kather, whose academic group at TUD Dresden University of Technology and Heidelberg University co-developed Mira. "}],[{"start":65.35000000000001,"text":"“I see AI agents as being similar to the autopilot system in an airplane. These systems can support and relieve medical professionals by taking over routine tasks, but ultimate responsibility will always remain with the physicians,” he added."}],[{"start":80.9,"text":"Mira draws on patient data from an electronic health record system and can choose from more than 85,000 options, including ordering diagnostic tests, prescribing medication and scheduling procedures. The researchers tested it using information from more than 500 emergency department clinical cases, which were passed to it via chats with AI agents acting as patients."}],[{"start":102.80000000000001,"text":"Mira notched a diagnostic accuracy of 87.1 per cent across eight conditions including appendicitis and lung embolism, according to the Nature paper. That compared with 78.1 per cent achieved by a panel of six physicians across specialities."}],[{"start":119.60000000000001,"text":"Amie used Google’s Gemini AI model to respond to data given to it by actors role-playing patients. The scientists tested Amie against 21 primary care physicians on 100 multi-visit case scenarios, which were grounded in current UK clinical practice guidelines and drug recommendations."}],[{"start":137.85000000000002,"text":"Amie matched real physicians in patient management reasoning capabilities and aligned its plans more closely with the guidelines than they did, the scientists found. It outperformed human professionals’ reasoning on medication in difficult cases. "}],[{"start":151.15000000000003,"text":"Both AI models had limitations, their inventors acknowledged. Mira still suggested “care that deviated from best practices” for a “small but non-zero” fraction of patients, the researchers said. "}],[{"start":163.35000000000002,"text":"The case information offered by the AI agents might have been “more structured than real speech of patients in emergency departments”, with fewer omissions and inconsistencies, they added."}],[{"start":173.95000000000002,"text":"The Amie study represented a “milestone” but neither the case mix nor the text-based patient scenarios were representative of a real clinical setting, the AI tool’s developers said. "}],[{"start":184.55,"text":"Amie exhibited “promising capabilities” but was “not ready for real-world translation” and required more work to curb problems such as latent reasoning errors, the scientists said."}],[{"start":195.65,"text":"Researchers not involved in the studies praised their rigour but echoed the caveat that both were based on carefully regulated simulations of patients. "}],[{"start":204.4,"text":"“This is some remove from the messy, complex, human world of everyday healthcare,” said Catherine Pope, professor of medical sociology at the University of Oxford."}],[{"start":215.85,"text":"Many of the reported instances of the AI models’ superiority reflected the “precision and completeness of plans” they offered, rather than showing “clear differences in clinical correctness”, said Julie Jacko, chaired professor of health informatics and data science at the University of Edinburgh."}],[{"start":234.6,"text":"“Overall, this is a strong experimental study and a meaningful step forward, but it demonstrates performance against a structured standard rather than fully capturing the complexity of real clinical decision-making,” Jacko said."}],[{"start":249,"text":"There was also a “question about where Amie’s advantage actually comes from”, given that on one benchmark general-purpose AI models had scored similarly, said Wei Xing, assistant professor in the University of Sheffield’s School of Mathematical and Physical Sciences."}],[{"start":264.35,"text":"“This suggests Amie’s edge may reflect the rapid general progress of AI models, more than the specific system built around it,” he said."}],[{"start":279.5,"text":""}]],"url":"https://audio.ftcn.net.cn/album/a_1781766171_6654.mp3"}