Many doctors tremble when they read the quality of artificial intelligence recommendations for the diagnosis and treatment of patients, equal or better than those of human doctors. No one believes that doctors will become obsolete due to AI, but advance can improve their performance.
A new study led by Professor Dan Zeltzer, a digital health expert from the Berglas School of Economics at Tel Aviv (TAU), compared the accuracy of these recommendations made by AI to those of doctors of the famous Cedars-Sinai medical center in Los Angeles. As he performs CS-Connect, a virtual emergency care clinic, he decided to collaborate with an Israeli start-up called K Health.
The document was recently presented at the annual conference of the American College of Physicians (ACP) and published in the journal Annals of internal medicine I Under the title “Comparison of initial artificial intelligence (AI) and final recommendations of doctors in virtual emergency visits assisted by AI.”
“Cedars-Sinai operates a virtual emergency care clinic offering telemedical consultations with doctors specializing in family care and emergency,” said Zeltzer.
“Recently, an AI system has been integrated into the clinic – an algorithm based on automatic learning which makes the initial contribution thanks to a dedicated cat, incorporates data from the patient’s medical file and provides the attending physician for diagnostic and processing suggestions at the start of the visit – including prescriptions, tests and references,” he said.
“When confidence is sufficient, AI has diagnostic and management recommendations (ordinances, laboratory tests and references),” said the digital health expert. “After interacting with algorithm, patients go to a video visit to a doctor who finally determines diagnosis and treatment. To ensure reliable AI recommendations, the algorithm – trained on medical records of millions of cases – only offers when its level of confidence is high, giving no recommendations in approximately one in five cases.
“In this study, we compared the quality of the recommendations of the AI system with the real decisions of doctors in the clinic.”
The researchers examined a sample of 461 online online visits over a month in the summer of 2024. The study focused on adult patients with relatively common symptoms – respiratory, dental, urinary and vaginal. In all the visits examined, the patients were initially evaluated by the algorithm, which provided recommendations, then dealt with a doctor during a video consultation.
Then, all recommendations – both algorithm and doctors – were evaluated by a panel of four doctors with at least a decade of clinical experience which evaluated each recommendation on a scale of four points – optimal, reasonable, inadequate or potentially harmful. Evaluators evaluated the recommendations according to the medical history of patients, information collected during the visit and transcription of video consultations.
The notes compiled led to convincing conclusions: AI recommendations were deemed optimal in 77% of cases, against only 67% of doctors’ decisions; At the other end of the scale, AI recommendations were considered potentially harmful in a smaller part of the cases than doctors’ decisions (2.8% against 4.6%). In 68% of cases, AI and the doctor received the same score; In 21% of cases, the algorithm obtained a higher score than the doctor; And in 11% of cases, the doctor’s decision was considered better.
The explanations provided by the evaluators for the differences in dimensions highlight several advantages of the AI system compared to human doctors.
Advantages of AI
First, AI adheres more strictly to the directives of the medical association – for example, without prescribing antibiotics for a viral infection. Second, AI identifies the relevant information in the medical file, such as recurring cases of a similar infection which can influence the appropriate treatment of treatment. And thirdly, the AI identifies more precisely the symptoms which could indicate a more serious condition such as the eye pain reported by a carrier of contact lenses which may indicate an infection.
Doctors, on the other hand, are more flexible than algorithm and have an advantage in assessing the real state of the patient. For example, if a COVVI-19 patient reports a shortness of breath, a doctor can recognize it as a relatively light respiratory congestion, while AI, based solely on patient responses, could refer it unnecessarily to the emergency room.
Zeltzer concluded that “in this study, we found that AI, based on a targeted admission process, can provide diagnostic and treatment recommendations which, in many cases, more precise than those made by doctors.
“A limitation of the study is that we do not know which doctors have examined the recommendations of the AI in the available graph or to what extent they were based on these recommendations,” he said. “Thus, the study did not measure the precision of the recommendations of the algorithm and not their impact on doctors.”
He added that the study is unique because it has tested the algorithm in a real framework with real cases, while most studies focus on examples of certification exams or manuals.
“The relatively common conditions included in our study represent approximately two thirds of the clinic’s case volume, and therefore the results can be significant to assess the preparation of AI to serve as a tool that supports a doctor’s decision in his practice,” said Zeltzer.
“We can see a moment soon when algorithms help an increasing part of medical decisions, bringing certain data to the doctor’s attention, helping them to make faster decisions with fewer human errors,” he predicted. “Of course, many questions remain on the best way to implement AI in the diagnostic and treatment process, as well as optimal integration between human expertise and medicine.”
When asked if Israeli doctors are afraid that AI can replace them, Zeltzer said the answer was mixed.
“I am not aware of any specific survey of Israel on this subject. In general, feeling towards AI tends to be mixed: excitement about potential alongside concerns concerning harmful and disruptive impacts, and says it is over-typical.
“In health care, the debate is not new,” he said. “In 2016, Geoffrey Hinton, often called” AI godfather “and later a winner of the Nobel Prize, predicted that AI would surpass radiologists in the five years. Nine years later, the precision of the AI in radiology has improved considerably, but no radiologist has lost its job because of the AI.
“There are several reasons for this. AI excels in certain tasks but is late in others. Health systems are cautious and move slowly. Security, trust and regulations have a slow adoption.