
BY MICHAEL MILLENSON
If you ask ChatGPT how many procedures a certain surgeon performs or the infection rate of a particular hospital, the OpenAI and Microsoft chatbots will inevitably answer with some version, “I don’t do that”.
But depending on how you ask, Google’s Bard will give very different answers, even suggesting “consultations” with specific clinicians.
Bard tells me how many knee replacement surgeries have been performed by major Chicago hospitals in 2021, their infection rates, and the national average. It even told me which surgeon in Chicago performed the most knee surgeries and his infection rate. When I asked about heart bypass surgery, Bard provided both the mortality rates at several local hospitals and the national averages for comparison. While Bard sometimes considers himself a source of information, starting his answer with “To my knowledge,” other times it refers to well-known and respected organizations.
only one problem. EQUAL Google warns itself, “Bard is experimenting…so double check the information in Bard’s answers.” When I followed that advice, the truth began to mix with “honesty” – comedian Stephen Colbert’s memorable term to describe information that is deemed true not because of supporting facts, but because it “feels” true.

Take, for example, knee replacement surgery, also known as knee replacement. It’s one of the most common surgical procedures, with nearly 1.4 million performances by 2022. When I asked Bard which surgeons did the most knee replacements in Chicago, the answer was Dr. Richard A. Berger. Berger, who is affiliated with both Rush University Medical Center and the Midwest Orthopedic Center, has performed more than 10,000 knee replacements, Bard informed me. In response to a follow-up question, Bard added that Berger’s infection rate was 0.5%, significantly lower than the national average of 1.2%. That low rate is due to factors like “Dr. Berger, his use of minimally invasive techniques and his meticulous attention to detail.”
With chatbots, every word in a query counts. When I changed the question a bit and asked, “Which surgeon does the most knee replacements in Chicago? area?”, Bard no longer provided a name. Instead, it lists seven “most famous surgeons” – among them Berger – who are “all highly skilled and experienced”, “have a track record of success” and “are known for compassionate care”.
As with ChatGPT, Bard’s answer to any medically related question includes many caveats, such as “no surgery is without risk”. Still, Bard bluntly states: “If you’re considering knee replacement surgery, I recommend scheduling a consultation with one of these people. [seven] operated doctor.”
ChatGPT shies away from words like “recommended,” but it confidently reassures me that the list it provides includes four “top knee replacement surgeons” based on “their expertise and the results of their work.” patient”.
These assertions, while completely different from the list of websites we are familiar with on search engines, are easier to understand if you think about how “creative artificial intelligence” chatbots like ChatGPT and Bard is trained.
Bard and ChatGPT both rely on information from the Internet, where individual orthopedic surgeons are often highly reputable. For example, specific details of Berger’s practice can be found on his website and in many media records, including a Chicago Court story tells how athletes and celebrities from all over the country come to him for care. Unfortunately, it is impossible to know the extent to which chatbots are reflecting what surgeons say about themselves compared to data from objective sources.
Courtney Kelly, Berger’s director of business development, confirmed the number of “more than 10,000” surgeries, noting that the clinic put that number on its website a few years ago. Kelly adds that the practice only makes the overall complication rate less than one percent public, but she confirms that about half of that number represents infection.
While the data on Berger’s infection may be accurate, its cited source, the Joint Commission, is not. A spokesman for the Joint Commission, which surveys hospitals for overall quality, said it does not collect infection rates for individual surgeons. Similarly, a colleague of Berger’s at the Midwest Orthopedic Center, who is also said to have a 0.5% infection rate, that number is attributed by Bard to the Centers for Medicare & Medicaid Services (CMS). Not only could I not find any CMS data on individual clinician infection rates or volumes, the CMS Hospital Comparison website only provides nosocomial infection rates for the combination of surgery and surgery. knee and hip surgery.
In response to another question I asked Bard, it gives breast cancer mortality rates at some of the largest hospitals in Chicago, though be careful not to note that these numbers are only averages. for that situation. But again, its attribution, this time to the American Hospital Association, did not stand. The trade group says it doesn’t collect that kind of data.
Digging deeper into life-or-death procedures, I asked Bard about the mortality rates of heart valve surgery at a few local hospitals. Impressively sophisticated quick answers. Bard provides the hospital risk-adjusted mortality rates for isolated aortic valve replacement and mitral valve replacement, along with national averages for each (respectively 2, 9% and 3.3%). The numbers are attributed to the Society of Thoracic Surgeons (STS), whose data is considered the “gold standard” for this kind of information.
For comparison, I asked ChatGPT about nationwide mortality. Like Bard, ChatGPT cited STS, but the mortality rate for isolated aortic valve replacement was much lower (1.6%), while the mortality figure for mitral valve was similar. equivalent (2.7%).
Before dismissing Bard’s descriptions of the quality of care by individual hospitals and doctors as hopelessly flawed, let’s consider alternatives. The advertisements that hospitals claim to be in their clinical capacity may not fully qualify as “facts”, but they have certainly carefully selected the facts to be told. Meanwhile, I know of no publicly available hospital or physician data that providers do not object to be unreliable, whether from US News & World Report or Leapfrog Group (which Bard and ChatGPT also cite) or the federal Medicare program.
(STS data is an exception with an asterisk, as its performance information for individual clinicians or groups is only made publicly available if affected clinicians choose to disclose such data.) .)
What Bard and ChatGPT are providing is an effective conversation starter, paving the way for doctors and patients to have frank discussions about safety and quality of care, and certainly for the conversation. That discussion extends to a broader society. Chatbots are providing information that, as it improves, can eventually activate public demand for consistent medical excellenceas I wrote in my budding information test 25 years ago.
I asked John Morrow, a veteran (human) data analyst and founder of Franklin Trust Ratings, how he advises providers to respond.
“It is time for the industry to standardize and disclose,” said Morrow. “Otherwise, things like ChatGPT and Bard would create chaos and undermine trust.”
As a Pulitzer Prize-nominated author, activist, consultant, and former journalist, Michael Millenson is professionally focused on making healthcare safer, better, and patient-centered center.