The IC² Institute kicked off its new colloquium series with “Chat GPT and the Future of Healthcare” on February 23. Co-sponsored by UT Good Systems and moderated by IC² Executive Director, S. Craig Watkins, the event featured four experts:
- Ying Ding (Bill & Lewis Suit Professor at the School of Information and Adjunct Professor at the Department of Population Health, Dell Medical School)
- Greg Durrett (Assistant Professor of Computer Science, UT Austin)
- Will Griffin (Vice President of Product Strategy at Blockchain Creative Labs (FOX))
- Justin Rousseau, M.D. (Assistant Professor, Dell Medical School)
The panelists provided clinical, industry, and academic perspectives on the risks, rewards, and ethics of using Chat GPT in the realm of health care. We’ve compiled HIGHLIGHTS of the conversation below.
Let’s start with the fundamentals. When we talk about large language models, from a technical and functional perspective, what are we talking about?
Greg — The current systems reflect the language model that is trained over the web … these models are trained to predict the next word given words that have come before. … The thing that has really super-charged these models is adding another layer of supervision that comes from reinforcement learning from human feedback … They get people to sit down and say, “these responses are better than these responses” and this is how, when you use a product like Chat GPT, you get this more curated-feeling experience because they’ve smoothed out a lot of rough edges where it goes off the rails … Behavior has been instilled into it through this extra supervision phase. That’s what has productized it and taken it to the next level.
Will — On a practical level, all AI starts out with a source of data … Large language model starts out with a source of data with words in it. … But we have to keep a healthy skepticism … It’s like I have 4 friends, with various levels of worldliness or common sense. Internet is one; whatever you conjure in your own mind is one; TV and other media is one; and now Chat GPT shows up. It’s just another source. And it’s only as good, just like your friends, as their education, how much they read, how much their parents loved them … So Chat GPT is only as good as that, plus, it’s only as good as the testers… So this 5th input —we really have to watch this one because the adoption is outstripping the quality of the product …We have to demand that our fifth friend steps their game up.
What about the distinction between synthesis and reasoning and its implications for Chat GPT? How do we incorporate these models into medical practice without any real safeguards?
Greg — There are a lot of cool examples of how Chat GPT can be applied. … But I think the biggest place where it can help us is simply sifting through information and synthesizing it and allowing us to digest it more easily … Where it starts to fail is when you ask for synthesis of a bunch of information and complex reasoning that, let’s say, only a physician can do. We haven’t really seen that work reliably enough… Using it for more sophisticated reasoning is something we should be leery of. We need to demand higher levels of quality and better understanding of what’s going on there before we dive into this headfirst.
Justin — As a clinician, I spend a lot of time generating content from the electronic health record—and seeing the potential of GPT to augment that work is exciting. But I worry a lot about how it impacts the health of our patients and the health of our doctors and the entire health team.
Ying — Sites like WebMD—they are just presenting information; they’re providing easy access to information, but the content is still generated by a human. But here, with GPT, they are writing in front of you – they are quite humanized! … Also, if you can generate words, you can generate actions. We’re so busy; we’re all burned out – why don’t we let GPT to time-consuming things we don’t care about— connect with hotels, go to different websites, and find the cheapest flight for you …
Will —there are a lot of totally wonderful use cases for Chat GPT today … benign uses … Unfortunately, that’s not how we get ethics baked into the process … we get ethics baked into the process when things go wrong. … That’s why the universities are so important. Every university is now building in rules around the use of GPT. We will model and train our students in the behavior we want everyone else in society to benefit from. This is the unique charge for academia. Because when I leave here today, I assure you this (discussing ethics) is not top of mind. The pressures that are going on in industry at all times, but especially at this moment, means that … we’re really relying on academia to lead the way in this.
Chat GPT may help limit the pressures on healthcare workers who are fatigued and overwhelmed, but it also poses significant risks. Why is monitoring these systems so important?
Justin — There’s the first step of validating and demonstrating the accuracy of those models that are used – and I think that’s what we’re seeing. “Okay this is doing a good job of summarizing large amounts of info from various sources” … but we have to look at when we put it into use – how do we use it, and how does that impact outcomes? … So far what we’ve seen is that sometimes institutions implement a tool and they just let it go. They don’t continue to monitor it even if things change over time. We saw how the sepsis prediction model during the time of COVID started to shift– and people weren’t noticing and just kept on letting it run … It overburdened the care team with too many alerts, and it was missing cases where patients were becoming septic, and people should have been alerted … We need to continually monitor and measure outcomes or we will miss those cases when it is causing harm …
And here’s another piece: I worry that if we depend on these models to do our work for us … and we take the human out of the equation, then we’re in really big trouble, because we don’t even see the incorrect content that might be provided to the patient or given to other care providers.
What is the role of the university in educating students about the ethical application of this technology, particularly as it’s being adopted so quickly?
Ying — When we talk about ethics, we always judge differently machine and human. Humans make all kind of mistakes … So how do we compare human error and machine error?
Will — It’s paramount. Almost every university, at the core of their mission, is an obligation to society. The university’s job to remind these students of their obligations to humanity. If universities don’t do it, I don’t know who will.
What’s the roadmap for using these applications in healthcare?
Justin — This is an issue we haven’t touched on: How do we certify AI before putting it to use in healthcare? There are calls for randomized control trials … We’re not going to be able to keep up with the pace of innovation by doing randomized control trials—they take too long, and people are going to put these things into use anyway … There needs to be some alternate methods for evaluating these tools before implementing them and continuing to monitor them for shifts in outcome and impacts.