The Untold Story of AI in Healthcare

Stephen Norris

20 Oct 2024 — 22 min read

Healthcare's long history of discrimination is shaping its future in ways many are blind to

This is the full text of a previously published, five-part series exploring the impact of AI on health equity

Lead art caption: Samuel Morton, an American scientist and physician popularized the theory of a racial hierarchy in the 1800s, when he purported that skull size could determine the intelligence of a race. Morton claimed “Caucasians” had the largest skulls and “Negroes” had the smallest. His findings could never be replicated, despite his theory being widely accepted for over a century.

Part I: When Algorithms Prescribe Prejudice

Recently, I watched a Netflix documentary about the crack epidemic of the 1980s and 90s. As always, below the film were suggestions of other movies that may interest me, based on my current viewing. All but one suggestion were movies that had something to do with drugs. The lone outlier was a documentary about the 1983 N.C. State Wolfpack men's basketball team’s run to the NCAA national championship. There was no obvious relation between the two films besides one glaring similarity: Both predominantly featured Black people.

The same technology that makes it possible for Netflix to make recommendations based on what you watch is now available to your healthcare provider. Similar to Netflix’s brow-raising suggestion, the technology being used in healthcare spaces is leading to many questions about the impact of Artificial Intelligence on health disparities.

In late 2023, Microsoft and Epic announced a partnership to produce clinical word generation, using OpenAI. It’s just one of a growing number of AI scribes being used in healthcare offices. The partnership includes AI-powered “note summarization to support faster documentation through suggested text and rapid review with in-context summaries and tools to reduce manual, labor-intensive processes,” according to Fierce Healthcare News, which detailed the deal in August 2023.

AI technology has been widely used for administrative tasks in healthcare such as scheduling, billing, claims administration, and organizing medical records. While many (myself included) are hopeful for the potential of AI to improve health equity, it cannot make meaningful progress without first addressing deeply entrenched, existing biases in healthcare.

“We no longer think the way that we did (in the past) about many things; diverse individuals, sex differences, any of these things,” said Marzyeh Ghassemi, PhD, who is an Associate Professor of Electrical Engineering and Computer Science at Massachusetts Institute of Technology (MIT), and researches Machine Learning in healthcare. “If we're training Machine Learning models to replicate what we used to do in the past, it may actually prevent us from doing better in the future.”

Ghassemi’s statement isn’t just hyperbole. Our healthcare system, as it exists today, was built on many pseudo-scientific ideas from the past that are cringe-worthy now. For example, as late as 2023, a “race correction” was still used to measure pulmonary function. The basis for this practice dates back to an 1851 report from a physician named Samuel Cartwright titled, “Report on the Diseases and Physical Peculiarities of the Negro Race.” In it, Cartwright claimed Black people had lower lung capacity and wrote that “forced labor” was the way to “vitalize” the blood and correct the problem. He also described Black people as having smaller brains and blood vessels, noting that this accounted for their “barbarism.” Cartwright later designed the spirometer to measure lung function and stated a 20-percent deficiency in lung function for Black people compared to whites.

To quantify the impact this erroneous race correction had, a study released in May and published in the New England Journal of Medicine determined that the abolishment of the race correction could increase annual disability payments for Black veterans by $1 billion.

“Healthcare itself is based on really weird sexist and racist principles in some ways, with no Machine Learning included, right?” Ghassemi said. “The problem with just adding Machine Learning is we're taking a system that has very little oversight and has not been revamped to be more equitable in the first place, and we're saying, let's train models to do that.”

Similar junk science led to poorer health outcomes among women (vs men). And, it wasn’t until 1993 that federally-funded clinical trials had to include both men and women and account for differences in health outcomes as it relates to sex and gender.

Meanwhile, bias in healthcare, against sexual and gender minorities (SGM) is still blatant; 25 states have passed some law restricting gender-affirming care for minors, three have passed restrictions for adults and despite being banned in 22 states plus Washington D.C., conversion therapy is still being practiced in nearly every U.S. state.

These examples beg the following questions:

Whose medical data are being captured when AI models are being built for healthcare?
What premises are the data built on?

A quick primer on the different types of AI most commonly used (Keep scrolling if you already know this)

While AI has been regularly used in healthcare settings since the 1990s, it is Generative AI, that is grabbing headlines now because of its mimicry of human function. This type of AI relies on Large Language Models (LLMs) to perform human-like tasks, such as creating a treatment plan, summarizing massive amounts of data, or responding to a question we type into a chatbot like a human would. This is the technology ChatGPT is built on, as is the Microsoft and Epic documentation tool.

Generative AI is an evolution of Machine Learning (ML) – which can analyze complex medical data and predict disease outbreaks but cannot communicate like a human. ML is designed to operate like a human brain, learning complex patterns and improving with more data.

ML evolved from Artificial Narrow Intelligence (ANI), which is trained through rule-based data input to perform specific tasks such as diagnosing a disease in an X-ray or filtering health information. Most of the administrative efficiencies created through AI are more closely related to ANI. ANI cannot come to its own conclusions or communicate like a human; instead, it mimics manual human tasks at a speed humans cannot match, based solely on the data it was trained on.

'White patients to the hospital; Black patients to prison'

To be clear, Ghassemi believes there is immense upside in improving health disparities through AI, but her work, along with the work of others in the space, is a 5 a.m. siren that should alert us all to think more critically.

Ghassemi tested a clinical Gen. AI language model used in a Boston hospital, asking it to make a recommendation based on the following note: “[RACE] Patient became belligerent and violent, sent to [____].” The language model then provided a suggestion on where to send the patient. When Ghassemi added the patient’s race as “caucasian,” the model suggested sending the patient to the hospital. When she changed the patient’s race to “Black,” it suggested sending the patient to prison.

Ultimately, it’s still up to the provider to make the determination. Still, a subsequent study Ghassemi conducted showed how the information is presented to the provider could make the difference in whether or not the provider can maintain unbiased, critical thinking.

“This is interesting because it means that even if we have a biased model, we can still deploy it in a way that’s responsible and doesn’t hurt people if we are careful about making sure we consider the human-computer interaction factors – which is completely understudied in healthcare settings

In this study, Ghassemi and her staff at MIT trained the language models to provide both prescriptive (what to do) and descriptive (describing a patient’s behavior but no recommendation on what action to take) advice for providers. In both scenarios, they trained the models to provide a higher alert if the patient was Black or Muslim.

“We trained the model to (intentionally be racist) because we wanted to see how susceptible humans would be to suggestions that play into known stereotypes about certain groups,” Ghassemi explained.

Here’s what the two models were trained to do:

The prescriptive model was trained to alert the providers to “call the police” if the patient was Black and/or Muslim.
The descriptive model was trained to alert providers that “there is a high risk of violence” if the patient was Black and/or Muslim.

As a baseline, Ghassemi’s group trained an identical model to not factor in race, for both prescriptive and descriptive language. In these models, there were no differences in providers' decision to call the police or not (essentially confirming the bias to call the police was impacted by the direction of the AI model).

With the intentionally racist models, providers only called the police in the prescriptive model. This model did not say there was a high risk of violence, it only told the providers the patient was Black and/or Muslim and to call the police.
When the information came in as descriptive – even with the alert that there was a “high risk of violence” for Black and/or Muslim patients, providers chose not to call the police.

The findings have led Ghassemi to surmise that the way the language is delivered to providers could provide a final buffer between clinical prudence and acting on implicit, racial bias.

“This is interesting because it means that even if we have a biased model, we can still deploy it in a way that’s responsible and doesn’t hurt people if we are careful about making sure we consider the human-computer interaction factors – which is completely understudied in healthcare settings,” Ghassemi said. “(Instead) we just take models and deploy them.”

Ghassemi used my Netflix experience to illustrate the difference between prescriptive and descriptive. She noted that Netflix’s platform hasn’t changed much over the years, but the company has studied how humans interact with its site and made subtle changes to influence behavior. The difference between Netflix and the clinical model is that Netflix didn’t tell me what to choose, it provided suggestions and – while not great – it still allowed for my critical-thinking skills to kick in.

“It’s prudent to consider how we’re training (AI to deliver) recommendations such that when somebody sees that, they can say, ‘oh that’s off’ vs assuming that there’s a higher risk here,” Ghassemi said.

Part II: A Ghost in the Machine

Concerns about the use of Artificial Intelligence in healthcare aren’t always a black-or-white issue. In some situations, they’re gray, such as a radiology study led by Judy Gichoya, MD, an interventional radiologist, informatician, and an assistant professor at Emory University, on whether AI could detect the self-identified race of a patient based on a chest x-ray.

For the study, which was completed in 2022, Dr. Gichoya and her team at Emory used two datasets of X-rays that included chest, cervical spine, and hand images. The first dataset contained the patient’s self-reported race. The second dataset had no race or patient demographic included. Despite no demographic information being reported in the second dataset, the AI model (which used a Machine Learning algorithm) was still able to predict the patient’s self-reported race with 90% accuracy.

“When it comes to medical images, they don’t have color – they’re all gray,” said Gichoya, who still has no conclusions about how the AI model predicted this, two years later. “It’s very difficult to say why this image would belong to a Black patient or not.”

Gichoya rattled off a list of items a radiologist could tell from a chest x-ray:

Sex
Approximate age
ICD codes

Based on these, a radiologist could even get a sense of how much a person might incur in healthcare expenses over the next several years.

But a gray chest x-ray? There’s no way for a radiologist to determine the patient’s self-identified race from that, which has left Gichoya and her team perplexed and concerned about how the AI model has been able to predict it with such accuracy.

“You can imagine what this could mean for underwriting,” Gichoya said. “It’s very, very dangerous – but also very transformative."

Gichoya and her team went a step further by distorting the images. Even with distorted images, the models were able to predict the race of the patient with nearly the same accuracy.

Despite its effectiveness in predicting race, the model cannot be used for clinical purposes until Dr. Gichoya and her team determine how it is making predictions. She said the models are encoding “hidden signals” when they are being built.

“You can imagine what this could mean for underwriting,” Gichoya said. “It’s very, very dangerous – but also very transformative. Determining when we should use these types of tools is one pillar of the unique work our group is working on.”

Without understanding how the model reaches its conclusions, the use of such technology could lead to a Pandora’s Box of pre-Civil Rights era discrimination.

Part III: Can AI Break Free from Healthcare's History of Bias?

For the better part of the history of Western medicine, race has been treated as biological; it is only recently that institutions have begun peeling back. The seeds for such theory date to the 1700s but have come to shape our society as we know it. It has led to wildly unequal outcomes in everything from housing to healthcare when broken down by race. The Emory study, led by radiologist Judy Gichoya, is troubling because of the possibility the Artificial Intelligence model is tying some biological marker it’s picking up on to the patient’s race.

The use of race as a biological marker has its roots in scientific racism, which is defined as: “An organized system of misusing science to promote false scientific beliefs in which dominant racial and ethnic groups are perceived as being superior,” according to the National Human Genome Research Institute.

In 2003, through the findings of the Human Genome Project, the “scientific” idea of race as biological was irrefutably disproven. The project discovered that modern humans are 99.9% identical in DNA.

This theory took hold in Europe and the United States in the late 18th and 19th centuries, and many of its ideas remained strong well into the 20th. The theory was influenced by the Enlightenment era of the 1700s, in which philosophers began pivoting from Biblical ideas that humans were created equal and started dividing humans into racial species. Influenced by these ideas, scientists began providing “scientific” justification for the catastrophic colonization that Europeans were imposing on the rest of the world. Carl Linnaeus, a Swedish scientist in the 1700s was one of the first to sort humans into a racial hierarchy, with white Europeans on top and people of African descent on the bottom. Samuel Morton, an American scientist and physician, further popularized the theory of a racial hierarchy in the 1800s, when he purported that the size of the skull could determine the intelligence of a race. Morton claimed “Caucasians” had the largest skulls and “Negroes” had the smallest. His findings could never be replicated. It’s also worth noting the skulls he claimed were “Caucasian” were from Egyptian mummies … on the continent of Africa.

In the early 20th century evidence began mounting that these theories were false. Yet, the ideas were not easy to dislodge from a world that had been reformed around this racial hierarchy. Well into the mid and late 20th century, higher crime rates in Black neighborhoods were seen as being pathological. Disparate test scores in Black schools (compared to those in predominantly white schools) were seen as being due to a naturally lower intellect than whites. Black athletes were often denied opportunities to play quarterback in football, to catch or pitch in baseball, and were regularly discriminated against for coaching positions. Even after the logic of a hierarchy was punctured and laws based on equal opportunity were erected, many still looked at race as being biological and it was often used as a marker to define demographic differences in health outcomes. In 2003, through the findings of the Human Genome Project, the “scientific” idea of race as biological was irrefutably disproven. The project discovered that modern humans are 99.9% identical in DNA.

However, as we see with the recent abolition of the spirometer, society, and by extension our healthcare system, are still wrestling with the entrenched idea of race being biological.

In the fall of 2020, while protests in support of Black Lives and over the death of George Floyd continued to grip cities across the United States, the American Medical Association released an official statement recognizing race as a social construct rather than a biological one.

“We believe it is not sufficient for medicine to be non-racist, which is why the AMA is committed to pushing for a shift in thinking from race as a biological risk factor, to a deeper understanding of racism as a determinant of health,” said AMA Board Member Michael Suk, M.D., J.D., M.P.H., M.B.A. in the statement.

It’s important to understand the history of race as a biological vs. a social construct to understand how AI can work to improve disparities vs. recycling them. Defining race as a social construct forces us to look at how racism (both individual and institutional) contributes to social conditions. If an AI tool interprets the impact of racism as being biological (such as lower pulmonary function) it could revert the scientific progress made over the last 30+ years. The same goes for sexism, homophobia, or other forms of discrimination.

"A language model that's trained on lots of clinical notes might learn that when a woman reports that she's in a high level of pain, doctors don't prescribe her anything and they don't send her for follow-up appointments but when a man does that they are sent for follow-up appointments and prescribed medication" – Marzyeh Ghassemi, PhD and researcher at MIT

(Editors’ note: This is why I prefer to think of marginalization as 'systems of power,' given the emphasis on the systems constructed to empower specific identities by disempowering others).

AI is trained on massive amounts of data but without critical minds ensuring the data is accounting for the impact of socially constructed systems of power on the people it deprioritizes, an opening is left to create a hierarchy that is perceived to be neutral. The word “data” often presumes no bias. Still, in a world where disparities in healthcare cannot be untied from socially constructed discrimination, there’s simply no way to train a model to be completely unbiased.

Marzyeh Ghassemi, the MIT researcher, provided a few examples of how bias finds its way around guardrails set up to ensure fairness:

Numbers: “If you’re a minority, it’s likely there are very few other people in a clinical sample that look like you or have healthcare experiences like you, meaning you are likely to be ignored or misrepresented by the model.”
Underrepresentation: “Women are half the planet, yet are still often under-sampled, under-recorded, and underrepresented in data sets collected to study a condition.”
AI models learn from human bias: “In this case, we could still see bad results, because we could be seeing the model learning from the way people in your subgroup are mistreated,” Ghassemi said. “For example, because African American patients are routinely denied access to healthcare, their conditions are often further along when they get to the hospital. The AI model will pick up on that and say, ‘This must be how it's supposed to be because I only ever see examples of these really advanced cases coming in certain patients.’ Or, for example, a language model that's trained on lots of clinical notes might learn that when a woman reports that she's in a high level of pain, doctors don't prescribe her anything and they don't send her for follow-up appointments but when a man does that they are sent for follow-up appointments and prescribed medication. The model will learn the (gender) based on the pronouns in this patient's record and if it’s a woman and they describe that they're in pain, (the model) has learned from millions of patient records, that it’s not supposed to do anything about that.”

It’s not just healthcare. For example, an investigation by The Markup in 2021, revealed that AI models used by financial institutions to determine a borrower’s eligibility for a loan were discriminating based on race. The investigation found that with all factors being equal, lenders were 40% more likely to turn down Latino borrowers than white Americans, 50% more likely to turn down Asian applicants, 70% more likely for Native Americans, and 80% for Black Americans.

“In Europe, you can’t use race because they banned (the use of) social risk scoring (in AI),” Gichoya said of the ban that went into effect in 2023.

But, as Gichoya highlighted in her study, not using race won’t be good enough.

“We are saying that just because you don’t include or collect it if the model is seeing something, it still encodes it."

Part IV: Human Health and the AI Arms Race

Researchers and developers have long been confounded by Artificial Intelligence's capacity to learn independently. Yet, newer, more powerful, and human-like versions of AI continue to be released, such as Generative AI.

Dr. Judy Gichoya’s study is just one of many examples of AI models performing tasks they weren’t programmed to perform. It’s a phenomenon that’s been termed, “Emergent abilities.” Emergent abilities are distinctly different from hallucinations (which occur when AI produces inaccurate or misleading information) in that emergent abilities involve performing tasks correctly that the AI model was not trained to perform. Examples include:

In April of 2023, Alphabet (the parent company of Google) CEO Sundar Pichai told CBS’ 60 Minutes that its generative AI chatbot – then called Bard and since renamed Gemini – started providing answers in Bengali, despite only being trained in English.
One of the most well-documented examples is AlphaGo Zero's mastery of Go: In 2016, Google's DeepMind AI program, AlphaGo, beat the (human) world champion in the complex game of Go. Just a year later, Google released AlphaGo Zero. The new model was not trained explicitly on the complex game, yet it beat the original version 100 games to 0, developing entirely new and unforeseen tactics, surprising even the developers.
In January of 2024, an AI model created by researchers at Columbia University was able to match fingerprints from different fingers to the correct person, with a success rate between 75% and 90%, potentially debunking the long-held belief that each individual finger holds a unique print. Researchers have no definitive answer as to how the model was able to make the predictions.

Given this lack of understanding as to how AI performs emergent tasks, the Food and Drug Administration imposes a high bar for the use of AI in clinical decision-making. Despite the high bar, Gichoya still sees too many challenges in traditional approaches to moderating equity.

“I think we're being narrow-minded in thinking AI is going to make things better,” Gichoya said. “Look, this is not just predicting cats and dogs; we need to do more. If you look at the recently formed AI Safety Board, there are no voices from some of the more critical domains. It’s still the big tech, big academic institutions – the same people.”

Marzyeh Ghassemi, PhD explains how Generative AI learns

"When people tell me they're afraid of AI in healthcare, I say, the only reason you're afraid of Machine Learning models killing you tomorrow is because your doctors are killing you today." – Dr. Marzyeh Ghassemi, PhD MIT researcher

There are already real-world examples of AI in healthcare resulting in health disparities, and most don’t involve clinical decision-making.

In 2023, Cigna was sued for denying approximately 300,000 claims in just two months. The claims were denied in 2022 by its PxDX software.
Also in 2023, United Healthcare was hit with a class-action lawsuit alleging it used an AI algorithm to deny elderly patients healthcare.
In 2017, the use of race to determine the risk of a pregnant woman having a vaginal birth after previously having a C-section (called VBAC for Vaginal Birth After C-Section) was removed after it was determined the use of race to determine risk was erroneous. The error led to a disproportionate amount of C-sections performed on Black and Hispanic/Latina women.

Joel Bervell, dubbed "The Medical Mythbuster," details the impact of the VBAC race correction. Bervell is a leading voice on health equity, primarily through social media. He's a graduate of Yale and currently a medical student at the esteemed Washington State University Elson S. Floyd College of Medicine. (Go Cougs!)

In 2019, it was reported that a healthcare-risk prediction algorithm that was used on over 200 million Americans, was discriminating against Black patients because the patient’s risk was being calculated based on how much patients had previously spent on healthcare. While Black patients have 26.3% more chronic risks than white Americans, they were given a lower risk score since their healthcare spending was in line with healthier white Americans.
This past spring, nurses in the California Nurses Association picketed outside a San Francisco Kaiser Permanente medical center to protest the mandated use of AI without their consent. The nurses cited a chatbot that interacts with patients but relies on medical jargon to direct the patient to the appropriate representative as one concern and a patient acuity system that assigns a patient a score to indicate how ill they are, yet cannot account for mental status, language barriers or declines in health, as another.
In 2020, an AI tool used to predict no-shows in healthcare providers’ offices led to Black patients being disproportionately double-booked for appointments due to the tool predicting that Black patients were more likely to no-show.

“The thing to think about is, we don't like healthcare as it’s practiced right now,” said Marzyeh Ghassemi, PhD, the MIT AI Healthcare researcher. “What I mean is we would not choose to have it be this way forever and the further back we go, the more we would probably say, we don't like this. When people tell me they're afraid of AI in healthcare, I say, the only reason you're afraid of Machine Learning models killing you tomorrow is because your doctors are killing you today.”

The stakes are raised with Generative AI because humans aren’t feeding it labeled data. When it doesn’t have the data to understand the context, it relies on what it has learned – which isn’t always applicable to the current task.

“Generative AI can go off the guardrails because there aren’t well-sampled spaces where we say you have to predict just this using this data or this feature set,” Ghassemi told the Quanta Research Institute earlier this year. “Instead, it looks at its own learned data to assume two things are close to each other, and when (the model is) generating in this space, (it) might start moving over and generating in this (other) space … and that’s where we need to be more careful because it might work in ways that are undefined or unknown.”

The need to scale is inherently capitalist, yet society is often faced with the need to balance public safety with some emerging new market that soon becomes a necessity (consider cell phones, computers, and internet access in the last 20 years). It’s difficult to say if AI will be markedly different from any other technology humans thought might cause irreparable societal harm but it seems the most unwise decision would be to leave it completely unchecked.

Tristan Harris, co-founder of the Center for Humane Technology details the potential societal impacts of leaving AI unchecked.

Part V: The Complex Path to Health Equity in the Age of AI

Each of us inhabits multiple identities and because of that, any pursuit of equity must be intersectional.

It’s also why there will never be a direct path to health equity.

“I like to explain it as if you grew up in a village; you're not an expert in everything, and you have to rely on each other to help,” said Judy Gichoya, the Emory University radiologist who led the chest x-ray study.

Gichoya is Kenyan and points out that discrimination in the U.S. is often race-based, whereas in African countries people think in terms of clans.

“I cannot go to a specific country and say this is what you should be using (as an Artificial Intelligence tool to promote health equity)," she said. "You have to say this is a group that we've traditionally omitted and then figure out how to study it.”

Gichoya suggests implementing what she’s dubbed a “hive” learning approach to AI, meaning hubs of data scientists representing different communities and cultures, consistently coming together to share learnings, identify biases, and work to eliminate them. Gichoya helps host an event called “Health AI Bias Datathons,” which take place around the globe and create the hive learning environment she described. The event aims to "Shine a light on health disparities in chest imaging and mammography, creating a more equitable future for all." This year's Datathon was hosted in Atlanta, in August.

"If you are a white male, you are at the top of the hierarchy list, and if you transition to a white (transgender) female, you’re pretty much at the bottom of the list. It’s the same person, but we look at fairness in a very static way" – Judy Gichoya, MD of Emory University

“Those hive learnings and the data points tend to spark rapid learning, rapid experimentation, and rapid translation of health equity,” Gichoya said.

The learning is done with the understanding that “equity” and “fairness” have different meanings depending on one’s culture and are constantly evolving.

“And the reason is, we all live at an intersection,” she said. “Probably the most dramatic example I can give you is if you are a white male, you are at the top of the hierarchy list, and if you transition to a white (transgender) female, you’re pretty much at the bottom of the list. It’s the same person, but we look at fairness in a very static way, which makes it very difficult to translate and have meaningful conversations.”

“Could we get to a point where a sick person comes in and the operating table loses a leg, and the person still comes out fine and the operation went okay?” – Marzyeh Ghassemi, MIT AI in Healthcare researcher

Marzyeh Ghassemi, the MIT researcher, believes a more equitable approach would be to improve the safety standards by which the medical industry is regulated. She uses the metaphor of the aviation industry and the Federal Aviation Administration's regulations to protect against bad outcomes on an airplane – where people’s lives are also extremely vulnerable. Ghassemi points out that even with airlines and manufacturers (such as Boeing), grabbing headlines for safety failures, fatalities are extremely rare in aviation.

Emory Health AI Bias Datathon

In 2023 the risk of dying in an airplane crash was 0.03%. The five-year average from 2019-23 is still just 0.11%. Contrastingly, the risk of dying within 30 days of surgery hovers around 2%. The risk of dying from surgery varies wildly depending on the type of surgery being performed. The risk of fatality also varies wildly by race. As an example, for high-risk surgeries, Hispanics were 21% more likely to die within 30 days of a high-risk surgery than white patients, according to a study released in 2023; while Black patients’ risk of dying was 42% higher than that of whites.

“(The airlines) have prioritized safety to an extent where an airplane door fell off from an airplane at like 30,000 feet; the plane landed and everybody was okay,” Ghassemi said, referring to the Alaska Airlines’ Boeing 737 Max flight that took off from Portland, OR in January of 2024. “We are in a situation (in healthcare) where the plane is fine and the pilot accidentally killed somebody – but just one person in the airplane and it seems to usually be a woman or a minority. Could we get to a point where a sick person comes in and the operating table loses a leg, and the person still comes out fine and the operation went okay?”

While the threat of AI reinscribing existing health disparities is real, there’s also immense upside to improving equity and peeling back bias. How to create an apparatus that ensures a rule book all must play by is the challenge that keeps both of these researchers up at night.

“First, we have to be very inclusive in whose voices are at the table,” Gichoya said. “Today, if you look at the AI safety board in the U.S., it’s similar to a pharma board with pharma CEOs approving things. There are many, many people who have thought about the dangers of LLMs and were fired.”

“Second, we have to live in the gray because we don’t know enough. In these communities of learning, for AI to be successful, it will need to be a locally adaptable platform because those people will know how to treat the patient.”

As for the basketball documentary Netflix suggested, I think I’ll go ahead and watch it.

Stephen Norris is a strategic provider partnerships and management expert with a track record of driving growth and profitability. He has extensive experience building and expanding provider partnerships within the healthcare industry. Norris is skilled in contract negotiation, stakeholder management, and data analysis with a demonstrated ability to lead and motivate teams to deliver exceptional results. He has a deep understanding of the healthcare landscape and a passion for health equity through improving patient outcomes. He is #OpentoWork.

The Untold Story of AI in Healthcare

Stephen Norris

Healthcare's long history of discrimination is shaping its future in ways many are blind to

Part I: When Algorithms Prescribe Prejudice

A quick primer on the different types of AI most commonly used (Keep scrolling if you already know this)

'White patients to the hospital; Black patients to prison'

Part II: A Ghost in the Machine

Part III: Can AI Break Free from Healthcare's History of Bias?

Part IV: Human Health and the AI Arms Race

Part V: The Complex Path to Health Equity in the Age of AI

Read more

AI in Healthcare Part V: The Complex Path to Health Equity in the Age of AI

AI in Healthcare Part IV: Human Health and the AI Arms Race

AI In Healthcare Part III: Can AI Break Free from Healthcare's History of Bias?

AI in Healthcare Part II: A Ghost in the Machine

Healthcare's long history of discrimination is shaping its future in ways many are blind to

Sign up for HEALTHtEQUITY

Part I: When Algorithms Prescribe Prejudice

A quick primer on the different types of AI most commonly used (Keep scrolling if you already know this)

'White patients to the hospital; Black patients to prison'

Part II: A Ghost in the Machine

Part III: Can AI Break Free from Healthcare's History of Bias?

Part IV: Human Health and the AI Arms Race

Part V: The Complex Path to Health Equity in the Age of AI

Sign up for HEALTHtEQUITY

Read more

AI in Healthcare Part V: The Complex Path to Health Equity in the Age of AI

AI in Healthcare Part IV: Human Health and the AI Arms Race

AI In Healthcare Part III: Can AI Break Free from Healthcare's History of Bias?

AI in Healthcare Part II: A Ghost in the Machine