In 2023, the widespread availability of Large Language Models, such as ChatGPT, has accelerated interest in Artificial Intelligence (AI) to solve societal problems. Indeed, the potential of AI to bring about improvements in quality of life are abundant, in healthcare and beyond, including improving healthcare productivity through autonomous AI – as proven by our study.
Healthcare, while often of high quality, is increasingly costly, leading to growing access and affordability issues, which in turn leads to subpar clinical outcomes and health inequity, including in the US. In an attempt to address these issues, health authorities, governments, and other agencies have widely implemented rationing, cost controls, and substitution with lower quality services.
In our paper, we propose and show that these efforts do not get at the core of the problem, which we argue is declining healthcare productivity. We point out how other industries, such as agriculture (ag), where high cost and lack of access caused similar problems in the past, such as famines in the case of ag, were able to solved them, not through rationing, but rather by improving productivity. I see the effects of thothose massive ag productivity gains every day in my home state of Iowa. Growing productivity has led to massive increases in our quality of life across the board.
Maybe, , authors have not considered healthcare productivity in the past, because they thought it could not be done, and therefore focused on rationing, substitution, and other capacity constraints instead.
Intending to increase productivity is not itself a solution, as illustrated by previous attempts to increase healthcare productivity through information technology, including through the introduction of electronic health records, which evidence shows, paradoxically often decreased productivity.
Proposing that autonomous AI can improve healthcare productivity is one thing – there also needs to be a way to measure such an effect, if any. Additionally, if those metrics being measured are relevant to decision makers, we need to measure them in a real-world setting, with real-world AI applications, not in alab setting with a prototype that has never been exposed to ethical concerns, regulation or the market.
Still too often, healthcare AI studies focus on what I have called ‘glamour AI’, where these are compared to some theoretical norm, and any potential effects on patient outcome are not considered. And if outcomes are considered, the Ai system is in prototype form , where the journey of such a system to real world - from prototype algorithm, through rigorous validation and ISO and GMLP certification, regulatory clearance by US FDA, the need for inclusion in standards of care, sustainable business models and reimbursement, widespread support by all healthcare stakeholders, and product / market fit - will adulterate that original prototype AI so that at the end it is almost unrecognizable from when is effect on outcome was started. Thus, the external validity of such studies in the real-world of productivity improvements if any, will be severely limited. In fact, while multiple AI prototypes have been evaluated on some form of clinic productivity, none have received reimbursement let alone sustainable widespread implementation, leaving the results to have little relevance for decision makers.
Accordingly, we used a real-world, commercially available, widely implemented autonomous AI in our study, which has been developed under a strict ethical framework, has been rigorously validated for safety, efficacy, and lack of undesirable bias against clinical outcomes, has been shown to be more accurate than experienced clinicians (thus guaranteeing at least equal if not higher quality of care), and has received support from all healthcare stakeholders, including patient organizations, bioethicists, physician organizations, regulators, payors, and investors, as explained in our paper and many preceding publications.
In addition, we were required to develop a metric and methods to measure clinic productivity in the real world. This is challenging in an outpatient setting, because most clinics use a scheduled system with specific regular slots for each patient. Designing a randomized controlled trial with and without autonomous AI in such a setting, is complex, as doctors in the AI arm of the study are unlikely to finish their clinic earlier – as the slots were already filled and cannot be adjusted dynamically depending on anyeffect the AI might have. The study would essentially be measuring the *expectation* of schedulers for what the productivity effect would be, and thus severaly biased.
Rather, we designed the study to be unbiased, in a clinic setting with a so-called overloaded queue – in practice, there is a pool of patients waiting to be seen without formal appointments. In the US, only emergency departments have such a workflow – but in low-income countries, specialty clinics often also operate in this manner. That is why it was so exciting to be able to do this study together with the Bangladesh clinical specialists, according to the mathematical model we developed, who run such clinics. This uniquely allowed us to estimate productivity with minimal bias.
We were excited that the results confirmed our hypothesis, showing that real world autonomous AI increases the number of patients receiving high quality eye exams per hour, improving clinic productivity (see Figure 1 below), in a real-world setting. In addition, we found several other advantages beyond the time saved by physicians. First, implementation of the AI system is scalable – in fact, during the pandemic, the AI system was deployed entirely remotely, without the AI creators setting foot in Bangladesh. Second, the introduction of the autonomous AI helped to address physician burnout and provider burnout in general, so that doctors can focus on what they enjoy, are good at, and what they were trained for. AI can be left to do the routine work, where human expertise is not needed.
Finally, back to LLM types of AI: a recent productivity study showed that using ChatGPT made BCG business consultants more productive; in consultancy, not in healthcare where ours is still the first one. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4573321