Review

Towards inclusive biodesign and innovation: lowering barriers to entry in medical device development through large language model tools

Abstract

In the following narrative review, we discuss the potential role of large language models (LLMs) in medical device innovation, specifically examples using generative pretrained transformer-4. Throughout the biodesign process, LLMs can offer prompt-driven insights, aiding problem identification, knowledge assimilation and decision-making. Intellectual property analysis, regulatory assessment and market analysis emerge as key LLM applications. Through case examples, we underscore LLMs’ transformative ability to democratise information access and expertise, facilitating inclusive innovation in medical devices as well as its effectiveness with providing real-time, individualised feedback for innovators of all experience levels. By mitigating entry barriers, LLMs accelerate transformative advancements, fostering collaboration among established and emerging stakeholders.

Introduction

Accelerating safe and effective innovation is crucial for the advancement of healthcare and the betterment of patient care. Breakthroughs have the potential to transform diagnosis, treatment, monitoring, health condition management and outcomes, pushing the boundaries of science and medicine. However, device innovation from conception to commercialisation is highly complex, lengthy and costly, posing considerable hurdles for novice entrepreneurs and start-ups. Where time is money and money is time, expenditures can quickly become daunting in such core areas as market analysis, research and development, intellectual property protection, regulatory testing and clinical trials.1 The latest progress in large language models (LLM), such as generative pretrained transformer (GPT), expedites innovation and may lower the barriers to entry for all potential players by wearing the hats of many different device-related or innovation-related consultants, and readily offer a variety of information through the vast number of parameters it was trained on.2

For the purposes of the following narrative review, we will focus on LLM applications related to GPT-4 (OpenAI, San Francisco, California, USA), an advanced artificial intelligence (AI) language model capable of generating human-like text by leveraging state-of-the-art natural language processing techniques.3 We discuss the feasibility of implementing GPT-4 to accelerate medical device innovation, as well as how GPT-4 can be prompted to assist in the innovation process. By integrating GPT-4 into the medical device development process, the process of biodesign can become more inclusive of novice or resource-limited innovators by easing access to information and curtailing financial barriers.

We discuss high-impact applications of LLMs in the biodesign process (figure 1) as described by Zenios et al.4 Thorough analysis of the utility of GPT-4 in each step of the biodesign process elucidates how each subcategory can be enhanced, streamlined and/or save funding for an innovator. Due to its powerful forecasting capabilities, GPT-4 has shown the ability to effectively aid optimal decision-making when the answer to a question cannot be deduced in the present context. Optimal decision-making can lead to efficient execution of steps through the biodesign process.5 To optimise the utility of LLMs, correct prompting is crucial. In subsequent sections, we elaborate on example use cases and prompts that innovators can use at each stage of development while illustrating the amount of time and resources they are saving by doing so.

Figure 1
Figure 1

Areas of application of large language models-based tools in the early stages of the biodesign process.

Problem identification: LLMs to identify trends

In the traditional biodesign process, problem identification begins with direct clinical exposure or elucidated problems from a clinical collaborator, an invaluable and irreplicable resource. Once a problem is identified, an understanding of its context or scope is essential. To do so, we suggest accelerating this process with the use of internet browsing-enabled GPT to access contemporary data, not limited to the training period cut-off of September 2021. Using a sequence of prompts, innovators can query to find potentially meaningful trends through the internet and the large volume of literature the LLM is trained on (figure 2A). Additionally, with correct leveraging of a retrieval augmented generation (RAG), the LLM can access financial data, economic history and regulatory guidelines of a specific space. The external resources which would provide such information could be from within public or private databases. Browsing capabilities and RAG usage can help prevent the LLM from producing a hallucination (ie, a made-up response from a gap in its training data), thus ensuring the most applicable and accurate information is shared with the user.6 Identifying trends may validate an innovator’s preconceptions with ongoing published work in the field, while the absence of developments may identify clinical gaps, or signal caution. Given the foundational importance of problem identification for successful device innovation, leveraging trend analysis with LLMs can help innovators by providing a high-level review of a problem space. 7

Figure 2
Figure 2

(A) Generative pretrained transformer-4 (GPT-4) browser tool for problem space identification. (B) GPT-4 direct prompting for problem space identification.

Problem identification is followed by solution design, which requires iterative development with input from a variety of stakeholders, most importantly the primary end-users, or clinical staff. Consequently, access to clinical stakeholders is integral and their input may significantly alter, obviate or accelerate the identification of end-user needs. To assist innovators in the clinical space, LLMs can offer targeted medical knowledge with respect to a system of focus, enabling users a basic understanding of the principles relevant to their innovation of interest and allowing them a base knowledge to build on in conversation with medical experts.8 For instance, querying the anatomical and physiological impacts of a potential medical device (figure 3A,B) allows an innovator to engage clinicians at a higher level of non-expert knowledge prior to consultation, thereby facilitating discussions on higher-order questions in a time-effective manner. Similarly, LLM tools can be used to understand market alignment of a problem statement as shown in recent research demonstrating GPT’s ability to run a market analysis using sound economic logic; however, this reported zero-shot approach may be refined for accuracy when enhanced by RAG.9 In complex clinical scenarios, employing a RAG system that references the most up-to-date external clinical information allows non-experts to efficiently elevate their understanding of a given problem space.

Figure 3
Figure 3

(A) Example prompt for physiological knowledge acquisition and identifying other knowledge gaps. (B) Generative pretrained transformer-4 (GPT-4) plugin, AskYourPDF, as a tool for acquiring physiological knowledge.

An important caveat within the realm of design concepts is that GPT-4 should not be prompted to create concepts for developers. Because GPT-4 is trained on already-published data, any conceptual ideas it generates may not be truly novel. However, GPT-4 can still be useful in concept screening.

Final concept selection

Once concept selections are narrowed, it is important for innovators to assess the intellectual property landscape through a prior arts analysis, understand the general regulatory pathway of the device and align the device with reimbursement practices to evaluate its overall development pathway.

Intellectual property landscape

The first step towards identifying patentable or novel features of a design involves conducting a prior arts analysis, commonly known as a patentability search, to assess the novelty or uniqueness of a device using existing patent documentation.10 It represents a daunting task, typically requiring the engagement of patent agents or lawyers, and can cost thousands of dollars, with average attorney hourly rates ranging from US$200 to US$500.11 As such, an initial analysis is typically carried out by the innovator, which also benefits their understanding of the intellectual property landscape within their medical device space.4 While GPT-4 cannot replace the need for a lawyer, it can help the innovator enhance their prior arts analysis and confidently work through this stage of development before considering protection options. We demonstrate the use of GPT to generate a table of expanded search terms for prior arts analysis (figure 4A). After using these search terms, and downloading relevant patent applications, an LLM tool to query PDFs can be prompted to quickly identify each device’s independent claim (figure 4B). These outputs can be summarised to formulate an analysis of the prior arts in the specific area of interest, which in our example relates to vascular closure devices. After a prior arts analysis, a deeper dive into intellectual property may involve a more exhaustive search for patents, where LLMs are used to compare features an innovator describes against those described in an uploaded patent document, such as with a comparator device. In doing so, LLMs may help innovators identify and refine device features for patent protection. Understanding novel elements of a device and articulating those features in patent claims becomes critically important. With the assistance of LLMs, innovators may expand their claims’ language to be as inclusive as possible. Repeated application of the above techniques can assist innovators in developing a strong initial intellectual property strategy. These methods can effectively familiarise innovators with the intellectual property landscape of a device space, allowing them to efficiently assess the uniqueness of their own solutions, aid in patent drafting and ultimately facilitate more effective interactions with their future patent counsel.

Figure 4
Figure 4

(A) Expanding keywords to search for prior arts analysis and initiating UploadYourPDF plugin. (B) Prompting generative pretrained transformer-4 (GPT-4) with downloaded PDFs and retrieving summary of protected devices. LLM, large language model.

Regulatory landscape

Before settling on a final design, it is important for innovators to fully evaluate the regulations and classifications of each concept when choosing a final design. Device classification dictates the market route and may influence concept selection due to additional costs, time and processes involved. For example, a standard premarket approval (PMA) submission fee costs US$441 547, whereas a standard 510(k) submission fee costs US$19 870.12 Innovators must understand the requirements for FDA approval and listing before moving forward with a specific design. LLMs can be prompted to assess likely regulatory classifications (class I, II or III). Furthermore, readily downloadable regulatory summary documents on PMA or 510(k)-specific FDA databases of relevant device types can be queried for previously required International Organisation for Standardisation (ISO) testing guidelines (figure 5), and, if applicable, clinical trial characteristics including sample size, blinding and number of arms. Another approach would be to implement a RAG with access to a database containing similar devices/predicate devices, which would augment the LLMs’ ability to determine the most applicable regulatory pathway for a design. Specificity in this area can help prevent LLMs from producing hallucinations due to knowledge gaps in the specific device space a designer is working in. LLMs can also assist innovators in understanding quality standard regulations to adhere to good manufacturing practices.

Figure 5
Figure 5

Using comparator devices, prepare for a 510(k) submission. FDA, Food and Drug Administration; LLM, large language model.

Financial landscape

Understanding the financial incentives that drive medical device adoption begins with an understanding of the existing solutions and available market. In prior research, LLMs have demonstrated the ability to provide relevant market information such as total available market, service available market and service obtainable market in formulating an initial market analysis.9 13 Subsequently, an in-depth analysis of the market requires an understanding of device reimbursement pathways. While reimbursement can represent a complex task, LLMs can assist in narrowing the appropriate codes whether it be the Current Procedural Terminology, Healthcare Common Procedure Coding System or International Classification of Diseases, 10th Revision for existing procedures. This is a necessary step that can heavily impact the payment for a novel device or procedure. The presence or lack of necessary procedural codes can drastically impact the financial feasibility of a novel device or start-up, as without sufficient reimbursement, a novel device company may not be able to cover their costs. Augmenting this space by carefully selecting the necessary procedural codes, therefore, is crucial. LLMs can help innovators readily identify if current codes align with their solution or if they will likely have to apply for a new code.4

Final concept selection

The illustrated GPT-4 use cases are by no means exhaustive of LLM capabilities but serve to demonstrate LLM capabilities. The various outputs of these prompts can ultimately be implemented into a weighted decision matrix (figure 6) to assist innovators in final concept selection. Used appropriately, LLMs have the potential to significantly de-risk concepts and accelerate concepts through the initial stages of the biodesign process. Furthermore, the OpenAI GPT-4 API can be leveraged alongside a RAG to access proprietary data on up-to-date costs of manufacturing/developing. In this stage, the RAG could enhance the cost consideration within the objective decision matrix. An LLM alone may lack the most accurate market values, thus the GPT-4/economic RAG combination could enhance this step and help innovators make the most financially conscious decision.

Figure 6
Figure 6

Generative pretrained transformer-4 (GPT-4) creating an objective decision matrix when prompted with problem. LLM, large language model.

Limitations

While LLMs have immense potential in expediting medical innovation, the novelty of the technology presents a few limitations that innovators must consider. For example, GPT-4 is trained on data with a cut-off date of September 2021. When using LLMs beyond the cut-off date, the reliance on outdated information may lead to inaccuracies. Furthermore, it does not have access to live market research and proprietary databases, meaning it cannot provide data on which devices currently possess the greatest market shares. While live market data can be uploaded to GPT-4 as PDFs or through a RAG and analysed, this feature is contingent on a researcher already having obtained the necessary documents/databases.

While LLMs are hindered by training cutoffs, innovations in AI can mitigate these issues. For example, RAG systems incorporate external data sources created based on the initial query with embedding language models. Additional data are used to augment the search, thus reducing hallucinations and outperforming parametric language models in various generation tasks.14 However, RAG still presents unique problems with knowledge generation. Ensuring that accurate information with minimal bias is gathered can be difficult, and more complex ‘long-form’ generation systems are needed to retrieve data at multiple points throughout the generation process.15

Despite advancements in AI systems, the reliance on proper queries is still a concern. If a researcher uses biased prompts, the generated answer may exacerbate these biases by drawing on training data that confirms them. Furthermore, language processing models may need to be explicitly trained on the definitions of ‘bias’ and ‘fairness’16

The risk of inaccuracies extends beyond recent data, so while LLMs queries can accelerate certain steps, they do not eliminate the need for human skills through the design process. LLMs can occasionally output incomplete or false information. For example, when prompted with ‘List the steps involved in the implementation phase of the biodesign process’, chatGPT-4 fails to include funding and financial strategy—a crucial component in the implementation stages. Therefore, while LLMs may streamline an individual’s workflow, the generated information should not be taken at face value without additional due diligence.

Additionally, the inherent reliance on training data creates a risk of plagiarism, meaning any responses generated are based on existing data and literature. Browser-enabled and retrieval-augmented LLMs give the capability to access contemporary sources of information; however, the accuracy of cited sources should be verified and should not be accepted at face value. Innovators must ensure they are only using LLMs to augment their original ideas and facilitate the development of their design concepts rather than rely on LLMs through each step of the biodesign process. Finally, it is currently unclear how significant the risks associated with uploading intellectual property into a framework that may use those very inputs to train itself further and make them available to the public are.

Conclusion

Medical innovation is a lengthy and demanding process, requiring a vast amount of information and resources for innovators to navigate each step effectively. For novice engineers and entrepreneurs, the usage of LLMs can fill the gaps in knowledge and experience that might otherwise impede their innovation. For large and small developers alike, LLMs create the potential for expediting the biodesign process due to the time and resources that can be saved. With proper prompting and use of techniques such as RAG, the implementation of LLM query techniques can reduce barriers to product commercialisation and alter the medical landscape by promoting innovation at an unprecedented rate.