Unleashing the potential of AI: a deeper dive into GPT prompts for medical research

Dorian Garin

doi:10.1136/bmjhci-2023-100857

Article Text

PDF

PDF +
Supplementary
Material

Letter

Unleashing the potential of AI: a deeper dive into GPT prompts for medical research

http://orcid.org/0000-0002-3872-3496Dorian Garin1,2

¹Internal Medicine, HFR - Hôpital Cantonal, Fribourg, Switzerland
²Faculty of Medicine, UNIGE - Université de Genève, Geneva, Switzerland

Correspondence to Dr Dorian Garin; dorian.garin{at}icloud.com

https://doi.org/10.1136/bmjhci-2023-100857

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Artificial intelligence

I read the article by Haemmerli et al on the performance of ChatGPT-3.5 in generating treatment recommendations for central nervous system (CNS) tumours, which were then evaluated by tumour board (TB) experts. While the study did illuminate promising aspects of the Artificial Intelligence (AI) model, the design of the prompt used to interact with ChatGPT warrants further consideration.

In the study, the prompt employed was a brief patient history, followed by two questions, which appears to have limited the model’s performance. As a sophisticated large language model (LLM), GPT-3.5 relies heavily on the context and specificity of the provided prompt.1 2 Based on cited literature, an alternative prompt structure could have included context, specific intent, a question and an expected response format. Moreover, pretraining the LLM with examples of the expected answer significantly improves the quality of the answer.2 3 Finally, the introduction of GPT-4 in early March 2023 has shown considerable improvement in understanding and generating responses when compared with ChatGPT-3.5.4 5

With the application of these techniques, researchers could have guided the predictive capabilities of the LLM to generate more relevant and contextually nuanced responses. This could have particularly helped in areas where the model underperformed, such as precision in glioma subtypes and considerations of patient functional status.

As an illustration, both ChatGPT-3.5 and GPT-4 were pretrained with eight examples (patients 1–8, patient history followed by TB response) from online supplemental material of the study. A more context-specific prompt was then used with the history of patients 9 and 10. Table 1 displays main output obtained using this technique, revealing enhanced precision in oncological diagnosis, treatment discussions and patient functional status from ChatGPT-3.5 compared with what was presented in the paper. GPT-4 seemed to align even more closely with the board’s opinion, which was defined as the gold standard. Full discussion with the chatbot is available in online supplemental material 1.

Supplemental material

[bmjhci-2023-100857supp001.pdf]

View this table:

Table 1

ChatGPT-3.5 from the paper’s online supplemental material S1 (6), chaptGPT-3.5 and GPT-4 adapted output

It is critical to acknowledge that the efficiency of LLMs applications heavily depends on the prompt used and the quality of the data given. Future research needs to employ a refined, context-driven approach in interacting with these models and the development and sharing of prompt engineering techniques should continue to be prioritised.

In conclusion, the exploration of LLM in CNS oncology research is commendable, but it is essential to optimise the methodology to fully unlock the true potential of AI tools in such a complex and challenging clinical landscape.

Ethics statements

Patient consent for publication

Acknowledgments

The author would like to thank Dr. Marie-Claude Blatter, PhD, for raising interest in the use of LLMs in medicine, and GPT-4 for writing and proofreading assistance.

References

↵
1. Gordijn B,
2. ten HH
. Chatgpt: evolution or revolution? MED Health Care Philos 2023;26:1–2.
OpenUrl
↵
1. Lyu Q,
2. Tan J,
3. Zapadka ME, et al
. Translating radiology reports into plain language using chatgpt and GPT-4 with prompt learning: results, limitations, and potential. Vis Comput Ind Biomed Art 2023;6:9. doi:10.1186/s42492-023-00136-5
↵
1. Liu J,
2. Shen D,
3. Zhang Y, et al
. What makes good in-context examples for GPT-$3$? [Internet]. arXiv. 2021. Available: http://arxiv.org/abs/2101.06804 [Accessed 15 Jul 2023].
↵
1. Nori H,
2. King N,
3. McKinney SM, et al
. Capabilities of GPT-4 on medical challenge problems. 2023. Available: http://arxiv.org/abs/2303.13375 [Accessed 15 Jul 2023].
↵
Gpt-4. Available: https://openai.com/research/gpt-4 [Accessed 15 Jul 2023].
1. Haemmerli J,
2. Sveikata L,
3. Nouri A, et al
. Chatgpt in glioma adjuvant therapy decision making: ready to assume the role of a doctor in the tumour board BMJ Health Care Inform 2023;30:e100775. doi:10.1136/bmjhci-2023-100775

Supplementary materials

Supplementary Data

This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Data supplement 1

Footnotes

Contributors DG created the concept of the letter, reviewed the literature and wrote the manuscript.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

[1] ↵
Gordijn B,
ten HH
. Chatgpt: evolution or revolution? MED Health Care Philos 2023;26:1–2.
OpenUrl

[2] Gordijn B,

[3] ten HH

[4] ↵
Lyu Q,
Tan J,
Zapadka ME, et al
. Translating radiology reports into plain language using chatgpt and GPT-4 with prompt learning: results, limitations, and potential. Vis Comput Ind Biomed Art 2023;6:9. doi:10.1186/s42492-023-00136-5

[5] Lyu Q,

[6] Tan J,

[7] Zapadka ME, et al

[8] ↵
Liu J,
Shen D,
Zhang Y, et al
. What makes good in-context examples for GPT-$3$? [Internet]. arXiv. 2021. Available: http://arxiv.org/abs/2101.06804 [Accessed 15 Jul 2023].

[9] Liu J,

[10] Shen D,

[11] Zhang Y, et al

[12] ↵
Nori H,
King N,
McKinney SM, et al
. Capabilities of GPT-4 on medical challenge problems. 2023. Available: http://arxiv.org/abs/2303.13375 [Accessed 15 Jul 2023].

[13] Nori H,

[14] King N,

[15] McKinney SM, et al

[16] ↵
Gpt-4. Available: https://openai.com/research/gpt-4 [Accessed 15 Jul 2023].

[17] Haemmerli J,
Sveikata L,
Nouri A, et al
. Chatgpt in glioma adjuvant therapy decision making: ready to assume the role of a doctor in the tumour board BMJ Health Care Inform 2023;30:e100775. doi:10.1136/bmjhci-2023-100775

[18] Haemmerli J,

[19] Sveikata L,

[20] Nouri A, et al

Log in using your username and password

Main menu

Log in using your username and password

You are here

Statistics from Altmetric.com

Request Permissions

Supplemental material

Ethics statements

Patient consent for publication

Acknowledgments

References

Supplementary materials

Supplementary Data

Footnotes

Read the full text or download the PDF:

Log in using your username and password