ChatGPT: rumors of human providers’ demise have been greatly exaggerated
2023; Elsevier BV; Volume: 119; Issue: 6 Linguagem: Inglês
10.1016/j.fertnstert.2023.03.010
ISSN1556-5653
Autores Tópico(s)Autopsy Techniques and Outcomes
ResumoAs an artificial intelligence (AI) language model, ChatGPT can be a valuable tool in the field of reproductive endocrinology and infertility. Although it cannot disrupt the field on its own, it can contribute to its advancement and improvement in several ways. ChatGPT can assist health care professionals in diagnosing and treating various disorders, including infertility. By analyzing patient data, such as medical history, symptoms, and laboratory results, ChatGPT can help identify the underlying cause of a patient's infertility. I have to apologize. I have taken up a few seconds of your valuable time by participating in what is by now a cliché in the medical, legal, nursing, business, and myriad other fields literature. The first 4 sentences of this piece were written by ChatGPT (the GPT is an acronym for Generative Pretrained Transformer), and I am obligated under the new Elsevier policy to inform you when I use the platform (https://chat.openai.com/auth/login), which is only meant to improve existing prose and research and never be cited as an author. OpenAI is the parent company for ChatGPT and released the new platform in November 2022, leading to dreams of a new world order by techies in Silicon Valley and gnashing of teeth by professionals and white-collar workers who feared for their professions—and their jobs. As is almost always true with these supposedly "disruptive" technologies, the truth is somewhere in between. For those unfamiliar, ChatGPT is a so-called large language model (LLM) that is trained on multiple relevant data sets, including all of the English language Wikipedia platform, and therefore supposedly has a snapshot of the entire internet up to 2021 (1Nath S. Marie A. Ellershaw S. Korot E. Keane P.A. New meaning for NLP: the trials and tribulations of natural language processing with GPT-3 in ophthalmology.Br J Ophthalmol. 2022; 106: 889-892Crossref PubMed Scopus (5) Google Scholar). In fairness, the release in November was acknowledged by the creators as being early, would not have accompanying references, and would primarily be a user-friendly interface so that the world could take the platform for a test drive and give OpenAI feedback. If you have not done so, I would encourage you to do so. Actually, the experience is pleasant and interesting, more like a conversation with an acquaintance than a rigorous search of the PubMed literature you might be used to. And there in lies one of the concerns. Collaboration between the computational departments at Yale and the National University of Ireland tested ChatGPT's performance on parts 1 and 2 of the United States Medical Licensing Examinations (2Gilson A. Safranek C.W. Huang T. Socrates V. Chi L. Taylor R.A. et al.How does ChatGPT perform on the United States Medical Licensing Exams? The implications of large language models for medical education and knowledge assessment.JMIR Med Educ. 2023; 9e45312Crossref PubMed Scopus (28) Google Scholar). Using 2 standard medical student question banks as the gold standard, the researchers challenged the platform with standard questions and even used the "attending tip" function of one of the data sets to see whether the additional guidance improved the response. For the step 1 questions, ChatGPT answered the respective data sets correctly 64% and 44%, respectively, and for the step 2 questions, at 58% and 42%, respectively. When scored by difficulty, ChatGPT scored 0 on the most challenging questions. Nevertheless, the investigators' conclusion was that ChatGPT had passed step 1 on one of the data sets. The message my medical students and residents were sharing with each other dropped the qualifiers, that it had passed the United States Medical Licensing Examinations, and they were now very concerned with their future careers. Apocryphal stories of the platform achieving top scores on the bar examination are also based on an exaggeration of similarly mediocre results. I hear that the lawyers are nevertheless concerned. Humans are clever, and many also saw the opportunities to get a leg-up on their critical tasks. An informal survey of 4497 Stanford undergraduates revealed that 17% of them admitted to using ChatGPT for their final examinations, and 5% actually turned in a work written by the platform (3Cu M.A. Hochman S. Scores of Stanford students used ChatGPT on final exams, survey suggests.https://stanforddaily.com/2023/01/22/scores-of-stanford-students-used-chatgpt-on-final-exams-survey-suggests/Google Scholar). In an unusual clash (or perhaps conflation) of the STEM (Science/Technology/Engineering/Math) and humanities worlds, educators at all levels are raising alarm about their fields as well (4Marche S. The college essay is dead.https://www.theatlantic.com/technology/archive/2022/12/chatgpt-ai-writing-college-student-essays/672371/Google Scholar, 5Gupta K. Stanford and MosaicML researchers announce the release of PubMed GPT, a purpose-built AI model trained to interpret biomedical language.https://www.marktechpost.com/2022/12/17/stanford-and-mosaicml-researchers-announce-the-release-of-pubmed-gpt-a-purpose-built-ai-model-trained-to-interpret-biomedical-language/Google Scholar). In all the hype and angst, we are possibly forgetting that we have been down this path before. When I was a medical student and resident, getting medical references meant trucking down to the medical library and pulling out a ponderous Index Medicus tome; now, I can do a PubMed search from my desk and get a more comprehensive review of the literature than I ever could the old way. That is a good thing. On the other hand, the nice lady who used to type my clinic and discharge notes is gone, replaced by a platform that requires an endless number of mouse clicks and staring at a screen for hours at a time. I probably get paid faster because charges are dropped almost immediately, but I have been less happy with the eye strain and lateral epicondylitis I got from all those mouse clicks. We have more or less embraced these changes, almost without noticing them or, with rare exception, having any input in their creation or adoption, and I really cannot say I remember the transition. The roll-out of the smartphone in 2007 has more of a time stamp, but even here, most of us remember Steve Jobs's presentation but not when we started seeing people walking around looking down at their devices, occasionally crashing into lampposts. Less obvious has been a wave of teenage suicides, primarily by young women who were the victims of social media bullying permitted by ubiquitous smartphones. But these days, our smartphones are possibly our most critical medical instrument. We cannot get onto our medical records without them; we use them to look up arcane facts on the fly; they have replaced the pager; and, through them, we are available at all times, including after hours. We have adopted these and other technologies without protest, mindlessly, and usually without much critical appraisal of their impacts. The current version of ChatGPT, in a way, means to please. The way that it replies to queries is by structuring its responses on the basis of what it anticipates that the appropriate pattern of words should be, informed by the data set that has trained it. When using the entire internet, this could include biased or inaccurate information, or outright disinformation. It imitates the existing language. In fact, OpenAI had to put in filters to prevent it from producing hate speech. ChatGPT also has no understanding of the physical world. These shortcomings have led some pundits to call ChatGPT nothing more than a meme machine, and generating them has become another common trope. It is hard to resist participating in this cliché a second time, so I asked it to write a haiku on REI. It returned: Hormones dance within Conceiving life's great joy Hope blooms in the heart Adorable, but I had to point out that the second line had 6 syllables. It cheerfully replied: You are correct! I apologize for this mistake. Here is a revised haiku … This raises another of the concerns many have: the platform as it currently performs needs to be extensively supervised, and the process by which AI creates the response is opaque—densely so. Those who hype seem to imply that ChatGPT can simply summarize the universe of all information and, like Mr. Spock when he asks for the coordinates to a distant galaxy, "the computer" will be infallible. This is far from the truth. For an LLM to truly respond reliably to medical questions, it requires training in specific knowledge domains, and this can be very expensive and complex. Stanford researchers, in collaboration with MosaicML, an AI stalwart, are working to create a PubMed GPT but note that because of the "large number of parameters in PubMed GPT, there are trade-offs between model complexity, cost, and specialized architecture" (5Gupta K. Stanford and MosaicML researchers announce the release of PubMed GPT, a purpose-built AI model trained to interpret biomedical language.https://www.marktechpost.com/2022/12/17/stanford-and-mosaicml-researchers-announce-the-release-of-pubmed-gpt-a-purpose-built-ai-model-trained-to-interpret-biomedical-language/Google Scholar). An underappreciated aspect of AI is that the massive volumes of data have to be annotated for them to be useful to the platform. Many, perhaps most, of the humans required for this labeling work in sweatshop-like conditions in the global south. Large language models are intriguing, but they are not ready for prime time and certainly are not going to replace any of us anytime soon. However, there is no question that AI and LLMs are destined to become a part of our future. And we want them to help us. Our field will have to depend on the automation of many of its processes (coding, translating, educating, researching, possibly fluently translating for nonnative English speakers, and maybe someday helping a human embryologist pick from among 3 4AA embryos) to free providers at all levels from the burden of rote duties and allow them to practice at the top of their license. We need to make sure that incorporating the platforms into our workflows liberates us from work rather than adding more of it to our already busy days. Unlike other times we have allowed techies to add to our work without the input of domain specialists we do need to be part of this process. This will happen in all fields, and English professors may need to rethink the college essay. However, a new paradigm for academic excellence will emerge, and perhaps LLMs can help up the game when used appropriately. And if it reduces the number of mouse clicks in EPIC, that alone will be a major improvement.
Referência(s)