Is using ChatGPT as an aid to the production of academic texts safe?
Leandro Villela de Azevedo
Dr em História Medieval (USP) / Historiador Corporativo / Professor de educa??o básica e superior / Estudioso de IA e aplicabilidade na área de humanidades
Is using ChatGPT as an aid to the production of academic texts safe?
Leandro Villela de Azevedo
?
PhD in Social History from the University of S?o Paulo
Specialist in the use of technology for teaching humanities
?
?
?
Part 1 - The Evolution of the Use of Technology and the Rise of Artificial Intelligence
?
It doesn't matter so much the position of the reader of this article in terms of the advancement of artificial intelligence, be it the most alarmist, who believes that these advances should be deactivated as soon as possible or we could be “dominated by them” or the biggest enthusiast who believes that they They came to transform our society once and for all, making exponential and almost magical progress in the coming years. At either of these extremes there is a certainty, just like mechanical engines, electricity, mass media, and the internet, this technology is here to stay and it is smart that, as soon as possible, people, especially academics, learn to better understand how it works (and if they are enthusiasts or critics, they do so based on real data and with the necessary care to do so)
?
But what exactly is Artificial Intelligence? The term artificial intelligence (AI) refers to the ability of a machine or computing system to perform tasks that normally require human intelligence. This includes activities such as speech recognition, decision making, language translation and even text generation, as is the case with GPT models. AI can be programmed to learn and adapt to new information, improving its performance over time. According to Stuart Russell and Peter Norvig, authors of the book "Artificial Intelligence: A Modern Approach", AI can be defined as "the study and design of intelligent agents", where an intelligent agent is a system that perceives its environment and takes actions that maximize your chances of success.
In this way, AIs are not exactly new and readers of this article have already used them, probably even before the name became popular, to make this clearer here are some examples: Reactive Machines: These are AI systems that react to different types of stimuli based on pre-programmed rules. They do not use memory and therefore cannot learn from new data. For example, IBM's Deep Blue, which defeated chess champion Garry Kasparov in 1997, is an example of a reactive machine. It was programmed to analyze possible moves in chess and choose the best move based on the rules of the game and pre-established scores. The same occurs, for example, in a simple calculator, or a system like Waze or GoolgleMaps that indicates the best route based on route calculations and GPS global positioning data. Indeed, since the computer that played chess and the first urban route systems via GPS, there has been a lot of evolution, and one of the main ones would be the advancement of IAS with a Limited Memory, which allows it to, in a certain way, “learn” the rules itself that will be used. use, but in a controlled way. In other words, it is allowed to analyze a certain amount of texts or images and try to understand from those images or texts certain rules that are not previously programmed. After this
“pre-training” phase, she begins to take actions not based on data programmed by a human, but on reasons that she programmed herself in the training phase. Note that this type of AI (which is the current phase we are in with GPTs) has no autonomy and is not “general” and can self-improve infinitely, but limited to the materials given to them at the time of training.
Beyond the GPTs phase, there are, but only in the theory phase or first development attempts, those that would be based on the Theory of Mind and that could ultimately become an Artificial General Intelligence or even acquire a certain self-awareness, but that is as theoretical as a flight to the moon was in 1800. Maybe it will be our reality one day, but we won't be talking about it now.
?
?
What are GPTs?
Among these innovations, the development of Generative Pre-trained Transformers (GPTs) stands out. GPTs are AI-based language models that have been trained on vast amounts of textual data to predict and generate text in a coherent and contextually relevant manner. OpenAI's GPT model series, in particular, has attracted significant attention. These models are called "pre-trained" because they were initially trained on a large corpus of internet text, allowing them to understand and generate natural language text with high accuracy.
ChatGPT, a practical application of GPT models, exemplifies the power and versatility of this technology. ChatGPT is a text generation template that can hold conversations, answer questions, write articles, and more. His ability to generate text in a coherent and contextually appropriate manner results from extensive prior training in various types of text, which gives him a broad base of knowledge.
The term "pre-trained" refers to the process by which the model is initially exposed to a large amount of textual data before being fine-tuned for specific tasks. This process allows the model to learn linguistic and contextual patterns, making it effective at generating text in a variety of contexts. In these texts, each word is assigned a numerical value based on the number of times it is associated with each other word and thus creates “probability” values?? with which it can be found in a text, thus allowing GPT to create texts that seem very understandable and realistic, but in fact somehow just using the knowledge and probabilities of being similar to texts he has already been trained with (that is, he creates a new text just based on the ones he already knows, being able to recombine but not give a real new evolutionary line in this text) Perhaps it may even seem “too intelligent” when compared with the average knowledge of anyone in its field, since it was fed by high-level texts (in addition to texts much simpler) but any specialist who tries to talk to the GPT about his specialty will realize that he knows more than him about this specific area (and if his text was used to feed the GPT he may even be able to recognize his own words)
?
?
?
Part 2 - Before using ChatGPT to help you produce an academic article, it is necessary to understand the differences between a GPT and a human being in thought processing
?
While language models like GPT are advanced and impressive tools, it is crucial to understand that there are fundamental differences between the way these models work and the way humans process thought and information. This section explores these differences, highlighting the implications for responsible and critical use of ChatGPT in content creation.
A - Probability-Based Processing
GPTs, including ChatGPT, generate text based on word combination probabilities. These models are trained on vast amounts of textual data, learning patterns and statistical associations between words and phrases. When asked to generate text, they select the next words based on the probability that these words will follow the previous ones, given the combinations seen during training.
On the other hand, humans process information in a qualitative and contextual way. Human decisions are influenced by a deep understanding of logic, personal experience, expert knowledge and emotions. While GPT may simulate this understanding in his responses, he does not have true cognition or awareness of the topics he writes about.
B - Limitations in Understanding and Decision
?
A crucial difference is that GPT has no genuine awareness or understanding of the texts it generates. He does not have the capacity for introspection or to evaluate the veracity of his own responses autonomously. For example, if a question is beyond the knowledge built into its training, the model can still generate an answer based on statistical patterns, even if that answer is incorrect or fanciful.
?
?
A human being, on the other hand, can recognize when they don't know the answer to a question and can make a conscious decision to seek out more information or to declare their ignorance. This ability to recognize limitations and seek validation is essential for the accuracy and reliability of information.
In layman's terms we can say that a human being can choose to lie, inventing an answer to a question that he does not know how to answer, he can choose to hide information for his own benefit, he can choose to manipulate information. A GPT cannot do it intentionally, however this should not be considered an advantage, since not doing it intentionally does not mean that it will not do it, quite the contrary, it will do it and it will be completely unrecognizable. to a non-attentive eye since to GPT this information will appear as “natural” and based on its ability to generate text by probability as any text it has generated before.
?
?
This is because, by its nature, GPT should be understood as a text generation tool. It is effective for creating drafts, inspiring ideas, and offering suggestions based on a wide range of textual information. However, the answers provided by GPT should not be taken as absolute truth without verification. Users should treat responses as initial suggestions that need to be confirmed through reliable sources and
verifiable. If you ask a question that he doesn't have the answer to, he will simply do what he was programmed to do, generate a text based on the combinations of words studied, even if this text is completely unreal or fantasy. It may generate references to non-existent books, cite non-existent historical facts and it will not know how to distinguish these from real ones until its user indicates this situation (just like a GPS guidance system that has been incorrectly programmed thinking that there is a route that no longer exists). exists will insist that the person take that route and make the human driving the car realize that he should not follow that path)
?
?
To better understand this situation, let's look at some classic errors associated with the use of GPTs and some that were found in daily use by the creator of this article.
?
?
Part 3 - Classic Errors Associated with the Use of GPTs and Translation Systems
?
Although AI technologies such as GPTs and machine translation systems have advanced significantly, they are not infallible. Errors made by these systems can have important consequences, especially when they involve sensitive historical information or translations that alter the original meaning of a text. In this section, we will discuss some notorious errors that have occurred when using these technologies.
?
?
Errors Related to the Holocaust
Recently, there have been cases where GPT language models made serious errors when generating texts about the Holocaust. These errors are particularly sensitive due to the historical and emotional nature of the topic. Accuracy and sensitivity when dealing with historical events such as the Holocaust are crucial, and failures in these systems can lead to the spread of incorrect and offensive information. Recent news (June 2024) still indicates errors that even in version 4 are occurring as we see in the examples below
https://oglobo.globo.com/mundo/noticia/2024/06/18/inteligencia-artificial-inventa-edistorce- memoria-do-holocausto-diz-unesco-que-alerta-para-impacto-entre-jovens. ghtml
?
Many of these errors, once noticed, are corrected and “blocked” by the team at OpenAI, Google and other artificial intelligence developers - here are some that have already been corrected:
Ayrton Senna case
A notable example of an error occurred with older versions of ChatGPT, which incorrectly indicated that Brazilian driver Ayrton Senna had died at the Interlagos race track in S?o Paulo instead of the Imola track in Italy. This error is due to the fact that Senna, being Brazilian, was associated with a "Brazilian race track", and Interlagos is the best known in Brazil. This type of erroneous association highlights how language models can generate incorrect information by not accurately understanding historical or factual context.
https://www.uol.com.br/esporte/colunas/flavio-gomes/2023/02/13/para-o-chatgpt-sennamorreu- em-interlagos-depois-de-bater-em-nakajima.htm
?
Google Translation Errors:
The nurse and the doctor:
Machine translation systems, like Google Translate, also have their history of errors. A classic example is the mistranslation of genres when translated from Portuguese to English and back to Portuguese. The sentence "The nurse passed the tool to the doctor" is correctly translated into English. However, when translating back into Portuguese, the phrase often came back as "The nurse passed the tool to the doctor." This is due to the predominance of genders in certain professional roles in the training corpus, where it is more common to find "nurse" and "doctor" than "nurse" and "doctor".
?
The impossibility of understanding that a banana can exist outside the bunch
?
Another famous error refers to an image GPT, also from OpenAI, Dall.E, which, similarly to ChatGPT, is pre-trained, but with images and their text references (indicating what the images are) so that it can “read” and “understand” the images and then be able to generate their own images from textual commands. However, apparently version 2.0 was not capable of drawing a single solitary banana, whenever he was asked to draw a banana he would draw a bunch, perhaps due to the simple fact that he had not been fed enough images of individual bananas and did not “understand” the idea of a banana out of the bunch.
领英推荐
Part 4: Errors found by the author of this article
For an inattentive reader, it may not have yet been realized that the author of this article is a historian, specialized in the use of AI to assist teachers in their classes in the humanities, which does not make him a candidate for writing without a historiographical level, but analyzing only the errors and uses of GPTs. Part of the initial motivation for producing the article came precisely from the large number of errors found when using these tools, even though he is an enthusiast of the technology and sees no reason to abandon it. To provide a more tangible example beyond the media, here we chose to share some of these examples:
When using ChatGPT, version 4, integrated with ScholarGPT, to produce an article on the relationship between the papacy, the antipapacy and the Protestant reform, the GPT constantly made references to a book “Giandoso, M. (2005) . The Politics of Antipopes: Power and Dispute in the Medieval Church. Medieval Publisher.” which had not been fed into the system by the user and was not even known to him. We can say that “out of nowhere” the system decided to “invent” such a book, probably due to previous work carried out with the same tool, where there was mention of a partnership with Daniel Giandoso in a project in 1998 (although there is no way to be sure that this was the origin of the “intruder” book in the bibliography.
In addition to being intrusive in the list of bibliographic references, GPT even invented the complete citation, suggesting that it be inserted in the context of the creation of the article
?
?
In other words, using GPT to “create” the article can be as inefficient as someone wanting to use their “calculator” to “file their Income Tax declaration” even though we all know that the calculator can indeed be very useful for assist in calculating tax amounts if she does not make the declaration on her own and it is necessary for a person with the necessary knowledge to use it for the parts that she “knows how to do”. In the same way, GPT can indeed be too ultimate as we will see below, but not to ask it to “create” the text on its own, especially in an academic context.
A second example refers to the use of DALL-e, still present in the most current version (June 2024) Dall.E 3.0 – When preparing a handout, an image was requested following the style of the Mesoamerican codices of the contact between Cortez and Malinche and the Aztecs. However, the system was not capable of representing Cortez off the horse, understanding that for an image to represent the presence of the Spanish conquerors there was a need for the presence of horses, and even when asked more than once for the horses to be removed, it went so far as to state that categorically that there were no horses in the image.
Only at the end when the order is given to delete “Hernan Cortez” and replace it with a normal person does he place a person without the horse, demonstrating that when he was trained only with images of Cortez riding horses (probably) he understood that “ Cortez” was a kind of figure composed of the sum of a horse and a person.
?
Part 5 and Final - Tips for Effectively Using ChatGPT in Creating Academic Articles
A - Verification and Review of Information
The first and most fundamental tip is to rigorously verify all information generated by ChatGPT. As discussed earlier, ChatGPT creates text based on statistical patterns and has no contextual or factual understanding. Therefore, each piece of information must be carefully reviewed and compared with reliable sources before being used in an academic article. Proofreading the text is equally important to ensure that the content is coherent, clear and error-free.
?
?
B - Prior Feeding with Relevant Articles
A useful practice is to provide ChatGPT with relevant articles and documents that you want to use as a basis for your own writing. This can be done by pasting excerpts from these articles into the conversation
or referencing your main ideas. This way, the model can generate text that is more aligned with the specific content and topics you are covering, increasing the accuracy and relevance of the generated text.
?
?
C - Division of the Article into Parts
?
Another valuable tip is to divide the article into smaller, more manageable parts. ChatGPT tends to produce deeper, more detailed text when working on specific segments of the article, rather than trying to generate a large text all at once. Dividing the work into introduction, development and conclusion, for example, allows the model to focus on each part more effectively, resulting in more cohesive and well-structured content.
?
?
D - Use of Human Guidance
?
It is crucial to approach using ChatGPT as a support tool, not as a complete replacement for the human creative process. Having a clear idea of what you want to cover in the article, including the conclusion or main points, before starting to generate the text, is essential. This ensures that the final text is aligned with your vision and goals. Use ChatGPT to expand your ideas, seek new perspectives, and enrich your content, but stay in control of the creative process.
?
?
E – Understand what GPT is and what it is not
It is essential to understand that the GPT is not a thinking and intelligent being and follows a logic of thought that is very different from that of humans, and limited only to what was originally fed to it (and in updates), so it is not possible to trust nor is it pertinent or productive to try “argue” with him. You can certainly “teach” it, but what will seem like learning will be just a recombination of your own words and in the near future the same error will continue to occur, so never use it for final versions, just to “talk” with an “other version of yourself”, knowing exactly what you want with that conversation
?
?
F Other Useful Tips
Use Direct and Clear Questions: Ask ChatGPT specific and direct questions to get more focused and relevant answers.
Incorporate Iterative Feedback: Review generated text in stages and provide feedback to the model to continually improve content quality.
Integration with Other Tools: Combine the use of ChatGPT with other research and editing tools to complement content creation. Grammar checking tools like Grammarly and academic databases like Google Scholar can be extremely helpful.
Exploring Different Formats: Try asking ChatGPT to rewrite passages in different styles or formats, such as summaries, critical analysis, or reviews, to diversify the content and ensure it meets the specific needs of the article.
?
?References
BENDER, Emily M.; GEBRU, Timnit; McMILLAN-MAJOR, Angelina; SHMITCHEVA, Shmargaret. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. ACM, 2021.
GOODFELLOW, Ian; BENGIO, Yoshua; COURVILLE, Aaron. Deep Learning. Cambridge: MIT Press, 2016.
MACK, David. Google Translate's Gender Bias Pairings Draws Ire Online. BuzzFeed News, 2018. Available at: https://www.buzzfeednews.com/article/davidmack/google-translate-genderbias. Accessed on: 20 June. 2024.
MARCUS, Gary; DAVIS, Ernest. Rebooting AI: Building Artificial Intelligence We Can Trust. New York: Pantheon Books, 2019.
MITCHELL, Melanie. Artificial Intelligence: A Guide for Thinking Humans. New York: Farrar, Straus and Giroux, 2019.
POST, Matt. A Call for Clarity in Reporting BLEU Scores. In: Proceedings of the Third Conference on Machine Translation: Research Papers. ACL, 2018. p. 186-191.
RADFORD, Alec et al. Language Models are Unsupervised Multitask Learners. OpenAI, 2019. Available??????????????????????????????? in:???????????????????????? https://cdn.openai.com/better-language- models/language_models_are_unsupervised_multitask_learners.pdf. Accessed on: 20 June. 2024.
RUSSELL, Stuart; NORVIG, Peter. Artificial intelligence. 3rd ed. Rio de Janeiro: Elsevier, 2013.
?
?
SCHWARTZ, Roy et al. Green AI. Communications of the ACM, vol. 63, no. 12, p. 54-63, 2020.