Will ChatGPT kill the art of essay writing?
ChatGPT is now three months old. And, as in all truly disruptive technologies, it’s hard to remember what the world was like BC (Before Chat). Remember life before the iPhone, when we sent text messages by bashing the phone's numerical keypad over and over, and stressfully navigated car journeys using paper road maps? Me neither. To all intents and purposes, GPS enabled touchscreen phones just are. It’s impossible to imagine otherwise. It won’t be long before GPT models feel the same way.
That said, we don’t yet fully understand the impact Large Language Models (LLMs) like ChatGPT will have on how we live and learn. We’ve had a taste, but it’s still early days. We’ve seen publications like The Atlantic and Forbes declaring that ‘the College essay is dead’, and New York State banning ChatGPT from its schools, motivated by concerns over integrity, student privacy, and cyberbullying. These knee-jerk, district-wide bans felt a bit like shutting the stable door when the horse was never really in the stable in the first place. After all, you can ban college network access, but what about the device in the student’s pocket that’s been there practically since birth?
But if we don’t ban, should we attempt control? A handful of schools are allowing limited ChatGPT use, as “a brainstorming tool, a writing assistant or a source of feedback, but not as a substitute for [students’] own work”. (1) This seems more sensible. The problem is that most teachers remain clueless about the existence of ChatGPT, let alone know how to effectively use it. It will be some time before there is a more system-wide response. And in the meantime, many students will use ChatGPT to write their essays and will likely improve their grades as a result.
The ChatGPT Style
I am an English teacher by profession. In recent years I moved into school management, where I’m responsible for the overall academic operation of a group of international schools. But at heart I’m a literature buff. I love books, plays, poems, language. Interestingly, when I asked ChatGPT to write me a GCSE level essay on the role of the Witches in Macbeth, my first reaction was not panic, or concern. It was relief. Because what it wrote (which I explored in more depth in my first LinkedIn post on Chat GPT back in December) was quite good. But not very good. It felt formulaic, mechanical, devoid of nuance. And it got me thinking, that maybe ChatGPT isn't best suited for 'off the peg' essay writing.
Because what ChatGPT produces, even if you work hard on engineering prompts, is a distinctly ‘ChatGPT’ style of writing. You can almost sense it laying one word after another in an accurate, grammatically correct, but ultimately pedestrian way. To read essays produced by ChatGPT is not to get a feel for the personality that sits behind the writing. In general, teachers get to know their students’ essays over time. This is why plagiarised work tends to stand out like a sore thumb. When it happens, we speak to the student, explain that if they do it again they’ll likely be removed from all their exams, not only the subject they attempted to cheat in. It’s a game schools play: scare the student, word gets out to their peers, and they don’t do it again. However, what I’ve noticed is that ChatGPT essays don’t sound like either student work or plagiarised work. They sound like ChatGPT work.?
There are of course ways to tweak the model to make it sound more like you, or a particular writer you admire. You can upload sections of your writing so it learns your style and apply this learning to GPT-generated writing. The thing is, I’ve tried this time and again and I can never get over the fact that the writing doesn’t really sound like me. No doubt it’s accurate and easy to read. Perhaps more so than my own writing. But I don’t feel myself at the heart of the writing. I’m just not there.
LLM writing as Simulacrum
What I’ve concluded is that ChatGPT writing is an approximation of my style, what the postmodern philosopher Jean Baudrillard referred to as a simulacrum: a copy of something without original or referent. (2) And, as Bing Chat opined when I asked it to compare Baudrillard with ChatGPT: “if ChatGPT generates a poem based on a given prompt, is it creating a copy of an existing poem or a new poem that has no original? Is it simulating the style or content of a specific poet or genre, or is it producing something unique and original? How does ChatGPT’s writing affect our perception of reality and meaning?” (3)?
The reason for this feeling of simulation lies in how LLMs work. They are not intelligent in the way we think of intelligence (which makes the I of AI rather inaccurate). They do not have sentience. They do not think things through nor do they have opinions on what they’re writing. They are not much more than guessing machines, predicting the next word or punctuation mark in a sentence based on the data and rules that have previously been inputted. The more users interact with them the better they get, but this is really only refining the rules within which they have been programmed. Open AI likens training an LLM to training a dog.?
And I think this is where the human art of essay and article writing is very different from the linear guessing game LLMs undertake. When humans write, we are never only thinking about the next word. We hold the entire thesis in our head and are constantly testing it, examining it, refining it. Writing is a multi-directional process: we write forwards, dig downwards, look outwards towards research and examples from our lives. We move backwards to earlier sections and tweak based on how our argument develops. We are everywhere at the same time. With the very best writing, we inhabit the ideas and the words flow. This is why, when we read back our writing at a later date, we often cannot remember having written it.?
Therefore, what Bing Chat referred to above as ‘reality and meaning’ lies in the holistic process of creation. Reality is not one word following another. You can produce meaning in this way, but there is little reality that sits behind it. This is the reason why these essays sound quite hollow. They are all style but no substance.?
GPTs as experts in filler writing
Based on this less than favourable comparison, it would be easy at this point to dismiss LLMs as gimmicks, with the initial wow factor soon wearing off into weary resignation that they will never deliver what they initially promised.?However, perhaps we are looking at them in the wrong way. Maybe we should instead focus on the sort of writing they’re good at. Because in a certain sector of content production they excel. And are already saving me a lot of time on a daily basis.?
A large part of what I do is operational. I spend more time writing manuals, policies and job descriptions than I do being creative. And where I have found ChatGPT shine is in writing this sort of material. Last week I asked it to write a job description for an Operations Manager. It needed a few additional prompts to ensure it ticked every box, but what it produced was clear, precise, and had no extraneous detail. It’s the same with policies: ask it to write an exemplar safeguarding policy for an international school in a local context and it produces a clear outline that the relevant detail can be added to. It may not write the entire policy, word for word, but the framework is there. I can then ask it to flesh out each section. It takes time, and iterations, but it gets the job done more accurately and quickly than if I was to do it myself from scratch.
This is where I believe LLMs like ChatGPT will continue to offer significant value, through producing what essayist Stephen Marche, in a recent Intelligence Squared podcast, calls ‘filler writing’: the sort of content that needs low creativity but is nonetheless vital for keeping the world turning. Policies, person specifications, job manuals, reports, even (I would perhaps controversially argue) resumes and personal statements. Students often find it hard to write about themselves. However, by listing their achievements and asking ChatGPT to turn them into a polished piece of writing, these achievements can stand out, freed from poor spelling, punctuation or grammar. It’s a little like using a ghost writer. There is nothing inherently wrong in this in my opinion - it doesn’t change the achievements and successes. It just presents them in a more coherent way.?
How does this feed back into how we should be using LLMs in schools and colleges? First of all, the college essay will most likely die (whether we like it or not). However, by shifting away from written assignments towards a focus on “presentations, multimedia projects, or experiential learning” (4) we can better assess those skills that are relevant to 21st Century ways of working. Other options include using ChatGPT writing itself as an opportunity to learn: students could be asked to criticise AI-generated texts, “or to improve them with their own revisions”. (5) As this is likely the way in which most of us will use LLMs in the future, it makes sense to train students to use them in this way whilst in school.?
The Evolution of Research
It’s not only ChatGPT as a generator of ideas and content that we should be taking notice of. The role of the search engine is also being turned on its head. Bing Chat, recently released to Beta testers, has as its engine GPT3.5, the large language model that also drives ChatGPT. Using Bing Chat enabled me to write the article you’re reading faster than I’ve ever written before. Whenever I needed evidence to back up my ideas, Bing Chat instantly found it for me. There was no need to hunt through Google articles. It gave me exactly what I needed, including references. It even summarised and added its own opinions (as is shown in the quote above on simulacra and simulation). Using Bing Chat for this article was like having a super fast research assistant sitting beside me. As a result I was able to develop and evidence my ideas in real time, rather than being constantly slowed by the research process. It was something of a revelation and this article would not have been the same without it.?
领英推荐
I’ll admit that my initial thoughts on Bing Chat weren’t hugely warm: I initially compared it too closely with ChatGPT, which was unfair. It may be powered by the same AI engine, but the bodywork is totally different. It’s like comparing a Porsche with a BMW 5 Series estate. ChatGPT is speedy, flash, but at times utterly impractical. Bing Chat is sensible, a little dull, but incredibly useful. I can already seeing it becoming a fundamental part of my workflow. Microsoft may have already cut off its ability to say weird stuff (following some interesting articles on the subject), but I don’t actually want Bing Chat to be weird. I want it to find me information fast and give me different ways of looking at it. And it’s efficient at both. Do I trust it one hundred percent? Of course not. But then I never fully trust websites full stop. There’s no difference as far as I’m concerned.?
Perhaps this is the future. A combination of ChatGPT to spark the initial ideas, give the frameworks, map the way ahead, and Bing Chat to provide the speedy research evidence once the writing process kicks off. It’s a powerful combination.
But what about the future?
Of course, we’ve seen nothing yet. The current GPT model (GPT3.5) is built on 175 billion parameters (think of a parameter as an input and output processing mechanism - a little like a synapse in the human brain). Google’s LLM, PaLM, has 540 billion parameters. (6) As there are around one quadrillion synapses in the human brain (7), it would appear that LMMs are still a long way off our brain’s natural computing power. However, whilst it is hard to compare synapses and AI parameters like for like, in terms of order of magnitude we can broadly say that PaLM has around half the number of parameters as there are synapses in the human brain. (8)
It has been extrapolated, based on the increase in Open AI’s GPT model parameters over time, that the soon-to-be-released GPT4 may have as many as one hundred trillion parameters. That’s around two hundred times PaLM’s size and almost six hundred times more than GPT3. Which, based on my rudimentary maths (and I am very happy to be proven incorrect here as I am no mathematician), GPT4 could be something like one hundred times more 'connected' than the human brain. And what will it then be capable of? No one knows. But if we believe that the world will continue as normal from this point onwards, we are in a dream. Because language models like ChatGPT and text to image models like Dall-E, Stable Diffusion and Midjourney have only just arrived. We are still in the MS DOS and dial up modem stage of AI. Perhaps even in the Charles Babbage Difference Engine stage. Just wait until we move into the quantum computing stage.
We have to prepare ourselves and our students, or the machines will leave us behind. That still may happen. We must remain as ‘humans in the loop’, continually impacting on these systems as they develop, ensuring their control and ethical use cases. Because once AI far exceeds the capabilities of the human mind, the Singularity no longer feels the stuff of fiction.?
2 https://www.press.umich.edu/9900/simulacra_and_simulation
3 This is the result of a question asked into Bing Chat: in itself a challenge to reference as it is not taken from a webpage but rather Bing Chat’s GPT3 AI making inferences based on me asking it to compare Baudrillard’s ideas on simulacra and simulation with ChatGPT.
6 https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html
7 https://www.nature.com/articles/d41586-019-02208-0
8 This calculation is based on Bing Chat’s calculation as follows: “An order of magnitude is a factor of ten. For example, 10 is one order of magnitude larger than 1, and 100 is two orders of magnitude larger than 1. To compare two numbers by their orders of magnitude, we can use the logarithm function with base 10. For example, log10(100) = 2 and log10(1000) = 3, so 1000 is one order of magnitude larger than 100.
To compare Google’s PaLM with the human brain by their orders of magnitude, we can use the following formula:
log10(PaLM parameters / brain synapses) = log10(540 billion / 10^15) = log10(0.00054) = -3.27
This means that Google’s PaLM has about -3.27 orders of magnitude fewer parameters than there are synapses in the human brain. Alternatively, we can say that the human brain has about 3.27 orders of magnitude more synapses than Google’s PaLM has parameters.”
Co- founder and Managing Partner at Infinite Learning
1 年Great piece of writing Darren. Very thought-provoking.
Co-Founder & CEO at Sophia. Transforming Education
1 年Interesting. I think what’s emerging is the ways that generative ai such as ChatGPT3 can be used by teachers but why narrative ai is going to be the place that develops student creativity and literacy In any event we are seeing the movement towards AI literacy more generally in education and this is an exciting path to be on