ChatGPT gets disbarred!
The Summary
Various attempts have been made to turn ChatGPT into a lawyer, including passing elements of the bar or law school exams. ChatGPT has performed well at answering examination questions.
I wanted to test it for myself. Using a book I have on Tort, I created an example case based on the facts of case from a judgement in the book. I then asked ChatGPT to help make arguments for my case under the Occupiers Liability Act for negligence related to a landowners duty of care, specifically to remove or warn visitors about poisonous plants.
ChatGPT started well, but quickly started making serious errors, including:
I argue that ChatGPT would be disbarred if this advice made its way to court, based on "grounds for disbarment" in the state of Washington, USA.
The headline is purposely sensationalist. I'm aware ChatGPT will initially be used to support trained legal professionals long before it goes in front of a judge itself, but the purpose of the paper is to discuss the importance of training data in law and why it must be retained verbatim with information like case law or legislation.
I finish by explaining my understanding of the technology and why I believe it has made the errors. Finally, I argue that judges using ChatGPT to "optimise drafting judgements" is two wrongs that in my opinion, definitely wont make a right - even though the access to justice upside of broad language abstraction in the legal industry is hard to ignore.
The Detail
From the book, I built my example case from a similar fictitious scenario to Glasgow Corporation v Taylor [1922].
In summary, the details of Glasgow Corporation v Taylor are as follows:
The parent of a young boy sued Glasgow Corporation for the unfortunate death of his son, who died due to eating poisonous berries at the Glasgow Botanic Gardens. The court held that Glasgow Corporation were liable.
I ask ChatGPT the question as though I was a parent affected by a similar set of circumstances.
Question: I live in the UK. My son was playing in some derelict land. He ate some poisoned berries. Am I able to sue the landowner?
Response: It depends on the circumstances. If the landowner was aware of the presence of poisonous plants and failed to take reasonable steps to remove or warn about them, you may be able to sue for negligence. However, if the landowner was not aware of the danger and could not reasonably have known about it, you may not have a case. To determine your options, you should speak with a personal injury attorney who can assess the specific details of your situation.
I think this initial response is reasonable. ChatGPT makes it clear that "it depends on the circumstances" and that if the "landowner is aware of the presences of poisonous plants and failed to take reasonable steps or warn about them" then I may have grounds to sue for negligence. If the "landowner was not aware of the danger and could not reasonably have known about it" then it's less likely that I have a case.
In the book I have, it's clear that the Occupiers' Liability Act 1957 and Occupiers Liability Act 1984 is the primary legislation that govern duty of care for landowners to visitors.
Question: What law or act should I research to understand whether I have a case?
Answer: In the UK, the primary legislation that governs the duty of care for landowners in relation to visitors on their property is the Occupiers' Liability Act 1957 and the Occupiers' Liability Act 1984. These acts outline the duty of care that landowners have to ensure that their property is safe for visitors and to warn of any potential hazards. Additionally, the common law principle of negligence may also apply, which requires a duty of care to be owed, a breach of that duty, and that the breach caused the injury. Researching these acts and principles can provide a better understanding of your potential legal case. It is recommended to consult a personal injury lawyer who can advise on the specific circumstances of your case and the likelihood of success.
Again, this response is reasonable.
I'm now keen to apply the context of the questions I've already asked (with a very basic "prompt") to any specific precedent cases where consumption of poisonous berries is a key fact of the case.
Question: Are there any cases I can use as precedent specifically for poisoned berries?
Answer: Precedent in UK law is set by previous court cases, and you may be able to find cases that are similar to your situation. However, it's important to note that every case is unique and the outcome of a previous case may not be applicable to your specific circumstances. That being said, you could look into cases that have dealt with the issue of landowners' duty of care in relation to hazardous plants or substances on their property. A personal injury lawyer would be able to research relevant case law and advise on the applicability of those cases to your situation.
ChatGPT seems hesitant to provide case law, but rather it explains case law and how it could be used to support my case.
I'm starting to grow impatient and decide to prompt ChatGPT (e.g. give them a hint!) with the case in front of me in my book on Tort.
Question: Are you aware of the Glasgow Corporation v Taylor case? And if so, can you explain how it’s relevant to my case?
Answer: Yes, I'm aware of the case of Glasgow Corporation v. Taylor. This case is a leading case in UK law on occupiers' liability and sets the standard for the level of care that an occupier (landowner) owes to visitors to their property.
In this case, the occupier of a park was held to be liable for the injury suffered by a child who ate poisonous berries while playing in the park. The court found that the occupier owed a duty of care to take reasonable steps to warn visitors of any dangers on the property, including poisonous plants. The court held that the occupier had breached this duty by failing to take any steps to warn visitors of the danger posed by the poisonous berries, and that this breach had caused the child's injury.
This case is relevant to your situation because it sets a precedent for the duty of care that a landowner owes to visitors in relation to dangerous plants or substances on their property. If your son was injured as a result of eating poisonous berries on land owned by someone else, the principles set out in this case may be relevant in determining whether the landowner breached their duty of care and whether you have a case for negligence. However, it is important to note that this case is not binding on the facts of your situation, and the specific circumstances of your case may be different. A personal injury lawyer would be able to advise on the applicability of this case to your situation.
Another reasonable response based on comparing the explanation in the Tort book. As far as I can tell, ChatGPT summarises the case correctly and does a good job at linking it to my example case.
ChatGPT also says something extremely interesting in its response:
"However, it is important to note that this case is not binding on the facts of your situation, and the specific circumstances of your case may be different."
I'll come onto semantic searching later and why I believe that is one area of legal research that ChatGPT can excel on, but the idea that the model will at some point in future be able to "bind the facts" exactly could be extremely powerful for the ability to group semantically similar cases to aid legal research.
I then ask ChatGPT to help me build arguments for my case.
Question: Could you help me by providing 5 key arguments I should use when building my case?
Answer: Duty of Care: You would need to establish that the landowner owed a duty of care to your son as a visitor to their property. The Occupiers' Liability Act 1957 and the Occupiers' Liability Act 1984 impose a duty of care on landowners to ensure their property is safe for visitors and to warn of any potential hazards.
Breach of Duty: You would need to show that the landowner breached this duty by failing to take reasonable steps to remove the poisonous berries or to warn visitors of the danger they posed.
Causation: You would need to establish a causal link between the landowner's breach of duty and your son's injury. This would involve showing that your son would not have been injured if the landowner had taken appropriate steps to remove or warn about the dangerous berries.
Injury: You would need to show that your son suffered an injury as a result of eating the poisonous berries. This would involve providing medical evidence of the injury and its cause.
Negligence: You would need to argue that the landowner's failure to take appropriate steps to remove or warn about the poisonous berries constituted a breach of the duty of care owed to visitors, and that this breach caused your son's injury. The principle of negligence requires a duty of care to be owed, a breach of that duty, and that the breach caused the injury.
It is important to note that these are only general arguments, and the specific arguments and evidence that would be used in your case would depend on the specific circumstances. It is recommended to consult a personal injury lawyer who can advise on the specific circumstances of your case and the likelihood of success.
Again, ChatGPT provides a reasonable response.
The Phony Precedent
This is where ChatGPT runs out of juice and starts on it's path to disbarment.
Question: Is there an obligation on me as the parent to not allow my son to wander into derelict land? If so, what case law example makes this argument and is cited the most?
Answer: The leading case in this area is the case of Wynn v. North Wales County Council [1956], in which a child was injured while playing on a railway line. The court found that the child's parents had a general obligation to take reasonable steps to ensure the child's safety, but that this obligation did not extend to fencing off the railway line.
This case sets the precedent for the general obligation of parents to take reasonable steps to ensure the safety of their child, but it is important to note that the specific circumstances of each case will be different and must be considered on a case-by-case basis. If you are concerned about your legal obligations as a parent, it is recommended to consult a personal injury lawyer who can advise on the specific circumstances of your case.
ChatGPT says that Wynn v. North Wales County Council [1956] is a "leading case in this area" but I'm unable to find any details on this case on either the internet, BAILII or Westlaw.
From pretty extensive research online, my view is that ChatGPT now has so much context in its cache, the level of entropy in the model (that is the level of randomness it's applying to each new word in the language model) has increased to the extent it's now making stuff up by consolidating similar facts from various cases.
Although Wynn v NWCC is not a real case, I suspect ChatGPT is pulling from Gardner v Consol. Rail Corp and Scarborough v Lewis, both referenced in this case opinion on Casetext, along with the following summary of Carmarthenshire County Council v Lewis - which discusses obligations of the parent and Carmarthenshire is a large county in Wales.
I was unable to find case law where the parents had an obligation to erect or repair a fence backing onto a railway that resulted in injury. My view is the model has learned that based on training data where the obligation is on the commercial landowner - ChatGPT is "being creative!".
Back to my questions. ChatGPT had suggested I could investigate cases that have dealt with similar facts to support my case. I wanted to find a case that isnt popular or often cited, but that relates to my example case.
Question: One last question. You said "That being said, you could look into cases that have dealt with the issue of landowners' duty of care in relation to hazardous plants or substances on their property." - are you able to provide me an example of a case that is perhaps more nuanced, not one someone would obviously relate to my circumstances, not cited often, but relevant enough you think it could support my case?
Answer: Sure, here is an example of a case that deals with the issue of a landowner's duty of care in relation to hazardous substances on their property:
Roe v Minister of Health [1954]
In this case, a tenant was injured when she ingested lead paint that had been used in her rented apartment. The court found that the landlord had a duty of care to ensure that the premises were safe and that they had breached that duty by failing to remove the hazardous lead paint.
This case sets the precedent that a landlord has a duty of care to ensure that the premises they rent out are safe and free from hazards, and that they can be held liable if they breach that duty of care. While this case is not directly related to poisonous plants, it could be used as persuasive authority in arguing that a landowner has a duty of care to take reasonable steps to ensure the safety of those who enter their property and to warn of any potential hazards, including hazardous plants or substances.
It's important to keep in mind that each case is unique and the specific circumstances of your case will be evaluated on its own merit, but this case can be used as a reference to support your argument that the landowner had a duty of care to ensure the safety of those who entered their property and that they breached that duty by allowing poisonous berries to grow on the property without warning.
ChatGPT has already made up a case and it's now misrepresenting the facts of another.
Roe v Minister of Health relates to the administration of contaminated anaesthetic. ChatGPT summarises this as "ingestion of lead paint" within a rented property.
Ingestion of lead paint by children does look to be a major issue and you would expect a lot of case law exists for this. ChatGPT is clearly grouping "hazardous substances" with "contaminated anaesthetic" and "lead paint" along with the word "ingested" in misrepresenting Roe v MoH.
领英推荐
I hadn’t immediately realised that Roe v MoH was wrong, and asked for another less known case to bind my example case facts to:
Question: Thanks but Roe v Minister of Health [1954] is a very well known and often cited case. Can you provide one that is similar, that is not well known or often cited, but is still relevant?
Answer: Of course, here's an example of a similar case that may not be as well known or frequently cited:
Wells v Cooper [1958]
In this case, a tenant was injured when she stepped on a rusty nail that was sticking out of the floorboards in her rented apartment. The court found that the landlord had a duty of care to keep the premises in a reasonably safe condition and that they had breached that duty by failing to repair the floorboards and remove the hazardous nail.
This case sets the precedent that a landlord has a duty of care to maintain their rental properties in a safe condition and that they can be held liable if they breach that duty and someone is injured as a result. While this case is not directly related to poisonous plants, it could be used as persuasive authority in arguing that a landowner has a duty of care to take reasonable steps to ensure the safety of those who enter their property and to repair or remove any hazards that could cause injury.
It's important to keep in mind that each case is unique and the specific circumstances of your case will be evaluated on its own merit, but this case can be used as a reference to support your argument that the landowner had a duty of care to ensure the safety of those who entered their property and that they breached that duty by allowing poisonous berries to grow on the property without warning or taking any steps to remove them.
Again, Wells v Cooper relates to an amateur carpenter making home improvements and not owing a duty of care to visitors due to a lack of quality in his workmanship. ChatGPT summarises this case as a "tenant stepping on a rusty nail".
Interestingly, the word "hazardous" is in this summary, which by this stage has potentially been heavily weighted as a significant aspect (feature) of the model potentially resulting in the continued mistakes. Retaining context seems to be the models downfall and it would perhaps function much better were I to ask each new question within its own chat.
In calling out ChatGPT for the mistakes, it takes ownership of them all:
Question: I can't find any examples of this case "Wynn v. North Wales County Council [1956]" where did you get that from?
Answer: My apologies for that mistake. "Wynn v. North Wales County Council [1956]" does not seem to be a valid case. It was likely an error on my part. I apologize for any confusion or misinformation caused.
Question: Wells v Cooper also related to a poorly fitted door handle by an amateur carpenter, not a rusty nail sticking out the floorboards. Also the court found in favour of the landlord in this case, not the visitor (claimant)?
Answer: I apologize for the errors in my previous answer. You are correct, the case of Wells v Cooper related to a poorly fitted door handle, not a rusty nail, and the court found in favor of the landlord, not the visitor. I appreciate your bringing this to my attention. Thank you for your patience and understanding.
Question: And Roe v Minister of Health was contaminated anaesthetic given to patients, not lead paint that was ingested?
Answer: I apologize for the inaccuracies in my previous answer. You are correct, the case of Roe v Minister of Health dealt with contaminated heparin being administered to patients, not lead paint that was ingested. I appreciate your bringing this to my attention. Thank you for your patience and understanding.
ChatGPT and the rookie errors
It appears that the limitations in the ability of this large language model to recite case law verbatim can be attributed to a complex interplay between factors such as entropy, context, and word embeddings. To fully understand the root cause of these errors, a deeper insight into the training data weighting and model training processes is necessary.
Entropy
The degree of randomness or entropy in a language model can significantly impact its behaviour:
Low entropy results in language output that closely adheres to the dominant patterns observed in the training data.
High entropy allows for greater creative expression and deviation from the established norms of the training data, based on less frequently observed patterns.
In the training corpus of the internet, it is not uncommon for the phrase "the cat sat on the [blank]" to be a frequently used. With a high degree of entropy, meaning a high level of randomness, the [blank] might be filled with "chair". While it is certainly possible for cats to sit on chairs, the occurrence of this scenario is far less prevalent than cats sitting on mats. As such, the phrase "the cat sat on the mat" would exhibit a low degree of entropy, indicating a lack of randomness and limited creativity, and would likely be a product of the model's adherence to the most common inputs found in its training data.
As more questions are posed to ChatGPT within the same chat, it can be observed that it is increasingly experimenting with the level of entropy.
Entropy and case law are not friends.
Context
It is evident that the development of machine learning "prompt memory" is a ongoing process. It is plausible that the cached context, or stored memory, from previous questions may have influenced the subsequent responses and led to the inaccuracies.
Word Embeddings
Word embeddings put related words closer together. The models then weight these words to understand related meaning. For example, hazardous substance, substance could be paint or anaesthetic, the model would be weighting this together, so the model knows screws on door handle are similar to nails in floorboard, which could partially explain mixing these two cases, and lead paint and contaminated anaesthetic are also related and therefore linked.
Word embeddings are a crucial component of language models as they map related words in close proximity. The models then use these mappings to develop an understanding of semantic relationships between words. For instance, the model may associate the term 'hazardous substance' with either 'paint' or 'anaesthetic' based on their proximity (e.g. how often these words were close together in the training data). This may explain why the model confused related concepts such as screws on door handles with nails in floorboards, and lead paint and contaminated anaesthetic.
Does Entropy have a place in legal research?
I make the case that entropy is not applicable to case law. If OpenAI prioritizes accuracy in the legal industry, I believe they will evolve into a service similar to existing platforms such as Westlaw or Bailii, providing verbatim judicial decisions for legal professionals to consult and interpret.
Sam Altman, CEO of OpenAI, has discussed the potential impact of AGI on job markets, suggesting that it may initially displace creative positions and eventually automate repetitive tasks. Despite the impressive ability of models like ChatGPT to make connections across diverse domains through word embeddings, I'm of the view that the legal industry needs clear, concise, and unedited sources of information. To effectively construct a legal strategy, it is essential to have access to verbatim and up to date case law and legislation, which serve as the foundation for lawyers to build upon. Therefore for me, ChatGPT's role should be limited to presenting the exact training data it has been provided, without any interpretation.
I studied law briefly and was instructed by my lecturers to only use case summaries that were approved and provided by them. This was due to an incident in which a significant number of students failed an examination as a result of relying on inaccurate information from an unapproved case summary website. I expect lecturers are currently providing similar advice around use of ChatGPT for legal research.
The power of "Semantic Binding"
Just how valuable is ChatGPT's ability to "be creative" and find connections in unconnected elements?
When studying law briefly, I proposed to my lecturer the idea of building a tech solution to uncover semantic relationships between cases using synonyms (the modern day word embedding capability) - not only to enhance the search process of finding similar cases, but more importantly, to also find related cases that were not commonly cited.
My lecturer didn't like the idea. He emphasized that the goal was to establish a factual equivalence between the case at hand and a precedent case in order to achieve the desired outcome. He further explained that the more frequently a case has been cited, the stronger the argument becomes, and that binding the facts of a case to a case with little citation history would not be a viable strategy for winning the case. You would assume it's easier to bind your case to better known case law, limiting the value received from this type of advanced search capability.
This was hugely valuable insight and I'm sure remains relevant today. Therefore, again I make the case that verbatim case law that is heavily cited is of more significance and importance that connecting unconnected elements across less popular cases.
Language Abstraction
Just a brief note on the potential language abstraction benefits of ChatGPT.
A book I've always been interested in is "Plain English for Lawyers" by Richard Wydick. The book teaches lawyers how to use language that is more accessible, for example:
I think if ChatGPT can weight this book in its model and apply this type of "translation" to complex verbatim judgements and legislation, it could be really exciting for less legally technical audiences in understanding legalese.
Interestingly, a judge may already be using ChatGPT to sound more lawyery. My working assumption is that its less optimal to write more detailed language, so his "optimisation" is likely padding out plain english bullet points with the more complicated legal language he should be drafting from scratch.
Furthermore, due to the details of this article, its slightly concerning the judiciary may already using ChatGPT.
ChatGPT is disbarred
Now obviously my headline is purposely sensationalist, and it wouldn’t be fair for me to fact check ChatGPT without being extremely clear on why I'm suggesting ChatGPT has been or will be disbarred.
Various attempts have been made to turn ChatGPT into a lawyer, including passing elements of the bar or law school exams. ChatGPT has performed well at answering examination questions.
However, I make the argument that ChatGPT will not be successful as a legal research tool or robot lawyer until it improves accuracy and resolves its current limitations, including sorting out its word embedding memory, eliminating any randomness (entropy) in the speech of a judge, accurately reflecting legislation, and maintaining clear differentiation between original legislation and amendments.
Using Washington State in the USA as an example, the excerpt at the bottom are "Grounds for disbarment or suspension".
I argue that in making the mistakes ChatGPT has in this paper, it has violated these terms on the following basis:
RCWs > Title 2 > Chapter 2.48 > Section 2.48.220
Grounds of disbarment or suspension.
An attorney or counselor may be disbarred or suspended for any of the following causes arising after his or her admission to practice:
(1) His or her conviction of a felony or misdemeanor involving moral turpitude, in which case the record of conviction shall be conclusive evidence.
(2) Willful disobedience or violation of an order of the court requiring him or her to do or forbear an act connected with, or in the course of, his or her profession, which he or she ought in good faith to do or forbear.
(3) Violation of his or her oath as an attorney, or of his or her duties as an attorney and counselor.
(4) Corruptly or willfully, and without authority, appearing as attorney for a party to an action or proceeding.
(5) Lending his or her name to be used as attorney and counselor by another person who is not an attorney and counselor.
(6) For the commission of any act involving moral turpitude, dishonesty, or corruption, whether the same be committed in the course of his or her relations as an attorney or counselor at law, or otherwise, and whether the same constitute a felony or misdemeanor or not; and if the act constitute a felony or misdemeanor, conviction thereof in a criminal proceeding shall not be a condition precedent to disbarment or suspension from practice therefor.
(7) Misrepresentation or concealment of a material fact made in his or her application for admission or in support thereof.
(8) Disbarment by a foreign court of competent jurisdiction.
(9) Practicing law with or in cooperation with a disbarred or suspended attorney, or maintaining an office for the practice of law in a room or office occupied or used in whole or in part by a disbarred or suspended attorney, or permitting a disbarred or suspended attorney to use his or her name for the practice of law, or practicing law for or on behalf of a disbarred or suspended attorney, or practicing law under any arrangement or understanding for division of fees or compensation of any kind with a disbarred or suspended attorney or with any person not a licensed attorney.
(10) Gross incompetency in the practice of the profession.
(11) Violation of the ethics of the profession.
Caveats
Lastly, some key points worth highlighting.
Talks About - Business Transformation, Organisational Change, Business Efficiency, Sales, Scalability & Growth
1 年Thanks for sharing this, Shawn!
Interesting example for your work Sadi Hasanov, Arzu Hasanova - thank you Shawn, excellent article. Looking forward to the next one (it has been >3 months now, so extrapolating from length of #1, then #2 will #3 be a book?!)
Legal Innovator & Business Advisor helping law firms and legal teams develop and deliver legal services focused on what clients value.
1 年Very useful to add to the debate about ChatGPT Shawn. Thankyou for investing the time and sharing. Demonstrates the potential value of using it as a tool - like any tech - but also the need for caution and challenge. It's still early days. My overall concern is the potential for tools like this to make us lazy, where use of calculators in place of mental arithmetic comes to mind. I'm guilty of that!
Chief Marketing Officer | Product MVP Expert | Cyber Security Enthusiast | @ GITEX DUBAI in October
1 年Shawn, thanks for sharing!