Mission Possible: The Assessment Framework ChatGPT Can't Beat
"This comprehensive approach ensures the authenticity of students' work and encourages genuine skill development, making it the assessment that even ChatGPT can't beat." - Source: ChatGPT
Introduction
For twenty years I hacked and developed language and tech systems in universities around the world to prevent plagiarism and encourage good academic work. Coming to the conclusion that our human 'techne' would result in sophisticated ways to trick most education asssessments, ten years ago I delved into what AI could do in the future and developed a college admissions and assessment framework impossible to crack by students using "generative AI or complex language software" and published it in my Doctor of Education dissertation at University of Exeter . We implemented it first as an experiment at UCL Innovation & Enterprise in Kazakhstan, then for the Masters Bridge in English within my private company Synergetics Education in Japan, India, and Germany.
In this article I employ ChatGPT to evaluate whether ChatGPT or similar tools can hack my system, called "Portfolio-based multitriangulated assessment". After failing to hack the test hacker, ChatGPT suggests this is a highly effective approach to combating cheating attempts with ChatGPT and similar tools, while empowering students and sharply increasing document validation.
Mission Possible: Creating an Unbeatable Assessment
Validated assessments rely on more data than variables and judge it historically. The multiple-triangulated assessment system I propose below expands out from the concepts of "trustworthiness" and "transferability" in case research, which are related to notions of 'validity' and 'reliability'. My "each learner is a multi-faceted case" approach to assessment addresses situations with multiple variables, data points, and interpretations interacting simultaneously.
With generative AI systems, language and learning are in transformation constantly, necessitating a multi-faceted approach adaptable in real-time. My "multiple-triangulation" of human and machine language to determine (in)consistency was an adaptation of Denzin's Triangulation techniques, drawing from insights from Fuller's Synergetics to Popper's Falsification to Chomsky's Generative Grammar to Gardner's Multiple Intelligences Theory to Heidegger's Hermeneutics to Stiegler's Grammatization and Hinton's Neural Networks.
For fun, the text throughout is integrated with ChatGPT and my own voice, making it impossible for even a software to recognize a(the) difference. This is the basic way people hack the system for essays. My own "human" sophistication can be seen now, because I told you. But when I describe "multiple triangulation" below, neither plagiarism checkers nor "gpt detectives" can distinguish my text from the computer's. And if they can, so what?
Multiple triangulation in case study research typically refers to the use of multiple data sources, methods, researchers, theoretical perspectives and other data points to validate and enhance the credibility of the research findings. Think of it like "entry points" into a room. Whether a door or window, an architect can design the size, shape, and position of the entry point. Now imagine Triangulation (furniture) helps to ensure that the research results (size, shape, positioning) are more comprehensive, accurate, and reliable by cross-verifying the data and reducing the impact of potential biases or limitations associated with a single data source or method. In other words, a science where one form of information verifies or challenges the accuracy of the other form.
To beat ChatGPT, this system goes beyond the triangulation of two points and requires three or more evaluation components that work together to ensure the credibility of a student's work:
- Written work
- Oral exams
- Tutorial feedback
- Discrete tasks
- Advanced language comparison
These categories include a large array of "data points" called artefacts that can be collected and triangulated. With the large amount of data we pull from each student, even micro or macro artefcts can be triangulated. For example, written work can not only include essays and long-form responses but also simple calculations or sentence strings, discussion posts, and other content suggestive of an 'author.' Tutorial feedback can be expanded to any 'human to human spoken interaction about a topic with a more knowledgeable person and a less knolwedgeable person' which could be a coaching, mentoring, or employer session. By integrating these 'active language creation' components, the assessment system effectively discourages cheating and promotes genuine skill development and understanding of course material.
ChatGPT suggests to further strengthen the system and make it even more resilient against AI-assisted cheating, the following additional measures can be implemented:
- Dynamic assessments: Adaptive, real-time assessments that adjust question difficulty and content based on student performance, making it challenging for students to rely on AI-generated responses.
- Time-limited tasks: Timed tasks or quizzes that require students to answer questions or solve problems within a short period, making it difficult for them to consult AI tools during the assessment.
- Randomized question banks: Large banks of questions from which the system randomly selects during assessments, making it challenging for students to prepare AI-generated responses in advance.
- Keystroke analysis: Monitoring and analyzing students' typing patterns to detect inconsistencies that may suggest the use of AI-generated content.
- Proctored exams: Supervised exams, either in-person or through remote proctoring services, to minimize the opportunity for students to use AI tools like ChatGPT.
- Peer evaluation: Incorporating peer review and evaluation components into the assessment process to encourage collaboration and make cheating more difficult.
Notice that many of these measures can be implemented either "during" or "after" a course of study. But the problem is this draconian command and control approach to surveillancing our kids instead of teaching them!!! That's why, with triangulation of three or more pieces of information from an existing course, assessment, or experiential activity, a more holistic (and accurate and stackable) portfolio of work is established without extra intervention of Big Brother like tactics that first are unnecessary and second create undue test pressure as well as teaching and administrative problems.
By combining some of these additional measures with the existing triangulation assessment system, an even more robust evaluation process can be created that effectively prevents cheating attempts using ChatGPT or other AI-generated content.
It would be extremely challenging for someone to beat this multi-faceted triangulation assessment system using ChatGPT. The system is designed to make cheating considerably more difficult and time-consuming than genuinely engaging with the material and developing the required skills.
Mission Impossible: The Highly Determined Cheater
领英推荐
It is possible that a highly determined individual like Fake Tom Cruise might still attempt to use ChatGPT in subtle ways, such as:
- Idea generation: The student might use ChatGPT to help brainstorm or outline their work, which could be difficult to detect since it is not a direct copy of generated text. However, this use may be considered less of a cheating attempt and more of a creative support tool.
- Language improvement: A student could use ChatGPT to refine grammar, syntax, or vocabulary in their work. By using the AI's suggestions to polish their writing, they could enhance the quality without raising suspicion. Nonetheless, the advanced language comparison employed by the system might still identify inconsistencies.
- Adaptive learning: A highly resourceful student could attempt to "train" ChatGPT over time to match their writing style by providing it with samples of their work. In this way, the student might try to generate responses that are more consistent with their previous submissions. This approach would require substantial effort and may still not guarantee success in evading the system's detection methods.
Even though these attempts might be possible, the comprehensive nature of the assessment system would make it very difficult for students to rely on AI-generated content consistently. The triangulation of written work, oral assessments, tutorial feedback, and discrete tasks ensures that inconsistencies in a student's performance would likely be identified, making it highly challenging to cheat using ChatGPT or other means.
Mission Repossible: Using Science with Existing Software
Josh prompt to ChatGPT: "Do any of these measures require special technology that school systems don't have?"
ChatGPT:
"Most of the measures mentioned in the portfolio-based triangulated assessment system do not require specialized technology beyond what is commonly available in school systems. However, there are a few measures that may necessitate additional resources or technology to be effectively implemented:
- Advanced language comparison: This measure may require the use of specialized software or tools that can perform sophisticated linguistic analysis to detect inconsistencies in students' work.
- Dynamic assessments: Implementing adaptive, real-time assessments could require the use of specific software or platforms that support this type of evaluation method.
- Keystroke analysis: Monitoring and analyzing students' typing patterns may necessitate the use of specialized software capable of capturing and processing this type of data.
- Remote proctoring services: If a school system chooses to implement remote proctoring for exams, they would need to invest in software or services specifically designed for this purpose. Such services often use webcam monitoring and various anti-cheating measures to maintain the integrity of the exam.
While some of these measures may require additional resources or technology, many school systems are increasingly incorporating advanced tools and software into their educational infrastructure. These measures could be adopted gradually, depending on the availability of resources and the priorities of the school system."
Conclusion
Josh prompt to ChatGPT: "but the main ways of triangulation can beat ChatGPT usually without extra tech?"
ChatGPT
"Yes, the main triangulation components of the portfolio-based assessment system can effectively counteract ChatGPT's potential use in cheating without requiring extra technology. The primary methods of triangulation include:
- Written work
- Oral exams
- Tutorial feedback
- Discrete tasks
These evaluation components work together to provide a comprehensive view of a student's progress, understanding, and skill development. By incorporating different assessment methods that capture various aspects of a student's performance, it becomes difficult for students to rely on AI-generated content like ChatGPT consistently.
For example, inconsistencies between a student's written work and their oral exam performance or tutorial feedback would raise suspicion and make it more challenging for them to cheat. This approach encourages genuine engagement with the course material and reduces the likelihood of successful cheating attempts using ChatGPT or similar AI tools without necessitating any specialized technology. In addition, transforming this content into stackable digital microcredentials builds an extra layer of trustworthiness through a portfolio of work, transferable to future schools and employers."
Need we say more?
Contact: josh@synergetics.us