Unraveling the Paradox of Large Language Models: An Investigative Look at the Promising Advances and Risks of GPT
As the product of relentless research and development, large language models (LLMs) like ChatGPT, constructed on the Generative Pre-trained Transformer (GPT) architecture, are making notable strides. Yet, in the shadow of these remarkable advances loom potential dangers, particularly misuse by malicious actors, including cybercriminals.(1)
?
Despite OpenAI’s unveiling of GPT4, aimed at enhancing the functionality of ChatGPT and minimising its propensity to generate harmful output, experts, including those at Europol workshops, warn that the risks identified in the GPT3.5, which is the version ChatGPT used when released persist in GPT4.(2)
?
ChatGPT, launched in November 2022, belongs to a class of artificial intelligence (AI) systems specialising in Natural Language Processing (NLP) and LLMs. Built on deep learning techniques and trained on massive datasets, these models can comprehend and produce natural language text.(3) This capacity has significantly advanced in recent years, thanks to the evolution of supercomputers, deep learning algorithms, and the data explosion utilised for training.
?
However, the potential misuse of ChatGPT by cybercriminals is a significant concern.(4) An incident revealed by an Israeli cybersecurity firm sheds light on this issue, highlighting how malefactors can exploit the ChatGPT application programming interface (API) for WormGPT3(5), a variant sold on hacking forums for malicious purposes.(6) The lack of ethical constraints compounds the danger, enabling even technically inept criminals to execute large-scale attacks.
?
OpenAI introduced GPT4 for ChatGPT Plus subscribers in March 2023, featuring superior problem-solving abilities, enhanced API integration and improved image processing and classification. This new model is less likely to generate harmful content while providing more accurate responses, supporting investigation efforts.(7)
?
Traditional investigative techniques can benefit from GPT or similar language models in investigations. GPT can help investigators gather background knowledge, understand technical concepts and patterns and summarise digital traces such as emails and browser history.(8) At the same time, they can interact with ChatGPT to discuss their case details and challenges; at this point, ChatGPT can provide guidance and suggest methodologies or tools based on its extensive training.
?
The model’s misuse potential, however, cannot be ignored. For instance, despite the two-phase training process, unsupervised training to learn language structure and patterns, followed by fine-tuning using Reinforcement Learning from Human Feedback (RLHF)(9), safety measures can be bypassed if questions are segmented into individual steps. Moreover, ChatGPT’s proficiency in generating code and creating convincingly authentic text at scale makes it a potent tool for online fraud and phishing, capable of mimicking the language patterns of specific individuals or groups.(10) It is ideal for spreading propaganda and disinformation, and the emergence of ‘dark LLMs’ hosted on the dark web without safeguards amplifies the threat. Recent activities on the Dark Web Forum show evidence of the emergence of FraudGPT, which has been circulating on Telegram Channels(11) since July 2023.(12)?
An InvestigativeGPT, just as with other GPT models, could be trained by fine-tuning using a dataset focused explicitly on investigative and detective work. While the model could potentially generate contextually relevant responses, it is essential to emphasise that these AI-driven tools are not intended to replace human expertise and judgment. Instead, they serve as a powerful aid to investigators, providing them with timely insights and analysis. But, the ultimate decision-making power and responsibility reside with human investigators. Their expertise, innovative thinking and specialist knowledge form the backbone of practical investigations.
?
However, while these models bring a wealth of opportunities, they also present some challenges, especially when considering the disclosure process in investigations and court cases. Legal proceedings demand meticulous documentation and verifiable evidence. The inherent ‘black box’ nature of AI models, where the decision-making process is not always transparent or easily understood, could raise concerns over the admissibility and reliability of evidence generated by these tools.(13) Moreover, additional challenges include storing, managing and appropriately disclosing the vast data used to train these models.
?
The quality and quantity of data fed into these models also represent a significant challenge. Erroneous outputs can directly result from poor data quality, requiring investigators to ensure that their AI tools are trained on reliable and representative datasets. Moreover, due to ethical considerations, AI models are restricted from generating inappropriate or aggressive language.(14) This may limit their utility in investigations involving explicit conversations between criminals or between victims/witnesses and criminals.
?
The core problem is that AI will help radically increase the scale and efficiency of attackers in ways we are not entirely ready to combat. We have some time, just not a lot, to develop solutions. That said, for many years, many have stated that we will always be playing catch-up to the criminals who often use cutting-edge technology better and faster than we do and certainly quicker than we can address the risks.(15)
?
As we look ahead, the double-edged nature of LLMs demands careful consideration. Investigators and law enforcement agencies must prepare for these challenges by raising awareness of potential security loopholes and promptly addressing them. The fast-paced evolution of these models also necessitates investigators to stay abreast of new developments, preventing misuse while leveraging the potential benefits of LLMs. The onus falls on subject matter experts to delve deeper into the research and better understand this emerging technology, driving responsible and practical use in investigative and legal contexts.
领英推荐
Article by:
Darren Mullins ?- Partner at Accuracy
Paul Wright MSc ?- Senior adviser at Accuracy?
1 Europol (2023), ChatGPT - The impact of Large Language Models on Law Enforcement, a Tech Watch Flash Report from the Europol Innovation Lab, Publications Office of the European Union” https://shorturl.at/wGUY3
2 Europol (2023), ChatGPT - The impact of Large Language Models on Law Enforcement, a Tech Watch Flash Report from the Europol Innovation Lab, Publications Office of the European Union” https://shorturl.at/wGUY3
3 OpenAI 2023, “Release Notes” https://help.openai.com/en/articles/6825453-chatgpt-release-notes
4 OPWNAI, 06 January 2023, "Cybercriminals Starting to Use ChatGPT". https://research.checkpoint.com/2023/opwnai-cybercriminals-starting-to-use-chatgpt/
5 Check Point Research Team, 07 February 2023, “Cybercriminals Bypass Chat GPT Restrictions to Generate Malicious Content” https://blog.checkpoint.com/2023/02/07/cybercriminals-bypass-chatgpt-restrictions-to-generate-malicious-content/
6 Tushar, Subhra Dutta, 14 July 2023, “Hackers use WormGPT to Launch Sophisticated cyberattacks” https://cybersecuritynews.com/wormgpt-ai-tool/
7 OpenAI (2023). “ChatGPT Plus: New Features and Enhancements 2023 [Updated].” https://chatgpt4.uk/chatgpt-plus-features-and-benefits-2023-updated/?utm_content=cmp-true
8 EclipseForensic, 25 February 2023, "How Will AI Transform Digital Forensics 2023 and Beyond?". https://shorturl.at/gNOW1
9 OpenAI 2023, “Release Notes” https://help.openai.com/en/articles/6825453-chatgpt-release-notes
10 Hacker News, 26 July 2023, "New AI Tool 'FraudGPT' Emerges, Tailored for Sophisticated Attacks", https://thehackernews.com/2023/07/new-ai-tool-fraudgpt-emerges-tailored.html?m=1
11 Channels are a tool for broadcasting your public messages to large audiences. They offer a unique opportunity to reach people directly, sending a notification to their phones with each post. https://telegram.org/tour/channels
12 Edward Gately, 25 July 2023, "Netenrich Tracks Emergence of FraudGPT AI Bot that Accelerates Cyberattacks", https://netenrich.com/blog/fraudgpt-the-villain-avatar-of-chatgpt
13 SEON Technologies, post 2021, "What is Blackbox Learning",
14 Ankita, upaded 25 July 2023, "Does Character AI allow NSFW Content?", https://www.mlyearning.org/does-character-ai-allow-nsfw-content/
15 Study was commissioned by the European Parliament’s Policy Department for Citizens’ Rights and Constitutional Affairs at the request of the LIBE Committee, 2015, "The law enforcement challenges of cybercrime: are we really playing catch-up?", https://www.europarl.europa.eu/RegData/etudes/STUD/2015/536471/IPOL_STU(2015)536471_EN.pdf
Experienced Cybercrime, Intelligence (OSINT & HUMINT) and Digital Forensics Investigator
1 年See the video recordings of the international conference "Economic Crime in the Age of Technology (ECAT)" that generated this article have been released and are accessible on the Corporate Crime Observatory at the following link: https://lnkd.in/ehU-YCeZ