AI in Cyber Threats

AI in Cyber Threats

By: Taylor Green


Over the past few years, the field of artificial intelligence has seen significant advancements. With the emergence of generative AI systems, most notably ChatGPT, as well as breakthroughs in various scientific disciplines through innovative use of machine learning, the landscape has evolved dramatically.?

One of the most incredible things about these new generative AI models is that it's not just big hitters like OpenAI, Microsoft, and Google in the race for new internet-facing LLM chatbots. We also see a bustling community of developers, new and old, working with closed and open-source models like the ones used by these companies. Some of these models can even be run on smaller hardware locally with a slightly decent computer at home.

Protecting systems like these, of course, has its own set of challenges. In October of 2023, we saw the publishing of OWASP's Top 10 for Large Language Model Applications. While I don't want to spend a bunch of time talking about how to secure an internet-facing AI model, I think talking about some of these vulnerabilities may highlight how these may affect you as well.?

These developments bring with them new challenges in cybersecurity. Key among these challenges are questions about how to defend internet-facing generative AI chatbots, and how to safeguard against the generative outputs these systems provide. The latter issue leads us into a complex domain of both perceived and actual threats. In this article, I aim to delve into this realm, examining the real threats that have already been observed and exploring potential threats that, while not yet realized, are considered plausible in the future.?

Forgetting is hard?

The ability to forget is something, just like for a human, difficult for current large language models (LLM) to accomplish. In 2023, Samsung banned the use of generative AI due to sensitive data being accidentally leaked to ChatGPT. Since companies who offer services such as ChatGPT may use their users' input as training data, this could potentially lead to exposure of intellectual property, personally identifiable information, or any other sensitive information to AI models. The benefit of using this data allows for the model to learn from the vast number of conversations that it has had. While these companies may have architectures and preventative measures for this, nothing is foolproof.

For this reason, you need to be very cognizant of what you are sharing with external AI models such as these. Removing data learned after training or "forgetting" data is not something which is possible in current models and would generally mean that the model would have to be retrained. This could potentially cost a company millions of dollars in compute time. Companies such as OpenAI have updated their policies to address this, however, as users of these technologies, we must be responsible for what we put in. For this reason, you must be very careful what you tell an AI chatbot. Make sure to review the chatbot's policy on data and retention so you know what is safe to put into the chat.?

Training Data Poisoning and Hallucinations?

Current LLMs cannot differentiate the truth from a lie. They only know what they have been taught from their training data. A lot of times, LLMs will also hallucinate. Hallucination is when LLMs may make up details to complete their output and not consider if what is being said is true or not. There have been examples of this in the real world and in cases where truth and accuracy mattered. For example, I recently spoke with a partner who asked ChatGPT for Microsoft’s support phone number, and they were given the number to a threat actor posing as Microsoft support.

These instances had severe consequences for those who had put trust in the model's output without verifying if what was said was true or not. Hallucinations are not always bad and may be needed in certain applications, but not when you intend to state facts. There is research being done on the truthfulness of AI models and explainability so that models can provide references for their statements, but currently, the responsibility is on the user to check for the accuracy of statements.?

One factor that can lead to incorrect data when prompting an LLM is data poisoning. Training data poisoning is also a top vulnerability for generative AI according to OWASP. Data poisoning is a technique in which attackers introduce inaccurate data into the training dataset to tamper with the integrity of the model output. This can cause the model to give bad output or misleading information. Defense against these vulnerabilities is something that the owners of the internet-facing AI models must consider. For the rest of us, the current best defense against hallucinations and false information is to not trust the output of an AI model without checking for accuracy yourself.?

Deepfakes?

Back in 2017, we were first exposed to deepfakes. Deepfakes are pictures, videos, and audio created using generative adversarial networks (GAN). Deep learning is a subset of machine learning that uses artificial neural networks to learn patterns from large datasets. GANs use two machine learning models, typically deep learning models, which are trained on the same set of images.

These two models are the image generator and the discriminator. These two models work together to refine an image until it is indistinguishable from the real reference image. Tech like this gives adversaries the ability to modify or generate audio and video in several ways, from swapping one person's face and voice with another to impersonating an individual, or even generating an entirely new person using whatever voice or face they want. What's more, is that the hardware required to run models like this can be bought off the shelf, negating the need for massive cloud compute. This was predicted when introduced to be a major problem in the phishing arena and others. Enough so, that in late 2018, the US Senate introduced the Malicious Deep Fake Prohibition Act of 2018 to combat the creation or distribution of fake electronic media records that appear realistic.?

Deepfakes are expected to contribute more to crimes such as extortion, harassment, blackmail, and document fraud. Due to these factors, I would expect deepfakes to contribute more and more in the future to phishing attacks amongst many others. The ability to pretend to be someone inside a company not just through email, but by phone and video makes it more difficult to discern what is real from fake in the digital realm.?

Recently, a North Korean hacker successfully posed as a fake IT worker using deepfake technology, deceiving a security firm, KnowBe4, into hiring them. This case serves as a stark reminder of how advanced and convincing deepfake technology has become, making it imperative for organizations to implement stringent verification processes to prevent such breaches.?

Although this sounds very bleak, there are still ways to defend against these types of attacks. Several tips for identifying deep fakes can be found at here. These include looking for artifacts such as blurring around the face, a lack of blinking, odd eye or mouth activity, inconsistencies in hair, as well as inconsistencies in the background and odd depth of field for the background. There are also some tools developed to help spot AI-generated images, such as Deepware scanner. One of the most important things to remember is that deepfakes are possible and a real threat. Knowing that threat actors may use these techniques requires that we all scrutinize messages that come from external sources, even when we may recognize the face or audio when receiving messages.?

AI Chatbots?

While attackers could use ChatGPT, Bard, Midjourney, or other AI models to help them create malicious content, they would generally need to utilize prompt injection to get around the safeguards that exist. AI chat services such as these have safeguards in place to help prevent their tools from being used to generate malicious content such as phishing emails or crafting malware. This gave rise to systems like WormGPT. WormGPT is another chatbot similar to AI models like ChatGPT but made specifically for cybercrime. The biggest difference is WormGPT does not have the safeguards that these other legitimate services do, and is, as the developer put it, uncensored. This allows users of the platform to generate new malware on demand, phishing emails for business email compromise without the bad translation, and more. WormGPT is not new in 2023, but the existence of chatbots such as these lowers the bar for knowledge needed to execute attacks like building malware, crafting email, and even automating extortion and ransomware attacks.?

As attackers start to add this tool to their toolbox, if they have not already, I would expect that in 2024 we are going to see a rise in phishing emails which contain fewer spelling errors and feel more authentic in tone. The speed at which new vulnerabilities are weaponized, new malware is built or upgraded, and campaigns and attacks are carried out, will also rise as well. Hope is not lost, though. While these new threats have been on the rise, defenders in cybersecurity across the world are recognizing the threat these systems present and are already using machine learning to analyze and defend against these advancements.?

Defending Yourself?

Defending against phishing is almost the same as the way you have always done it. The phishing emails of the future will have fewer spelling errors and better images that match the content the attacker is looking to spoof. But they will still need you to click on malicious links, or download malicious attachments, or even get you to call a number. These messages are still going to contain other hallmarks of phishing, such as urgency and scare tactics to get you to act without safety and just click a link or call a number. Remember that instead of following a link for an alert on an account you may have, you can always open a new tab and navigate to the website manually and not use the link in the email. Always verify any alerts or urgent messages you receive. Never be afraid to call up a friend or colleague to verify if they sent you an email that you may not have been expecting. Trust your gut and if you think that you received a phishing message, check with your security team.?

The benefits of machine learning have not been lost on the defenders. In fact, a lot of the scary sounding malware such as Deeplocker in 2018 from IBM, which carried a Deep Learning Neural Network (DNN) onboard the malware. In 2023 Hyas Labs developed a proof of concept called EyeSpy, which claims to be a fully autonomous, AI-synthesized, polymorphic malware. Malware like this is being developed by researchers in a move to understand the potential TTPs that attackers and their malware may use in the future given these advancements. The prevention tools that are in place in a lot of ConnectWise environments such as Sentinel One and Bitdefender already utilize multiple forms of AI to detect threats. Most modern EDR solutions use deep learning and can also utilize large language models to either assist the EDR in detecting threats or even assisting the user in finding phishing through a scam detection service such as Scamio.?

Moving Forward

As the field of artificial intelligence continues to bring us faster and smarter tools, we must keep cognizant of their potential for malicious use. The tools that are used by threat actors are very similar to the ones you have already been exposed to. Just like generative AI can help write a piece of code for you based off a simple ask, threat actors enjoy the same benefit.

AI has lowered the bar for entry for programmers who write good programs and bad programs. You just may have to use a different chatbot to get the latter. Because tools like this also lower the bar of knowledge to do accomplish tasks the user may not be highly skilled in, this may mean that people over relying on this tool could also unwittingly introduce vulnerabilities to programs and spread misinformation if we are not careful about what use from the output of these programs. The responsibility is not just on the owners of the AI models, but all of us to combat these threats.?

Mohammad Hasan Hashemi

Entrepreneurial Leader & Cybersecurity Strategist

2 个月

From the challenges of forgetting data and combating training data poisoning to the rise of deepfakes and the potential misuse of AI chatbots for malicious purposes, the landscape is evolving. It’s crucial for organizations and individuals to stay vigilant, verify AI-generated information, and invest in robust security measures.

The rise of generative AI systems like ChatGPT and advancements in machine learning have significantly transformed the threat landscape. These innovations, while offering immense benefits, also introduce new cybersecurity challenges. It’s crucial for organizations to stay vigilant and adapt their security strategies to address these evolving threats.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了