Tagging schemas are a prerequisite to successful use of AI for cybersecurity

Anthony Morris

Building an AI powered Virtual Security Analyst

发布日期: 2023年5月10日

I'm going to apologize to Jacob Williams (known by many as @MalwareJake on Twitter) right up front because I will likely unintentionally misquote his opening IANS presentation 2 days ago - but it's not on purpose and the gist behind the quote is still highly accurate. It went something like this. "data normalization is the challenge that is inhibiting successful application of artificial intelligence in cybersecurity."

In context, Jake was talking about the importance and need to develop a tagging schema. As a simplistic example, consider tags that might be applied to an application:

app_requires_admin_access
admin_access_required

In other words, it isn't sufficient just to HAVE tags or even apply tags (as shown in this example), but that you must also follow a defined tagging SCHEMA.

I asked ChatGPT itself what role data normalization played in it's own use and it told me:

Data normalization plays a crucial role in AI tools like ChatGPT. Normalization refers to the process of scaling and transforming data to improve the performance and stability of machine learning models. Specifically, normalization aims to bring the input data into a consistent range, so that the model can learn effectively from the data and make accurate predictions.

In the case of ChatGPT, which is a language model designed for natural language processing tasks such as conversation generation, data normalization can help to ensure that the model can learn from a wide range of input texts with varying lengths, structures, and formats. By normalizing the input data, ChatGPT can effectively process and analyze the text, identify patterns and relationships in the language, and generate coherent and meaningful responses to user queries.

领英推荐

OpenAI is So NOT Done

AIM 1 年前

Toxic AI

Prof. Ahmed Banafa 4 个月前

?? A new GPT Data Leak?

AlphaSignal 9 个月前

Additionally, data normalization can help to prevent overfitting, which occurs when a model is too closely tuned to the training data and performs poorly on new or unseen data. By normalizing the input data, the model can learn to generalize patterns in the language rather than memorizing specific examples from the training data.

And to put an exclamation point on the whole topic, consider this article from NBC news that states OpenAI (the startup company behind ChatGPT) actually employees an "army of contractors" to assist it with the necessary task of data labelling [normalization].

Quoting one of the interviewed contractors

“We are grunt workers, but there would be no AI language systems without it,” one worker, Alexej Savreux, told NBC. “You can design all the neural networks you want, you can get all the researchers involved you want, but without labelers, you have no ChatGPT. You have nothing.”

Now where the rubber meets the road- here's a practical call to action.

At a minimum, start to develop a tagging schema for documents, assets and information in your organization. Start now to find a way to apply those labels. AI *is* coming and it won't be stopped... i strongly encourage you now to create a plan so you're ready and positioned for success.
Are you a dreamer and a leader? Let's collaborate together and see if we can create a cybersecurity specific schema for AI systems. Get in touch with me using my LinkedIn profile and let's build something together.

FULL DISCLOSURE: The author of this article is a co-founder in a shadow startup that is looking at how artificial intelligence can improve current security processes. While AI has arguably been historically overhyped and exaggerated, it doesn't take away that a properly designed and implemented AI system can radically benefit your security position [emphasis on properly]. Contact me privately on LinkedIn if you want to have a non-sales conversation to see what that might look like.

Tagging schemas are a prerequisite to successful use of AI for cybersecurity

Anthony Morris

Building an AI powered Virtual Security Analyst

领英推荐

TXHackerTracker

491 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

OpenAI is So NOT Done

Taking a Coffee: Chatting with an AI about the best possible Prompts to an AI

Can Data Observability Save Us from AI Disaster?

ChatGPT, Bard, Gemini, Claude, Small Scale and Open Source LLMs — Which Are the Best and When?

The Secret OpenAI Chatbot System Prompt

Criminalizing Counterfeiting Human Intelligence Technology (CHIT): Real/True/General AI vs. Big Tech Fake AI & ML & DL & NNs

THE HIDDEN RISKS OF AI: EXPLORING VULNERABILITIES IN LARGE LANGUAGE MODELS

AI Insights Newsletter VII

Navigating the Complexities of AI: Insights from ASD’s ACSC Publication

How A Data Scientist Made A Cloned Version of His Friends' Group Chat Using AI!

领英推荐

TXHackerTracker

491 位关注者

Separating the sheep from the goats: Is your AI vendor real?

2024年8月28日

Three take-aways from the Crowdstrike outage

2024年7月19日

Crypto scam story

2024年4月5日

Questions to ask your vendor about AI

2023年9月22日

A step-by-step method to leverage generative AI to pass certification tests

2023年8月18日

The business of "yes"

2023年8月11日

Is "cloud first" the best business strategy?

2023年8月4日

So you want a cybersecurity mentor? Really? Are you sure?

2023年7月28日

A dummies guide to understanding the Microsoft Key Signing compromise

2023年7月25日

Infosec for developers in less than 2 hours

2023年7月24日

社区洞察

其他会员也浏览了

OpenAI is So NOT Done

Taking a Coffee: Chatting with an AI about the best possible Prompts to an AI

Can Data Observability Save Us from AI Disaster?

ChatGPT, Bard, Gemini, Claude, Small Scale and Open Source LLMs — Which Are the Best and When?

The Secret OpenAI Chatbot System Prompt

Criminalizing Counterfeiting Human Intelligence Technology (CHIT): Real/True/General AI vs. Big Tech Fake AI & ML & DL & NNs

THE HIDDEN RISKS OF AI: EXPLORING VULNERABILITIES IN LARGE LANGUAGE MODELS

AI Insights Newsletter VII

Navigating the Complexities of AI: Insights from ASD’s ACSC Publication

How A Data Scientist Made A Cloned Version of His Friends' Group Chat Using AI!