IAM in the age of GenAI

IAM in the age of GenAI


Introduction about IAM

Identity & Authentication

User authentication in IT systems is the process of verifying the identity of a user attempting access to any digital asset as an example network, storage, and systems. It is a critical component of cybersecurity and access management, ensuring that only authorized users gain entry to systems and data. Authentication typically involves the use of First form authentication like credentials such as usernames and passwords, Second form authentication like hardware token or push notification, or more advanced third form methods like biometrics (fingerprints, facial recognition), or multi-factor authentication (MFA), which combines two or more forms credentials.? The use of MFA, significantly reduce the risk of compromised accounts. Even if one factor (like a password) is breached, additional factors (such as a mobile authentication app or biometric verification) provide extra layers of security.

Authentication is considered the first line of defense against unauthorized access. By verifying user identities, it ensures that only legitimate users can access sensitive information, systems, and resources. This helps protect against data breaches, cyber-attacks, and other malicious activities.

Authorization

User authorization in IT systems refers to the process of granting or denying users access to specific resources, data, and functions within an information system based on their identity and permissions. It is a crucial aspect of cybersecurity and access management that ensures only authorized individuals can perform certain actions, thereby protecting sensitive information and maintaining the integrity and confidentiality of the system.

?

Authorization typically follows authentication, which verifies the user's identity. Once authenticated, the system checks the user's permissions against an access control policy to determine what resources the user can access and what actions they can perform. Common models for implementing authorization include Role-Based Access Control (RBAC), where permissions are assigned to roles rather than individuals, and Attribute-Based Access Control (ABAC), which uses user attributes and environmental conditions to make access decisions.

How it is done traditionally (RBAC amp; ABAC)

?

RBAC

Role-Based Access Control (RBAC) is a method for regulating access to computer or network resources based on the roles assigned to individual users within an organization. In an RBAC system, permissions are assigned to specific roles rather than to individual users. Users are then assigned to these roles, thereby inheriting the permissions associated with the roles. This model simplifies management by grouping permissions under roles, such as "administrator," "editor," or "viewer," which reflect the user's job responsibilities.

?

Advantages

-????????? Simplified Administration: RBAC reduces administrative work because roles are centrally managed. Instead of managing permissions for each user individually, administrators can assign roles to users, which automatically grants the appropriate permissions.

-????????? Improved Security: By enforcing the principle of least privilege, RBAC ensures that users only have access to the resources necessary for their roles. This minimizes the risk of unauthorized access and potential security breaches.

-????????? Compliance and Auditing: RBAC facilitates regulatory compliance by providing clear and manageable structures for access control. It simplifies auditing processes by ensuring that roles and their associated permissions are well-documented and easily reviewable.

Disadvantages

-????????? Initial Setup Complexity: Implementing RBAC can be complex and time-consuming. Defining roles, assigning permissions, and mapping users to these roles require a thorough understanding of the organization's access needs.

-????????? Role Explosion: In large organizations, the number of roles can proliferate rapidly, leading to management challenges. This situation, known as role explosion, can make the system cumbersome and difficult to manage effectively.

-????????? Rigid Structure: RBAC can be inflexible in dynamic environments where users' access needs frequently change. Adapting to these changes requires continuous updates to role definitions and user assignments, which can be administratively burdensome.

Overall, while RBAC offers significant advantages in terms of security and manageability, it also comes with challenges that need careful planning and ongoing maintenance to ensure its effectiveness.

ABAC

Attribute-Based Access Control (ABAC) is an advanced method for managing access to resources by evaluating a set of attributes associated with users, resources, actions, and environmental conditions. Unlike Role-Based Access Control (RBAC), which grants permissions based on predefined roles, ABAC makes dynamic access decisions based on attributes such as user department, security clearance, resource classification, time of access, and location. For example, a policy in an ABAC system might allow access to financial records only to users in the finance department with a security clearance of "high" during business hours from company devices.

?

Advantages

-????????? Fine-Grained Access Control: ABAC offers a more granular approach to access control, allowing for highly detailed and context-sensitive policies. This flexibility enables organizations to define precise access rules tailored to specific needs.

-????????? Dynamic and Contextual: ABAC evaluates attributes in real-time, making it adaptable to dynamic environments where user access needs to change frequently. This context-awareness enhances security by considering factors such as time of day, location, and current threat levels.

-????????? Scalability: By using attributes rather than roles, ABAC can scale more efficiently in large and complex organizations. It reduces the need for a proliferation of roles and simplifies the management of permissions across diverse and evolving scenarios.

Disadvantages

-????????? Complexity in Implementation: Setting up an ABAC system can be complex and resource intensive. It requires a thorough understanding of the attributes involved and how they interact, as well as the creation and management of detailed policies.

-????????? Performance Overhead: The dynamic evaluation of attributes in real-time can introduce performance overhead, especially in environments with high access request volumes. This can impact system performance and response times.

-????????? Policy Management: Managing and maintaining a large number of complex policies can become cumbersome. Ensuring that policies remain consistent, up-to-date, and free of conflicts requires significant administrative effort and vigilance.

Overall, ABAC provides a robust and flexible framework for access control, suitable for environments requiring high levels of security and adaptability. However, its complexity and the need for careful management are important considerations for organizations planning to implement this model.

The problem with the introduction of GenAI

In a non-GenAI system authorization can be controlled at 1 or all of the following three levels.

-????????? UI or Frontend tier.? Authorization can be controlled at the level of the page of a UI component level.

-????????? Middle tier – This is where a middle tier service will grant or deny the user access.

-????????? Data tier – this can be controlled at DB, schema, table or even at the row level.? Some DBs have specific features to facilitate this.

But when you are calling a GenAI model You are faced with the following 2 problems

-????????? The model has been trained on all kinds of data by design.? This data can include client data, operational data, HR data, and financial data.? The main reason for combining all this data in one model is by design.? It is done so the model can cross reference the data and create or infer new data from the training data.? That is one of the main advantages of GenAI (Gen – for generative)

-????????? The second problem is that the model does not follow your typical 3-tier architecture and each tier is divided into components, service, and tables.? The inter working component of the AI model is hidden or unknows.

?What about MLBAC (Machine Learning-Based Access Control)

MLBAC is an advanced method of managing and regulating access to systems and data. It leverages machine learning algorithms to analyze patterns and behaviors in user access data to make real-time access control decisions. This approach aims to enhance security by dynamically adapting to emerging threats and user behaviors that traditional access control methods might miss.? Leading IAM providers like Microsoft, Okta and ping have been incorporating MLBAC as part of their offering.

Advantages of MLBAC:

1.????? Adaptive Security: MLBAC can continuously learn and adapt to new threats, providing a higher level of security.

2.????? Real-Time Decision Making: Machine learning algorithms can analyze access requests in real-time, enabling quicker and more accurate decisions.

3.????? Reduced False Positives: By understanding normal user behavior, MLBAC can reduce the number of false positives, minimizing disruptions to legitimate users.

4.????? Scalability: MLBAC can handle large volumes of data and users, making it suitable for organizations of any size.

5.????? Proactive Threat Detection: It can identify unusual and unseen patterns and potential security breaches before they cause significant harm.

Disadvantages of MLBAC:

Does not work on Generative AI models:? It’s able to predict new pattern but it is not able to operate on Generative content generated from a GenAI model.? This is for two reasons.

1.????? Usually IAM is a gate keeper up from to a system or data.? The new problem with GenAI is that it is generating new content on the backend.? So it is as important to authorize the generative data as it is authorize the requester.?

2.????? IAM (including MLBAC) looking from the outside has no concept of how the generative data was generated and based on what data.

Here is an example.? HR manager is asking a GenAI model to predict hiring pattern for next quarter.? The GenAI model produces the prediction based on HR data and Financial data that is not public yet.? This scenario can’t be caught by an outside agent, aka IAM, as it has no clue that the predicted HR hiring pattern was based on financial data and the HR manager should not have access to this data.

There are other disadvantages like complexity, Cost, and Data dependency that I will not delve into here.

Proposed solution

I proposal that the IAM data should be integrated into the GenAI model.? Only the GenAI model can infer the relationship between the generated content and the raw data the model was trained on.? The IAM data should be an integral part of the training data.? It also should be an integral part of the verification harness.? This can be done at two phases:

During training

Please take a look at Fig 1:?

We can see here that:

1-????? During the training phase we are enriching the training data with IAM tags that tell the AI model who should have access to this data and when.

2-????? Also, during the validation phase the testing validation harness should also validate the IAM rules.? Basically we are integrating an IAM compliance harness as part of the overall validation harness.? The GenAI model is not deemed ready unless it passes both validation harnesses.

At usage time.

In this phase, when a user submits a prompt to the Gen AI mode, the prompt is intercepted by an IAM filter that enriches the prompt with RBAC or ABAC meta tag that describes the IAM rules context.? This will allow the GenAI model to use this meta data to decide if this user has the authority to access the newly generated content.? Please refer to Fig 2 below for the workflow.

?

Logging

Another key element of this proposal is logging.? It is so important that we log the following

1-????? During training and validation we should log the result of all test cases, especially the failed ones.? This is so important for compliance (like GDPR, HIPAA) and auditing reasons.

2-????? At production we should log all prompts and the content generated from the GenAI model.? Again this is for auditing and compliance reasons.

Conclusion.

The integration of Identity and Access Management (IAM) into the realm of Generative AI (GenAI) represents a significant advancement in managing and securing access to sensitive data. Traditional IAM systems, which function effectively within defined system architectures, encounter unique challenges when applied to GenAI models. These models, trained on diverse datasets that include client, operational, HR, and financial data, require innovative approaches to ensure robust security and compliance.

?

Integrating IAM with GenAI involves embedding access control mechanisms during both the training and usage phases of the AI model. During training, data is enriched with IAM tags, ensuring that access controls are ingrained within the model's framework. This phase also includes validating the model against IAM rules, ensuring it complies with access control policies before deployment.

?

At runtime, prompts submitted to the GenAI model are processed through an IAM filter that attaches RBAC or ABAC metadata. This metadata helps the model determine whether the user is authorized to access the generated content, thereby enforcing real-time access control.

?

Additionally, comprehensive logging during both training and production phases is crucial. Logging ensures that all actions, especially failed compliance checks, are recorded for auditing and regulatory purposes. This is essential for meeting compliance requirements such as GDPR and HIPAA.

?

By integrating IAM into GenAI, organizations can leverage the powerful capabilities of generative models while maintaining strict control over data access and ensuring compliance with security standards. This approach not only enhances security but also fosters trust and reliability in AI-driven systems.

?

In summary, the proposed IAM integration framework for GenAI models addresses the unique challenges posed by these advanced systems, ensuring that they operate securely and compliantly in diverse and dynamic environments.

?

Yatharth Gupta

founder & CEO, Codified

4 个月

This is very interesting Nezar - I think there is a lifecycle management piece which is also interesting - how long to give access to data for

回复
Nezar Gharbia

Senior Transformational Technology Leader | Strategic Planning, Team Leadership | Digital Transformation | Legacy modernization | Cloud Migration | VP of Software Engineering | CTO | Director of Software Development

6 个月

come on friends. noone has an opinion? have you faced this problem? how did you tackle it? what your thought on what I am proposing? a little interaction would be really appreciated.

回复

Great article

回复

So what happens when IAM data changes, do you have to update the model?

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了