Putting Generative Agents behind authentication and controlling their Access. From sandbox to production. Part?1
In the world of “Catch Me If You Can,” Frank Abagnale Jr. masterfully impersonates a pilot, a doctor, and a lawyer, exploiting vulnerabilities in systems of trust to achieve his goals. His story serves as a cautionary tale in the age of generative AI, where the potential for AI-generated content to misleading or blur the lines between reality and fabrication demands a vigilant and thoughtful approach to authentication, authorization and auditability so it is possible to track, record, and review the actions and decisions made by AI systems..
Moreover, the novelty and complexity of these systems, coupled with the nascent state of best practices, patterns, and standards for their implementation and management, necessitate a cautious and iterative approach.?
If you’re new to authentication, authorization, and auditability, here’s a simple way to think about them:
Authentication: It’s like showing your ID at the door to prove you are who you say you are.
Authorization: Once you’re in, it’s about checking if you have permission to enter specific rooms or use certain equipment.
Auditability: It’s like having security cameras and a logbook to keep track of who went where and what they did, just in case something goes wrong.
To ensure these powerful AI tools are used responsibly and effectively, we must adopt a multi-faceted strategy. This includes:
This approach is particularly critical as organizations transition from traditional human-agent interactions to AI-powered solutions. Understanding how users adapt to these new systems and ensuring their seamless integration into existing workflows is paramount.
In this blog post, we’ll provide a technical deep-dive into building Generative Agents on Google Cloud, with a strong focus on incorporating user identity, access control, and auditing mechanisms. We’ll explore how these elements help to maintain security, transparency, and user trust.?
While authentication and authorization are well-established domains with mature standards like OpenID Connect and OAuth2, these concepts can often seem daunting to GenAI developers. One of the ambitions of this article is to help bridge that gap and provide a practical, intuitive understanding of how these protocols work and how you can use them to build secure and user-friendly generative AI applications.
By following step-by-step instructions and clarifications from this article, you’ll be able to build an authenticated agent that executes actions while respecting the permissions of the logged-in user. This article marks the first milestone in my series dedicated to the journey of transforming generative AI applications from sandboxed demos into robust, production-ready solutions on Google Cloud. In subsequent articles, I will build upon the foundational concepts established here, showing how to add more capabilities to this multi agent generative AI application presented to users as [Knowledge Chat].?
When implemented, our Agent is going to respond with personalized welcome message for logged-in user:?
When designing the security layers for this generative AI application, I had to make some key decisions regarding authentication and authorization. Here’s a breakdown of the options:
Enterprise Environments:?
Modern Web and Mobile Applications:
For large-scale applications with millions of users accessing your services via web and mobile platforms, cloud-based identity providers like Firebase Authentication, Google Identity Platform, Keycloak (open-source), .... offer better scalability and flexibility.
Our application will be used within Enterprise Environment by employees, partners, etc. and not as consumer-focused so we will use Google Cloud Identity and will consider it as Identity Provider for our internal application. What I mean by Identity Provider? Selecting Identity Provider boils down to a simple question: Do you want users to create brand-new accounts specifically for your application, or would you prefer to let them leverage their existing accounts? Here we will leverage user accounts created for our Google Cloud organization.
What if your application is consumer-focused? Use Google Identity Platform for user management and enable Social Login, e.g. Sign-in with Google to allow users to sign in to your application using their Google accounts. This solution can scale to millions of users.?
Cloud Identity and Cloud Identity Platform rely on Google’s authentication infrastructure and open protocols (OAuth 2.0, OpenID Connect). So let’s dive a little into these concepts:
These protocols are essential building blocks for modern authentication and authorization systems, enabling seamless and secure integration between applications and services. Understanding them will help you build more robust and user-friendly generative AI applications.
If you remember, in this article we will be building an authenticated agent that executes actions while respecting the permissions of the logged-in user. Your application (agent) will need to ask user to grant authorities so that it will be able to generate special token?—?access token?—?working as temporary, limited-access pass to act on user data when agent will seek for help asking external systems using so called Tools. This is where we will use OAuth 2.0.?
You might be wondering, “How do I connect my application to Cloud Identity so I can offload the complexities of authentication and authorization to Google?”
In essence, you need to build a trusted relationship between your application and Google’s comprehensive identity and access management infrastructure. This goes beyond just a simple user database; Google provides a robust system for not only storing user accounts but also handling the intricate processes of authentication, authorization, and secure communication.
The key to establishing this relationship lies in creating an “OAuth Client ID” within the Google Cloud Console’s “Credentials” section.?
Choose [Web application] as Application Type and give it a name:
This unique identifier essentially acts as your application’s digital badge (with Client ID and Client secret as credentials), granting it the necessary permissions to interact securely with Google’s authentication system:
Importantly, this doesn’t mean your application gains unrestricted access. The specific scope of access?—?the data your application can request and the actions it can perform on behalf of the user?—?is carefully defined during the authorization process, where the user explicitly grants consent. As developers, we need to remember that even though we’re building powerful AI agents, we can’t just assume access to user data. For instance, agent we are building will need email address of logged-in user to execute actions.?
We can’t just take it; we need to ask nicely! Once user is logged in securely, our application will request permission to access email address from user Google Account profile.?
As an application developer, you have the crucial responsibility of specifying the exact permissions your app requires. This determines the actions your application is authorized to perform on behalf of the user.
This list of permissions is presented to the user through a “consent screen”, where they can review and grant or deny access.
Here is how you declare “consent screen” for your application in Google Cloud console:
The consent screen serves as a transparent communication channel with your users. It’s where you provide essential details to help them understand why your application needs specific permissions:
Next, you’ll need to specify the precise permissions, or “scopes,” that your application needs to access or act on specific private user data from their Google Account. This list is presented to users on the consent screen, allowing them to grant or deny access.
These scopes are categorized based on their sensitivity, ranging from non-sensitive to restricted. In the upcoming article, we’ll expand our agent’s capabilities to include querying the Agent Builder Search application, which operates on Confluence and Jira. Naturally, this will necessitate user authorization to use this search application on his/her behalf. The required permissions fall under a sensitive scope, highlighting a crucial aspect of modern authorization: user consent. Even if your application integrates with multiple services, it can only access user data within those services if and only if the user has explicitly granted permission. From a security standpoint, this approach aligns with the principle of least privilege. Applications should only request the minimum necessary permissions to function correctly.
Also, remember: as a user?—?you can revoke permissions at any time if you change your mind or no longer trust an application.
One last thing I want to highlight here is that users prompts will sometimes require accessing data that are not owned by the user. Think about salary reports, Jira tickets,?... Users have in general constraints on which data they can access?—?our agent should respect this constraints and generate answers only using data this user is authorized to access. This ensures transparency and respects data privacy.?
Our conversational agent, deployed on Google Cloud, operates under the identity of a technical user known as a service account. This service account is responsible for executing actions and accessing Google Cloud resources on behalf of the agent. Google IAM roles and permissions govern which actions this service account is authorized to perform.
However, there are certain actions we don’t want to delegate to this service account. Specifically, when our agent needs to lookup user details our application is authorized to access or relay queries to the Agent Builder Search application, we want these actions to be performed directly on behalf of the logged-in user, not the agent’s service account.
Here’s where the user’s granted permissions come into play. Behind the scenes, our application will make API requests to the Google People API and Agent Builder Search app, acting not as the service account but directly as the logged-in user. This is accomplished by including an access token in the API request headers (Authentication Bearer), which essentially functions as a key representing the authenticated user.
The OAuth2 standard defines the process of generating this access token. In fact it defines few processes named FLOWS, can that can be applied in different setups. The two that I think are relevant to GenAI developers are:
领英推荐
You try to access my application [1]. I delegated authentication and authorization to Google so you will be redirected to Google Sign-in screen [2] and when you provide your credentials and will be authenticated you will also be asked to authorize the application to do some actions on your behalf [3]. Once this is done you will be redirected to so-called callback endpoint?—?which in my case is welcome page of my application with access to chat window where I can communicate with agent. Google appended to this callback request something which is called AUTH CODE [4]. You can think of this as a temporal badge your application got from Google so whenever your application come for drink that you have in your all-inclusive package you can show it and you will not go through the registration again. In reality, your application will use AUTH CODE to exchange it to [access token] [5,6]— which functions as a key representing the authenticated user and is understood by service where our application will need to act on behalf of the logged-in user [7]. Our application will include this access token in the API request headers (Authentication Bearer) to these services.?
Let’s transition from theory to practice! It’s time to roll up our sleeves and start building our Agent application within Agent Builder.
AgentBuilder is a low-code service available on Google Cloud to building multi agent and search applications. Give a name and select region where your agent should be deployed. You are ready to to establish clear goals for the Agent and define the reasoning rules it will follow to effectively address user queries.
For now, let’s start with a simple rule: greet the user and simultaneously utilize the get_user_profile TOOL to retrieve their name and email address.
To gather the necessary user information for a personalized greeting, we’ll leverage the Google People API, which is readily available on Google Cloud. This API allows us to securely access profile data for authenticated users. However, before proceeding, ensure that you’ve enabled the People API in your Google Cloud project.
Sample output from this API is as follows:
Now that we have the People API enabled, let’s define an Open API Tool within Agent Builder to represent this API and allow our agent to interact with it securely.
The corresponding Open API Schema is as follows (no need to modify anything here if you want to use it in your agent):
openapi: 3.0.0
info:
title: Google OAuth2 Userinfo API
description: API to retrieve user information using Google OAuth2
version: 1.0.0
servers:
- url: https://www.googleapis.com/oauth2/v1
paths:
/userinfo:
get:
operationId: getUserInfo
summary: Get user information
description: Retrieves user information associated with the provided OAuth2 access token
responses:
'200':
description: Successful retrieval of user information
content:
application/json:
schema:
type: object
properties:
id:
type: string
description: The user's unique Google ID
email:
type: string
description: The user's email address
verified_email:
type: boolean
description: Whether the user's email address has been verified
name:
type: string
description: The user's full name
given_name:
type: string
description: The user's given name
family_name:
type: string
description: The user's family name
picture:
type: string
description: A URL to the user's profile picture
hd:
type: string
description: The hosted G Suite domain of the user (if applicable)
'401':
description: Unauthorized - Invalid or missing OAuth2 access token
Here’s the crucial part: We need to configure the authentication method for this tool. To ensure that the API calls are made on behalf of the logged-in user, we’ll choose the “Bearer Token” option. This directly corresponds to the OAuth2 Authorization Code flow we discussed earlier, allowing us to securely inject the user’s access token into the API requests..
The “Token” input field requires a bit of explanation. Essentially, we need to provide the value of a session parameter. Think of a session as a representation of your ongoing conversation with the agent. For now, let’s input $session.params.token into this field. We'll clarify the reasoning behind this shortly.
To provide a user interface for our agent, we’ll leverage Dialogflow Messenger. Integrating it into your application is straightforward?—?simply copy the provided code snippet into your HTML page.
To seamlessly integrate Dialogflow Messenger into your application, ensuring it leverages Google services for authentication and authorization, we need to make this page a part of your broader application structure. This will allow us to securely manage user sessions and access tokens, enabling personalized and context-aware interactions with the AI agent.
I used PyCharm from my laptop and created new python project for this application. This is going to be Flask application. Within this project, I established a subfolder named “templates” and placed the code for Dialogflow Messenger into a file named “index.html”:
I’ve enhanced this HTML file with the following script section. It’s designed to capture values dynamically injected into the template by our Flask application, which we’ll develop shortly. This variable specifically represents user access_token.?
<script>
var access_token = "{{ my_access_token }}";
</script>
Now comes a crucial element that connects the dots. This is where we assign the user’s access token to the agent’s session parameters. Remember the $session.params.token we placed in the Bearer Token field earlier? This code snippet is where we create that parameter and inject the user's access token, enabling the agent to act on their behalf when making API calls..
<script>
const dfMessenger = document.querySelector('df-messenger');
dfMessenger.addEventListener("df-user-input-entered", function(event) {
const queryParameters = {
parameters: {
token: access_token,
}
};
dfMessenger.setQueryParameters(queryParameters);
console.log(queryParameters)
});
</script>
This code snippet defines a listener that triggers whenever the user clicks the send button on the chat UI. It essentially captures the user's input, our parameters (including access_token) and prepares it for processing by the agent.
Now, let's move on to the final piece of the puzzle: a simple Flask application that will simulate the OAuth2 authentication flow. This will enable us to securely obtain the user's access token and inject it into the agent's session parameters, as we discussed earlier.
Here is where you will need Client ID and Client secret generated when you created you Credentials OAuth Client at the beginning of this article.?
from flask import Flask, redirect, request, render_template
from google_auth_oauthlib.flow import Flow
from googleapiclient.discovery import build
import os
app = Flask(__name__)
GOOGLE_CLIENT_ID = "XXXXXXXXXXXX"
GOOGLE_CLIENT_SECRET = "XXXXXXXXXXXXXXXXXX"
SCOPES = ['openid',
'https://www.googleapis.com/auth/cloud-platform',
'https://www.googleapis.com/auth/userinfo.email',
'https://www.googleapis.com/auth/userinfo.profile'
]
flow = Flow.from_client_config(
client_config={
"web": {
"client_id": GOOGLE_CLIENT_ID,
"client_secret": GOOGLE_CLIENT_SECRET,
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
}
},
scopes=SCOPES,
redirect_uri="https://127.0.0.1:5000/callback",
)
The main entry point of our application is just the code that initiates the OAuth Flow.
@app.route('/')
def login():
authorization_url, state = flow.authorization_url(
access_type='offline',
include_granted_scopes='true'
)
return redirect(authorization_url)
So when you run this code:
Now, when you navigate to https://127.0.0.1:5000, you'll be instantly directed to the familiar "Sign in with Google" screen.
As we covered previously, once the user successfully authenticates, the authorization server redirects the flow back to a pre-configured endpoint (callback URL) within your application. This is a key step in the OAuth2 process, where your application receives an authorization code that it can then exchange for an access token.
redirect_uri="https://127.0.0.1:5000/callback",
Now, let’s implement that callback endpoint to handle the authorization code and complete the OAuth2 flow.
@app.route('/callback')
def callback():
flow.fetch_token(authorization_response=request.url)
if not flow.credentials: # Check if credentials were fetched successfully
return "Authentication failed"
credentials = flow.credentials
# Now you have the user's access token:
access_token = credentials.token
print(f"Access Token: {access_token}")
return render_template('index.html', my_access_token=access_token)
It’s imperative to add this redirect URL to the list of “Authorized redirect URLs” within your OAuth Client ID configuration in the Google Cloud Console. This ensures that Google’s authorization server knows where to send the user after successful authentication, completing the OAuth2 flow securely.
Now, when you run the code and access the application, you should be greeted with the Dialogflow Messenger UI. Upon typing “HI,” you’ll initiate a conversation with the AI agent. Leveraging the Google People API and granted permissions, the agent will look up logged-in user profile information and craft a personalized welcome message tailored just for this user.
Finally, let’s not forget about auditability. If you’ve enabled Conversation History in your Agent Settings, you can review a complete record of all conversations, including our recent interaction. This capability is invaluable for understanding user behavior, troubleshooting issues, and ensuring compliance.
Summary:?
This article demonstrates how to build an authenticated generative AI agent on Google Cloud. It explains the importance of user identity, access control, and auditing in managing AI applications. The article provides a step-by-step guide on setting up authentication using Google as the identity provider, implementing authorization through OAuth2 scopes, and enabling auditability through conversation history tracking. Additionally, it touches on the importance of user consent and transparency when accessing user data.
Please clap for this article if you enjoyed reading it. For more about google cloud, data science, data engineering, and AI/ML follow me on LinkedIn.
This article is authored by Lukasz Olejniczak? —?AI Specialist at Google Cloud. The views expressed are those of the authors and don’t necessarily reflect those of Google.
Well researched Lukasz Olejniczak, PhD , thanks for diging in to this subject!
Applied AI - AI/ML at Google Cloud
3 个月Erica Chuong - you may find some answers here.
Applied AI - AI/ML at Google Cloud
3 个月Link do Medium: https://medium.com/@olejniczak.lukasz/putting-generative-agents-behind-authentication-and-controlling-their-access-c3abab7f9573