Unlocking Enterprise Image Descriptions: Harnessing the Power of Nasuni's S3 Edge API and LLMs for Precision Naming
In a prior article I showed how we could leverage Llamafile to use a local LLM to rename images with descriptive names to make them easier to find.
Today we are going to do something very similar but this time we will use the Nasuni S3 Edge compatible S3 endpoint (for unstructured file data managed by Nasuni), and Claude's LLM to interrogate the images.
Imagine that a company has directories full of images in which the names are simply letters or number or a combination of both (which they often are). This makes it very difficult for users to find and identify the images without browsing each image.
We will use Claude’s multimodal AI capabilities, via its API, to identify the image and rename it:
Requirements?
Starting Point:?
Image Directory on a Nasuni Edge which given Nasuni provides a compatible S3 API can be interrogated via an S3 compatible clients such as Cyberduck:?
As Nasuni's S3 Edge is compatible with the S3 API we can use the Boto a Python library for interacting with S3. It enables us to perform various operations on S3 buckets and objects.
Code:?
import boto3
import base64
import os
import io
from botocore.exceptions import ClientError
from anthropic import Anthropic
# Nasuni S3 edge configuration
NEA_ADDR = '<enter_address>'
A_KEY = 'S3_Edge_Key'
S_KEY = 'S3_Edge_Secret_key'
BUCKET_NAME = 'enter_bucket_name'
IMAGE_DIRECTORY = '<image_DIR>' # Specify the directory (prefix) where images are located
# Initialize the Anthropic client
anthropic = Anthropic(api_key='anthropic_api_key')
# Create an S3 client
s3_client = boto3.client(
's3',
endpoint_url=f'https://{NEA_ADDR}/',
aws_access_key_id=A_KEY,
aws_secret_access_key=S_KEY
)
def test_connection():
try:
response = s3_client.list_buckets()
print("Connection successful!")
print("Available buckets:")
for bucket in response['Buckets']:
print(f" {bucket['Name']}")
return True
except Exception as e:
print(f"Connection failed: {e}")
return False
def get_object_from_s3(key):
try:
response = s3_client.get_object(Bucket=BUCKET_NAME, Key=key)
return response['Body'].read()
except ClientError as e:
print(f"Error reading object from S3: {e}")
return None
def sanitize_filename(filename):
if filename is None:
print("Warning: Filename is None")
return "unknown_file"
filename = ''.join(c for c in filename if c.isalnum() or c in (' ', '_', '-'))
return filename[:50] # Limit to 50 characters to avoid overly long filenames
def get_image_description(image_data):
if image_data is None:
print("Error: image_data is None")
return None
try:
encoded_image = base64.b64encode(image_data).decode('utf-8')
response = anthropic.messages.create(
model="claude-3-opus-20240229",
max_tokens=300,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": encoded_image
}
},
{
"type": "text",
"text": "Describe this image in a brief, concise manner suitable for a filename."
}
]
}
]
)
return response.content[0].text
except Exception as e:
print(f"Error getting description: {e}")
return None
def process_image(key):
try:
print(f"Processing image: {key}")
image_data = get_object_from_s3(key)
if not image_data:
print(f"Failed to get image data for {key}")
return False
description = get_image_description(image_data)
if description:
sanitized_description = sanitize_filename(description)
new_key = os.path.join(os.path.dirname(key), sanitized_description + os.path.splitext(key)[1]).replace('\\', '/')
print(f"Uploading object with new name: {new_key}")
try:
s3_client.put_object(Bucket=BUCKET_NAME, Key=new_key, Body=io.BytesIO(image_data))
print(f"Successfully uploaded: {new_key}")
except ClientError as e:
print(f"Error uploading new object: {e}")
return False
print(f"Deleting original object: {key}")
try:
s3_client.delete_object(Bucket=BUCKET_NAME, Key=key)
print(f"Successfully deleted: {key}")
except ClientError as e:
print(f"Error deleting original object: {e}")
return False
print(f'{key} renamed to {new_key}')
return True
else:
print(f'No description returned for {key}')
return False
except ClientError as e:
print(f'An error occurred processing {key}: {e}')
return False
def list_objects(bucket, prefix=''):
try:
paginator = s3_client.get_paginator('list_objects_v2')
page_iterator = paginator.paginate(Bucket=bucket, Prefix=prefix, Delimiter='/')
for page in page_iterator:
if 'Contents' in page:
for obj in page['Contents']:
yield obj
if 'CommonPrefixes' in page:
for common_prefix in page['CommonPrefixes']:
yield from list_objects(bucket, common_prefix['Prefix'])
except ClientError as e:
print(f"Error listing objects: {e}")
def main():
if not test_connection():
print("Failed to connect to S3. Exiting.")
return
processed_count = 0
try:
print(f"\nListing objects in bucket: {BUCKET_NAME}, directory: {IMAGE_DIRECTORY}")
for obj in list_objects(BUCKET_NAME, IMAGE_DIRECTORY):
key = obj['Key']
if key.lower().endswith(('.jpg', '.jpeg', '.png', '.gif')):
print(f'Processing file: {key}')
if process_image(key):
processed_count += 1
print(f'Completed processing {processed_count} images')
# Verify the results
print("\nVerifying results...")
remaining_images = list(list_objects(BUCKET_NAME, IMAGE_DIRECTORY))
print(f"Number of objects remaining in {IMAGE_DIRECTORY}: {len(remaining_images)}")
for obj in remaining_images:
print(f" {obj['Key']}")
except Exception as e:
print(f"An error occurred: {e}")
import traceback
traceback.print_exc()
if name == "__main__":
main()
Before running the script be sure to add the Nasuni Edge details, the Nasuni (S3) bucket name, the Nasuni (S3) keys, and also your Anthropic Claude API key.
Once this is done we can run the code which interrogates the buckets, finds the images in the directory and sends them to be identified by the multimodal Claude LLM. The images are written back to the S3 bucket and the prior image deleted (as S3 does not have a rename capability).
The Cyberduck Output after the script has been run:
Image example: Lifeguard tower on a beach at sunset.jpg
领英推荐
In summary, we successfully utilized Claude, a compatible Bedrock AI model, with Nasuni's S3 compatible API to satisfy an image rename use case.
It would be pretty easy to extend this script by also adding richer image metadata tagging (in addition to renaming the image).
So what use cases could this be good for ?
Media and Entertainment:
Content Organization:?
Automate the process of identifying and renaming images in media libraries, making it easier for production teams to search and organize visual assets.
Metadata Enrichment:?
Enhance image metadata by adding descriptive tags and accurate filenames, improving asset discoverability and reusability.
E-Commerce and Retail:
Product Catalog Management:?
Automatically recognize and rename product images based on their content, ensuring consistent and accurate product listings.
Visual Search Optimization:?
Improve the effectiveness of visual search tools by ensuring images are correctly labeled and tagged, enhancing the customer shopping experience.
Marketing and Advertising:
Campaign Asset Management:?
Streamline the management of marketing campaign assets by automatically identifying and renaming images, making it easier to retrieve and deploy visual content.
Brand Consistency:?
Ensure all marketing materials are consistently labeled and organized, maintaining brand integrity across various platforms.
Publishing and Media:
Editorial Asset Management:?
Facilitate the organization of editorial images by automatically identifying and renaming them according to their content, aiding in the efficient production of publications.
Archive Management:?
Improve the management of historical image archives by accurately tagging and renaming images, making it easier to retrieve and use archived content.
Healthcare:
Medical Imaging Management:?
Automate the identification and renaming of medical images (e.g., X-rays, MRIs) based on their content, improving the organization and retrieval of patient records.
There is a lot of room for small AI use cases such as this to provide value within the enterprise.