Unlocking Enterprise Image Descriptions: Harnessing the Power of Nasuni's S3 Edge API and LLMs for Precision Naming

Unlocking Enterprise Image Descriptions: Harnessing the Power of Nasuni's S3 Edge API and LLMs for Precision Naming

In a prior article I showed how we could leverage Llamafile to use a local LLM to rename images with descriptive names to make them easier to find.

Today we are going to do something very similar but this time we will use the Nasuni S3 Edge compatible S3 endpoint (for unstructured file data managed by Nasuni), and Claude's LLM to interrogate the images.

Imagine that a company has directories full of images in which the names are simply letters or number or a combination of both (which they often are). This makes it very difficult for users to find and identify the images without browsing each image.

We will use Claude’s multimodal AI capabilities, via its API, to identify the image and rename it:

End-to-End-Process

Requirements?

  • Python 3?

  • S3 Edge configured and available?

  • Anthropic API key?

Starting Point:?

Image Directory on a Nasuni Edge which given Nasuni provides a compatible S3 API can be interrogated via an S3 compatible clients such as Cyberduck:?


Original Image listing in Cyberduck

As Nasuni's S3 Edge is compatible with the S3 API we can use the Boto a Python library for interacting with S3. It enables us to perform various operations on S3 buckets and objects.

Code:?

import boto3 

import base64 

import os 

import io 

from botocore.exceptions import ClientError 

from anthropic import Anthropic 
 

# Nasuni S3 edge configuration 

NEA_ADDR = '<enter_address>' 

A_KEY = 'S3_Edge_Key' 

S_KEY = 'S3_Edge_Secret_key' 

BUCKET_NAME = 'enter_bucket_name' 

IMAGE_DIRECTORY = '<image_DIR>'  # Specify the directory (prefix) where images are located   

# Initialize the Anthropic client 

anthropic = Anthropic(api_key='anthropic_api_key') 

# Create an S3 client 

s3_client = boto3.client( 

    's3', 

    endpoint_url=f'https://{NEA_ADDR}/', 

    aws_access_key_id=A_KEY, 

    aws_secret_access_key=S_KEY 

) 

  
def test_connection(): 

    try: 

        response = s3_client.list_buckets() 

        print("Connection successful!") 

        print("Available buckets:") 

        for bucket in response['Buckets']: 

            print(f"  {bucket['Name']}") 

        return True 

    except Exception as e: 

        print(f"Connection failed: {e}") 

        return False 

  
def get_object_from_s3(key): 

    try: 

        response = s3_client.get_object(Bucket=BUCKET_NAME, Key=key) 

        return response['Body'].read() 

    except ClientError as e: 

        print(f"Error reading object from S3: {e}") 

        return None 

  
def sanitize_filename(filename): 

    if filename is None: 

        print("Warning: Filename is None") 

        return "unknown_file" 

    filename = ''.join(c for c in filename if c.isalnum() or c in (' ', '_', '-')) 

    return filename[:50]  # Limit to 50 characters to avoid overly long filenames 

  

def get_image_description(image_data): 

    if image_data is None: 

        print("Error: image_data is None") 

        return None 

    try: 

        encoded_image = base64.b64encode(image_data).decode('utf-8') 

        response = anthropic.messages.create( 

            model="claude-3-opus-20240229", 

            max_tokens=300, 

            messages=[ 

                { 

                    "role": "user", 

                    "content": [ 

                        { 

                            "type": "image", 

                            "source": { 

                                "type": "base64", 

                                "media_type": "image/jpeg", 

                                "data": encoded_image 

                            } 

                        }, 

                        { 

                            "type": "text", 

                            "text": "Describe this image in a brief, concise manner suitable for a filename." 

                        } 

                    ] 

                } 

            ] 

        ) 

        return response.content[0].text 

    except Exception as e: 

        print(f"Error getting description: {e}") 

        return None 

  

def process_image(key): 

    try: 

        print(f"Processing image: {key}") 

        image_data = get_object_from_s3(key) 

        if not image_data: 

            print(f"Failed to get image data for {key}") 

            return False 



        description = get_image_description(image_data) 

        if description: 

            sanitized_description = sanitize_filename(description) 

            new_key = os.path.join(os.path.dirname(key), sanitized_description + os.path.splitext(key)[1]).replace('\\', '/') 
          

            print(f"Uploading object with new name: {new_key}") 

            try: 

                s3_client.put_object(Bucket=BUCKET_NAME, Key=new_key, Body=io.BytesIO(image_data)) 

                print(f"Successfully uploaded: {new_key}") 

            except ClientError as e: 

                print(f"Error uploading new object: {e}") 

                return False 
             

            print(f"Deleting original object: {key}") 

            try: 

                s3_client.delete_object(Bucket=BUCKET_NAME, Key=key) 

                print(f"Successfully deleted: {key}") 

            except ClientError as e: 

                print(f"Error deleting original object: {e}") 

                return False 
             

            print(f'{key} renamed to {new_key}') 

            return True 

        else: 

            print(f'No description returned for {key}') 

            return False 

    except ClientError as e: 

        print(f'An error occurred processing {key}: {e}') 

        return False 


def list_objects(bucket, prefix=''): 

    try: 

        paginator = s3_client.get_paginator('list_objects_v2') 

        page_iterator = paginator.paginate(Bucket=bucket, Prefix=prefix, Delimiter='/') 
  

        for page in page_iterator: 

            if 'Contents' in page: 

                for obj in page['Contents']: 

                    yield obj 

            if 'CommonPrefixes' in page: 

                for common_prefix in page['CommonPrefixes']: 

                    yield from list_objects(bucket, common_prefix['Prefix']) 

    except ClientError as e: 

        print(f"Error listing objects: {e}") 

  
def main(): 

    if not test_connection(): 

        print("Failed to connect to S3. Exiting.") 

        return 

    processed_count = 0 

    try: 

        print(f"\nListing objects in bucket: {BUCKET_NAME}, directory: {IMAGE_DIRECTORY}") 

        for obj in list_objects(BUCKET_NAME, IMAGE_DIRECTORY): 

            key = obj['Key'] 

            if key.lower().endswith(('.jpg', '.jpeg', '.png', '.gif')): 

                print(f'Processing file: {key}') 

                if process_image(key): 

                    processed_count += 1 

     
        print(f'Completed processing {processed_count} images') 
    

        # Verify the results 

        print("\nVerifying results...") 

        remaining_images = list(list_objects(BUCKET_NAME, IMAGE_DIRECTORY)) 

        print(f"Number of objects remaining in {IMAGE_DIRECTORY}: {len(remaining_images)}") 

        for obj in remaining_images: 

            print(f"  {obj['Key']}") 

    except Exception as e: 

        print(f"An error occurred: {e}") 

        import traceback 

        traceback.print_exc() 


if name == "__main__": 

    main()         


Before running the script be sure to add the Nasuni Edge details, the Nasuni (S3) bucket name, the Nasuni (S3) keys, and also your Anthropic Claude API key.

Once this is done we can run the code which interrogates the buckets, finds the images in the directory and sends them to be identified by the multimodal Claude LLM. The images are written back to the S3 bucket and the prior image deleted (as S3 does not have a rename capability).



The Cyberduck Output after the script has been run:


Post image processing listing in CyberDuck


Image example: Lifeguard tower on a beach at sunset.jpg



In summary, we successfully utilized Claude, a compatible Bedrock AI model, with Nasuni's S3 compatible API to satisfy an image rename use case.

It would be pretty easy to extend this script by also adding richer image metadata tagging (in addition to renaming the image).

So what use cases could this be good for ?

Media and Entertainment:

Content Organization:?

Automate the process of identifying and renaming images in media libraries, making it easier for production teams to search and organize visual assets.

Metadata Enrichment:?

Enhance image metadata by adding descriptive tags and accurate filenames, improving asset discoverability and reusability.

E-Commerce and Retail:

Product Catalog Management:?

Automatically recognize and rename product images based on their content, ensuring consistent and accurate product listings.

Visual Search Optimization:?

Improve the effectiveness of visual search tools by ensuring images are correctly labeled and tagged, enhancing the customer shopping experience.

Marketing and Advertising:

Campaign Asset Management:?

Streamline the management of marketing campaign assets by automatically identifying and renaming images, making it easier to retrieve and deploy visual content.

Brand Consistency:?

Ensure all marketing materials are consistently labeled and organized, maintaining brand integrity across various platforms.

Publishing and Media:

Editorial Asset Management:?

Facilitate the organization of editorial images by automatically identifying and renaming them according to their content, aiding in the efficient production of publications.

Archive Management:?

Improve the management of historical image archives by accurately tagging and renaming images, making it easier to retrieve and use archived content.

Healthcare:

Medical Imaging Management:?

Automate the identification and renaming of medical images (e.g., X-rays, MRIs) based on their content, improving the organization and retrieval of patient records.

There is a lot of room for small AI use cases such as this to provide value within the enterprise.

要查看或添加评论,请登录

Jim Liddle的更多文章

社区洞察

其他会员也浏览了