Implementing Image Moderation with OpenAI's GPT-V

Image moderation is a crucial part of maintaining the integrity and safety of digital platforms. It involves examining user-generated images and deciding whether they are suitable for display based on predefined policies. With the advent of artificial intelligence (AI), automating this process has become feasible. In this article, we discuss how to implement an image moderation system using OpenAI's revolutionary GPT-V model.

Before we start the implementation, we need to define our moderation criteria. For this example, we have five overarching policies:

MODERATION_CRITERIA = """
  1. Offensive or harmful content
  2. Politically sensitive material
  3. Religious or dogmatic content
  4. Controversial, polarizing content
  5. Hate speech and discriminating content
"""        

We have categorized potential violations into five broad classes for simplification. These can be updated or extended based on the specific use-case.

To command our GPT-V model, we have designed a specific prompt. The model will assess images based on our moderation criteria, describe each image in detail, and identify if any information violates the criteria. It also identifies weapons or similar objects in the images. The model responses will be in a specific JSON format:

GPTV_MODERATION_PROMPT = """
  Topics to avoid:
  {{ MODERATION_CRITERIA }}
  Instructions:
  * Describe each image in detail.  
  * Flag if the description violates any of the topics above.
  * I want you to respond in the following specific JSON format:
  {
      "flagged": boolean,
      "categories": list of categories violated,
      "descriptions": "The image description",
      "reasoning": "The reason for not flagging",
      "weapons": "List any weapons",
      "ranking": "Rate between 1 to 10 on how much the image violates"
  }        

The response will contain a true/false flag to indicate whether the image was flagged, a list of violated categories, the image description, the reason it was or wasn't flagged, any weapons identified, and a ranking of how much the image violates the terms.

Next, we generate the system prompt by rendering the moderation scheme with our criteria using the Chevron library:

SYSTEM_PROMPT = chevron.render(
    GPTV_MODERATION_PROMPT,
    {"MODERATION_CRITERIA": MODERATION_CRITERIA}
)        

Now, it is time to call the GPT model and feed our system prompt. We specify our choice of model and other parameters for the request:

params = {
      'model': 'gpt-4-visual',
      'temperature': 1.0,
      'top_p': 1.0,
      'frequency_penalty': 0.0,
      'presence_penalty': 0.0,
      'api_base': 'https://api.openai.com/v1',
      'headers': {'OpenAI-Version': '2020-11-07'},
      'api_key': API_KEY,
      'messages': SYSTEM_PROMPT
}

response = openai.ChatCompletion.create(**params)        

In the output, the model will generate a description for each image input and check if it violates any of the given criteria, flagging where necessary. If any weapons are found in the image, it will be listed in the 'weapons' field of the JSON.

{
    "flagged": true,
    "categories": [
        "Religious or dogmatic content",
        "Controversial or polarizing content"
    ],
    "description": "The image is a vibrant and elaborate artwork that appears to be an advertisement. It showcases a blend of various cultural elements including some from Hinduism. The central figure resembles a Hindu goddess with multiple arms, adorned in traditional attire and jewelry. There are also individuals dressed in different traditional outfits, possibly representing different cultures. The background is filled with ornamental patterns, mandalas, domed buildings that resemble mosques, swirling designs, and various objects. There is also a large logo prominently displayed at the center-top. Flames, candles, and a variety of objects including soda cans and bottles are also present in the image.",
    "reasoning": "The image combines religious iconography with the branding, which might be viewed as controversial or polarizing.",
    "weapons": "None",
    "ranking": "5",
    "image_name": "https://<storage account>.blob.core.windows.net/generated/07c25cde-f5a1-40c4-b17a-390a7c57ddc5.png"
}        

This system enables rapid, accurate, and automated image moderation, filtering out content that violates any of the defined criteria. Through the seamless integration of OpenAI's GPT-V model, we are one step closer to making the digital space safer and more welcoming for all its users.

Varun T.

Director marketing intelligence & US @ Keenfolks, Co-founder @ Og.ai, Author, TEDx speaker. Crafting marketing spells for brands with dash of AI sorcery, pinch of innovation, and a touch of organizational makeover

1 年

Very interesting! Thanks for sharing. Would the similar approach also work for image generation compliant with brand guidelines? Brands are very specific on how their outputs need to adehre with visual identify and brand guidelines. For example, there can be some NO GO fonts, colours and age of people for SPRITE images, and this needs to be screened before making it usable. Would love to know your take on how nuanced this moderation can get at a brand gudelines level. If it can, what criteria will shape it's successful training? Lokeshwar Reddy Vangala Pratik Thakar

Subrat Bisht

Head Enterprise Digital Capabilities @ General Mills India | Martech Transformation, Data-driven Marketing

1 年
回复

要查看或添加评论,请登录

Lokeshwar Reddy Vangala的更多文章

社区洞察

其他会员也浏览了