Apple’s MGIE, Multimodal AI Glasses, and Top AI Advancements

Apple’s MGIE, Multimodal AI Glasses, and Top AI Advancements

Here we have covered some of the most exciting updates in AI that will blow your mind. Read more to find out how Apple is innovating with AI, having AI super powers in the form of glasses feels like, Microsoft Copilot's fresh updates, and much more.


The AI Highlights

  • Apple has released MGIE, an open-source model that uses natural language to edit images. MGIE can crop, resize, rotate, flip, add filters, etc. Read More


  • Brilliant Labs announced “Frame”. The glasses are the first to have a built-in multimodal AI assistant capable of interacting with digital and physical worlds. Perplexity also announced that they will be integrating their chatbot into the Frames.



  • Microsoft just pushed new Copilot updates, including a visual makeover, inline image editing, and more. The company also showed off its first Super Bowl ad in 4 years, putting AI at the center stage.



  • The Midjourney alpha is now available to users who have generated 10,000 images or more.



The Next Chapter in Google Gemini Era: Gemini Advanced

The Google Gemini model has been the subject to talk about in the AI world lately, with a focus on its language abilities, performance in various tasks, and comparisons to other models. It highlights several strengths of Gemini, including its ability to handle complex reasoning tasks, generate non-English languages, and outperform other models in certain benchmarks.

Gemini is now a step older with its advanced version and Bard has now been rebranded as Google Gemini. A brand new landing page, new apps, and a lot more has been pushed forward in the rebranding, with subtle warnings over the existing Google assistant.

Here are some details:

  • Gemini Advanced is a paid upgrade of Gemini, offering more features for $19.99/month with the Google One AI Premium Plan, which also gives you 2TB of storage and other perks.
  • If you're on a Google One family plan with AI Premium, your family members do not get access to Gemini Advanced, which slows down its growth as a product.
  • Gemini comes in three versions: Nano, Pro, and Ultra, each designed for different uses.
  • Gemini works with many Google tools like the Pixel 8 phone, Bard chatbot, and apps like Gmail, Docs, Slides, and Sheets.
  • Gemini is cheaper and easier to use than before, making it available to more people.
  • There's a Gemini mobile app that lets you do things like create photo captions and get answers to questions about articles.

While the results achieved by Gemini are impressive, there are also discussions about the potential exaggeration of its capabilities in certain areas. Overall, Google Gemini provides valuable insights into its language abilities and performance across various tasks. Read More


Natural Language-Prompted Image Editing with MGIE

Apple and UC Santa Barbara researchers have introduced MGIE, an open-source AI system that enables image editing through natural language commands. MGIE can reliably edit an image even if the user describes the changes to be made in natural language.

The system can handle common Photoshop adjustments like cropping, rotating, and filtering, as well as more advanced object manipulations, background replacement, and photo blending. MGIE optimizes images globally by adjusting properties.

How does MGIE use natural language prompts to improve image editing?

MGIE uses natural language prompts to improve image editing by incorporating multimodal large language models (MLLMs) to interpret text prompts and make pixel-level changes to photos.

The MLLMs are capable of cross-modal reasoning and responding appropriately to text, allowing MGIE to translate user commands into concise, unambiguous editing guidance.

For example in the image below, "make the sky more blue" becomes "increase the saturation of the sky region by 20%."

MGIE's versatile design empowers all kinds of image editing use cases, from common Photoshop adjustments like cropping, rotating, and filtering to more advanced object manipulations, background replacement, and photo blending. The system optimizes images globally by adjusting properties.

MGIE can understand a wide range of natural language prompts for image editing, including basic prompts like

Crop the image
Rotate the image
Apply a filter

It can also handle more complex tasks like "remove the background", "replace the sky", and "add a person to the photo".

The system can even understand ambiguous commands make it look better and make appropriate edits based on the context of the image.

Benefits of natural language prompts over traditional image editing methods

  1. Ease of Use: Natural language prompts make image editing more user-friendly. People can simply describe what they want, making the process more intuitive than traditional methods that require technical knowledge of editing software.
  2. Enhanced Flexibility: With text-guided manipulation, users aren't limited to predefined tasks (like colorization or inpainting). They can ask for a wide range of edits, from simple adjustments to complex transformations, reflecting their specific needs more accurately.
  3. More Accessible: By using natural language, these systems lower the barrier to advanced image editing, making powerful tools available to a wider audience without the need for specialized training.
  4. Intuitive Interaction: The conversational approach of using language to guide edits creates a more natural and engaging user experience, making it easier for users to achieve their desired outcomes.


So, this is it.

Thanks for reading PixelBin Newsletter! There is much more to these updates, read the newsletter to unveil the most useful AI tools for you.

Subscribe to our newsletter here and get more insights.

We’ll be back next week with fresh updates on AI, which we think you’ll love.

Meanwhile, if you have something to tell us, we are all ears.

Have suggestions or questions for us? Reach out to us at [email protected].

Follow Us for Everyday Highlights on Twitter, LinkedIn, and Instagram. Join our PixelBin discord community and engage in conversation with fellow AI enthusiasts.

要查看或添加评论,请登录

Pixelbin的更多文章

社区洞察

其他会员也浏览了