登录查看更多内容

AI Avatar - A Brief Analysis of Photo-to-Image AI Models

Chenxi Wang, Ph.D.

Investor, Cyber expert, Fortune 500 board member, Venturebeat Women-in-AI award winner. I talk about #cybersecurity #venturecapital #diversity #womenintech #boardgovernance

发布日期: 2024年10月7日

+ 关注

Recently I used Meta AI to generate an Avatar image with the prompt: "Imagine me as a rock chick with wind-tossed hair"

Prompt: "Imagine me as a rock chick with wind-tossed hair" (Meta AI)

As soon I posted the image on Facebook, my DM blew up.

"This is fantastic!" my friends exclaimed!

"Which service did you use to generate this?", many asked.

Meta AI was indeed easy to use. You upload a selfie and seconds later, you can start to use prompts to generate images.

This image is probably my favorite. The background and the apparel are on point, down to the details of the hair. Of course my arms are thinner and the body is leaner than in reality, but hey, who is checking?

I tried a few other prompts, like "Imagine me as a conference speaker", which generated the image below.

Prompt: "Imagine me as a conference speaker" (Meta AI)

Hmm... It does look like I am giving a talk, but why do the audience face away from me?

And here is another interesting one. The prompt is: "Imagine me as Carrie Bradshaw". Carrie is perhaps my favorite TV show character. And we got this:

Prompt: Imagine me as Carrie Bradshaw (Meta AI)

I was actually fairly impressed by this one, With the signature curls, the whimsical outfit, the image captured the essence of being Carrie, albeit with my face.

This got me thinking: what else is out there that can generate images based on a photo and prompts? So I went on a bit of a research mission, and this is what I found:

Text to image services (prompt only)

Microsoft designer, DALL*E both fall into this category. These apps, often based on Stable Diffusion, generate images based on prompts alone, but there is no option to input an image. So they cannot generate images with your likeness.

Image editing apps (image only)

Apps in this category take images as input but they only let you modify certain aspects of the image, such as the background, your eye color etc. Some of them are scenario based, for example, they can create professional headshots, you in a Christmas photo, you in an anime setting, etc. But there is no general prompt flexibility. Examples include Aragon.ai, fotor.com, aiavatar.com, lightxeditor.com, and Canva.com. In my opinion, these are AI for editing, not true image generation.

Image generation with limits (image only)

Lensa.ai is an example in this category. These apps will generate many images based on uploaded photo, but doesn't allow prompts - in other words, you can't tweak the images that it generates.

For my interests, I was targeting apps that can take both images and prompts. I also avoided paid services, trying only free trials or free services. I did find that many apps, like Photoleapapp.com advertise free trials, but when you click into it, it asks for money to buy credits to generate even a single image.

The one that caught my eye was Imagineme.ai. From the website, it looks like a service that is comparable to Meta AI. The image quality looks superb; Its tag: "Generating stunning images of yourself with one line of text."

Imagineme.ai is a paid service, but a friend of mine had some credits and he let me use them. Here are my analysis of ImagineMe AI vs. Meta AI.

- Meta AI works on a single selfie. Of course they have access to many photos of you, the output might not be based on a single photo. Imagineme, on the other hand, ask you to upload 20 photos.

- Meta AI works almost instantaneously after you upload the photo. Imagineme, however, takes 10-20 hours to train a model using your photos. Fine tuning takes a bit less time, but can still be several hours.

Here are some side by side comparisons between Meta AI's generations and those of Imagineme.

Prompt: Imagine me as a rock chick with wind-tossed hair. Left: Meta AI, Right Imagineme

领英推荐

3 Predictions For The Role Of Artificial Intelligence…

Bernard Marr 4 年前

Discovering Ideas: Navigating the Circle of Creative…

Raul Arantes 1 年前

Generative AI: Need a new headshot? Maybe AI can help!

Ash Serrano 1 年前

Left: Meta AI, Right Imagineme (2000 steps)

While MetaAI smoothed the lines on my face and gave me a more youthful look. Imagineme accentuated the wrinkles on my face and made me look like an aging asian Bon Jovi. "At least it gave me some serious guns on the arms", I chuckled.

Prompt: Imagine me at a cocktail party with a fancy dress Left: Meta AI, Right Imagineme

Left: Meta AI, Right: Imagineme AI (2000 steps)

The left is a slightly younger version of me, in a 20's dress and headpiece. Why 20's? It was not in the prompt. I guess the AI decided a 20's style dress is desirable. The right is ... my aunt!

Prompt: Imagine me very sad about something (Left: Meta AI, Right Imagineme)

While Meta AI didn't quite get "sadness" very well, my sad face was more like a "subdued" face. Imagineme's sad face was frightening. Am I 80?

I also asked Imagineme to generate me as Carrie Bradshaw. I fully expected an aging version of me as Carrie. However, I did not quite get that.

Prompt: Imagine me as Carrie Bradshaw (Imagineme 1800 steps)

What I got was something rather interesting -- instead of my face (aged or not) with Carrie's hair and clothing, the service had combined Asian features with Sarah Jessica Parker's distinct facial structure. It is neither I nor Sarah Jessica Parker.

At this point, I was determined to find out why Imagineme's models, after getting 20 of my photos, were this off the mark. So I looked around in the app and eventually found that I could change the fine-tuning steps the model go through to improve the image quality.

By default, the model fine tunes in 2000 steps, but the user can manually configure it to a different number of steps. I experimented with it -- before 1500 steps, the images bear no resemblance of me. After 1500, the images started to take on some of my features. After quite a few tweaks, 1800 seems to be the optimal where I don't look too old and the images still look somewhat like me.

Here are some of the images Imagineme generated after I changed the fine-tuning steps.

Imagine me as Carrie Bradshaw:

This version of generated Carrie Bradshaw is better than the previous version. It still does not look like me, but at least it is not an asian Sarah Jessica Parker.

I repeated the "Imagine me as a rock chick with wind-tossed hair" with the 1800-step-tuned model. The result was slightly better, less frightening than the 2000-step model.

Left: Meta AI, Right: Imagineme AI (1800 steps)

The rocker image the 1800-step model generated was more youthful and looks more like me than the 2000-step model, but not as attractive as the Meta AI one.

Another prompt I tried was "Imagine me as a high school student" to see how the models can generate a younger version of me.

Left: Meta AI, Right: Imagineme (1800 steps)

Both models rendered a younger woman in a classroom setting. While the MetaAI image looked more like me. The Imagineme image looked like someone else, though that smile was definitely mine.

It's been interesting experimenting with the different photo+prompt image generation apps. I have to say between the Imagineme and Meta AI models, there is a clear winner. With soft edges and a gentle lighting, Meta AI's images have an overall more sophisticated feel and quality than others. The images also highlight the subject's good features and smooth out the sharp edges. I did like the fact that Imagineme allows one to customize the steps of fine tuning. However the overall images were hit and miss.

While AI can generate all kinds of scenarios at the beckon of a prompt, I have not found a model that is a match to the real thing. Below is a photo I took three weeks ago in Zurich, no filters, no AI touch ups, with wrinkles and all, but no AI can generate that sparkle in the eyes of the real photo. Well, not yet anyway.

Thibault de Becdelievre

Business Developer chez Seelab.ai | J’aide les entreprises et les équipes créatives à gagner du temps sur leur création visuelle grace à l’intelligence artificielle pour booster leur communication et leur productivité

3 个月

Such a fun and insightful experiment, Chenxi! It’s interesting to see how different AI models interpret prompts and features. At Seelab.ai, we often think about balancing consistency and realism in AI-generated images. What’s your take on where these tools should draw the line between enhancing reality and altering identity?

Sujata G.

Digital Creative Designer | Researcher | Prompt Engineer | Staff Engineer

5 个月

Very interesting creations using AI Chenxi ! And quite a fun project to get started with using AI tools.

Shelly Liu

Senior Cloud Engineer at CIBC

5 个月

Definitely gorgeous

Ronit Polak

Engineering Executive | Diversity & Inclusion Leader | Board Director | Silicon Valley Business Journal Women of Influence 2019

5 个月

Fascinating!

查看更多评论

要查看或添加评论，请登录

Chenxi Wang, Ph.D.的更多文章

An Ode to Suzhou High

2024年7月26日

An Ode to Suzhou High

There are few things in life I deem “must do” these days. There is always another trip that I can take, another deal…

24 条评论
Driving Innovation & Advancements in Cyber Security

2018年10月10日

Driving Innovation & Advancements in Cyber Security

Chenxi Wang At the Grace Hopper Conference two weeks ago, I moderated a panel of four women executives to discuss…

7 条评论
Join Us At Context Conversation: Medical Device Security

2018年9月14日

Join Us At Context Conversation: Medical Device Security

In June 2017, The NotPetya attack devastated some of the largest control systems around the world, not the least was…

1 条评论
Pen Testing Delivered As SaaS Help Companies Achieve Compelling ROIs

2017年11月3日

Pen Testing Delivered As SaaS Help Companies Achieve Compelling ROIs

We all know that web application security is of critical importance. For several consecutive years, Verizon's DBIR…

8 条评论
Programmatic Security Risk Management - A New Approach by Balbix

2017年8月24日

Programmatic Security Risk Management - A New Approach by Balbix

A few weeks ago, I was introduced to Balbix, a new Silicon Valley security startup. I've known Gaurav Banga for some…

2 条评论
Announcing Twistlock's Container How-to Guide Series

2016年11月28日

Announcing Twistlock's Container How-to Guide Series

Many DevOps and security professionals have asked us questions on how to set up a secure environment to run containers.…
Twistlock Announces $10M Series A Round

2016年7月6日

Twistlock Announces $10M Series A Round

Today Twistlock achieved a significant milestone - we closed a $10 million Series A round, led by TenEleven Ventures…

8 条评论
Against Encryption Backdoors? For backdoors? Not decided? Come to this RSA session to see an ex-NSA GC debate with privacy & encryption experts

2016年3月1日

Against Encryption Backdoors? For backdoors? Not decided? Come to this RSA session to see an ex-NSA GC debate with privacy & encryption experts

On March 2 at 9:10 a.m.

1 条评论
New EU Data Protection Regulations Are Tougher And Impose Sweeping Penalties

2015年12月17日

New EU Data Protection Regulations Are Tougher And Impose Sweeping Penalties

On December 15 EU approved a set of new data protection regulations that will have significant impact to companies that…

8 条评论
Twistlock announces general availability of its Container Security Suite

2015年11月10日

Twistlock announces general availability of its Container Security Suite

After 6 months and 15 successful beta deployments, Twistlock today announced general availability of our container…

3 条评论

See all articles

AI Avatar - A Brief Analysis of Photo-to-Image AI Models

Chenxi Wang, Ph.D.

Investor, Cyber expert, Fortune 500 board member, Venturebeat Women-in-AI award winner. I talk about #cybersecurity #venturecapital #diversity #womenintech #boardgovernance

领英推荐

Chenxi Wang, Ph.D.的更多文章

社区洞察

其他会员也浏览了

Crafting Stunning AI Focus Sash Designs

How to tell if an image is AI-generated

Absen x Refik Anadol Unveil AI Art Installation at ISE 2025

Creative Assistant, Exploring Stylistic Avenues, and AI

With AI, 360° Adventures Await

Beyond the Looking Glass: The Journey from Luxurious Mirrors to Ubiquitous AI

Starting a New Era of Fashion Prototyping with AI: The Bmodel Journey

Enter Billie Eilish’s AI Art Creations Dreamland (2024)

AI Pic-Time Paints Your Vision into Reality

Why generative AI solutions are just art assistants not artists.

领英推荐

Chenxi Wang, Ph.D.的更多文章

An Ode to Suzhou High

Driving Innovation & Advancements in Cyber Security

Join Us At Context Conversation: Medical Device Security

Pen Testing Delivered As SaaS Help Companies Achieve Compelling ROIs

Programmatic Security Risk Management - A New Approach by Balbix

Announcing Twistlock's Container How-to Guide Series

Twistlock Announces $10M Series A Round

Against Encryption Backdoors? For backdoors? Not decided? Come to this RSA session to see an ex-NSA GC debate with privacy & encryption experts

New EU Data Protection Regulations Are Tougher And Impose Sweeping Penalties

Twistlock announces general availability of its Container Security Suite

社区洞察

其他会员也浏览了

Crafting Stunning AI Focus Sash Designs

How to tell if an image is AI-generated

Absen x Refik Anadol Unveil AI Art Installation at ISE 2025

Creative Assistant, Exploring Stylistic Avenues, and AI

With AI, 360° Adventures Await

Beyond the Looking Glass: The Journey from Luxurious Mirrors to Ubiquitous AI

Starting a New Era of Fashion Prototyping with AI: The Bmodel Journey

Enter Billie Eilish’s AI Art Creations Dreamland (2024)

AI Pic-Time Paints Your Vision into Reality

Why generative AI solutions are just art assistants not artists.