登录查看更多内容

How Far Are We From Being Able to Generate Whatever 3D Objects On the Fly?

Patrick Delaney

Software Engineer, DevOps, MLOps, LLMOps, AIOps

发布日期: 2023年7月17日

Welcome to my bi-weekly newsletter, “I’ll Keep This Short ,” where I navigate the less-traveled paths of AI, building new insight beyond the banal, mainstream chatter.

Step Into a New Dimension

Walking across your living room floor, coffee in hand, you ready yourself to sit down in your nice, relaxing avocado-shaped easy chair for some well-earned rest after a hard day’s work of writing prompts and generating images.

Dalle-2 Prompt: “3d render of a chair that looks like an avocado digital art”

While you sit there drinking your very real coffee, staring off into space, probably what you don’t think to yourself is, “Whew, I sure am glad that this chair in fact exists in physical reality.”

But in fact that’s precisely where we’re at with the vast majority of AI-generated content on the internet today. We’re a heck of a long way off from creating actual 3D content on the fly. Even the avocado chair above, while it certainly looks 3-dimensional, it’s really a 2-dimensional rendering trained on previous 2-dimensional snapshots of 3D renders that a human did.

For those who have used 3D modeling software it’s likely imminently clear what I am talking about. 3D CAD software has been ubiquitous since the 1980’s as something used to model virtually everything, from furniture here on earth to furniture on the International Space Station.

Since I’m not clear on how familiar the vast majority of everyone reading this article might on the nuances of 3D objects vs. 3D pictures in 2D space, I drew a little demonstration to show what I mean when I am talking about the difference between illusory 3D objects and real 3D objects below.

If you rotate a 3D object, you should be able to see the other side of it, in some kind of software environment. If you rotate a picture of a 3D object, an illusory 3D object, you will see the other side of the picture frame, and the illusory object does not change.

So where the heck are we as a species in terms of being able to generate some sweet, sweet, real 3D objects? There’s got to be tons of uses for text-generative 3D objects, from being able to generate and 3D-print out your own personal toe-door-opener things you see in bars, to a plastic bust of Karl Marx that fits over the end of your toothpaste tubes, so that Karl Marx can spit toothpaste on to your brush every night.

What Peak Performance Looks Like

This is what peak 3-dimensional performance looks like. You may not like it, but this is actually the first actual 3D printed object that was created using mathematics - the Utah Teapot , first rendered in 1975 by a researcher at the University of Utah.

Short and stout, with a handle and a spout, when you tip it over, you realize it’s actually rendered via Bézier Curves rather than just perhaps a bunch of points manually configured by hand in a grid space. Bézier Curves are essentially parabolic lines defined by mathematical functions, like these. You can imagine how a congruence of several of these in a defined way can be used to create objets.

Let’s contrast this to an illusory 3D avocado teapot, as interpreted by Dall-E 2, just for kicks:

While cool, it’s a hallucination, without any real physical embodiment, that is to say, there isn’t really a 3D point cloud which dictates how those shadows fall and how that light bounces off of the surface. There would be no way to, “rotate” these on the screen, they are purely illusory 3D objects, not, “real 3D objects.”

The above gives us a foundational understanding for where 3D graphics came from in the first place. So how about generative 3D objects?

Enter Shap-E

Perhaps you’ve heard about Dall-E, how about Shap-E? Recently, a paper came out from OpenAI researches called Shap-E , which is a 3D object generator. From the paper, Shap-E is an improvement over a previous model called Point-E . Whereas Point-E modeled point clouds, Shap-E uses something called Neural Radiance Fields (NeRF) which represents a scene as an implicit function. Never mind what NeRF is for a moment.

What you get as a result of NeRF in contrast to Point Clouds is something like this:

As opposed to Point Cloud images which are very detailed like the following, but are lacking in realistic surface interpretation:

So here's the tricky part. While this does seem to be an interesting approach, the value of Dall-E and other generative AI seems to be in part the capability to create, "whatever," but you don't seem to be able to do that with Shap-E, it creates all sorts of mistakes, e.g.:

The resulting samples also seem to look rough or lack fine details, or outright hallucination in the form of not being what was requested.
Further, this is not covered in the paper, but the architecture seems to be super resource heavy as far as I can tell, I tried to run it in a Colab notebook and it took forever, so this might not be, "cheap." I’ll go over this in the next section.

Industry Research Biz 1 年前

Reduce Time-to-Market with AI for Industrial Design

MistyWest 4 个月前

3D Modelling Software Massive Market Opportunity…

Aditya Bhavsar 1 年前

My Attempt At Running Shap-E in a Colab Notebook

I was able to render an image of a dog with a HuggingFace demo:

Ran the model with a prompt:

…and I was able to successfully convert this into a GLTF file, which looks like the following:

Mathematical Background

Point-E Math

I'm skipping the math section for the Linkedin Article. To view the math, go to the substack version of this article .

Point-E Result

So as a result, Point-E was able to generate images which are very detailed like the following Avocado chair:

Shape-E Math

I'm skipping the math section for the LinkedIn Article. To view the math, go to the substack version of this article .

Shap-E Result

Shap-E was able to generate images which were, "pleasing," “smooth,” and did not skip out on parts of the model as opposed to Point-E, like the following:

So here's the tricky part. While this does seem to be an interesting approach, the value of Dall-E and other generative AI seems to be in part the capability to create, "whatever," but you don't seem to be able to do that with Shap-E, it creates all sorts of mistakes.

What About Just Rendering with Code with a Large Language Model?

As I have mentioned in a previous post, large language models have a problem with factual knowledge alignment , and this goes in particular for more specific, niche topics.

We can observe that the best of class, GPT-4 LLM as of May 2023 does not deliver even the simplest everyday object:

Create a house in OpenScad

Imagine trying to build an actual house with this technology. Gah! What happened to my roof? I appreciate that my car is dry but really it would have been much better to protect my living room.

There’s a Market for That

So of course I set up a betting market on Manifold, my favorite Predictions Marketplace, to try to get an idea of where this will go.
As of the time of writing, this is sitting at about 65%, but you can check the new probability below or by following the link to the market itself.
I will not bet on this market, but I will have to figure out a way to resolve it, which may become contentious if there is a large trader volume by the time it needs to be resolved, prior to June, 2024.

Market on Non-Crappy 3D Generative Objects

How Far Are We From Being Able to Generate Whatever 3D Objects On the Fly?

Patrick Delaney

Software Engineer, DevOps, MLOps, LLMOps, AIOps

Step Into a New Dimension

What Peak Performance Looks Like

Enter Shap-E

领英推荐

My Attempt At Running Shap-E in a Colab Notebook

Mathematical Background

Point-E Math

Point-E Result

Shape-E Math

Shap-E Result

What About Just Rendering with Code with a Large Language Model?

There’s a Market for That

I'll Keep This Short

491 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

7 Types of 3D models you need to know about

3D Modelling Software Market Massive Market Opportunity Opening Up | Profiled Autodesk, Dassault Systèmes, Trimble

Create and Share 3D Models with Speed and Accuracy

Reasons Why Your Business Needs 3D Modeling

3D Modeling Software Market Giants Spending Is Going To Boom | Intermap Technologies, SketchUp, Blender

"GameBot" l 3D HardSurface Asset l Production Blog

How to Add 3D Elements to Scanned 3D Room Model

Best 3D Modeling Software for 2024

Why 3D Modeling is Critical for Turning Product Concepts into Market-Ready Designs?

Speed up 3D Solid Modeling with Offset Edges and Convert Edges commands

Step Into a New Dimension

What Peak Performance Looks Like

Enter Shap-E

领英推荐

My Attempt At Running Shap-E in a Colab Notebook

Mathematical Background

Point-E Math

Point-E Result

Shape-E Math

Shap-E Result

What About Just Rendering with Code with a Large Language Model?

There’s a Market for That

I'll Keep This Short

491 位关注者

The Fine-Turning an Open Source Language Model Journey Part One: Impetus

2023年10月10日

Craft Beer and Spongiform Brain Bacterium

2023年9月27日

Constitutional A.I. and the Math Achievement Gap

2023年9月13日

AI Panic: Are Robots Going to Kill Us All?

2023年8月29日

Why Do A.I. Image Generators Have Problems Creating Hands?

2023年8月16日

Threading an Argument for the Fediverse

2023年7月31日

Why Didn't Ancient Rome Have a Space Program?

2023年7月3日

How Far Away Are We From Non-Crappy AI Generated Video?

2023年6月21日

Defeating the Wizard: Large Language Model Prompt Attacks

2023年6月12日

Whatever Happened to The Internet of Things?

2023年6月7日

社区洞察

其他会员也浏览了

7 Types of 3D models you need to know about

3D Modelling Software Market Massive Market Opportunity Opening Up | Profiled Autodesk, Dassault Systèmes, Trimble

Create and Share 3D Models with Speed and Accuracy

Reasons Why Your Business Needs 3D Modeling

3D Modeling Software Market Giants Spending Is Going To Boom | Intermap Technologies, SketchUp, Blender

"GameBot" l 3D HardSurface Asset l Production Blog

How to Add 3D Elements to Scanned 3D Room Model

Best 3D Modeling Software for 2024

Why 3D Modeling is Critical for Turning Product Concepts into Market-Ready Designs?

Speed up 3D Solid Modeling with Offset Edges and Convert Edges commands