AI and Image Understanding
Brian Myers
Design Technology Leader | AEC & BIM Innovator | Speaker on Human-Centered Digital Transformation
I know that many people have used ChatGPT for a variety of uses, even I have used it to scrub images for information. But the test that I did this morning I believe is interesting enough to share with you as it has great value in us understanding what can be accomplished NOW. In this trial, I asked ChatGPT a simple question: "Tell me about this image".
ChatGPT had no information (not even the title of the photo) to go by. Here was the result:
This photo shows a cozy and well-organized home office space. A high-back office chair is positioned in front of a desk with a large monitor displaying a vibrant image, perhaps of a digital artwork or a scene from a video game. To the right of the desk is a tablet device mounted upright. The room is well-lit with natural light coming in from the left, and there's an ambient warmth to the space, possibly from the sunlight.
The walls are adorned with framed pictures and certificates, suggesting a space that belongs to someone who takes pride in their achievements or interests. On the right wall, there are certificates indicating professional qualifications or commendations. The left wall has a framed photo or poster related to the entertainment industry, hinting at personal interests or affiliations.
Bookshelves are filled with neatly arranged books and binders, showing a level of organization and perhaps a hint at a studious or meticulous personality. There's also a small fireplace visible on the monitor of a smaller device on the left side of the desk, adding a touch of homeliness to the office.
Overall, this workspace seems to combine functionality with personal touches, making it a personalized and inviting area for work and study.
While not 100% correct, I can not say it definitively got anything wrong, other than misinterpreting a hard to see image as a fireplace... but honestly, it still got the feel of the image correct as well as most of the items in it. I do have to laugh as it thought the bookshelf on my curtain was real. But, it did get it right in terms of why I chose it for my curtain... the feel I wanted in the space. If the bookshelf had been real it would have been organized the same way. Lets try again, this time with an image I created and enjoy from MidJourney.
This image captures a serene and intimate moment. It features a person seated by a window, deeply engrossed in playing a stringed instrument, possibly a guitar. The setting sun casts a warm, golden light into the room, creating a cozy atmosphere and highlighting the subject's concentration and the details of the surroundings.
The room is filled with personal touches: framed artworks or photographs on the wall, house plants that bring a bit of nature indoors, and a bouquet of sunlit flowers on the window sill, which adds to the homey feel. There's also a collection of items and possibly books on a table beside the window, suggesting a space that's lived in and cherished.
The reflection in the window adds a layer of depth to the scene, blending the interior's warmth with the exterior's urban backdrop, possibly an apartment overlooking a city street. This reflective view, along with the framing of the shot, gives the impression of a private moment where music and the artist's passion are the world's focus.
The overall composition of the image, with its interplay of light and shadow, as well as the framing of the subject within the window, evokes a sense of tranquility and artistic dedication. It's as if the viewer has stumbled upon a moment of creation and reflection, a personal oasis amidst the bustle visible beyond the glass.
It was at this moment I realized ChatGPT understood what it was analyzing. While not exactly elusive, I doubt many people that enter my office today, and see this as my Windows background, would have extrapolated all that. But it's EXACTLY why I have it as my windows background currently.
Now, lets go with something more difficult (maybe). Brad Booker and I went to high school together. He recently won an Emmy as the producer of the best animated short film, "WAR IS OVER! Inspired by the Music of John & Yoko" . This is a still image extracted from the film.
领英推荐
The image you've shared is a digital artwork that depicts a dramatic and tense scene centered around a game of chess. It features two individuals deeply focused on the chessboard, surrounded by a group of onlookers who are intently observing the game. The setting appears to be in a dimly lit room, likely a tent or a makeshift shelter, given the canvas-like material in the background, which suggests a military or wartime environment.
The characters in the image are drawn with expressive features and seem to be soldiers, given their uniforms. The intense concentration and varied expressions on the faces of the onlookers convey the gravity of the game, possibly serving as a metaphor for strategy and the mental battles that accompany wartime situations. The players and the crowd are illuminated by a soft, central light source that appears to be coming from above the chessboard, drawing the viewer's focus to the central action.
There's a palpable sense of camaraderie and shared suspense among the group. The scene could be reflecting a moment of respite from the hardships of war, where the game of chess provides a brief escape and a test of wits. The artwork might be illustrating the idea that, even in the darkest of times, people seek connection, distraction, and a way to engage their minds.
It may have not known this recent animated short film (there is a good chance neither did you) but it understood the context of the single image. I was impressed.
Finally, I uploaded a picture of the Robie House. All the image name had was the word "house". This is what it came back with.
This image showcases the Robie House, a famous building designed by the renowned American architect Frank Lloyd Wright. The Robie House is considered one of the best examples of Wright's Prairie School architectural style, characterized by its emphasis on horizontal lines, flat or hipped roofs with broad overhanging eaves, windows grouped in horizontal bands, solid construction, craftsmanship, and discipline in the use of ornament.
The photo captures the building's distinctive long, horizontal profile, and its use of Roman brick, with a palette harmonizing with the natural environment. The house is notable for its dramatic overhangs and cantilevered roofs, which create sheltered spaces and strong geometric forms.
The Robie House, located in the Hyde Park neighborhood of Chicago, Illinois, was built between 1909 and 1910 and has since become an important piece of architectural history. It is often studied by architecture students and professionals for its innovative design that was far ahead of its time. In this building, Wright famously sought to create a "domestic symphony" which was echoed by the harmonious relationship between the building's design and its furnishings.
Wright's work, including the Robie House, has been influential in shaping the direction of modern American architecture. The house is now a designated U.S. National Historic Landmark and is open to the public for tours.
We know from public statements that we are within months of getting ChatGPT 5 released. Sam Altman, CEO of OpenAI, calls the technology you just saw above "extremely limited" compared to what ChatGPT 5 and eventually ChatGPT 6 will produce.
It can be argued (even ChatGPT itself will argue this) that ChatGPT is not intelligent and I would be describe the term instead of intelligent as "sentient". That's fine, I won't debate that. But that also misses the point. These AI models have developed to a point where they can correctly interpret what they see visually and explain the reasons why the things they interpret exist and how they are experienced.
In the field of architecture AI can take visual stock of a space through a headset and place generic digital equivalants of those objects into the correct places in the design models. That's not a future item, I just watched a video of that being done. Now knowing the above we can determine that we can feed AI an image of a proposed space (or a space that currently exists) and then the software could provide feedback about the space and even recommend additional ways to improve it, based on the function, layout, and mood you want to achieve.
Interesting, eh?
I'd love to know what you think the future will hold and where all this will go.
Innovative Architectural Solutions | Revit & Visualization Proficient | Design Challenges Welcome
12 个月Thanks for sharing.. amazing insight.. Nowadays we have x number of ways to utilize Ai, fun and practical.