SI Art With Short Prompts
Larry Van Stone
Biology and Data specialist at Delaware Museum of Nature and Science (PT)
<SI = Simulated Intelligence. True AI is still out of reach>
Preparing a review of a story with a feudalistic setting, I tried a certain short prompt with six generative art engines. The image above was generated in Bing by the Bing Image Creator, powered by Dall-E3. I'll state up front that Dall-E3 typically produces the most human-like images, compared to the others. Or perhaps I should say, the images I find most comprehensive of what I had in mind. The prompt was as shown in the caption, "The essence of feudalism". This is my favorite of four images presented, as re-rendered in wide format by Dall-E3; it can do that now, but the result is not just an expanded copy of the original, the engine takes certain liberties.
A few anomalies indicate that Dall-E3 doesn't "think" like we do. Firstly, there would be no forest of yew trees surrounding a castle like that. They breach the defenses automatically! The second important item is the varied livery of the mounted knights. The castle rooftops are blue for a reason, and the primary color of the livery should be a similar blue color. I see three colors on the doublets of the knights, and only one of two squires depicted is wearing blue. Smaller points include
Now, as for other engines:
Gemini produced nothing of interest, just random interiors of ramshackle barns. No humans or animals visible; Google is still working out how to depict people without getting in trouble for the giant historical anomalies the Bard imager produced.
Strangely, Leonardo AI also didn't produce anything I wanted to show off.
Playground has three engines. The newest, Playground 3.0, is claimed to have the strictest adherence to a prompt. There are no controls or filters in the free version. Here are a couple of results:
I call this over-adherence to the prompt! A table from a textbook, and a bestiary of armor styles.
领英推荐
Onward to the Playground 2.5 engine, which has a variety of controls, including variable prompt adherence. I used the Delicate Detail filter for this:
This engine produced architecture. This baronial interior is fetching, but nothing like what I wanted for the illustration.
Playground's third engine is Stable Diffusion XL, which produced this:
This has a manor house rather than a castle, and focuses on the nearby village. Another offering by this engine was an overview of a large village of thatch-roofed cottages.
I hesitate to say that these engines have "personalities", but the differences in the ways they depict an idea reflect significant differences in their training datasets, at the very least.
I could have saved time by going with the Dall-E3 image, which I did use for the review, but checking with the other engines has been instructive.