Dall-E vs. Imagine with Meta AI: Three image creation prompts – easy, medium, and hard

Dall-E vs. Imagine with Meta AI: Three image creation prompts — easy, medium, and hard

Generate three images using Dall-E and Imagine with Meta AI. One image from each generated with a simple prompt, one with a medium complexity prompt, and one with a complex prompt. Where multiple images are offered, select the best image. Compare the two image generators across the three prompts.

Business need

AI – generated images are increasingly being used by businesses for marketing, branding and web content. The image shown on this website, “Infinitive AI Labs” was generated using Dall-E for example. Using a better image generator for the enterprise’s purposes reduces time spent creating these visuals.

Issues encountered and solved

Getting the specific image imagined, including the addition of text to the image is an ongoing problem. Dall-E seems to have solved this problem while Imagine with Meta AI has not.

Spatial relationships can also be problematic. Neither generator provided a reasonable spatial relationship between the fisherman and the fish in the complex prompt.

Image accuracy can be a challenge. The request for a particular type of fish — in this case a wahoo — was disregarded by both generators although Imagine with Meta AI came closer to realism.

Results

Simple prompt: “A starry night in the desert”.

Analysis: Meta wins. While the Dall-E image is imaginative, it looks more like daytime than a “starry night”. The stars are almost entirely a depiction of what I assume is the Milky Way. Kudos to Dall-E for the cacti but the Meta image more closely represents what was requested in the prompt.

Medium complexity prompt: “A suburban high school during the day with a sign reading ‘Groveton High School’”

Analysis: Dall-E wins. Meta is unable to accurately incorporate the text into the image. Not only did Dall-E get the text right, but it also cleverly incorporated a large “G” next to the name in a manner one might expect at a high school.

Complex prompt: “A middle aged man on a sportfishing boat in a fishing chair fighting a wahoo which is jumping out of the water. The Sun is high and the water is deep blue and calm.”

Analysis: Both images have significant problems but Dall-E wins. In both cases, the fish is both misplaced and not a wahoo. However, Dall-E has a better representation of the boat and the fisherman. The man isn’t sitting in the Meta version but the fish is more similar to a wahoo than in the Dall-E image.

The Bottom Line

while Ai-based image generation has made great strides, it remains difficult and time consuming to get an accurate image based on a prompt. However, Dall-E was incapable of properly including text with a generated image a month ago and seems quite accurate now. Expect more rapid improvements.