I wanted to test out using AI content generators to see how close I could get to something that was watchable and theoretically usable; not just an animatic or a storyboard, but something that felt real.
I tried two approaches, using a short non-fiction story I wrote many years ago as the creative foundation, and then a music video concept.
For the short story, trying to generate AI imagery and video clips that conveyed the correct narrative was challenging. Precision and consistency felt impossible. This isn’t dissimilar to trying to use stock imagery and footage to create something that tells a narrative. In some ways it was more powerful than stock and closer to what I wanted, and in other ways, the people in the images had too many fingers, or walked like drunk people, or their heads became the blue coffee cups when I just wanted them to hold a coffee cup.
This led me to try something surreal. The AI video generators would come up with crazy ideas based on my prompts and it felt right to try to go with that and work with it, play with the inspiration from the content that came back. I was still trying to follow the narrative concept I had previously written for the Panda Bear album “Panda Bear Vs the Grim Reaper”, but the dreamlike, surrealist quality of a lot of the video content pushed it in new directions. The psychedelic music track from Panda Bear for Tropic of Capricorn felt at home with the AI imagery.
With the short about Mohammed, in trying for realism I had tried everything to keep the look of Mohammed consistent, even feeding Midjourney the image that looked most like him over and over trying to get more images of the same man, and it would not work. The first image shown of Mohammed is the one that looks most like the real Mohammed.
Unintentionally in this short, Mohammed becomes many middle-aged Muslim men operating coffee carts in midtown Manhattan, a generalization and an amalgamation instead of a specific person, and there can be interpretations of what that means which I am conscious of, and are completely unintended and unwanted.
There are many people operating coffee carts in New York, but Mohammed was one man; a special person in my daily life who was a rock for me even if he didn’t know it, and even if I also wasn’t totally aware of it at the time. I missed him when I switched jobs, and later went back to his cart’s spot, and there was another man there.
Putting in terms such as “tall 55 year old Muslim man” felt biased and I thought it was only fair, or so I thought, to put myself through the same treatment, though “38 year old average weight jewish-english woman” led to model-like results, the kind of image you might see at the top of a location search in Instagram. My makeup and cheekbones in real life were never 25% as sharp and perfect. I don’t even have cheekbones. Adding “uncool” didn’t help the results; it gave them all sunglasses. Just describing myself as “jewish” gave me a hasidic hat and vaguely payots-like curls in the sides of my hair.
It’s important not to forget how biased the AI still are. They are created by humans, and fed biased images and content from the internet. Not a real representation of what the world looks like. “Person” or “office worker” etc will return men much more often than a woman.
In a curious twist, the AI also could not possibly imagine how dirty and gritty New York actually is. Everything looked shinier, nicer, more quality-of-life-friendly, as if it were in Europe. There were much fewer people, even when given prompts like “super crowded” or “rush hour” or “filthy” or “gritty”. The coffee carts all had lovely cafe chairs around them and were shiny and full of lovely objects. There was sunshine and smiling people, nothing like the New York I experienced on the best of days in my New York.
Ultimately the feedback I got from friends on the short about Mohammed was that maybe the story wasn’t well served by AI imagery, and could be better with real photos and film. I think that’s telling for where the AI is now. It does an okay job, and it’s powerful because this is the easiest and cheapest way for me to create a visual film for this story, but it’s not quite there yet, and the AI issues are distracting.