A little over four years ago, we featured here on Open Culture a set of realistic images of people who don’t actually exist. They were, as we would now assume, wholly generated by an artificial-intelligence system, but back in 2018, there were still those who doubted that such a thing could be done without furtive human intervention. Now, after the release of tools like OpenAI’s ChatGPT and DALL‑E, few such doubters remain. In recent weeks, another OpenAI product has caused quite a stir despite having yet to be properly released: Sora, which can use text prompts to create not just replies in kind or still images, but minute-long video clips.
“This is simultaneously really impressive and really frightening,” says Youtuber Marques Brownlee in his introduction to Sora above. He examines some of the demo videos released so far by OpenAI, highlighting both their strengths and weaknesses.
It would be difficult not to feel at least a little astonishment at the result Sora has produced from the following prompt: “
There’s something Blade Runner going on here, in more senses than one. The not-quite-human qualities about this “footage” do stand out on closer inspection, and in any case make the whole thing feel, as Bownlee puts it, “a little bit… off.” But as he also emphasizes, repeatedly, it was just a year ago that the bizarre AI-generated Will Smith eating spaghetti made the social-media rounds as a representation of the state of the art. The underlying technology has clearly come a long, long way since then, and though the material so far released by OpenAI may feel faintly awkward and “video-gamey,” they clearly show Sora’s capability to create videos plausible at first and even second glance.
This may spell trouble not just for those currently in the stock-footage business, but also for those who happen to believe everything they watch. Brownlee calls the implications “insanely sketchy during an election year in the US,” but he may take some comfort in the fact that Sora is not, at the moment, available to the general public. There are also explainers, like the one from the Wall Street Journal video above, in which AI-industry professional Stephen Messer points out the telltale glitches of AI-generated video, many of which have to do with the finer details of physics and anatomy. And if you find yourself paying unusual attention to the consistency of the number digits on Messner’s hands, just tell yourself that this is how it feels to live in the future.
Based in Seoul, Colin Marshall writes and broadcasts on cities, language, and culture. His projects include the Substack newsletter Books on Cities, the book The Stateless City: a Walk through 21st-Century Los Angeles and the video series The City in Cinema. Follow him on Twitter at @colinmarshall or on Facebook.