DALL-E 2: The world's seen nothing like it, but can AI spark a creative renaissance?
How will text-to-image AI change the way artists work? Four Canadian creatives weigh in
Think of anything. Seriously, anything. A turtle wearing a cowboy hat. Spider-Man blowing out birthday candles. Maybe a landscape painting of Algonquin Park, rendered in the style of Monet or Takashi Murakami.
If you have the words to describe your vision, no matter how strange — or banal, for that matter — it's now possible to generate a picture in an instant, a file ready to download and share on your platform of choice.
It's not magic, it's AI. And in the last few months, the world's become increasingly aware of systems that are capable of conjuring original images from a few simple keywords. The results are often startling — though not always because of their realism.
That's especially true in the case of Craiyon (a tool previously known as DALL-E Mini), which is arguably the best known system of its sort. Free to use and available to all, the public swiftly adopted this open-source image-generator earlier in the year, and it's become a meme-maker's fantasy, spawning infinite threads of jokey one-upmanship.
<a href="https://t.co/gaI7oYJzf4">pic.twitter.com/gaI7oYJzf4</a>
—@weirddalle
Craiyon is trained on millions of images, a database it refers to when deciphering a user's text-based request. But the pictures it delivers have never existed before. Every request for a portrait of Snoop Dogg eating a cheeseburger is wholly one-of-a-kind. The pictures lack photorealistic clarity, and there's something to its grainy aesthetic, punctuated by spun-out faces, that suggests a nightmare seen through a car-wash window, but more powerful options are waiting in the wings.
On Thursday, Meta revealed it's developing a text-to-image AI called Make-A-Scene, and earlier this year, Google revealed the existence of its own text-to-image generator (Imagen). Neither of those tools are currently available to the public, but there are other companies that have opened their projects to outside users.
Midjourney would be one example; there's a waitlist to access its current test version, but admitted users can opt to join paid subscription tiers. Perks include friend invites and unlimited image requests.
But DALL-E 2 is the system that's probably drawn the most attention so far. No relation to Craiyon/DALL-E Mini, it was developed by OpenAI, a San Francisco-based company whose other projects include AI capable of writing copy (GPT-3) and code (Copilot). DALL-E 2 is the latest iteration of a text-to-image tool that they first revealed in January 2021.
At that time, OpenAI's system could produce a small image (256-by-256 pixels) in response to a text prompt. By April of this year, however, the AI (DALL-E 2) was capable of delivering files with four times that resolution, while offering users the option of "inpainting," or further refining specific elements within their results: details including shadows, textures or whole objects.
Never mind the particulars of how any of this is possible (there are other resources out there that parse the science) — the results are astounding in their detail and slick enough for a glossy magazine cover. (Just check out the latest issue of Cosmopolitan, which features "the world's first artificially intelligent magazine cover.")
As such, DALL-E 2's developers have taken a few steps to prevent users from dabbling in evil. Deepfake potential would seem to be a concern. Want a picture of a real person? That's a no-no, though the technology is technically capable of doing it. And there are safeguards against images promoting violence, hate, pornography and other very bad things. Various tell-tale keywords have apparently been blocked; for example, a picture of a "shooting" would be a non-starter.
Access to the tool remains limited to a group of private beta testers, and though OpenAI says they want to add 1,000 new users each week, there are reportedly more than a million requests in their waitlist queue.
Still, even at this stage, the technology's existence is raising plenty of questions, if not as infinite in quantity as the images DALL-E 2 can produce. Are creative industries facing obsolescence? When anyone can generate a professional-quality image with a few keystrokes, how will that impact graphic design, commercial illustration, stock photography — even modelling? What if the AI's been trained to be racist or sexist? (Right now it sure seems to be.) Who owns the images that are spat out by one of these systems — the AI, the company that developed it or the human who typed "sofa made of meatball subs" and made it happen?
The future remains as murky as a Craiyon render, but for now, there's at least one question we can begin to answer. If these tools have indeed been developed as a way to "empower people to express themselves creatively" — as the creators of DALL-E 2 purport — then what's it like to actually use them? And what if your job is all about creativity? How are artists using AI right now?
CBC Arts reached out to a few Canadian artists who've been dabbling with AI systems, including DALL-E 2, whose research program currently includes more than 3,000 artists and creatives. How do they see this technology changing the way they work?
Here are some of the early thoughts they had to share.
Bridget Moser: 'It's helping me realize what I want to make real'
A 2017 finalist for the Sobey Art Award, Bridget Moser is a Toronto-based artist who's renowned for her performance and video-based work. In April, she joined DALL-E 2's waitlist, and spent the next few months brainstorming what to feed it first. When reached by CBC Arts, she had been playing with it for just two weeks.
What are you doing with DALL-E 2?
At the start I was a little bit addicted, I would say. With DALL-E 2 you get 50 prompts per 24 hours. Each prompt generates six images, so you're technically getting 300 images per a 24-hour period. For the first five days, I think I just maxed it out and then had to slow down a little bit. I went too far. (laughs)
Right now I'm just saving the images. It really feels like a process of sketching or brainstorming, and I feel like I've learned something from what it's producing that will lead to something else.
There's tons of rules you have to follow with DALL-E 2 and it will refuse certain prompts as well if it thinks it's going to violate the rules. No gore, no violence: nothing like that. And nothing shocking.
It's very interesting to me to try and work within those constraints and see what feels sort of like that, but actually isn't.
Some of the images I'm quite in love with, but I don't know what will happen to them. Some of them are on Instagram. I've tried not to share things that are really unsettling. Even the ones I've posted, some people have been like, "This is very disturbing."
What's the first thing you made?
What was I doing at the start? I guess it was about making impossible photographs, in some respect.
One of my favourites from the very first day was probably "12 rubber gloves in the air + in the woods at night + disposable camera + flash photo." It generated this kind of ghostly looking photo, and I was just so pleased with that.
Is there an art to generating prompts?
Yeah, totally. I think it's like a skill that you fine-tune the more you use it. You learn these little idiosyncrasies that exist that you didn't know about until you ask it to do something or change something.
Why should artists have access to AI tools like DALL-E 2?
I wish more artists had access to it because I think it's going to be really important for a lot of people, and it would be really disappointing if the only people who are able to use it are the Bored Ape NFT guys or something.
I can also imagine a lot of people on 4chan are salivating at the prospect of using something like this, so I think it's important that artists have early access to try and hopefully mitigate some of the more problematic aspects of technology like this. It's inevitably problematic. There are tons of inherent biases when you're training an AI just because humans are inherently biased and we're the ones training it. I would hope the way artists use it would be not-evil, but I certainly know of evil artists, so there's no guarantee there either.
How is AI going to change what you do?
I'm still not totally sure, but it makes me want to experiment more materially, which is something that I feel lost doing a lot of the time.
I feel like I'm pretty good at performing and I'm pretty good at making videos. I have things I want to do sculpturally, but just can't figure out on my own. And I feel like in seeing hundreds and hundreds of DALL-E 2 variations, it's helping me realize what I want to make real.
Part of me also feels like this technology is inevitable. It's coming for us no matter what.- Bridget Moser, artist
I go back and forth a little bit. In some ways, these images feel kind of complete on their own; maybe they don't need to be made into any kind of physical iteration.
Part of me also feels like this technology is inevitable. It's coming for us no matter what. And so there is something kind of reassuring about being able to use it in a way that feels generative and creative. It's going to create new possibilities instead of the sense of doom that I think sometimes comes with this kind of technology.
Winston Hacking: 'I wouldn't even claim it as my own artwork'
A filmmaker and animator, Winston Hacking's signature collage-based style can be seen in music videos for Flying Lotus, Andy Shauf and BadBadNotGood. (A clip for the latter was a recent honouree at the 2022 Prism Music Prize this month.) Previously based in Toronto, Hacking now lives in Portland, Ore., where he's been experimenting with Midjourney, feeding it 10 prompts per day and regularly sharing the faux-vintage results on Instagram.
What are you doing with Midjourney?
When I first saw what was happening, I was curious. Making collage artwork, the first thing I started thinking about was like, oh — I can create my own vintage magazines to source from.
I go through archives — I go through Flickr, Creative Commons, public domain photos. And sometimes it's really hard to find something specific that you're looking for. Sometimes you don't know what you're looking for.
So that's the first thing that I thought about: what if I don't think about one of these generated images as a finished artwork — like, it's just asset, it's just an element, like a piece of cut-out paper? That's kind of where I'm coming from as an artist, potentially using it for an animated project.
Can I create my own magazines and what would they look like — and how does that influence decision-making in collage?
First impressions?
It's totally captivating. I would almost relate it to playing Nintendo for the first time when I was a kid — that kind of quality of, "Wow, I've never seen anything like this before."
I'm not sold on it as this game-changing thing that I'm going to embrace. Right now, I'm just kind of playing with it.
I think what we're really looking at is a whole new way of communicating — visual communication. I'm not so freaked out by it. I see a lot of potential for it to help people communicate ideas.
Is there an art to generating prompts?
It's strange because I wouldn't even claim it as my own artwork, you know what I mean? I don't really claim ownership of the images that are being generated. I mean, I entered in prompts, but I don't know if that makes it my work or not.
I'm just describing something; it's not like I saw that image in my head.
As artists, that's what we do, right? We take something ... and we try to break it.- Winston Hacking, artist
I just know that certain things combined are beautiful — things I find beautiful in images, like textures and anomalies, aberrations.
What types of images do you like and why do you like those images? If you can answer that, then you might find something that impresses you.
What can artists get out of using these AI systems?
I think it's definitely important to at least embrace it and actually see what can be done. As artists, that's what we do, right? We take something — we take this new medium or this new technology — and we try to break it. (laughs)
How is AI going to change what you do?
I really haven't made a decision on if I would integrate it into a project or not. I definitely know that there's certain times where I'm looking for something really specific and I can't find it, and I think that that's where it is a great tool. It can fill a gap; it can fill in a missing piece. I definitely see it as a tool.
Ginette Lapalme: 'A lot of it feels like it's straight out of my brain'
Based in Toronto, where she runs Toutoune Gallery on Bathurst Street, illustrator Ginette Lapalme has been dabbling with DALL-E 2 since early June. "Initially, I wasn't really sure what I could get out of it," she says. Now, her Instagram is full of blobby digital renders that bear an eerie resemblance to her IRL work.
What are you doing with DALL-E 2?
I'm inspired by finding strange objects — so, old tchotchkes, you know? Bootleg images or novelty toys. I was initially seeing if DALL-E 2 could create these things out of whole cloth. I was trying to see whether the machine could create these things that I get a lot of joy finding online or in real life.
I'm just kind of playing around with building different forms, feeding it a lot of different words that kind of explain the aesthetic I'm usually playing with — so different colours, different materials, different shapes.
It's kind of mend-bending because a lot of it feels like it's straight out of my brain, or like it's already fitting in with the work that I make. It's awe-inspiring. It's very bizarre.
First impressions?
I got access on June 7th or 8th, so I've been diving pretty deep on this thing every day.
A comedian and artist who I really like, Alan Resnick, posted about having access to it in early June. I was just really floored by what he was doing with it. I joined the waitlist then, and I think maybe he was able to recommend me.
He's done some really cool stuff with it in terms of video art. It's interesting to see different artists have access to these things. You see what people make with DALL-E Mini, and to me it's all kind of a little bit boring — like Mad Libs. But seeing artists I like have access to it? It's mind-blowing the way that people figure out how to use it to their benefit.
Why should artists have access to AI tools like DALL-E 2?
When I first heard about this thing, I was super wary. Having this machine, this AI, be able to fake a specific artist's style is off-putting. But it's a tool, and I think artists can benefit from it. It's an amazing tool that can kind of create images from your dreams. And I think everybody who has access to it who's an artist already has a specific style or way of thinking. They can do a lot of creative stuff with it.
How is AI going to change what you do?
I took a long break from making work with resin and plastics the last few years, and this machine is actually making me want to dive back into craft. I'm very inspired by it.
It's a really funny thing to think about because this is an AI feeding me digital images. It's making unreal objects that look very textured, like they could be real objects. I'm trying to find a way to make them become real for me.
Clint Enns: 'It feels like I could actually make the type of images I want to make'
While he waits for DALL-E 2 access, Clint Enns is busy investigating whatever AI systems are available to him, but two free and open-source tools have been his go-to options so far: Craiyon and Disco Diffusion. Originally from Winnipeg, Enns is the artist behind Internet Vernacular, an ongoing found-photography project that traces the evolution of visual communication throughout the digital age. Its last public exhibition focused on the year 2004, and the birth of the shareable "social photo." It's tempting to think 2022, the dawn of DALL-E 2, will feature in a future chapter of the series.
First impressions?
Everybody's doing it right now, which is kind of fun. And I think it's really captured the imagination of a lot of artists.
Although machine learning has been around for a little while, it feels like the results were simply in the realm of science fiction or fantasy art. Kind of cheesy. But it feels like something's broke. It feels like there's real potential in the technology. It feels like I could actually make the type of images I want to make.
The images look flawed. I think that's what was magical about them for me — that they look glitchy and broken down. Like DALL-E Mini: all of the faces are just melting, right?
How are you using AI?
I'm still using DALL-E Mini. It's really informative to see what other artists are feeding into the machine as prompts. You can really learn about an artist's practice just by seeing what their prompts are, similar to the way that the machine is sort of learning from us. Like, the prompts they are using usually reflect what they are trying to make without the use of the computer.
I was trying to see how the machine would interpret my prompts, in particular where it failed. I like to provide it with what I consider impossible tasks for things that I thought would be funny, or things that are self-referential. Like, "an AI-generated face." You put in that prompt and see what it thinks an AI-generated face looks like.
When I start to understand this technology a little better, I'm hoping to put out a chapbook where I think through the images. Like, thinking about what art is in an era where this machine is making art. I already have a line from it: something like, "I'm just waiting for the machine to become sentient enough to sue artists using its results for copyright infringement."
Why should artists have access to AI tools?
Well, somebody like me, what I'm trying to do is explore where technologies break down or fail. I've been doing this throughout my practice — like making glitch art. I think artists are really good at finding those failures and exploiting them.
You can generate perfect landscapes with this technology, but you can only see so many of those perfect landscapes. That's where these technologies start to break down and open up. I really think that's the artist's job in all of this.
How is AI going to change what you do?
I think this technology raises a lot of challenges to the artist. Can a machine do it better than us? But whenever a new technology comes up, it always poses a challenge to the artist. Think about photography, right? Why paint when you can just take a photo of something? Artists always find a way of both using the technology in innovative ways and responding to it.
It feels really exciting, like the birth of a new type of computing, you know? It's like when I was a kid and I got my first computer and I could do things that I couldn't do before. It feels like this tool is going to allow that. It has lots of potential.
These conversations have been edited and condensed.