GET IN TOUCH WITH PAKKO, CREATIVE DIRECTOR ALIGNED FOR THE FUTURE OF CREATIVITY.
PAKKO@PAKKO.ORG

LA | DUBAI | NY | CDMX

PLAY PC GAMES? ADD ME AS A FRIEND ON STEAM

 


Back to Top

Pakko De La Torre // Creative Director

A.I. open thread: Black people also underrepresented for image creation

A.I. open thread: Black people also underrepresented for image creation

Facial recognition is often wrong about black people, frequently misidentifying black people, and young black women specially. Maybe other applications of artificial intelligence, like image creation, do better by black people. Or maybe image creation is also beset by the problem of willfully small reference pools.

The way an artificial intelligence image creation program works is that you give it a text “prompt” in a natural language, like English, and then the program tries to create an image that corresponds to that prompt.

An image of a cow jumping very high at night generated by Night Café Studio

For example, if I give an image generation program the prompt “cow jumped over the moon,” it might generate an image in which we see a cow jumping very high at night. The program might add some onlooking cows for good measure.

Image creation is sort of like facial recognition in reverse. A facial recognition program takes in an image and tries to identify the people, animals or objects in it, giving a name (hopefully but not always the correct name) or at least a description. An image creation program can take a description and create an image based on that description.

Facial recognition works very well for white people. One time I posted on Facebook a dark and blurry picture, and it correctly identified one of my friends, white guy, nice guy, lurking in a corner of the photo.

Incorrect facial recognition has unfortunately had tragic consequences for black people. Exact numbers are not available, the American Civil Liberties Union (ACLU) has complained. Lawmakers like U. S. Rep. Ted Lieu (D-California 33) are concerned, and crafting legislation to rein in the police’s use of facial recognition programs.

The problems that image generation has with black people might shed some light on the problems facial recognition has with black people.

I first heard about image creation by artificial intelligence from David Bruce, a British composer of music under the wide umbrella of “contemporary classical music.” Bruce concluded that composers don’t have to worry about their jobs being taken away by artificial intelligence, but visual artists do need to worry.

If an odd bunch of musicians need a new composition specifically for their unusual ensemble, they should turn to a human composer like David Bruce, because artificial intelligence won’t be able to produce a suitable musical composition, except with a heck of a lot of human intervention.

But if those musicians then need artwork for their demo CD, artificial intelligence image generation might just be the ticket.

Instead of paying who knows how much to an artist, they might try putting in the title of the composition they commissioned from David Bruce as text prompt into an image generation program, and the result, royalty-free, might be so good that the only thing missing is text for the composer, performer and title.

After a few Google searches, I found Night Café Studio. To have it create an image, you need one or more “credits” (the exact price depends on which algorithm you choose). You can buy credits for 10¢ or less each, but you have to buy at least a hundred at a time.

Or you can sign up and wait less than twenty-four hours to get five daily credits (I think those are issued at midnight GMT each day). You can also get one-time credit bumps for doing things like completing your Night Café profile, publishing an image created by Night Café for the first time, etc.

When I first started looking into this, I was drafting my article “All scientific research is important” (I stand by that, though with clarifications and caveats). I had decided that for the article’s top photo, I would take an existing photo of Stephen Colbert wearing one of his usual gray or blue suits and fiddle with the colors in Adobe Photoshop to make the suit jacket and tie look pink. It would be very easy.

I was thinking about how to make that image of Stephen Colbert wearing pink when I got a notification from Night Café Studio to claim my five daily free credits. So I thought, why not let Night Café Studio take a crack at this Stephen Colbert assignment? The worst that could happen is that the AI-generated image would be useless and I would have to go with my original plan.

Two images of Stephen Colbert. Left, a photo from the set of his late night show. Right, an image created by an artificial intelligence

But when I saw the image Night Café Studio created, I thought it was perfect. And besides, I had misplaced my license number for Adobe Photoshop Elements and I was not looking forward to figuring out how to do the same thing in the GNU Image Manipulation Program (GIMP). Although GIMP is said to be almost as powerful as Adobe Photoshop, I have often found GIMP’s user interface to be confusing. But that’s a rant for another day and maybe another venue.

Using artificial intelligence to generate images can also be a great way to brainstorm. I’ve been wanting to rethink the royal cards in the standard deck of playing cards (the pip cards should stay the same, I think).

So I thought, Camila Cabello as the Queen of Spades. I put that into Night Café Studio. The results were not quite what I wanted, but the woman in the image was recognizably Camila Cabello. Next, I tried Camila Cabello as the Queen of Hearts. I failed to notice that they introduced the option to generate four images at a time (you get a little bit of a discount that way).

Four images of Camila Cabello as the Queen of Hearts

The one in the lower left corner looks more like Wonder Woman than like the Queen of Hearts. Nevertheless, she does look like Camila Cabello. I got similar results for clubs and diamonds.

Since I do want to retain that traditional diagonal symmetry of the standard royal cards, these images are going to need a lot of work if I choose any of them for this purpose.

Also, I probably shouldn’t use Camila Cabello for all of the queens. What I’m trying to get at here is that I also want some diversity here. So I thought of Amber Ruffin, a writer on Late Night with Seth Meyers and the host of her own late night show on Peacock (NBC’s streaming service).

Notice the diagonal symmetry of the traditional royal cards in a standard deck.

On one of her “Amber Says What?” segments, Ruffin mentioned how a Jumbotron confused Sam Jay for Ziwe Fumudoh. To be fair, I don’t know who either of those women are. But Ruffin also mentioned how a sports announcer confused Dionne Warwick for Gladys Knight.

For me it would have been more understandable if instead he had confused Dionne Warwick for Ego Nwodim from Saturday Night Live, at least that would have been like confusing Sarah Palin for Tina Fey. Ruffin joked that the Jumbotron guy might confuse Michelle Obama for Harriet Tubman.

Some people say “I don’t see color,” but that’s probably wrong, “because, clearly, that’s all you see,” Ruffin said. Who knows what wrong name a facial recognition program or a clueless sports announcer would come up with spotting Ruffin at a sports event.

I believe that Amber Ruffin would appreciate being the queen of one of the four suits. I thought of asking her on Twitter, but that was before Elon Musk started removing Twitter’s plumbing and parading it around.

But maybe before I ask, I should have some kind of sketch or mock-up to show her. And since Night Café Studio did so well with Stephen Colbert, I entrusted to it the task of imagining Amber Ruffin as the Queen of Spades.

How do you suppose a human artist would go about this assignment? If he or she doesn’t actually know what Amber Ruffin looks like, there’s Google, or Bing, or Duck Duck Go, etc., which he or she would go to for reference images.

Amber Ruffin as the Queen of Hearts? Image created by Night Café Studio, an artificial intelligence image generator.

Presumably Night Café Studio did something similar for Camila Cabello: it looked up an image of her, looked up an image of a queen and then got to work. So when I tried doing the same thing with Amber Ruffin, I was kind of surprised by the results.

For Amber Ruffin as the Queen of Spades, the result (at the top) looks more like Viola Davis in The Woman King, a movie that came out at around the time I put in that text prompt. But I’m saying that more because of the costuming, because when I looked Viola Davis up on Google, I realized the woman in the image is not Viola Davis either. She looks very familiar, like I should know who she is.

It’s like if when I gave this assignment to a human artist, he or she reasoned that since Amber Ruffin is a black woman, any random black woman will do for the assignment.

The algorithm seems to only see race and gender. To be fair, it also sees weight (try these text prompts with Lizzo to see what I mean).

Maybe I’m not supposed to use Night Café Studio’s stable diffusion algorithm for real life celebrities. Except that it did so well with Stephen Colbert. So Stephen Colbert is unique but black women are interchangeable? That’s not right.

I also tried white women. The results with Leisha Hailey were (except for the Queen of Spades) recognizably her, but not exactly flattering. Let’s just say that if I wanted to found the Leisha Hailey Fan Club, I would not use the images Night Café generated for the fan club website.

The results with Katherine Moennig as the King of Spades, etc., were more flattering (yeah, yeah, I know about Daniel Sea).

The DALL-E 2 algorithm sidesteps that problem by simply forbidding names. Give it a name and it’ll refuse to start. But give it a more generic description, like “black woman as the Queen of Spades,” and it’ll get to work immediately (provided of course you have sufficient Night Café credits, DALL-E 2 costs more than stable diffusion).

What about men and women of other races? I tried “Sikh man playing French horn.”

An image of a Sikh man playing the horn generated by Night Café Studio.

I don’t like calling it a “French horn,” but didn’t feel like trying to argue with the artificial intelligence. It didn’t matter, the musical instrument in the resulting image looks more like some kind of valveless euphonium.

The Sikh man has a beard, which is good (Hollywood often forgets this detail), but he’s got no turban. Still, a pretty good image generated with very little effort on my part.

I still have quite a bit to learn about artificial intelligence image creation. I see other “artists” using text prompts with more words, including “modifiers.” Sometimes those seem to help, other times I wonder if the result would have been the same.

Night Café Studio is not the only game in town. A couple of weeks ago I learned about Midjourney Bot, available on some Discord servers (Discord is like Slack, but considered somewhat cooler by some). I tried Midjourney with Amber Ruffin. The women in the images Midjourney generated look more like Amber Ruffin… but still not quite as precisely as with Camila Cabello or Katherine Moennig.

The open thread question: Have you experimented with artificial intelligence image generation, and if so, have you noticed anything strange regarding images of black people?

This content was originally published here.