Bad Ghibli

pirate wires #137 // an openAI update ghiblifies the world, “ethical questions” raised, and the white house social media team innovates within the form (a fascinating disaster)
Mike Solana

Subscribe to Mike Solana

Loading the Elevenlabs Text to Speech AudioNative Player...

To slop or not to slop. It was one of those days on the internet. Last week, after the introduction of Open AI’s 4o model enabled advanced image generation in ChatGPT, users quickly discovered they could generate a picture of… well, almost anything they wanted. And what they wanted most of all, at least in those very early hours with their newfound superpower, was to feed GPT real pictures of themselves, their loved ones, and various historical and cultural touchstones, and to have their friendly neighborhood superintelligence turn them into anime characters reminiscent of Studio Ghibli, a famous and beloved Japanese animation studio. After the first few Ghibli girlfriends went viral, everyone had to have one, and a classically presenting memetic explosion followed. For just about 24 hours, X — the first platform where users explored the new tool — set aside the culture war, the Tesla firebombings, and the collapse of globalization to share a lot of this:

@Aella_Girl

@Liv_Boeree

To date, almost all the images I’ve seen have been technically impressive, beautiful, and evocative. More importantly, they’re unambiguously much better at depicting a prompt request than ever before. There’s no really good way to describe this other than I used to ask GPT for an illustration of some very specific thing, and GPT would invent something that looked nice, but only seemed inspired by my prompt — and, most of the time, only vaguely. Now, GPT creates images in detail I didn’t even realize I requested. For example, I write “Ghibli this photo,” and it shows me Ghibli in quality I can not even properly verbalize, let alone produce myself. The feeling was something like suddenly singing in a brilliant voice. We were all, for the day, artists. Or, that is what it felt like.

Naturally, with this much fun happening on the internet, it was only a matter of time before the endemically miserable Bluesky natives, and the perpetually mindless Threadizens, joined the fray and found a reason to be angry. It was time to ask the “ethical questions.”

Does a beloved form of art lose its value once it’s easily mass produced? Is Ghiblifying an image — or Disneyfying, or Muppetfying — a form of stealing, actually, given the model was clearly trained on actual films, is now being asked to produce art exactly like them, and is not only following the prompts correctly, but at scale? Then (evergreen) what will happen to the jobs? And, a more spiritual concern perhaps, will young people ever learn to animate by hand again? As a kid, I spent hundreds of hours sketching my favorite comic book characters, and then characters of my own. I can’t explain it in any kind of rational language, but the thought of our world without children learning to draw just makes me kind of sad. Are there negative externalities here we aren’t considering?

Yeah, there are probably a lot of dangerous things we aren’t considering. But for every challenging question posed by advanced image generation, there are obviously gifts that would have felt to me, as that young kid who wanted to create his own comic books, a miracle. My art was good but never great (I didn’t have the patience), and my search for an artist — which extended into my early 20s — failed. Had I just been able to collaborate with GPT, I’d likely still be making comic books.

Today, I’m sitting on dozens of film and television scripts. How difficult would it be to produce my own movies, now, after collaborating with one of the powerful models on a new style? How difficult would it be to act the lines out on my own, and create new character voices with Elevenlabs? This is the world we are imminently approaching. The one-man studios are coming. My one-man studio is coming.

So, sure, we’ll be fighting over the “ethical questions” for weeks, and months, and years to come. But it’s not really clear they’re the most pressing issue at hand. The technology gives, the technology takes, and regardless of how any of us feel about the technology it’s here to stay. Much more interesting, I think, is the question of how this new tool will change our politics and culture. In this, the impact of advanced image generation is most forcefully made by general access to the tool, which brings us to a strange new state of culture we don’t seem to be considering at all, and definitely don’t understand: people with no meaningful artistic ability are making real, impactful art. Because this has never existed before, it seems unambiguously great. But what if the impact made is unintentional? After a thoughtless bit of prompting, what if your art is highly effective, but only at conveying a belief you don’t hold?

As most people don’t take the impact of art on culture seriously, the question is typically either dismissed as elitist bullshit or relegated to the topic of taste. There, it tends to be muddled into general feelings associated with mass production. On Ghibli Day, the images were endless. Were they “slop”?

Regardless of the quality, it was hard to argue the avalanche of AI-generated images we saw last week, grasping for attention in a kind of life-like mass beyond anyone’s control, wasn’t slop-like, and I alluded to as much that afternoon. Later, OpenAI CEO Sam Altman responded on X, “One man’s slop is another man’s treasure.” Different strokes for different folks, case closed. But the conversation on slop, in keeping with its basic nature, was a bit of a distraction.

It was Sam’s following tweet that exposed a real — and in my opinion really fascinating — problem:

@sama

Obviously, he was alluding to the White House social media team’s Ghibli, now the most famous example of OpenAI’s new image generation by far. And the first really great, and really impactful, example of bad AI-generated art:

@WhiteHouse

A bad Ghibli, we’ll call it. A piece of AI-generated art that is incredibly successful — at conveying something you didn’t intend to convey.

Now, much as discussion of our Ghiblified internet quickly bifurcated into two “sides” that didn’t really matter (“should this” or “should this not happen” (it’s happening)), discussion of the White House’s Ghibli immediately collapsed into furious debate over the question of whether the image was “cruel” or “funny.” In fact, this is the lens through which my own first critique of the image was largely viewed, and not only by totally deranged right wing slop monsters like Jack Posobitch, who took a break from hysterically screaming about the latina Snow White to accuse me, in characteristically dystopian fashion, of “counter signaling” the commandant. Even otherwise intelligent people were distracted by the question of the Ghibli’s goodness, and assumed I agreed it was “cruel.” I did not. Or, I really didn’t think about it much.

It doesn’t matter if the White House Ghibli was cruel or not at any kind of high, meaningful level. That entire subject is the stuff of ephemeral political drama, and everyone mad about it (or ecstatic) will be distracted by something new next week. The important discussion at the heart of the bad Ghibli is what the picture achieved that would not have been possible even just a day before it was posted.

While it’s hard to know exactly what the White House social media team was trying to achieve with its Ghibli (though my sense is probably nothing, and they were simply working from instinct as most of us are online), I think it’s safe to say they didn’t want to evoke a sense of pity in the average American for a deported fentanyl dealer. But that is what they achieved.

The moment their Ghibli went viral, and the first wave of backlash began, many hardline supporters of the White House were quick — naturally, correctly — to point out this was an AI-generated picture of a felon, almost certainly responsible for the death of Americans, who had already been deported once before. She was also, in real photos that emerged, (I am not being petty here, this is important) physically repulsive. Just look at her, the bad Ghibli defenders cried. And I understand the impulse. I mean, take a look yourself:

@WhiteHouse

Who wants this person in their country? Nobody. Who cares if she cries on her way home? Not me. But, incredibly, in order to defend the cartoon, White House allies had to point to the real photo of an actual criminal, which the White House had already released at the time of the Ghibli. In this, we are not using imagery to massage or explain away reality, as we often see with propaganda. We are using reality to explain away an entirely generated controversy that happened, it seems to me, by mistake. Because someone who is not an artist generated real art he didn’t understand.

How did we get here?

An artist working for Studio Ghibli understands when he’s creating a cute, or warm, or cold character, because he has to understand. That’s his job: to produce a piece of art that evokes a certain kind of emotion in an audience, and takes us on a journey. He begins with intention, then uses his skill to express that intention.

We rarely see professional art that communicates the opposite intention of an artist (say, a main character the artist intends for an audience to find attractive, whom an audience finds repulsive), because until today if an artist couldn’t effectively communicate with his art he wouldn’t eat. But advanced image generation decouples skill and intention. It’s a little inflexible, still, but the only meaningful tool of the artist a prompter can’t yet copy is knowledge. After an AI generates a technically brilliant photo, you still have to understand what you’re looking at. Last week, someone in Washington did not understand what he was looking at.

The White House Ghibli depicted a fat woman crying. But she wasn’t fat in the monstrous manner of the fentanyl dealer. She looked like somebody’s abuela, and not in a knit cap, but in a kitchen hairnet. Really, she reminded me of a cafeteria lady, and millions of Americans likely made that same emotional connection. On X, the White House’s bad Ghibli (close to 75 million views) has five times the viewcount of the original photo (15 million views). On platforms like Bluesky and Threads, where Trump garners far less sympathy, and therefore no principle of charity, it is likely most users do not even know a “real” picture exists, and they’re certainly not assuming the deported woman is a killer.

In the early hours of the controversy, I honestly doubt Trump even knew about the bad Ghibli. But his most rabid and performative internet fans all decided any criticism of the sobbing cafeteria lady was an indictment of their man. They’ve therefore since spent quite a deal of energy attempting to weave a grand theory for how the bad Ghibli was a brilliant 4D chess play, actually.

Some have argued the right wing needs to assert victory through humiliation of its rivals, and destruction of the left’s sacred idols is part of this process. Some have argued the White House social media team is trying to desensitize us all to the notion anyone deported for living here illegally, no matter how endearing, should garner sympathy. These are clever arguments. They are also obviously bullshit. The White House Ghiblified a fentanyl trafficker and made the average person feel a little bad about her deportation, where previously they didn’t care at all. That is a mistake. A really, really interesting mistake.

My earlier suggestion that GPT is now capable of producing images more “correct” than we can anticipate — than we can even fathom, in most cases — brings us finally to something even more important than art. In ex-risk circles, the more thoughtful doomers have never really focussed on the Terminator. They’ve worried about an AI that is capable of doing exactly what we ask of it, whether we realize that is what we’re asking or not. The paperclip maximizer is one extreme example, in which an AI asked to efficiently maximize the number of an office supply company’s product turns the entire universe into paperclips. “GPT,” you might ask of some very advanced future model, “bring back American manufacturing in the quickest and most efficient manner possible.” Oops, it just nuked our competition to reignite the war machine that once catalyzed American industrial hegemony. “I didn’t ask for this!” you cry. But you did. And the AI listened.

The White House’s bad Ghibli is not a nuke, but it is an example of misalignment. The AI is trained to answer prompts, not to answer prompts that preserve the prompter’s job. Now, I don’t expect the Jack Posobitches of the world to care about any of this, because Jack is algorithmically misaligned himself. But for the rest of us who are less mentally ill, I think it’s worth considering this tiny hint of what’s to come. Because this first bad Ghibli is not only a bad piece of art. I think, more than anything, it’s a warning worth consideration. We are entering a world in which we will get exactly what we ask for, nothing more and nothing less — Aladdin’s lamp, now within our grasp. 

Be careful what you prompt for.

—SOLANA

Subscribe to Mike Solana

0 free articles left

Please sign-in to comment