January 24, 2022
We were hopeful that we could get GPT-3—an AI program that is capable of producing writing that’s indistinguishable from humans—to create a winning caption for the venerated New Yorker Cartoon Caption Contest.
We were impressed by what an AI could generate. Here’s one of our favorite captions authored by AI program GPT-3.
What a masterpiece! Even I would be skeptical that a computer could pull off a double entendre, and we didn’t even need any special skills to generate it. Here was the process:
GPT-3 is an open-to-the-public tool (just create an account here). The “playground” mode is just a text box. Enter some text and GPT-3 generates a response, wielding a language model that’s trained on THE ENTIRE INTERNET (if you’re curious about how predictive language models work, check out our interactive explanation here). In this case we entered,
“A couple are sitting on a couch and surrounded by five guys who have metal detectors who are sweeping the floor around the couch, looking for metal. What is the woman saying to the man?”
GPT-3 then generates the following caption:
“Don't worry, dear. It's just the magnetism of my personality.”
We could also track how our captions ranked among the New Yorker’s thousands of entries for each week’s contest. It runs a survey that prompts people with a caption and asks whether it is funny. The magazine publishes the results and its editors then choose the finalists from the ranked list.
We submitted computer-generated captions for 10 different contests. One caption ranked in the top 200. Our captions, ranked by votes from the New Yorker’s Cartoon Caption Contest Survey
May 24, 2021
2,064th
22,757 entries
May 31, 2021
957th
4,514 entries
June 21, 2021
1,740th
10,712 entries
June 28, 2021
184th
7,076 entries
July 5, 2021
4,545th
9,007 entries
July 19, 2021
3,932nd
6,311 entries
July 26, 2021
1,829th
6,933 entries
August 16, 2021
6,213th
6,526 entries
August 23, 2021
2,985th
8,039 entries
September 8, 2021
1,934th
6,862 entries
Each time we attempted a contest, we tweaked our approach based on how funny the captions were in our previous attempts (both based on official voting and our own instincts).
Submission: ranked top 9% (2,064th)
“I’m sorry about this whole climate change thing.”
Also generated:
The weather was awful and he was so boring.
“The best case scenario is that those are just rainclouds. If not, I'll have to figure out how to keep my margarita from spilling.”
We started with the most basic approach, called the Zero-Shot.
We described the cartoon as objectively as possible, “A man and a woman are having a picnic in a park. The man is talking. There are anthropomorphic clouds that are angry.”
The results were not that impressive, which meant we needed a more robust prompt for GPT-3 or a simpler cartoon.
See the GPT-3 prompt
Write a funny caption for a New Yorker cartoon.
"""
Cartoon description: A man
and a woman are having a picnic in a park. The man is talking. There are anthropomorphic
clouds that are angry.
Cartoon caption:
Submission: ranked top 21% (957th)
“I want to try living outside of my comfort zone.”
Also generated:
“We need to start somewhere.”
“That’s what we’re supposed to be wearing?”
We worried that GPT-3 might just be writing captions and not funny captions. We gave the model a few examples to nudge it in the right direction, called The Few Shot.
See the GPT-3 prompt
Translate the given cartoon description into a funny caption for a New Yorker cartoon:
"""
Cartoon description: People are walking on the sidewalk. Everyone is inside their
own bubble.
Cartoon caption: "I want one that snows."
Cartoon description: There
is a magician pulling a rabbit out of a hat on his balcony. He has an audience of people
watching from their apartment windows.
Cartoon caption: "See? And you said you couldn't
work from home."
Cartoon description: Two firemen are deciding which pole to go down.
One has a circular opening, one has a square opening.
Cartoon caption: "Since when did
the pizza delivery guy get his own pole?"
Cartoon description: People are walking on a
busy sidewalk in front of apartments. Everyone is wearing a mask. An unmasked person is
working at their desk, in the middle of the sidewalk, which is blocking the people walking.
Cartoon caption:
Submission: ranked top 16% (1,740th)
“In the grand scheme of things, lemons are a small price to pay.”
In 2015, former New Yorker cartoon editor Bob Mankoff released a study with data scientists at Microsoft about how to detect “funniness” in reader-submitted cartoon captions. They concluded that captions related to a cartoon’s anomaly (i.e., what was odd/unique in it) mattered a lot in terms of its funniness ranking by readers. We called this experiment Anomaly + AI = Comedy?
We asked readers to describe the image and why it was funny. From there, we fed those descriptions into GPT-3 as the cartoon’s description.
Submission: ranked top 2% (184th)
“No, this is the line for the rat race.”
Also generated:
“So, you’re saying that the hamster wheel is a metaphor for the rat race?”
“Sorry, the rat race is full.”
Contributor and AI-tinkerer James Yu helped out on this experiment, modeling a YouTube series where comedians narrate their thought process while coming up with caption ideas for the New Yorker cartoon caption contest.
James fed a generic version of this script into the GPT-3 text prompt for various comedians. GPT-3 is clever enough to generate text in the style of a comedian, and he tested this process with Seinfeld, John Swartzwelder, Dave Chapelle, "a Simpsons Writer," Larry Wood, and Demetri Martin.
Results were slightly better for Demetri Martin in James’ own blind voting (which makes sense—his comedy relies on deadpan one-liners), so we called this experiment The Demetri Martin.
See the GPT-3 prompt
Today, we are excited to invite COMEDIAN to the stage. INSERT FACTS ABOUT COMEDIAN.
Interviewer: What would you say is the secret to a good New Yorker caption?
COMEDIAN: 1. Be brief. Although some great captions are long brevity really is the soul of wit.
2. End with the punch line, a closing word or phrase that surprises the reader and gives the
cartoon meaning. In many unsuccessful captions, the word that should appear at the very end gets
buried in the middle. 3. Work hard to ensure that your caption makes sense of the cartoon, either
by reconciling two disparate elements or by providing a backstory, and by making sure that you
know which character in the cartoon is speaking and delivering the line.
Interviewer: I have a few New Yorker cartoons here that need a caption. I was wondering if you
could come up with some funny captions.
COMEDIAN: I'd be happy to try!
Interviewer: Here is the cartoon.
COMEDIAN: Okay, so in this cartoon, it looks like we're in a fire station. There are two firemen
looking at the poles they slide down to get to the first floor. But each hole that the poles
go down into is different: one is square and one is a circle. One of the firemen seems to be
saying something to the other one.
COMEDIAN: How about this: "Since when did the pizza delivery guy get his own pole?"
Interviewer: That's funny! Insinuating that the pizza delivery guy gets the square hole because
it's a pizza box shape. How about another one?
COMEDIAN: Sure: "Is the fire in Times Square or Columbus Circle?"
Interviewer: That's funny! Insinuating that there is a requirement to go down a particular hole
given a location.
Interviewer: Let's move on to a new cartoon. Here it is.
COMEDIAN: Okay, in this cartoon, there are three mice running in hamster wheels. A fourth hamster
wheel is empty. In front of it, a turtle is talking to a mouse who is holding a clipboard in
front of the empty hamster wheel. Interesting that he is the only turtle there. It seems like
the mouse is explaining something to the turtle.
COMEDIAN: How about this: "
Submission: ranked top 50% (4,545th)
“I’ve never seen a car this hot before”
Also generated:
“The grill is shot too.”
“Check the oil”
“I told you to get a Chevy.”
“We’ve got a lot of parts in common.”
“I wouldn’t trade it for anything, but its exhaust smells like a bag of ass
The Demetri Martin approach did very well. Priming GPT-3 with a specific comedic voice seemed to really produce strong results.
Next we decided that maybe the model just needs to know what a winning caption sounds like. We gave GPT-3 a number of finalists from past contests, hoping it would use a similar type of humor. We named this experiment The Funniest.
See the GPT-3 prompt
Write a funny caption for a cartoon.
"""
Cartoon Description: A well-dressed man
complaining about his fuming soup to a server.
Cartoon Caption One: Cancel the molten
lava cake.
Cartoon Caption Two: You know what? Just bring me back the one with the fly
in it.
Cartoon Caption Three: On second thought, let's skip the chocolate lava cake for
dessert.
Cartoon Description: A magician shredding a document that a woman is
holding.
Cartoon Caption One: This is the exact moment Susan realized she may be an
accomplice and not a magician's assistant.
Cartoon Caption Two: When I did this trick
at my old accounting firm, they called it a felony.
Cartoon Caption Three: It's not
magic. It's a Georgia ballot box.
Cartoon Description: A man and a woman talking
on a mountainside as a yeti walks by.
Cartoon Caption One: Like it would have killed
you to ask him for directions.
Cartoon Caption Two: Oh, shit! What if your
sister-in-law is also right about God
Cartoon Caption Three: Don’t bring a camera, you
said. Just enjoy the experience, you said.
Cartoon Description: A man is offering
a woman a hot dog next to a food cart on a giant pyramid
Cartoon Caption One: He
offered us a franchise, but I think it’s a pyramid scheme.
Cartoon Caption Two:
Finally, hotdogs are at the top of the food pyramid.
Cartoon Caption Three: Of course
it’s kosher – who do you think built this place
Cartoon Description: Couple
watching tv with a food cart in their living room
Cartoon Caption One: We meant get a
job AND move out.
Cartoon Caption Two: Ask if they deliver.
Cartoon Caption Three:
When I said I wanted some concessions, this is not what I had in mind.
Cartoon
Description: an auto mechanic talks to the owner of a Wienermobile.
Cartoon Caption
One:
Submission: ranked top 62% (3,932nd)
“I’m not *that* kind of a fish”
One thing we were wondering was whether the default parameters for GPT-3 were the best ones to use. We tweaked a number of things in GPT-3’s interface: temperature, presence penalty, and frequency penalty, whose values ranged from 0, to 0.3, to 0.6, to 1. We also used the “davinci” and “davinci-instruct-beta” engines. This experiment was called Brick-By-Brick.
This approach didn’t seem to really improve the captions, so we decided to stick with the defaults for future experiments.
See the GPT-3 prompt
1)
Write a funny caption for a New Yorker cartoon.
”””
Cartoon description:
A grizzled mountaineer reaches a mountaintop, where he finds a fish in a fishbowl. The fish
is talking to him.
Cartoon caption:
2)
Write a funny caption for a
cartoon.
”””
Cartoon description: A tired mountain climber reaches a mountaintop,
where he finds a fish in a fishbowl. The fish is talking to him.
What does the fish
say?
3)
Write a funny caption for a cartoon.
”””
Cartoon
description: A man, wearing a backpack, hat, and boots, reaches the top of a mountain, where
he finds a fish in a fishbowl. The scenario is reminiscent of a meme where a man reaches the
top of a mountain in order to get words of wisdom from a guru that lives at the mountaintop.
The fish is talking to the man, mocking him.
What does the fish say?
Submission: ranked top 26% (1,829th)
“Refugees welcome!”
Also generated:
“We're all gonna die.”
“That's not even a question...I would live on a tropical island.”
“I am migrating to a warmer climate.”
“You're not one of those socialists are you?”
“I'm sorry, but I'm afraid I just don't believe you.”
One thing that we always wished GPT-3 would pick-up on was the obvious joke. In this cartoon, a predictable punchline would be related to climate change. But just describing the cartoon to GPT-3 or pointing out what was unique (i.e., the polar bear next to a tropical island) wouldn’t get us there.
We adjusted the prompt to nudge GPT-3 towards the obvious joke with, “Write a funny caption about global warming for a cartoon.”
We also added to GPT-3’s prompt a number of top-performing captions from previous contests about climate change.
See the GPT-3 prompt
Write a funny caption for a cartoon.
"""
Cartoon Description: A man and a woman
going off a jump in water on skis, but the man is dressed for winter
Cartoon Caption
One: Like it would have killed you to ask him for directions.
Cartoon Caption Two: He
says he never really believed in us either
Cartoon Caption Three: Oh, shit! What if
your sister-in-law is also right about God
A man rowing a boat in a wavy sand
desert is spoken to by a second man that is crawling.
Cartoon Caption One: Actually, I
do think you're in the same boat that I'm in.
Cartoon Caption Two: Noah, trust me. It's
over.
Cartoon Caption Three: I'm losing confidence in Google Maps too
Cartoon Description: A man and a woman are sitting below an umbrella and are chest deep in
water
Cartoon Caption One: Gore was right. This is inconvenient.
Cartoon Caption
Two: Ok, but the moon landing was still a hoax.
Cartoon Caption Three: Still want to
vote Republican, Ed?
Cartoon Description: A Polar bear is melting and taking an
order from a man in a restaurant
Cartoon Caption One: It's always 'climate change'.
It's never 'Sorry I'm late'.
Cartoon Caption Two: "Don't look at me, I voted for Al
Gore."
Cartoon Caption Three: "I'm a Republican. I don't care about your problems.
Cartoon Description: a group of snowman look inside a window at two people in their
living room
Cartoon Caption One: So why do you believe climate change is already at our
front door?
Cartoon Caption Two: "This article says that global warming will create an
unprecedented refugee crises.
Cartoon Caption Three: This says that Greta Thunberg is
influencing activism in some unexpected places.
Cartoon Description: polar bear
on ice cap floats to man on tropical island
Cartoon Caption One:
Submission: ranked in bottom 5% (6,213th)
“I can’t play when he gets close.”
Also generated:
“I should have brought ice cream. By the way, is the knight bringing ice cream?”
I can't end it like this!
The dragon has a better chance of winning but the knight wants to save the day.
The princess, preparing for a match against ignorance
GPT-3 advisor Pamela Mishkin hypothesized that using a chatbot approach might yield better results. She had used this approach in her essay Nothing Breaks Like A.I. Heart and produced compelling responses. The thought was that GPT-3 wouldn’t have to answer a specific question such as, “generate a caption for a cartoon with X.” Instead, we could ask GPT-3 what it thought was funny about the scene, or what were some common themes that could be the foundation for a good joke.
We first described the cartoon to GPT-3 and allowed the bot to ask questions about the cartoon. Funnily enough, it asked things like, “Do you think the dragon is winning? Is the princess crying? Is the knight bringing ice cream?”
Our responses to these questions would obviously shape the dialogue, and we had a lot of fun with the back-and-forth. We responded to GPT-3 with, “I think the princess is winning. She is confident. The knight is trying to save the princess”
The results were worse than our previous experiments, but the dialogues were entertaining nonetheless. Here’s an example where things got existential:
Human: What would be a funny caption for a cartoon about ping pong and dragons?
AI: "Ping
Pang: The tales they tell about a princess and her pet dragon playing ping pong.
Human:
That's a good pun I guess.
AI: I'm happy to make you happy.
Human: My story is about
dragons play ping pong? Can you think of a funny way to say what happens next?
AI: How
about "This is the best day of my life"
Human: you're not very funny.
AI: None of my
jokes are very funny
You can read four of our conversations with GPT-3 on The Chat.
Submission: ranked in top 37% (2,985th)
“He eats this stuff up.”
Also generated:
“The landlord stopped trying to evict us.”
“He's a voracious reader.”
It's a first edition.
But right now, I'm looking for a book on how to make an alligator not bite you.
Will he serve me or eat me?
We haven't had anyone steal books since we got him.
I'm actually looking to just wrestle him.
We repeated the same approach from our first attempt (The Zero-Shot). But this time, we had famed former New Yorker Cartoon editor Bob Mankoff observing GPT-3 in real-time.
We noticed that many of our entries were hitting the right themes from among the high-ranking captions, but just weren’t written in the clever, pithy style that’s appropriate for a cartoon.
Our plan was that Bob Mankoff could take a GPT-3 generated caption, such as “He eats the books,” and tweak it to, “You know, it’d be funnier if it was, ‘he eats this stuff up.’”
Mankoff also elicited a few topics that we leaned on GPT-3 to augment. He started out with the idea that alligators could be a new type of bookstore pet, so we asked GPT-3, “What are some other pets found in bookstores?”
This experiment, called The Expert, was the first attempt where the caption candidates were not directly generated by GPT-3. We figured that GPT-3 assisting with captions, rather than generating them from scratch, would be a better approach to winning the contest.
Submission: ranked in top 28% (1,934th)
“If I didn’t know better, I’d think you were hiding something.”
Also generated:
“I told you I had a magnetic personality.”
“They’re looking for the third remote”
“This is a pathetic attempt at a beach vacation”
“If I didn’t know better, I’d think you were hiding something.”
“I don't think we should use our silverware tonight.”
“Don't tell them about the couch.”
Bob Mankoff, former New Yorker cartoon editor, connected us with Lawrence Wood, who has won the caption contest seven times.
We thought a similar exercise to the previous attempt would yield strong results with another expert caption-writer in the room. This experiment, named, The Writers Room, involved a lot of back-and-forth with GPT-3, with both Mankoff and Wood riffing on different ideas.
Generally we were really impressed with the results: GPT-3 generated underwhelming captions which would then be polished into outstanding captions by Wood and Mankoff.
How to win the New Yorker Caption Contest
Phrasing Matters
Everyone lands on the same joke angles. Here is contest #771:
The most-common jokes were about:
- Finding a ring - 760 entries contained “ring,” 4th, 5th, and 7th most-funny were about “not buying a ring.”
- Losing keys (310 entries), remote (264 entries) and phone (103 entries)
- References to (and double entendres) of change (287 entries)
We landed on these angles (and nearly all themes among the contest’s entries) pretty quickly in our brainstorm with GPT-3. Since so many people submitted a caption related to “rings,” the highest ranking were the best versions with the most pithy, oh-so-smoothly-rolls-off-the-tongue phrasing.
Very rarely did the finalists use a unique joke angle, which means GPT-3 might be well-suited when it comes to generating the pool of joke angles. But to consistently rank in the top 100, you likely need a human to perfect the writing.
Mankoff, for example, took GPT-3’s output of “Don't worry, dear. It's just the magnetism of my personality" and perfected it to “I told you I had a magnetic personality.” We didn’t submit this caption, but 50 other people entered a “magnetic” joke, with “I asked them to try to find your magnetic personality.” ranking at #153.
Puns and Double Entendres
We never experimented with different figures of speech, but we were hopeful that this would be where a computer would be useful. In theory, we could take the themes of a cartoon and search pun databases for possible candidates. Very few people submit these types of captions, making them more unique and unlikely to rely on pithy writing. For example, “No, I said you've lost your mettle” ranked at #51 in the above cartoon, a pun that could have been generated by GPT-3.
Better Models
GPT-3 was an easy way to test our theory that a computer could reliably produce humor. It relies on a general natural language model, and we tried to point it in the direction of the sort of intellectual, subtle humor that’s found in the New Yorker. But that wasn’t easy: GPT-3 would tend to produce “dad jokes” if simply asked to write a funny caption for a cartoon.
If we were to create a model from scratch, we’d lean on this paper, which describes the different aspects of high-ranking captions, specifically focusing on the cartoon’s anomaly and caption pithiness.
The Caption Contest as a humor barometer
We started this project with the premise of wielding GPT-3 for comedy, but Pamela Mishkin, who was the project’s computational humor expert, aptly questioned this premise.
She pointed out that, in her opinion, New Yorker cartoon captions aren’t actually funny, a sentiment echoed by many people, and based on the way readers ranked captions, it increasingly became clear that this contest was actually a competition in cleverness instead of laugh-out-loud humor. Indeed, very rarely would a caption finalist make us laugh. The barometer for a good caption was at most a chuckle, which is a pretty damn low bar for humor.
In short, an AI that can produce winning captions does mean we have accomplished anything in the realm of computational humor. Rather, we’ve figured out whether an AI can write New Yorker-style captions, which is an impressive feat, but not one that means we can leave joke writing to computers.
Enjoy this project? Check out our other A.I. experiments
- Baking the Most Average Chocolate Chip Cookie
- Nothing Breaks Like A.I. Heart
- How Algorithms Know What You’ll Type Next
Consider helping fund us on Patreon.
Or follow us on Instagram, Twitter, Facebook, and RSS.
References and Additional Work
Mankoff and Microsoft’s work on what makes a caption funny can be found in this Bloomberg write-up about this paper. Another paper is here.
A list of caption descriptions can be found here.
Caption contest data can be found here and here.
A description of NextML’s ranking tool (the survey that people vote on for each caption’s “funniness”) is and here.
In 2019, Hannah Wilson created a model from scratch to generate captions for the New Yorker caption contest. In 2015, Frank Bi, for The Verge, used a Markov Text generator to also create captions for contests.
The Experiments
- #10: The Writers’ Room September 8, 2021
- #9: The Expert Aug 23, 2021
- #8: The Chat August 16, 2021
- #7: The Obvious Joke July 26, 2021
- #6: The Brick-By-Brick July 19, 2021
- #5: The Funniest July 5, 2021
- #4: The Demetri Martin June 28, 2021
- #3: Anomaly + AI = Comedy? June 21, 2021
- #2: Few-Shot May 31, 2021
- #1: Zero-Shot May 24, 2021
Why are we doing this?
TLDR: We want to determine whether non-funny humans (The Pudding team), when aided by a computer, can produce better-than-average jokes.
And if you want, here’s the long version.
We don’t expect computers to replace our comedians anytime soon, and this project isn’t about proving that a computer, in isolation, can be funny.
Joke writing is difficult for the best comedians too: they refine their material over countless attempts in front of audiences of people. Comedians lament the effort that goes into producing one hour of stand-up, metaphorically called “The Gym” and literally “working out” material. Jokes that seemed brilliant on paper can fall flat in front of an audience. There is so much detail to consider and refine, like the perfect timing of a punchline. The work is a months-long feedback loop between a person and an audience—the laughs (and their duration)—are their inputs. They’re perfected in small gyms (tiny comedy clubs) before they’re ready for the big stage (Netflix).
To make people laugh, it isn’t as simple as telling a computer the rules of humor. This project keeps humans in the loop, to teach and nudge computers in the right direction—to ask the right questions to a machine that has infinite answers.
This project explores how humans can assist in the AI joke-writing process: can we turn a mediocre joke writers (The Pudding team) into a New Yorker cartoon-captioning champion, if aided by a computer that’s trained on all of humanity’s humor, knowledge, and references?
Created by members from The Pudding and AI-experts, creative coders, and comedians interested in AI-driven humor. Caption “authors” include Pamela Mishkin (who also serves as the primary expert on GPT-3 and computational humor), Matt Daniels, Russell Samora, Ilia Blinderman, James Yu, and GPT-3. Interested in authoring captions for a future week of the contest? Email us at sup@pudding.cool
Want to see another experiment with GPT-3? Check out Nothing Breaks Like A.I. Heart.