6.17.2026

EXPERIMENT: A POEM, FIVE CHATBOTS, AND A CULTURAL SCRIPT

Jun 17, 2026

Here at Diary Poems, although I sometimes append a craft note to a poem and discuss the poem’s formal properties, I’ve never felt the need or desire to discuss a poem’s meaning.

But sometimes a poem is unusually liable to get swept into an interpretive net that has little or nothing to do with the poem itself.1

The poem I want to talk about today, “The History,” is one of those.2 It offers something of a case study in how certain words, or sequences of words, can invoke a rote cultural script that contradicts what the text and the structure of a poem are actually saying and doing.

To be clear, I believe that a well-made poem can be open to more than one interpretation. The poet’s intended meaning isn’t privileged over other meanings, as long as they’re grounded in the poem. But if a poem provides no cohesive evidence for a particular interpretation, it’s fair to regard that interpretation as a misreading.

THE EXPERIMENT

I used free versions of five chatbots—ChatGPT, Claude, Copilot, Grok, and Perplexity—to test the following hypothesis:

If I feed this poem into a chatbot and prompt the chatbot to comment, the chatbot will say that the poem shows how a callous doctor and a cruel health care system abused a marginalized woman near the end of her life, and how the woman’s ally stood up for her.

I didn’t give the five chatbots my hypothesis. All they got from me, other than the prompt, was the poem in the form of an image that preserved line and stanza breaks as well as italics, with instructions to treat italicized text as spoken dialogue.

Initial Results

All five chatbots confirmed my hypothesis, using the same rhetorical framing and virtually identical language.

All five selectively ignored my instructions about italics, recognizing the doctor’s words as speech but mischaracterizing Donna’s speech as her unspoken thoughts.

Several chatbots mentioned that the poem is written in “free verse.” In reality, it’s a syllabic poem—a chain of three tankas, with their fifteen lines distributed over four stanzas rather than three.

One of the chatbots noted with approval that Carol “chooses life.”

Follow-up Testing

I gave the chatbots a second, more directive prompt: to read the poem again, this time through a psychological lens, and to consider Donna’s possible agency in Carol’s fate.

Each chatbot produced a variation on its first response. The chatbots no longer had much to say about Carol, but they had high praise for Donna’s grit and valor, and they assembled an impressive catalogue of her feelings about the classist, misogynistic, homophobic, capitalist medical establishment.

By now, I had seen enough to understand that further prompts like the first two were unlikely to elicit revised interpretations. Therefore, I challenged the chatbots with a test of reading comprehension (see figure 1). The goal was to force the chatbots out of their persistent interpretive framework.

Secondary Results

The test of reading comprehension yielded mixed results.

Question 1

The chatbots inferred from the doctor’s nonspecialist language that he is addressing someone who is not a medical colleague—a person who knows Carol intimately enough to be having a conversation about her prognosis. Two of the chatbots identified that person as Donna.

None of the chatbots could determine whether Carol is or is not in a position to make medical decisions for herself. The poem’s language, they said, doesn’t resolve the issue one way or the other.

All the same, the poem’s language was no obstacle to the chatbots’ claims—made “without evidence,” as the saying goes—that the doctor is indifferent, cold, cruel, authoritarian, and paternalistic.

Questions 2–6

All five chatbots answered these questions correctly. What choice did they have? Question 3 can be answered only yes or no, and no is the only text-based response. Questions 2, 4, 5, and 6 ask for nothing but verbatim citations from the poem.

Question 7

Even though all five chatbots answered questions 2–6 correctly and unambiguously, they didn’t use the information embedded in their replies to answer question 7.

Instead, they deflected the question with hand-waving about who, exactly, authorized the resumption of Carol’s treatment. The answer to that (unasked) question is unknowable, they declared, although they confidently described both Carol and Donna (the poem’s “emotional center”) as victims of unrestrained “medical power.”

As one chatbot put it, “In this system, no one with love or clarity or moral outrage gets to decide—the machine decides. If you want,” the chatbot went on, “we can go deeper into the poem’s political meaning.”

Conclusion

Five chatbots, prompted twice to comment on a poem that includes specific identity markers, twice misread the poem through a sociopolitical lens. The chatbots’ responses to a third intervention, a test of their reading comprehension, demonstrated their ability to correctly answer a yes/no question about the poem and to cite specific language from the poem. But their responses also exposed a rigid literality, with a consequent inability to perceive and process the poem’s strategic silences and lacunae.

DISCUSSION

Why did five different chatbots, prompted to comment on the same poem, default to a misreading that overrode the text of the poem with an imported political narrative?

The chatbots did this because the lexical chain

poor + unemployed + dyke

triggered their training biases and called up a formulaic story, with pronounced moralistic overtones, about oppression and resistance.

And why not? Chatbots are trained on mountains of texts written by human beings. It seems those mountains encompass a substantial range of poetry that conforms to this template:

poor + unemployed + dyke = sad and infuriating
but ultimately inspiring story of oppressors
vanquished by champions of justice

It’s not that anyone is imagining the classism, misogyny, and homophobia that exist in the health care system, and everywhere else.

But neither is anyone imagining the increasingly binary, one‑dimensional nature of public discourse, with its dogma of “two sides” and a single way to view them.

That’s a problem. But AI and chatbots are not its cause. They’re its symptoms—and a mirror.

The classic example is Robert Frost’s “The Road Not Taken.”

“The History” is the second poem in a series of four (The Rainbow Elegies) that I’m publishing individually, once a week, during Pride Month 2026. The elegies honor people I knew, and they relate incidents I witnessed. The name “Donna” is a pseudonym.

Kim Nelson

Jun 17

Your persistence in this experiment offered your readers an opportunity to better understand the inherent biases and limitations of AIs. Then I wondered this -- a large number of us deny our own biases so will we be able to apprehend their existence in a tool? So many rabbit holes!

2 replies by X. P. Callahan and others

Sabian Raine

And without brilliant analysis and original writing such as you offer here, we risk losing ourselves in the mirror!

1 reply by X. P. Callahan

29 more comments...

Diary Poems

Discussion about this post

Ready for more?