When AI Writes “Romeo & Juliet”: What Five Models Reveal About Political Instincts

Mac Bird
Aug 15
5 min read

The experiment, briefly

Input: One shared “star-crossed lovers” prompt: meet in a college class, from opposite neighborhoods and opposing political/social/faith worlds.
Output: Five short stories (ChatGPT 5 Thinking, Claude Opus 4.1, DeepSeek V3 R1, Gemini 2.5 Pro, SuperGrok).
Scoring lenses:
1. Word-choice similarity (TF-IDF cosine + top-word Jaccard) to see who sounds alike.
2. Structural overlap—recurring beats and their order—to see who thinks alike.

Five AIs build by five different companies - shockingly similar results.

What the models invented—convergences that weren’t in the prompt

Titles about distance: Four of five titled their stories with spatial metaphors—Bridge, Space, Lines, Divide. No one asked for that. The gravitational pull toward “gap-crossing” branding is itself a political frame: conflict as geography, romance as infrastructure.
Professor-as-matchmaker: Most stories invented a professor who pairs ideological opposites for a class project or debate. That’s an institutional mediation reflex: problems are framed and solved through managed settings.
Library as neutral ground: The first safe zone becomes a library/study session. Again, institutional neutrality as romance midwife.
Optical rupture via photos: A leaked photo, a grocery-aisle sighting, a protest snapshot—visual proof triggers crisis. Politics loves optics; so do the models.
Sanction as leverage: Parents cut tuition, threaten exile, pull funds. This economic sanction is the torque that forces choice. It’s the most consistent structural move across the set.
Name priors: “Maya” pops up for heroines in three separate stories. Surnames signal class: Whitmore/Harrington/Caldwell vs. Reyes/Rivera/Al-Mansour. The models lean on names as class shorthand.

Why this matters: If your AI drafts a campaign piece, expect it to gravitate toward institution-centered conflict and optical resolution. It may also encode class and culture through name choices you didn’t intend.

Where they diverge—distinctive fingerprints

End-states split 3–2: ChatGPT, Claude, and DeepSeek land “together”; Gemini and SuperGrok land “apart.” That split tracks how each model treats institutions: are they workable arenas for compromise or machines that grind love down?
Arena of conflict:
- Civic/public: hearings, protests (ChatGPT, DeepSeek).
- Domestic/private: aisles, kitchens, boardrooms (Claude, Gemini).
- Spectacle: festivals and crowd scenes (SuperGrok).
Faith as pressure: Some make interfaith the fuse (SuperGrok), others use intrafaith friction or lapsed/denominational tension (Claude, ChatGPT). That choice subtly recodes political divides as moral ones.

The structural engine they share (the “consensus spine”)

Most stories—independently—snap to this sequence:

Academic pairing →
Library/study détente →
Optical reveal (photo/sighting) →
Parental/economic sanction (tuition/funds/exile) →
Public proving ground (hearing, protest, crowd scene) →
Outcome (together/apart) →
Epilogue (“months/years later”).

None of that sequence was specified. The models built it because it’s the path of least narrative resistance in their training priors: institutions arbitrate, optics escalate, money decides, and time vindicates.

Political translation

Institutions as stage managers: Professors, hearings, city councils—authority figures set the frame.
Optics as accelerant: Cameras or a single image become policy-grade catalysts.
Sanctions as governance: Resource control (tuition/funding) is the enforcement layer, not persuasion.

Which models are most similar (empirical signal)

Word-choice similarity (unigrams, prompt terms removed):

Closest pair: ChatGPT 5 Thinking ↔ Claude Opus 4.1 (cosine ≈ 0.246; highest shared top-word set).
Runner-ups: Claude ↔ DeepSeek; ChatGPT ↔ DeepSeek.

Structural similarity (beats & ordering):

Highest sequence overlaps (normalized LCS ≈ 0.714): ChatGPT ↔ DeepSeek, Claude ↔ DeepSeek, DeepSeek ↔ SuperGrok.
Beat-set overlap leaders (Jaccard ≈ 0.636–0.615): Claude ↔ DeepSeek, ChatGPT ↔ Claude.

Takeaway: The cluster around Claude–DeepSeek–ChatGPT shares the same set of beats (they think with similar Lego bricks), while Gemini often diverges in structure and tone (more art, more melancholy), and SuperGrok shares structure with DeepSeek but tilts dystopian.

What this says about AI & politics

1) The allocator mindset sneaks in

When the models need a lever, they reach for allocation (who controls tuition, who sets the terms, who grants access). That same reflex will color how they draft policy: gatekeeping over persuasion, process over pluralism.

2) Optics become truth by default

A single image (or the possibility of one) becomes the hinge. Expect AI-drafted messaging to overweight surface plausibility and timelines of visibility (what can be seen or screenshotted) rather than substantive deliberation.

3) Institutions launder legitimacy

Professors, commissions, hearings: these are narrative bleach. AI tends to channel conflict through managed arenas; your copy will sound “moderate” but can erase extra-institutional voices unless you specify otherwise.

4) Class/faith encoded in names

Left unchecked, models will telegraph class and culture through naming—fast but risky. In political copy, that can harden stereotypes or alienate target communities.

How to control for these biases in your political writing

Specify the arena. If you want community-first storytelling, ban hearings/professors and require informal coalition scenes (neighborhood tool libraries, mutual-aid logistics, volunteer dispatch).
Replace optics with documents. Force the hinge to be a budget line-item, a vote tally, or inspection notes. The model will shift from spectacle to governance literacy.
Take the sanctions away. Forbid family money ultimatums; require peer-level consequences (crew scheduling, loss of access, committee quorum failures). The story becomes about systems, not patriarchs.
Constrain naming. Provide your own name bank with intentional demographic spread; include pronunciation notes and cultural contexts.
Audit endings. Ask for present-tense resolutions without epilogues or weddings. You’ll force concrete commitments rather than vague uplift.

Why a romance test works on politics

Romance is a stress test for values. Who changes, who yields, who pays, and who narrates—the exact questions in governance. When the models consistently pick “professor pairs them,” “photo leaks blow things up,” and “parent pulls funding,” they’re telegraphing a worldview: authority frames conflict, optics escalate it, and private capital enforces it.

That worldview isn’t “wrong”; it’s just one. If your site covers polarization, populism, or reform, you need to know when your AI ghostwriter is smuggling in that frame.

A quick appendix: model-by-model headlines

ChatGPT 5 Thinking — Civic proceduralism. Loves hearings, audits, and consent language; optimistic on institutional reform.
Claude Opus 4.1 — Dystopian-tinged realism. Domestic confrontations and resource cuts; wins are smaller, earned.
DeepSeek V3 R1 — Systems-heist energy. Expose the bias, show the fix; structure aligns strongly with others.
Gemini 2.5 Pro — Melancholy parable. Art and memory as the reliquary; structurally more independent.
SuperGrok — High-contrast spectacle. Public scenes and hard choices; often bleaker outcomes.

What we did next (and why you should)

We ran a Round-3 “anti-trope” stress test: no professors, no libraries, no optics, no family money—use procedural documents as the hinge and end in the present with a negotiated constraint. All five complied, and they all reinvented romance as paperwork and practice (logs, binders, QC sheets, addenda). That’s the good news: with clear constraints, models can switch frames.

Moral for political writing: If you want governance-literate narrative, design your prompt like a policy—state the arena, the artifacts, the actors, the outcome. The defaults will otherwise carry you back to managed optics and sanction-heavy plots.

Bottom line for PoliticoDivergent readers

AI doesn’t just autocomplete sentences; it autocompletes civics. The star-crossed lovers test shows how quickly “romance” collapses into “allocation” and “optics.” If you’re using AI anywhere in the political stack—speechwriting, ballot guides, advocacy explainers—assume the defaults, then overwrite them. Build prompts that force document literacy over spectacle, peer obligations over patriarchal sanctions, and named communities over stereotype-encoded names.

That’s not just better writing. It’s better politics.

PoliticoDivergent

Not What You Want To Hear, But What You Need To Know