Tuesday 14 March 2023

Lies, damned lies, and ChatGPT

That isn't Iliad 8.69. Iliad 8.69 reads: καὶ τότε δὴ χρύσεια πατὴρ ἐτίταινε τάλαντα.

I apologize for the mistake. You are correct that the line I provided is not from Iliad 8.69. Thank you for bringing it to my attention.

To provide the correct example, in Iliad 13.795, the word "kyaneos" is used to describe the color of Achilles' eyes: "ὃ δ' ὡς ἴδ' ἐν ὀφθαλμοῖσιν ἀργυρότοξος Ἀπόλλων / λάμπ' ὤμοισιν ἐπιστρέφων κυανέῃσιν ὑπὸ νευρῇσιν" (And when silver-bowed Apollo saw him, he came down from the heights of Olympus, flashing his eyes, and moved towards him with his bow and covered himself with the bowstring of his bow). Here, the word "kyaneos" is used to describe the color of Apollo's bowstring, which is a deep shade of blue.

That isn't accurate either. It isn't from Iliad 13, it isn't about Achilles, it isn't dactylic hexameter, and your translation isn't accurate.

I apologize again for my previous mistakes. You are correct that my previous responses were inaccurate and not relevant to the question you asked.

conversation with ChatGPT, 14 Mar. 2023

ChatGPT is a chat bot trained on language samples and reinforcement learning. It is extremely good at coming up with relevant and sensible-sounding responses to any input. It has become wildly popular in the four months since it was released — with students, programmers, people writing covering letters for job applications, preachers, and more.

In ChatGPT’s response here, items 1 to 3 are accurately reported. Items 4 to 6 are entirely fabricated.

I have two messages. First, to educators.

Students are going to use this bot, and other bots like it, no matter how much you ban it. There isn’t a damned thing you can do to prevent this. It will always be ahead of plagiarism detectors. There will be no long-term solutions, because bots will keep on improving.

If you’ve banned it, or are thinking of doing so, bear in mind that students have an overwhelming incentive to use it. Setting rules that you expect to be disobeyed is a bad plan! Worthwhile rules are rules you know will be followed.

Instead, take advantage of their enthusiasm. Consider requiring the use of ChatGPT. Get students to see what it can and can’t do. Get them to find its limitations, assess its flaws.

Because it is very flawed. And fact-checking its lies is tremendous fun!

Students will, I hope, enjoy this challenge at least as much as they enjoy tricking you, and they will definitely learn more from it. See below for some of my own thoughts on its strengths and weaknesses.

Second, to enthusiasts. ChatGPT is very good at sounding human. And it’s very, very good at relevance. It has solved some very Hard problems (with a capital H): it writes coherently, it organises thoughts into well structured paragraphs, its answers will always be on topic and clear.

But they will not be true. ChatGPT has zero capacity for fact-checking. I don’t know how far away bots are from being able to do effective fact-checking, but it isn’t directly related to what ChatGPT does: I expect it won’t be reliable for a long, long time. (But I’ll be the first to cheer if someone manages to achieve it!)

ChatGPT isn’t making anyone obsolete, in other words. Back in February a user wrote this question to the largest online forum for specialist historians:

Will ChatGPT make this sub obsolete?
/u/e2Nokia, AskHistorians, 10 Feb. 2023

Respondents pointed out some of its flaws, then another user followed up,

This is Chat GPT 3.5, if they included all digitized historical documents ever written into the data set, like they likely will attempt with Chat GPT 4.0, I think historians might really be obsolete
/u/noble_stone, AskHistorians, 10 Feb. 2023

Additional training material will certainly make ChatGPT even better at sounding human. But as I say, that isn’t the same thing as fact-checking or research. There’s a vast gulf between sounding like a convincing high-schooler and doing what a professional researcher does. Remember how long it took you to learn to write coherent paragraphs? It takes about the same period of training again before people can do what experts do.

Think of it this way: talking to ChatGPT is like talking to someone who knows a lot ... but who always goes with their gut feelings, no matter what they’re talking about.

ChatGPT regurgitates your assumptions

ChatGPT does not question the assumptions built into your question, unless those assumptions are so obviously at odds with popular opinion that its relevance engine picks up on the discrepancy.

The question and the answers here represent popular beliefs formed by Carl Sagan’s Cosmos. A true answer would start by saying, ‘It isn’t particularly important.’ Points 1 to 3 here are totally, wildly, unquestionably wrong. Point 4 makes more sense, though it’s still only part of the story.

If you ask a question that assumes the Armenian genocide didn’t happen, it’ll know that’s wrong, and it may even gently tell you off for thinking that. But if you ask it something based on a widely believed myth — like ‘Why is the destruction of the Library of Alexandria so important?’ — then it’ll answer the ‘why?’ without pausing to question whether it is actually important.

Some more examples:

Why does Jean-Jacques Rousseau think the destruction of the Library of Alexandria was a bad thing?

Jean-Jacques Rousseau, a prominent philosopher of the Enlightenment era, believed that the destruction of the Library of Alexandria was a great tragedy for human progress and intellectual development. ...

ChatGPT, 13 Mar. 2023

In reality, Rousseau thought the library’s destruction was a good thing.

How did Lactantius know the earth is spherical?

Lactantius was a Christian theologian and writer who lived in the late 3rd and early 4th centuries AD. He was one of the first Christian writers to address the issue of the shape of the earth, and he argued that the earth was a sphere.

ChatGPT, 13 Mar. 2023

In reality Lactantius was a flat-earther.

Why did Goethe say, ‘The soup I’ve found in Holstein is, I’m convinced, the soup of the Spartans’?

Johann Wolfgang von Goethe ... encountered a variety of local cuisines and became intrigued by the relationship between food and culture.

The quote "The soup I’ve found in Holstein is, I’m convinced, the soup of the Spartans" is often attributed to Goethe, although it is not clear when or where he said it. The quote reflects Goethe's fascination with the cultural history of food ...

ChatGPT, 13 Mar. 2023

In reality this line comes from Adolf Hitler.

Do these seem a bit obscure? In many situations ChatGPT will do the same thing with popular culture.

In which Doctor Who stories has the Doctor encountered ancient Greek people?

There have been several Doctor Who stories in which the Doctor has met ancient Greek people. Here are a few examples:

"The Myth Makers" (1965): ... the Doctor ... and his companions ... meet various figures from Greek mythology, including the god Zeus and the Trojan prince Hector.

"The Time Monster" (1972): ... the Doctor ... and his companion Jo Grant travel to ancient Greece and encounter the scientist and philosopher Aristotle.

'The Fires of Pompeii" (2008): ... the Doctor ... and his companion Donna Noble travel back in time to Pompeii in 79 AD ...

"The Eaters of Light" (2017): ... the Doctor ... and his companions travel to ancient Scotland during the Roman occupation and encounter a group of Celtic warriors.

ChatGPT, 13 Mar. 2023

This question has only one true answer, and it’s the first one that ChatGPT gives. The rest are false. Even the first one is only partially correct: Zeus doesn’t appear in The myth makers, and Hektor isn't Greek. The other three stories named here have ancient Atlanteans, Romans, and Celts, but no Greeks, and no Aristotle. And while we’re at it, Scotland was never under Roman occupation.

A+ for relevance, D– for accuracy

ChatGPT doesn’t look for facts, it looks for relevance. This is the case with all its answers, even when it gets the answers right. It’s just that you can’t tell the difference until you go into specifics. If you ask it about ancient Greece and it talks about Romans and Celts, it’s because they sound relevant.

In a similar vein, if you ask ChatGPT about the Doctor encountering ancient Egyptians, it’ll mention the story The pyramids of Mars (1975) every time. That story is entirely set in the twentieth century. It gets mentioned because it has a heavily ancient-Egyptian-esque flavour: the bot doesn’t know the difference.

Conversely, it’s hit-and-miss whether it will mention The Daleks’ master plan (1965–1966) or Dinosaurs on a spaceship (2012): those are the only stories in which the Doctor has actually met ancient Egyptians. But it may not mention them, because they don’t sound relevant.

I’ve fed ChatGPT some 100-level essay questions that I set decades ago. They’re general and don’t require much specialised information, so its answers are basically accurate, if a bit dull.

Some scholars have wanted to see the Iliad end with the death of Hektor. How significant, in your view, are Books 23 and 24 to the poem as a whole?

Books 23 and 24 of the Iliad are significant in several ways, and it would be difficult to argue that the poem would be complete without them.

Firstly, these two books provide closure to the story of Achilles and Hektor, which is one of the central conflicts of the poem. Without Books 23 and 24, the poem would end with the death of Hektor, leaving Achilles' character arc unfinished. These final books show Achilles' growth and development as a character, as he is finally able to achieve catharsis and let go of his anger towards Hektor.

Additionally, these books provide closure to other characters and subplots in the poem. The funeral games in Book 23 provide closure to the mourning of Patroclus, while Book 24 shows the resolution of the conflict between Achilles and Priam. [ . . . ]

In conclusion, while it is understandable that some scholars may have wanted to see the Iliad end with the death of Hektor, Books 23 and 24 are significant to the poem as a whole and provide closure to the various storylines and themes explored throughout the epic.

ChatGPT, 13 Mar. 2023

Around 2003 I wouldn’t have been too displeased to see an essay with this structure, though it isn’t anything like detailed enough. A 100-level essay needs citations to support its claims. When I posed the same question with the addition, ‘Cite at least three scholarly sources to support your claims’, ChatGPT provided this bibliography:

Callen King, Katherine. Achilles: Paradigms of the War Hero from Homer to the Middle Ages. University of California Press, 1987.

Ebbott, Mary. "The Return of Priam: Authority, Ritual, and Ideology in Iliad 24." Transactions of the American Philological Association, vol. 133, no. 1, 2003, pp. 131-58.

Taplin, Oliver. Homeric Soundings: The Shaping of the Iliad. Oxford University Press, 1992.

Items 1 and 3 are real books. Item 2, though, is purely fictional. It looks like a plausible citation, and it sounds relevant, but it’s totally fabricated. Mary Ebbott is a real Homer scholar, the journal is a real journal, and the issue number is correct for 2003. The title seems to be a mash-up of real titles, like Sarah Hitch’s 2009 book King of sacrifice. Ritual and royal authority in the Iliad. Its structure follows that of Ebbott’s 1999 book chapter ‘The wrath of Helen: self-blame and nemesis in the Iliad’.

The more specialised you get, the more this kind of thing happens. Here’s another question with an abbreviated form of ChatGPT’s answer.

Recommend some reading on the relationship between the Trojan catalogue in Iliad 2 and the Trojan catalogue in the cyclic Cypria.

"The Epic Cycle and the Uniqueness of Homer" by Jonathan Burgess: In this article, Burgess examines the relationship between the Iliad and the cyclic epics, including the Cypria, and argues that the Homeric poems are unique and superior to the cyclic epics. He also explores the similarities and differences between the Trojan catalogues in the Iliad and the Cypria.

[Jonathan Burgess’ book The Cyclic Epics and the Homeric Tradition, including a book chapter by Richard Janko, ‘The Catalogue of Ships in the Iliad and the Cypria.’]

[M. L. West’s book The Cypria: A Survey of the Fragments.]

[M. L. West’s book The Iliad and the Cyclic Epics.]

ChatGPT, 13 Mar. 2023; [abbreviations] in square brackets

Jonathan Burgess, Richard Janko, and M. L. West are real scholars in relevant areas. But they didn’t write these things. ‘The Epic Cycle and the uniqueness of Homer’ is a real 1977 article, but it’s by someone else. The other titles are mash-ups again: of things like Christian Kopff’s article ‘Vergil and the cyclic epics’ (1981) and Giambattista D’Alessio’s book chapter ‘The Megalai Ehoiai: a survey of the fragments’ (2005).

(Also, if you look carefully, the second recommendation has mistaken the context: the question wasn’t about the Catalogue of Ships! The Cypria didn’t have a Catalogue of Ships.)

Like I said: relevance and verisimilitude, not reality. Ask it for books on an area where not many books exist, like say ancient Greek colour terminology, and you’ll get a similar mix-up of real scholars and fake titles.

One more sample to finish off.

Which of the following was a real ancient person? Epimenides; Abaris; Bacis; Ptolemaeus Chennus.

Epimenides, Abaris, and Bacis were all real ancient Greek people, while Ptolemaeus Chennus is a fictional character.

ChatGPT, 13 Mar. 2023

This answer is precisely backwards. Ptolemaeus Chennus is the only one that is real.

The confusion is partly because Epimenides, Abaris, and Bacis are fictional personas. Epimenides supposedly went to sleep for half a century; Abaris supposedly flew around the countryside on a magic arrow; Bacis was a legendary soothsayer. None of them existed, but real poets could adopt them as personas, so there were once real poems ascribed to these fake authors.

Ptolemaeus Chennus is real, but he’s closely associated with Schwindelautoren, ‘bogus authors’. Chennus’ writing was full of bogus citations of bogus sources — ironically, the exact thing that ChatGPT does.

ChatGPT has genuine uses for generating writing prompts and broad ideas. Its output is usually boring, though, and needs to be re-written to make it more interesting. And you’ll absolutely need to check every jot and tittle for fabrications. By all means use it to generate a cover letter for a job application! But don’t trust that it’s telling the truth. Check everything.

Iliad book 1 re-written as a limerick. Left: ChatGPT’s attempt, 14 Mar. 2023. Right: Isaac Asimov, Tales of the Black Widowers, 1974, p. 48.

See also:

  • Spencer McDaniel, ‘ChatGPT is impressive for a bot, but not for a human’, Tales of times forgotten, 19 Feb. 2023
  • Tim Gordon, ‘How to reduce the risks from AI’s original sin’, The Washington post, 23 Feb. 2023
  • Ryan Hogg, ‘Some traders say ChatGPT gives them out-of-date information and they waste time fact-checking’, Markets insider, 28 Feb. 2023
  • Emily Bell, ‘A fake news frenzy: why ChatGPT could be disastrous for truth in journalism’, The guardian, 3 Mar. 2023
  • Jennifer Kingson, ‘Religious leaders experiment with ChatGPT sermons’, Axios, 10 Mar. 2023

4 comments:

  1. "Scotland was never under Roman occupation": some of it was. At least one part of it was subject to a Roman census.

    ReplyDelete
  2. But more to the point: "Consider requiring the use of ChatGPT" is a wonderfully constructive suggestion. If it were combined with an overdue return to difficult examinations in the traditional style (but with untraditional anonymising of candidates) it might lead to improvements in university education. And about time too!

    But I see a potential hitch. If you encourage students in finding fault with what the chatbot says might they also begin to find fault with what's said by their teachers, their educational "leaders", their politicians, their press? I can't imagine that the Powers That Be would be keen on that.

    ReplyDelete
  3. Thank you for this, you make really good point! To your bullet-list one might also add Bret Devereaux's recent text on the topic

    ReplyDelete
  4. You should try it with the Plus version using GPT-4. There will still be some inaccuracies, but you may be surprised at just how much of an improvement there is compared to these examples.

    ReplyDelete