By understanding the limitations of using large language models for fact-checking, writers and editors can learn to use these tools productively.

As large language models (LLMs) like ChatGPT become more relevant and advanced, many writers and editors may be concerned about how these models will affect their careers. One area of particular interest is determining if ChatGPT could be used to detect the accuracy of information. The following study set out to determine whether writers and editors—the traditional fact-checkers in publishing—could be replaced by emerging LLMs when it comes to verifying facts.

THE RESEARCH

K. Węcel, M. Sawiński, M. Stróżyna, W. Lewoniewski, E. Księżniak, P. Stolarski, and W. Abramowicz conducted a study entitled “Artificial Intelligence—Friend or Foe in Fake News Campaign.” The study examines the viability of using ChatGPT as a fact-checker. ChatGPT is an LLM that analyzes vast amounts of texts from the internet to identify patterns and generate new text. Given the wide knowledge base that a system like ChatGPT should have, the researchers decided to compare ChatGPT responses with fact-checking verdicts from actual people.

First, the researchers randomly selected 4,770 claims that had been submitted to fact-checking websites. Then they gave each claim to ChatGPT to fact-check. The researchers also wanted to measure whether ChatGPT would return a different response based on how it was prompted, so the researchers used six different prompts with each claim. 

The researchers found that the accuracy of ChatGPT’s responses was low and not much different from the accuracy of random guesses. They also found that the wording of different prompts did not affect the overall accuracy of a response, but the different prompts did affect the confidence of a response (i.e., whether ChatGPT returned false, partially false, or no evidence). This variation in confidence suggests that ChatGPT responses may contain biases based on the way the user prompts it. 

“Large language models are optimized for plausibility not accuracy.”

Węcel et al. (2023)

The researchers’ findings supported the conclusions from previous research that ChatGPT’s accuracy was low because ChatGPT can produce responses that are similar to what it has already seen, but those responses are designed to be more convincing than accurate. At its current state of development, ChatGPT is not able to analyze text for accuracy the way a person could.

THE IMPLICATIONS

These results reveal that LLMs have not eliminated the need for writing and editing jobs—especially when it comes to fact-checking. Fact-checking is a crucial aspect of ethical writing and editing. Published content should strive to relay the most accurate information possible, and LLMs like ChatGPT are not currently capable of analyzing content for accuracy. 

However, this does not mean that writers and editors should never use ChatGPT. In fact, learning to use ChatGPT and other LLMs—with their limitations in mind—can certainly increase productivity and give writers and editors an edge in the job market. Writers and editors should ensure that when they use ChatGPT in content creation, they are cognizant of its inability to detect inaccuracies. 

To learn more about the limitations of ChatGPT as a fact-checker, read the full article:

Węcel, K., M. Sawiński, M. Stróżyna, M. Lewoniewski, E. Księżniak, P. Stolarski, and W. Abramowicz. 2023. “Artificial Intelligence—Friend or Foe in Fake News Campaigns.” Economics and Business Review, 9, no. 2: 41–70. https://doi.org/10.18559/ebr.2023.2.736.

—Katie Greene, Editing Research

FEATURE IMAGE BY HATICE BARAN

Find more research

Learn more about using ChatGPT effectively by taking a look at Parker Cook’s Editing Research article “ChatGPT, Editing, and You.” 

Read Abid Haleem, Mohd Javaid, and Ravi Pratap Singh’s 2022 article to discover more features and challenges of ChatGPT: “An Era of ChatGPT as a Significant Futuristic Support Tool: A Study on Features, Abilities, and Challenges.”