Concerns raised over ChatGPT’s citation accuracy in latest study

In a recent study released by the Tow Center for Digital Journalism at Columbia University, significant concerns have been raised regarding the reliability of content citations produced by the AI chatbot ChatGPT, developed by OpenAI. The research becomes particularly pertinent as various publishers opt to sever or modify their licensing agreements with OpenAI amidst ongoing uncertainties surrounding the accuracy of AI-generated citations.

The study focused on evaluating how accurately ChatGPT identifies sources for quotations taken from a diverse range of publishers, encompassing both those with licensing agreements and those without. Researchers examined 200 distinct quotes from 20 randomly selected publishers, including notable names such as The New York Times, The Washington Post, and The Financial Times. Through a rigorous analysis, the research aimed to discern whether ChatGPT could reliably attribute sources when prompted with specific block quotes.

Researchers Klaudia Jaźwińska and Aisvarya Chandrasekar, in their blog post detailing the study’s findings, indicated that ChatGPT’s citation mechanism does not guarantee accuracy. They noted, “Though OpenAI emphasizes its ability to provide users ‘timely answers with links to relevant web sources,’ the company makes no explicit commitment to ensuring the accuracy of those citations.” This lack of clarity is particularly troubling for publishers who expect their intellectual property to be represented and referenced correctly.

The researchers uncovered numerous examples where the citations produced by ChatGPT were inaccurate, highlighting a worrying inconsistency in its responses. They found that while some citations were correct, a considerable number were either entirely incorrect or only partially accurate. In total, out of the 200 instances reviewed, 153 responses from ChatGPT were flagged as either partially or fully incorrect, while the AI only acknowledged a lack of confidence in its answers on seven occasions.

In addition to inconsistencies, the study raised concerns over the potential for reputational damage to publishers stemming from these inaccuracies. Whether publishers had chosen to block OpenAI’s search crawlers or allowed them, the findings suggest that no party is immune from the risks associated with incorrect citations. For example, even publishers actively obstructing access reported improper attributions of their content.

Adding depth to the findings, the researchers pointed out an unsettling correlation between AI-generated citations and the propagation of plagiarised content. They reported an incident where ChatGPT inaccurately credited a website that had simply copied a New York Times article without proper citation, underscoring a broader issue regarding OpenAI’s mechanisms for filtering and verifying content from data sources.

Despite the ongoing collaborations between prominent publishers and OpenAI, the study suggests that these partnerships do not ensure more reliable citation practices. Researchers proposed that the underlying issue lies in how OpenAI processes content, treating journalism as “decontextualized content” and ignoring the contextual factors that contribute to authentic sourcing.

Furthermore, the researchers examined ChatGPT’s responses to identical queries posed multiple times, revealing significant variability in the results. The implications of such inconsistency are evident, especially in contexts where accurate sourcing is crucial for users who rely on these citations for informed decision-making.

In response to the study, OpenAI maintained its support for publishers, asserting that it aims to aid users in discovering quality content through accurate summaries and attributions. The company pointed to ongoing efforts to enhance their citation accuracy and respect publisher preferences in search outcomes.

The Tow Center's study underscores critical uncertainties about the future interplay between AI technologies and news publishers. As more media organisations reconsider their relationships with generative AI tools, the findings encapsulate the need for transparent methods in sourcing and attribution to protect journalistic integrity within this evolving landscape.

Source: Noah Wire Services

More on this