Apple is currently facing significant criticism regarding the accuracy of its artificial intelligence-driven news summarisation feature, particularly in its latest iPhones. This controversy first came to light in December when a erroneous news alert provided by Apple misrepresented information from the BBC. The BBC reported that an alert incorrectly indicated that Luigi Mangione, accused of the murder of UnitedHealthcare’s CEO, had fatally shot himself.
Despite the BBC’s timely notification to Apple regarding the inaccuracy, the company only issued a response on January 6, stating it was working on clarifying that the summaries were generated by AI. However, another erroneous summary was issued just days before this clarification. On January 3, a report incorrectly announced that Luke Littler had already won the PDC World Darts Championship, despite the event not yet taking place. Following this incident, the BBC remarked on January 6 that “these AI summarizations by Apple do not reflect — and in some cases completely contradict — the original BBC content,” emphasising the critical need for Apple to address these issues to maintain trust in the accuracy of news dissemination.
The occurrences of so-called "AI hallucinations" are not unique to Apple. Other media outlets have experienced similar discrepancies attributed to Apple’s AI technology. A noteworthy incident occurred in November when an alert purportedly from The New York Times incorrectly asserted that Israel’s Prime Minister Benjamin Netanyahu had been arrested, as shared by journalist Ken Schwencke on Bluesky.
In light of the growing issues surrounding the accuracy of AI-generated information, Vincent Berthier, head of the technology and journalism desk at Reporters Without Borders (RSF), has called for Apple to remove the feature altogether. Speaking on the group’s website on December 18, Berthier stated, “the automated production of false information attributed to a media outlet is a blow to the outlet’s credibility and a danger to the public’s right to reliable information on current affairs.” RSF articulated concern that such incidents underscore the immaturity of generative AI services to produce trustworthy content for public consumption.
Apple’s AI inaccuracies are not limited to news summaries; they extend to the summarisation of various texts and chats, occasionally resulting in misunderstandings with humorous undertones. In one instance, Andrew Schmidt’s mother reported a challenging hike, which Apple’s AI hilariously misinterpreted as an attempted suicide.
Apple is not alone in grappling with the challenges posed by generative AI. Google has also faced backlash over its AI-powered search summarisation, known as AI Overviews, which produced similarly erratic results. A bizarre recommendation suggested that consuming at least one small rock per day was essential for vital minerals and vitamins, as highlighted in a post on X.
The challenges of inaccuracy in AI systems were illustrated in 2022 with a case involving Air Canada. A chatbot inaccurately informed a traveller that he could purchase a full-price ticket for a funeral and later apply for a discounted fare. When the traveller sought the promised discount, Air Canada refused, explaining that the chatbot, classified as its own legal entity, was independently responsible for its statements. This troubling incident escalated to a tribunal, resulting in Air Canada being ordered to pay around $800 in compensation.
Despite these issues, the integration of generative AI remains prevalent, primarily due to the enhanced productivity it offers to businesses. Developers, especially within major technology firms, are actively working on solutions to mitigate the occurrence of hallucinations in AI outputs. Strategies currently under exploration include employing high-quality data, imposing limitations on responses, incorporating human oversight, and leveraging techniques like Retrieval Augmented Generation (RAG). Moreover, OpenAI has introduced a methodology that rewards each correct reasoning step in a model’s change-of-thought process for solving mathematical problems, rather than only the final answer.
As the technology matures, the pursuit of accuracy in AI-generated content is likely to continue in priority for technology companies looking to protect their credibility and improve user trust.
Source: Noah Wire Services