The rise and critique of prompt engineering in AI

The concept of prompt engineering, which involves refining instructions to optimally interact with generative AI systems, has emerged as a crucial aspect of the artificial intelligence landscape since the success of ChatGPT in 2022 and 2023. ZDNet reports that this method has garnered both enthusiasm and criticism, igniting a debate on its efficacy as a user interface tool.

Prompt engineering rose in prominence by 2024, as users recognised that well-crafted prompts could significantly influence the accuracy and relevance of outputs from large language models (LLMs). Companies like Uber have fully embraced this practice, establishing dedicated teams to focus on the optimal structuring of prompts. This development has contributed to prompt engineering becoming a specialised field of exploration.

However, criticisms of this approach have been articulated by experts such as Meredith Ringel Morris, a principal scientist for Human-AI Interaction at Google’s DeepMind research unit. In an article for the December issue of Communications of the ACM, Morris argues that prompting might not be the best interface for generative AI systems. She states, "It is my professional opinion that prompting is a poor user interface for generative AI systems, which should be phased out as quickly as possible."

Morris highlights the inherent limitations of prompts, stating that they are fundamentally not "natural language interfaces," but rather "pseudo" natural language interfaces shaped by specific formats that often lack intuitive grounding. She points out that minor alterations in prompt phrasing, such as synonyms and punctuation changes, can lead to significantly different outputs from the models. This inconsistency, she argues, causes confusion, particularly for average users who may not be equipped to navigate these complexities.

The difference between human-to-human communication and prompts is stark, as Morris emphasises. Human interactions are nuanced, relying on shared understanding and contextual cues, whereas prompting lacks these elements, leading to reliance on dedicated experts termed “prompt engineers" and the emergence of marketplaces like PromptBase specialised in curating prompts.

Morris brings attention to the implications of prompting on AI research, suggesting that reliance on varied prompting, which she terms "prompt-hacking," can distort the validity of research outcomes. This variation may lead to benchmarks used for evaluating advancements in AI models producing unreliable results due to different operationalizations of the prompts used in testing.

In response to these challenges, Morris proposes alternative methods for interacting with AI systems. Among her suggestions are constrained user interfaces featuring familiar buttons, "true" natural language interfaces, and high-bandwidth approaches involving gesture and affective interfaces. She argues that such formats would facilitate user engagement without the steep learning curve associated with prompt engineering.

Morris concludes her discourse by stating that the current reliance on prompt-based interactions could hinder the evolution of AI technology. She posits that future innovations will likely render prompt engineering a transient phenomenon, asserting, "I expect we will look back on prompt-based interfaces to generative AI models as a fad of the early 2020s—a flash in the pan on the evolution toward more natural interactions with increasingly powerful AI systems."

The ongoing discourse indicates that while prompt engineering remains a significant component of user engagement with AI, the search for more intuitive and effective methods of interaction is set to continue as the landscape of AI technology evolves.

Source: Noah Wire Services

More on this