In a significant development within the intersection of artificial intelligence and publishing, Microsoft has entered into a three-year agreement with HarperCollins to facilitate the training of an undisclosed AI model using the publisher’s catalogue of nonfiction books. Bloomberg reports that under the terms of this deal, Microsoft will pay $5,000 per eligible nonfiction title, with the proceeds being equally divided between the authors and HarperCollins. Notably, this agreement stands apart from any existing publishing arrangements and does not count against authors' previous advances.

The agreement, which came to light through a report by 404 Media, stipulates that only select nonfiction works published prior to this agreement are eligible for inclusion. Authors who wish to have their works featured in this AI training programme must actively opt in, while those who decide against participation will exclude their titles from the dataset and forego the associated payout. It's important to note that not all HarperCollins authors will receive the offer, as Microsoft is engaged in a targeted selection of the books intended for the model's training.

To alleviate concerns about potential misuse of content, the terms include specific limitations on how the authors' works can be used. The agreement specifies that the AI model can utilise “no more than 200 consecutive words and/or five percent of a book’s text” from the selected works, alongside a commitment from Microsoft that it will refrain from scraping text from any illegal piracy sites.

This move highlights a growing trend where tech companies are seeking to leverage large datasets to train their AI models, particularly large learning models (LLMs). As the availability of suitable training material is limited within the public domain, Microsoft's arrangement with HarperCollins significantly broadens the scope of data available for the development of its AI capabilities.

Previous agreements between tech firms and publishers often remained under wraps; however, the details of the HarperCollins deal not only illuminate Microsoft's intentions but also establish a financial baseline for what technology companies may be willing to invest in similar initiatives. Although the specific purpose of the AI model remains undisclosed, Bloomberg noted that it will not be designed for book generation.

As businesses seek to harness emerging technologies, the Microsoft-HarperCollins collaboration represents a pivotal moment in the realignment of content creation and artificial intelligence within the publishing industry.

Source: Noah Wire Services