HarperCollins Asks Its Authors to Sell AI Training Books
Earlier this month, Daniel Kibblesmith received an emailed memo from HarperCollins, one of the world's largest publishing companies, offering $2,500 to license his 2017 children's book. Santa's husband in a period of three years. Getting caught? The subject will be licensed to a technology company to help train an AI model. “Disgusting,” the writer wrote about the offer in a post on the microblogging site Bluesky.
With their wealth of high-quality content, book publishers have emerged as an attraction for AI companies that need data to improve the skills and knowledge of their AI systems. HarperCollins, a British-American publishing company and a member of the “Big Five” publishing group, recently inked a partnership with Microsoft that will see some of its nonfiction books used to help the company train for a new model, as reported by Bloomberg. In a statement to the Observer, HarperCollins confirmed that it had “reached an agreement with an intelligence technology company to allow limited use of fake background list topics for training AI models.” Microsoft ( MSFT ) declined requests for comment.
HarperCollins noted that writers will be given the option to take or pass on the opportunity. “Part of our role is to present writers with opportunities to be considered while at the same time protecting the fundamental value of their work and our shared profits and royalties,” the publisher said. “This agreement, with its limitations and clear guidelines about the model that respects the author's rights, does that.”
The precautions of the agreement include limiting the output of AI models to no more than 5 percent of the book's text, according to a statement from the Authors Guild, the largest organization of authors in the US HarperCollins' AI's AI Partnership partnerships will lead to a payment of $5,000. Each article is split equally between the publisher and the author, the organization said. Although the Authors Guild described the arrangement as giving “too much to the publisher,” it praised the fact that HarperCollins would seek individual consent from authors and described licensing as a way of “returning control of use to authors and their partners.” “
Along with authors such as George RR Martin, Jonathan Franzen and Jodi Picoult, the Authors Guild last year sued OpenAI for allegedly using their work to train models without permission. Various authors have also filed similar copyright cases against the likes of Anthropic, Meta (META) and Microsoft for training AI models on data sets of pirated books.
AI deals for publishers are on the rise
This concern has not stopped publishers from securing lucrative deals with major technology companies. Educational publishers Wiley and Taylor & Francis earlier this year partnered with various AI developers to provide AI training content, with Microsoft reportedly offering $10 million to the latter to access its data. Oxford University Press also said it is working with AI companies, and MIT Press recently told 404 Media that it is close to several AI trainings.
As they run out of high-quality data accessible online, AI developers are increasingly looking for new ways to get their hands on reliable and accurate content. News Corp, the parent company of HarperCollins, in May made an agreement to provide news from its newspapers such as the Wall Street Journal, Barron's and the New York Post to OpenAI, which has similar agreements with a number of publications including the Atlantic, Vox. The media, Associated Press, Financial Times and Time Magazine. Microsoft, too, has content licensing arrangements with the likes of Reuters, Hearst Magazines and Axel Springer.
Microsoft's data access could increase significantly in the near future, as HarperCollins has already submitted license book requests from thousands of authors, according to the Authors Guild. How many writers will enter, however, remains to be seen. In responses to his Bluesky post, Kibblesmith jokingly said he wouldn't take that deal unless it was worth $1 billion. “I would do it with money that I will no longer need to work, as the goal is to keep this technology,” he wrote.