Where We Go From Here: AI in Publishing

Scan the pages of any current affairs magazine today and you will be certain to find an article on Artificial Intelligence (AI). We are transfixed by it! It is a subject that is just as likely to instill fear and loathing as it is to inspire optimism, and that is because there is so much about it that we do not know and cannot quite grasp – as of this writing. So, we are adding this article to the mix to summarize the state-of-the-art for publishers, and to gaze into the crystal ball about what it may yet become.

We are focused on generative AI – the ability to “create” somewhat intelligent text from human prompts, via a LLM (Large Language Model) of training data (e.g., ChatGPT4). And we are interested in its application for trade and scholarly publishing.

Stephen Hawking warned the world in 2014 that AI could be the end of humanity, such was the great man’s deep concern about the ability of machines to surpass human intelligence. Movies like “2001: A Space Odyssey” (1968) and Blade Runner (1982) alarmed and entertained us. Now, LLMs can generate computer code, write text in any style we choose, and even pass the bar exam.

Uses of AI in Publishing Today

In this context, we can think of publishing consisting of two kinds of workflows; those that cannot easily be automated because they demand human creativity, and secondly, those that can easily be automated because they involve mundane sorting, categorizing, and summarizing activities that can be defined by logical rules. The latter are ripe for AI applications. Some of the first AI tools edited and translated incoming manuscripts. Copy editing and proof reading are good examples of the use of AI which has now even spread to more artistic tasks like cover design.

Marketing is the biggest single user of AI in publishing. They use it to analyze reader preferences and behaviors for audience building and generating automated personalized marketing (email and social media). It saves a lot of time and is proven to be much more effective than a one-size-fits-all approach. Text auto-tagging and SEO (Search Engine Optimization) techniques, as further examples, can easily create the right hashtags to enhance discovery. They are the building blocks of a good digital marketing strategy. 

In Customer Service, a great deal of customer goodwill can be lost waiting to speak to a human representative, and so a well-prepared chatbot is reliable, factual, and inexhaustible, and when the going gets tough, it can easily transfer control to a human. Reader retention is critical. 

Finance departments are beneficiaries of AI applications for royalty contract analysis and creation, in generation of public reports, and for copyright and plagiarism checking. Intellectual Property is the publisher’s lifeblood. And in production planning, publishers can identify reading data trends to modify content or to manage their publication plan. AI also plays a big part in format conversion (e.g., from print to e-book or audio), and in metadata management. 

The time saved in reducing manual effort by performing repetitive, low-level but essential chores can be transformative for publishers. And by moving the mundane tasks to the computer, the result is a higher quality, lower cost process where humans can focus on more insightful tasks where they excel.

Where do we go from here?

The fear around AI emerges because we often cannot understand how LLMs create the output they do. Most researchers say that they are merely “stochastic parrots,” but others present evidence that LLMs can produce results that they were not trained for, beyond simple statistical analysis. It would certainly eliminate much of this doubt if the developers were more transparent about their internal code, but that seems unlikely given the competitive race engaging the participants, and the enormous stakes involved.

Today, most experts agree that even though LLMs can offer reasonably good, grammatically correct prose, humans can put together more interesting and sentient written language. But what happens to publishers and publishing when generative AI can create inspiring and creative works (of fiction and non-fiction) all by itself? Then the direct human connection with words is irrevocably changed and publishing with it. To head off those fears, there are three significant issues that publishing needs to address.

Copyright

The US Copyright Office makes it clear that copyright law protects holders with exclusive rights that do not apply to machine-created materials. However, that is being contested in the courts. What is less clear is what is the rights status of all the ingested copyrighted training material on which LLMs are fed? If the output from the LLM is not sufficiently transformed from the ingested material, then that is not fair use according to plaintiffs, of which there are many. If it is fair use, how can you possibly assign the rights when none of that ingested material is itemized by the LLM? Other legal questions abound, and before publishers can commit to AI, they need to understand the risks and how to protect themselves.

Hallucinations

LLMs ingest massive quantities of textual material from the internet, along with all its flaws, inaccuracies, and downright lies. As a result, they often produce inaccurate or fictitious output, referred to as “hallucinations,” where the LLM simply makes things up, or parrots fake data. So meticulous human editing is necessary. This may be palatable for a marketing group at a trade publisher who is generating emails or product literature for example, but it is a different question for scholarly publishers who must be held to a higher standard of authority and review. They cannot use AI until they know how it works.

Bias

LLMs ingest internet-based materials and so they consequently inherit the biases against marginalized communities that may be inherent in the source. Combined with the copyright and hallucination concerns noted above, many publishers have banned the use of generative AI altogether.

Summary

Overall, many publishing leaders are encouraging their staff to experiment with the use of generative AI in appropriate workflows. Management is ultimately charged with distinguishing between the benign workflows that add to productivity when passed to AI, and managing those workflows that may eventually corrupt the creative publishing process by reducing it all to pushing a computer button. For the future, the decisive question is where the rights to the content lie and how that content can be verified. The book, as we know it, is here to stay. Everyone agrees that there will be publishers and editors in the future who will ensure the quality of curated content, but your guess is as good as ours as to who will be writing it.  

Learn how knk is at the forefront of this transformation and can empower your business to unlock new opportunities for growth. Contact us today to schedule a meeting or demo.

Images by rawpixel and kstudio on freepik.

knk Blog Team

The knk blog team fills the knk blog with content, new posts and replies to comments.

We welcome your comments!