There are a lot of things not to like about generative AI. Like any tool in the hands of human beings, it will almost certainly be used for ill. But it will probably not be used only for ill.
As the industry slouches past generative text and images, toward AI-generated videos, it occurs to me that one of the (potentially happy?) effects of this might be to further reduce friction for story tellers. Or, to say it another way, maybe it will decouple story telling from the technical skills needed for story visualization.
Children’s book authors, those who lack artistic skills, might, for example, be able to use an AI to generate the illustrations for their stories. Similarly, stories might be made into videos by exploiting AI to generate such videos. Some examples of AI-generated video can be found at the bottom of this post.
Ultimately, it seems like the production of media for story telling might be placed within reach of people who have stories to tell, but historically haven’t had the technical skills to platform their stories. In a sense, Substack is already doing that very kind of thing for the written word. Substack provides a platform for writers which eliminates the need for them to master web servers, content management, storage, reliability, networking, access control, etc. etc. ad nauseum.
It is not too soon to start asking questions about the limiting principles surrounding the use of AI for story telling. After all, if it can create the images, why not have it create the stories themselves? Well, so far, experience with AI stories is markedly underwhelming. While it might synthesize some stories from its training data, to say such stories are formulaic and platitudinous would be an insult to platitudes everywhere. Furthermore, the propagandistic nature of these models’ training leaves the generated stories both tendentious and lame. From a story-telling perspective, then, AI language models are kind of like watching the dancing bear at the circus: you’re not surprised that the bear doesn’t dance very well; you’re surprised that he dances at all. You also suspect, as you sit there watching, that the bear is never really going to dance well. He is - there’s no way to get around it - just a bear.
Similarly, much of the excitement surrounding language models comes from the surprise that their responses are sometimes less absurd than one might otherwise expect. Though I hasten to point out that the bar for what constitutes “success” in this field has been set pretty low. That’s because AI models are not capable of actually thinking - they can only recapitulate their training data in ways that reflect a statistical affinity for the text of the prompt supplied by a user. Notwithstanding their limited success, early indicators are that language models will actually collapse in the absence of a sufficient supply of original content generated by real human beings. Which is another way of saying that the statistical characteristics of AI-generated content are inadequate for purposes of training AI models themselves. Thus, even in the world of AI, there is no such thing as a self-licking ice cream cone.
The intent of this post has been more to raise questions than to predict ultimate outcomes. It seems manifestly obvious that story telling is going to be affected by the possibility of reducing the technical burden on story tellers. Whether the cumulative effect of that reduced burden is a happy one remains to be seen. But here’s hoping.