Cayley's image is an apt illustration of an essay that's also a work of 'digital language art.' Although Cayley incorporates new material and newly contextualized examples, referring chiefly to his own work, what follows is also the reconfigured rewrite of a recent essay for a series of conferences and a peer-reviewed online journal, Political Concepts, which can be found online at: https://www.youtube.com/watch?v=LDJRQYRWpvQ.
When addressing the modelit in my title, I might point to its partial derivation from literature via the North American college-jargon abbreviation, lit. This would briefly beg the question of what literature is, in pragmatic and Foucauldian terms: as, for example, a discourse variously determined and policed by implicated constituencies. Whatever literature is, viewed thus, comes to be determined by the discourse-based power and knowledge struggles of these constituencies. In the case of eliterature, the broader constituencies of its students, on the one hand, and its practitioners, on the other, are particularly tightly integrated, for largely historical reasons that might be summarized by saying, students of eliterature created eliterature. The underlying stakes in asking the corresponding question, what is eliterature? are doubly displaced: to quite distinct technological considerations – having important formal implications – and then back toward the ‘larger’ question of what literature is. Underlying this non-unitary – and, by some accounts, self-contradictory – originary ontology of eliterature there are anxious, if not paranoid, considerations as to whether or not literature’s gatekeepers will admit eliterary artifacts into the world of letters on which the more evaluative students of literature discourse.
When considering the question of what literature is, pragmatically, I assume that there are two constituencies which particularly concern us: students (broadly conceived to include ‘scholars’) and practitioners. Which of these categories is more variegated? I take students of literature to include, amongst other roles: readers, publishers, booksellers, critics, and scholars. We usually take literary practitioners to be writers and, insofar as they may also be amateurs, they may be otherwise highly variegated with respect to their non-literary worldliness. They appear, however, uniform to the extent that they share a medium. They compose in the faculty of language that allows us to be what we are, language animals. And they aestheticize some part of our linguistic animation. For the purposes of this essay, let’s call this medium language. When it comes to eliterature, we may concede that practitioners also compose in other digit(al)ized media, more or less integrated with their compositional practice in language. Or they may compose in the medium of computation itself (à la Nick Montfort), or, indeed, in certain transmedial manifestations that digitization allows.
What I mean by this can be demonstrated by reference to a very recent account of creative gestures incorporating precisely the kind of Generative Pre-trained Transformers (GPT) associated with computational systems that co-produce the “Large Models” (usually, of course, Large Language Models, LLMs) referenced by the modelit of my title. Scott Rettberg et al. set out a claim relating to prompted, GPT image creation in a recent conference presentation, ‘AIwriting: Relations Between Image Generation and Digital Writing.’1Scott Rettberg et al., ‘Aiwriting: Relations between Image Generation and Digital Writing’ (paper presented at the ISEA 2023 Conference, Paris, 2023). The authors – all except Jill Rettberg – are exemplary as student-practitioners of eliterature, with fully integrated scholarly and creative practices. They suggest that this kind of GPT-assisted image creation is eliterature. It is a practice of language – prompt writing – which instigates a transmedial creative practice, resulting in the production of visual images. The images manifest or retain their relationship to the prompt language, to language as such. Rettberg et al. outline a compelling practice which could only ever be achieved with computation and the transmedial affordances of digitalization. I am taken aback by the extent of the new practice’s transmedial specificity when, for example, comparing it with what has been a more typical practice of eliterature, one in which a practitioner – let’s think of J.R. Carpenter – uses digital affordances, not always including generative computation, to manipulate either or both media, in order to bring together the maker’s facility with language and their facility with, say, image creation or design. Prompted GPT image generation seems to be, on the other hand, an actual (e)writing of the image, in which the ‘e’ of the writing is performed by the transformer, working on image sources which have not (usually) been created by the ewriter.
What is compelling for me is the tight integration of media in this inherently transmedial practice, and the fact that actual language (our shared medium) is its impetus and, to a great extent, its shaping energy. GPT-assisted image generation really does deserve to be considered as eliterature and as something new in terms of practice, especially as and when it is deliberately so regarded, as eliterature. Whether it will ever be read as literature is, of course, another matter, perhaps one that will ultimately prove aesthetically and culturally irrelevant. It has been important for me to bring up and begin to parse this new practice here in order to distinguish what GPT-assisted image generation performs from what is produced by certain other practices of digital language art, and because it may help to put the role of GPTs, as digital, ‘e-’ affordances, into certain perspectives on which I will elaborate.
For now, we might ask ourselves: What quantities and qualities of linguistic art go into the prompt language? And why? This could be asking, how important is an art of language for this practice? Or it could simply be inquiring as to the constraints of a new genre or form of writing. More significantly, for my purposes, I want to ask about what happens as and when the language is entered into the model? All the authorities, so far, reply that they do not really know. Many aspects of the engineering of these processes can be described, but the model’s overall operation of a function – and the ‘inner’ workings that find this function – are hermetically, epistemologically sealed from both practitioners and their human readers. Then, there is the question of the image sources for the generated image, for the image that is produced and ‘read.’ Where did these source images come from and who made them? Who is invested in their ownership and moral rights (of association and integrity)? Finally, the GPTs in this practice are applying themselves to manually token-tagged or machine learning-tagged image data, not to transcribed (natural) language (linguistic data) as such. In what follows, I try to stay focused on transcribed language and/or language as such.
The three issues that we will conclude by taking somewhat further are thus already evoked: questions concerning language and linguistic data; questions concerning the hermeticism of compositional process; and (in a more limited way) those surrounding the moral rights relating to source data and generated output. Questions for which transmediality is crucial must be left aside to be considered by other commentators.
Not only are these circumstances of interest in helping us to parse out crucial aspects of new practices which may (how soon?) prove an existential challenge to eliterature as we know it, they are also useful in highlighting the Foucauldian integration of students and practitioners, as contrasted with the situation of the differently configured constituencies in literature more generally. As and when language is the primary or sole medium of our practice then a singular problematic characteristic of literature is evoked. All the diverse members of literature’s two main constituencies use language – both students and practitioners – but it is the non-practicing members who determine what part of this use is literary. As for what is eliterary, since its students are a specialist minority with a narrower range of roles in the world of culture, and since many of these students are practitioners, and since many of the practitioners of eliterature are scholars and commentators, the discourse of eliterature is differently structured in a Foucauldian sense.
I am simplifying a much more complex state of affairs. Nonetheless, literary practitioners must, typically, submit to friends, mentors, agents, editors, publishers, marketeers, readers, journalist critics, and scholar critics in order have their linguistic artifacts approved as literary. On the other hand, student-practitioners of eliterature typically evoke something akin to extraordinary, often formal, or indeed technical, originality as both definitive and evaluative of their practice. For at least as long as I have been in the field this has boiled down, pragmatically, to a contrast between needing and wanting to publish a book and, basically, insisting that publishing a book might be the last thing – formally or technically – that a principled eliterary student-practitioner might ever think of doing.2Clearly, this is a light-hearted, somewhat exagerated characterization, especially given the fact that the author, who self-identifies as a language artist devoted to computational, dynamic, non-deterministic, time-based, and/or aural linguistic artifacts sees no contradiction or problem with publishing both supply texts and particular textual outcomes from otherwise unprintable work. John Cayley, Image Generation: Augmented and Reconfigured, (Denver: Counterpath, 2023). It is worth noting that there will be a future publication based on a panel – for which the author is one participant – at the 2023 conference of the Electronic Literature Organization, ‘Print Manifestations and Materiality: On computer-generated books in Electronic Literature.’ Whereas experimenting with the latest technology is very much in the eliterary student-practitioner’s wheelhouse.
Meanwhile, underlying all this practice, both literary and eliterary, is the problem of text. For in all this practice text is instantiated typographically. This is the default for all of us. But by text in this particular context, I mean the central object of Garrett Stewart’s recent Book, Text, Medium: cross-sectional reading for a digital age (2021).3Garrett Stewart, Book, Text, Medium: Cross-Sectional Reading for a Digital Age, (Cambridge: Cambridge University Press, 2021). This is text as something that is delivered by a physical medium but its essential and defining characteristic is being readable. It is visible and present to us, ‘in’ the book and ‘as’ printed letters, but, significantly, text in this sense is non-identical with any of its physical manifestations or their storage and display devices (the book, the screen, the underlying ‘disks’). It is, rather, something we can, in principle, scan and read, and which establishes, for us, a relationship with language. Text is something that relates to language and, when agreeing to this relation, we should be prepared to concede that language is, similarly, non-identical with text.
Both literature and eliterature have ontologies grounded by both pragmatic (e.g., Foucauldian discourses, structures of power) and non-pragmatic criteria. I am taking the textual aspect of both literature and eliterature as tending toward the non-pragmatic, language philosophical end of this continuum. We might propose that text itself is – regardless of text’s relationship with whatever language is – constitutive of literature, substantively and/or essentially. Language is the medium, we have said, of artists who make both literature and eliterature (at least as one essential constituent in transmedial work). But readable text provides us with an object that relates, regularly, functionally, to language and, it follows, to those subsets of linguistic artifacts that we consider to be (e)literature. As an object, which is more or less empirically accessible, text also offers itself as, potentially, an object of modern science, something that can be studied methodically, theorized, known.
And here is the poststructuralist rub. The predominant scientific paradigm for the study of language is structural linguistics. Despite claiming, in its founding gestures, to give primacy to speech, it based its science on differences which can, by definition, be textually transcribed. Linguistics did, and still does, transcribe these differences, and then takes them in as empirical evidence, as linguistic data. As Derrida pointed out, structural analyses could thus only ever be referred to (an archi-)writing, that is to what we perceive and then read, in the world as, precisely, text. Whereas it is demonstrable – for now let’s say that puns and ellipses are enough for a demonstration-of-concept – that events of language may hinge, ontologically, on differances, on creative, language-constitutive differences. These may be traced and transcribed but they will ‘always already’ be subject to erasure and deconstruction. Moreover, the (Romantic) intuition that writing, in one version of philosophical truth, is unable to fully transcribe human experience (which may be “sayable” although as yet unsaid) is equally true for the (Structuralist) acoustic images of speech.4I don’t want to further overburden this essay with attempts to justify my underlying philosophy of language, but readers may nonetheless wish to know that I am following on from my reading of Garrett Stewart to some of the thinking in Giorgio Agamben, What Is Philosophy?, trans. Lorenzo Chiesa, (Stanford: Stanford University Press, 2018). What Derrida called the logic of the supplement prevails, resisting any scientism based purely on transcription, encoding, and formulation, since these latter require empirical, ‘primary’ data. The point is that the logic of the supplement prevails for language, for our medium, for the medium of (e)literature.5Interestingly, poststructuralism is sometimes credited with a position more or less diametrically opposed to my own reading here. In an excellent recent article on the the ‘mind’ of AI, N. Katherine Hayles, for example, lumps the poststructuralism(s) of “Barthes, Foucault, Jacques Derrida, etc.” together as proposing a “null strategy” approach to the critique synthetic textuality. The strategy is ‘null’ in the sense that any text has the same status with respect to interpretation, including LLM-generated text. N. Katherine Hayles, ‘Inside the Mind of an AI: Materiality and the Crisis of Representation,’ New Literary History 54, no. 1 (2023). Poststructuralism, in Hayles’ account, threatens to ignore or set aside the effects of embodiment that Hayles and I would agree in finding missing from currrent synthetic language. Jacques Derrida’s, “There is nothing outside the text.” is the tagline of this ‘null strategy’ position, if interpreted, that is, without pragmatic grammatological deconstruction. Of the poststructuralists cited by Hayles, however, I take Derrida to provide the most serious language-philosophical position. I read the logic of the supplement (which I briefly outlined) as allowing us to discover the traces of language’s human embodiment in the grain of human-generated text itself. Moreover, for Derrida and myself both the streams of acoustic images that we call speech and the sequences of signs in sign languages (the only other socialized language practice on the planet) is also (the) text. One of Derrida’s points is that text in this sense is all we have, but it is also always a supplement, and a “not all.” Our embodied linguistic social interactions most especially, but also our embodied self-present – inwardly vocalized – readings of the text – for which the memory of social, linguistic, and literary experiences are constitutive – allow us to discover what Derrida famously called “differance(s),” phenomena that may be absent or erased in one or other supplement but is nonetheless present-as-readable to ourselves and/or other interlocutors. That is what the ‘a’ in ‘differance’ is. Text for us is not the same as text for the LLMs. This language philosophical strain of poststructuralism agrees with Hayles on this score and also, in my reading, with her commitment to embodiment. Language as such is always already embodied. If it is not embodied, it is not language.
Language philosophically then, the text may be all we have but it proves itself to be “not all” (as some philosophers put it), not all of whatever language is. And for (e)literary reading it was never, and never will be, ultimately, enough. We can return to the work of Garrett Stewart, for example, to find a theorized methodology of reading that regularly recognizes, in the course of its hermeneutic practice, that every text is simultaneously a phonotext.6Garrett Stewart, Reading Voices: Literature and the Phonotext, (Berkeley: University of California Press, 1990). Stewart’s readings insist that what the text evokes is always already an evocalization, an encounter with what the text says to us when actually reading out loud or in our interior vocalizations. The interplay involved is essential to the reading of anything literary, typically manifesting as style. You don’t even necessarily have to go along with a Derridean language philosophy to read as Stewart does. But you do have to acknowledge that there is more to the text than what is spelled out in its orthographies and punctuation.7I’d like to cite here – particularly because these interventions are made by an important practitioner-student of eliterature and also because they engage with linguistic embodiment in a manner that integrates with contemporary critical practice – two papers by Allison Parrish. Allison Parrish, ‘Nothing Survives Transcription, Nothing Doesn’t Survive Transcription,’ in Iona University, Data Science Symposium (2023); ‘Material Paratexts,’ in ICCC 2022 (2022). The first of these is directly relevant since it deals with the problem of transcription, including text-as-transcription, reminding us – and my Derridean persona agrees (see above) – that we are always obliged to be working with transcriptions whenever we are working with the abstractions and idealities of computation.
The most extensive linguistic domain for what is beyond text in this sense is aurality. Can we not safely assert that the extent and variety of linguistic practice which is vocal and aurally perceived exceeds that which is textual? How much of this practice, exceeding the textual, is also motivated by aesthetics? And by a shared sense of the value of these linguistic artifacts within the human cultures where they are performed? And by a shared sense that they are worthy of being not only remembered, but that these remembrances should be preserved over generations?
These are rhetorical questions, given that language is an evolved faculty and that the history of text and of text-as-history is a flash in the diachronic pan relative to the prehistory to which I’ve just alluded. And text is also a ‘minority practice,’ synchronically, relative to the human world’s less- or non-literate, if often no less ‘literary,’ practices of language, globally speaking. Textual eliterature is no better at hearing these other voices.
But this rhetoric does not get us far with respect to another politics of (e)literature which confronts us in the present moment. I agree, with other student-practitioners of eliterature, that literature is misdirected to some extent; that it is selective, arguably biased; and that its relative and literal ignorance of aesthetic practice in programmable media is philosophically and ethically questionable. I reaffirm that, pragmatically, literature can and should be constructed differently for a variety of reasons, not least with regard to diversity and equity. Some of this reconstruction is, happily, happening as we speak. On the other hand, eliterature is as much subject as literature is to what I think of as textual idolatry. And at this particular historical moment certain consequences of textual idolatry threaten to become dire, perhaps even existential for both literature and eliterature.
By ‘textual idolatry’ I mean at least two things. Firstly, the continued over-valorization of reading and writing held captive by the gravity of text, addicted to textual practices, and addicted in particular to what I want to further qualify as our default orthotextual practices. Orthotextuality here corresponds with the more familiar orthography – correct(ed) spelling – which is its figurative backbone. When we speak of text, in the context of literary and documentary practice, all of us – including scholarly textual critics – take more or less for granted the historically very recent propensity to establish a text in standardized spellings and punctuation. For LLMs the problem of ‘cleaning the data’ – raw internet text for example – resolves to establishing the correct(ed) text, the orthotext, that will then be used for training. Secondly, once the orthotext has been carved out, its idolatous image is handed over to digitally formulated techniques, including current LLMs, that take this orthotext as their object – without questioning its relationship to language as such. What, of language, has been lost in the ‘clean up’ and by the ‘corrections’?
Undergirding this textual idolatry, in similarly historical and pragmatic circumstances, we have further taken for granted the fact that text can be digitized. This is one demonstration of a point made earlier, one that resonates with a foundational tenant for the methodology of textual criticism as a literary studies specialism. Text is not to be identified with its physical instantiations. Text in a computer is the same object of textual critical study as text in a book. And alphabetic text is, in fact, foundational for modern computation. It is determinant of important characteristics for encoding conventions which still predominate. But as we have also already pointed out text implies nothing more than some kind of regular relation with language as such, and it does not take much research – think English spelling (orthography) – to be able to affirm that there is no one regular relation between text and language, that any such relation, if treated as singular, must be multi-dimensional. It is more likely a case of a plurality of relationships which are contingent on both the characteristics of the text and independently determined characteristics of the language to which it is referred.8An even more complex relation when considered translingually and with respect to differing systems of inscription, such as sinography. None of this is automatically accounted for by computation itself, or in the adaptation of textual critical practices to computation. Hence the literary professional anxiety concerning “close reading with computers,” to quote the title of an excellent book on the subject by Martin Paul Eve.9Martin Paul Eve, Close Reading with Computers: Textual Scholarship, Computational Formalism, and David Mitchell's Cloud Atlas, (Stanford: Stanford University Press, 2019). See also the author’s review: John Cayley, ‘Differences That Make No Difference and Ambiguities That Do,’ review of Martin Paul Eve, Close Reading with Computers, Stanford: Stanford UP, 2019, Novel: a forum on fiction 54, no. 2 (2021). Whereas textual scholarship as such can only be enhanced by computational techniques, their bearing on evaluative, hermeneutic reading remains at issue. In general, the statements of textual scholarship concerning text are taken to require further reading in order to be understood and set in the context of full-blown literary scholarship.
No fault in this. Typically, the techniques deployed are what I will characterize as heuristic.10I have to thank my colleague and sometime collaborator, Daniel C. Howe, for the use of this term in contexts like this one. They may even be quite simple and straightforward – based on the relative frequency of key words for example – but at least they, the techniques, can themselves be known. They can be understood and reapplied in other contexts for comparative discoveries or other textual critical statements, and when this is done the student of literature or eliterature will know what they are doing and how to compare. Until recently, in fact, those student-practitioners with an interest in computational techniques applied to language have, necessarily, devoted themselves to precisely these heuristic techniques. There was no other way to engage language using computation. Since the turn of the millennium, however, in our digitalized culture – which takes in eliterature by definition and now encompasses any profession of literature – we have access to computational techniques which, by contrast, I would characterize as occult or, let’s just say, hermetic.11I realize that there will be nuanced responses to this claim of hermeticism: that differing training processes lead to differing models, that differing GPTs, for example, have distinct structures for differing purposes or data, or that certain parameters built into GPT interfaces may be meaningfully construable to humans, etc., etc. Anecdotally, however, most experts say that they do not know what the models are doing, and high-dimensionality is, I suppose, the fact that guarantees this situation. Two recent papers by leading researchers that I have seen (if not fully absorbed) challenging the critique of hermeticism are Jack Murrello, Carsten Eickhoff, and Ellie Pavlick, ‘Language Models Implement Simple Word2vec-Style Vector Arithmetic,’ arXiv (2023); Ellie Pavlick, ‘Symbols and Grounding in Large Language Models,’ Philosophical Transactions of the Royal Society A (2023).
We can now chat with Large Language Models, LLMs, a term that has been settled on since the elaboration of what Adrian Mackenzie describes as the discourse of Machine Learners, which he further identifies as a “data practice.”12Adrian Mackenzie, Machine Learners: Archaeology of a Data Practice, (Cambridge: MIT Press, 2017). At the time of writing, the particular data practice of LLMs which is causing a sensation verging on the apocalyptic is text generation in response to human-composed linguistic prompts. This process of prompted response is the same as that cited above with respect to the image generation that Scott Rettberg and others have proposed as a new eliterary practice. At a high level of abstraction, the models concerned are remarkably similar, differing more in terms of the source of the data on which they operate – transcribed text in one case, digitized images in the other – than in terms of the trained data itself, or the operations of the function-finding neural networks concerned, or the details of how these networks are engineered, layered, and structured.
It must be born in mind, however, that all these ‘machine learners’ are engineered as decision-making tools, as motivated function finders which take input from a domain of data and classify or map it onto a structured co-domain. Humans, including their engineers, see machine learning models as performing evaluative, classifying decisions, and thus text generation, for example, reduces to solving the problem of which ‘best’ next word. It is an evaluative, decision-making process.
The properties and methods of the decisions made by these machine learners are not, however, heuristic. An example of text generation based on what I’m calling a heuristic algorithm is the Markov chain, beloved of many eliterature students and practitioners. The next word of these chains will be quasi-randomly chosen from those that most frequently – within a pre-determined corpus – follow a pre-determined number of words in the immediately previous sequence. This kind of algorithm does not generate text that is as convincing or as ‘compelling’ (whatever that may mean) as that produced by the LLMs, but its operation can be called heuristic because we can know what it is doing. Whether a Markov chain produces the ‘best’ next word is not really and not necessarily at issue.
By contrast, LLMs may be built from many of thousands of possible, mathematically studied, relatively specialist operations. These are based on those used for data management and statistical analysis, enfolding these earlier practices into the creation of generalized ‘vector spaces.’ These are then processed and computed using, for the most part, linear algebra. The operations are difficult to think through other than in the formulations of certain branches of mathematics, and they are, literally, impossible to visualize on 2D surfaces, or in 3D for that matter, due to the vector spaces’ high dimensionality. The choice of ‘best’ next word is being made in relation to many, many dimensions. This means that qualitative judgement or thought with respect to contemporary ‘generated text’ is, for most human readers, occult. It is as hidden with respect to articulation as the judgement or thought of, for example, a traditional student of literature.
All this is rapidly becoming common knowledge. Readers of this essay will likely have read many similar summary accounts in many journalistic and documentary registers.13My own is necessarily a very cursory account. The most useful more detailed, non-specialist accounts that I have read, apart from Mackenzie (op. cit.), are Hayles; Stephen Wolfram, ‘What Is ChatGPT Doing … and Why Does It Work?,’ Writings (2023). The summary here was composed originally for scholar critics of literature who are less likely to have the same degree of technical interest that I can assume in the case of student-practitioners of eliterature. But all human readers may find instances of LLM-generated modelit (and other artifacts) attractive because they present us with experiences which are analogous to our experiences of appreciative hermeneutic reading, during which we may remain in radical doubt concerning the articulation of our thoughts and judgments, or how these ‘operate.’ We may even be pleased or excited to believe that the workings and the style of a human or artificial literary artifact are somehow beyond us.
If I am addressing all readers, however, especially human, close readers, I must go on to reiterate that, as opposed to what all of us are doing when we read as humans, the LLMs are only working with text, and orthographic text at that. They are interrogating only relationships between text and whatever is sayable as significance and affect. Then also I must remind us, again, that what the LLMs are doing is complex and beyond the typical reader’s human comprehension. Despite this, the whatever-they-are-doing is nonetheless known to be formulated.14Formulated in terms of computation, for which Hayles (op. cit.), quotes an “exemplary definition” by M. Beatrice Faci, “To compute involves a systemisation of [some aspects] of the real through quantitative abstractions.” And we also know that the whatever-it-is is basing its formulations exclusively on differences established by the ‘tokenized’ idealities of transcribed, orthographic text.15For more on the understanding of text in this essay, please see: John Cayley, ‘The Language That Machines Read,’ in Attention à la Marche = Mind the Gap: Penser la Littérature Électronique in a Digital Culture = Thinking Electronic Literature in a Digital Culture, ed. Bertrand Gervais and Sophie Marcotte (Montreal: Les Presses de l’Écureuil, 2020). This piece is also available, presented in a somewhat unusual way, on one of my websites at https://nllf.net/lmr.
When practicing new (e)literatures with and against current LLMs, what then is missing from our practice? Having given a summary account of what I currently understand of the workings of the LLMs, we reencounter three issues already evoked. The first concerns the distinct ontologies of text and language and orbits around the intuition (based on human reading of responses) that individual style is missing from our engagements with generative transformers.16Some readers may object that ‘style’ is, on the contrary, something that distinguishes the new generations of GPTs, something of which they are suddenly capable. Cf. Hayles, op. cit. “[GPT-3] is able not only to create semantic coherence and syntactic correctness but also to capture high-level qualities such as style and genre.” When I discuss style here, I am referring to individual style, as akin to and co-constituted by embodied human voice, the voice and style which is indicative of an a particular invidual, as expressed by acoustic features such as timbre, but also by peculiarities of language use: paradigmatic, syntagmatic, and also with respect to paralinguictic features such as intonation, etc. Style, as Hayles uses it here is the kind of style emergent in literature: ornate, plain, 18th-century, demotic, etc. These can be ‘captured’ and regurgitated by the transformers because they have been ingested and ‘tagged’ by associated critical discourses. It is remarkable that contemporary transformers can respond to style and genre related prompts but this is not the same thing as having a voice and style of their or its own, which is my focus here. The absence of this individual ‘style’ is particularly sensible to humans transacting with AI during everyday, functional prompt and response sessions. I claim that one reason for this is ignorance – on the part of the LLMs – of human linguistic practice. They do not have significant data, for example, from vocal and aural linguistic practice, nor from the evocalized experiences that Garrett Stewart discovers in literary reading. In sum, what they do not have is data pertaining to the human embodiment of language. Language as such is inseparable from human embodiment at any and all levels of linguistic structure. The LLMs are working with text not language. Secondly, the heuristic approach to our practice is occluded by the hermeticism of the LLMs’ operations. Eliterature, particularly eliterature that deploys computational affordances for the manipulation and generation of texts (particularly texts that are attributed to the creative practitioner) needs to revisit the heuristic potential of its practice and also – thanks to insights offered by reflection on our first issue, the ontology of language as such – be prepared to explain whether and how a heuristic engagement with text has been and/or can be (closely) read into literature as such. Why is OuLiPo literature, when GPT text generation may not be? Why do the transformers have trouble with (OuLiPian) heuristic constraint? Finally, I must at least attempt to deal briefly with issues surrounding the moral rights, of association and integrity, that pertain to both the trained data and the generated output of Large (Language) Models. The way that practitioners must approach this issue is very different, however, because, given LLMs as they have been released, certain modes of potential pragmatic response have been foreclosed. ‘Data,’ morally associated with many individuals, in principle anyone who has published their linguistic creations on network-accessible platforms, has already been ‘acquired,’ and its integrity has been, by neural network procedural definition, compromised. There is little for the eliterary practitioner to do other than embrace this revolutionary overturning and erasure of moral rights, or protest its consequences. In principle though, with regard to the other two issues that I am raising, engagement may be nuanced, productive. Eliterary student-practitioners may wish, for example, to contribute to models that have other and better data given to us by language as such, or they might be prepared to get ‘deep’ into the obscurity of the mathematical, functional, computational workings of the models and transformers.17The work of Allison Parrish and David Jhave Johnston come to mind in this context. Parrish is one of the few practitioner-students working seriously with computational approaches to language-as-phonology, including – with characteristic generosity – work to provide infrastructural support for other existing and potential artists. See, in particular, her Pincelate project (https://portfolio.decontextualize.com/; https://github.com/aparrish/pincelate) and Allison Parrish, ‘Poetic Sound Similarity Vectors Using Phonetic Features,’ in AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (2017). Johnston, already well-known for his monumental ReRites project, is and will be a highly influential practice-based researcher in the field of machine learning and language art. David Jhave Johnston, ReRites, 2017–19. Anteism.
We have already engaged with the problem of individual style by pointing to something that hardly seems to have been noticed let alone questioned. Large Language Models are not what they claim to be. They are, at best, Large Text Models. For literary style as we know it, we would, minimally, need to evoke and, as Garrett Stewart would put it, evocalize the text and bring it closer to language as such. Literature has ways of doing this. To date, the models do not.
Style is often construed in terms of voice, relating the two metaphorically in a way that is difficult to formulate. As a matter of record – at least anecdotal – style is often admitted to be missing from model-generated text. I can’t usefully say too much more about what style is, but we can say that it is closely associated with something that we know is, as I’ve suggested, missing from the training data of the LLMs. They do not have experience of or data from the kind of regularly related sequences of acoustic images (phonemes with appropriate, aurally indicated punctuation) that human readers do have, and that human readers are processing when they read any text, since this ‘text’ is also, for them – citing Garrett Stewart once again – a phonotext. Could the LLMs be given this data? Yes, they could. A robotic, humanoid version of it is, in principle, available to them as text-to-speech. Are they using this now? I do not know but have never heard of it mentioned in the admittedly small part of the literature on LLMs that has reached my attention. If such data is ever enfolded into the models’ vector spaces, I will be initially surprised, and then even more concerned about the future of literature, depending on who advises the use of data like this, from aurality, in the LLMs’ discourse. I devoutly, if somewhat despairingly, hope it will be humanists who do so because, as opposed to the default position of current engineers, they may agree with me that style and (authorial) voice are, amongst other features, ontologically essential to whatever language is. And voice and style are both a function of practices and events requiring language-animal embodiment.
The computational supposition that the relations between text and language, between acoustic image and language and between acoustic image and text can all be formulated and meaningfully subjected to evaluative computation will still be at issue, however. Based on what we know about the simplest of these relations – that between acoustic image and text – humanism and human reading still has a great deal to say in and about the discourses of literature and of eliterature insofar as language is still a medium that is important to the latter.
To begin to deal with some practitioner-pertinent issues surrounding the LLMs’ hermetic processing of language for ostensibly aesthetic purposes, I will revisit one of my own long-term projects, one having collaborative aspects and outcomes. The underlying proposal is that the processes applied in this past project were and are heuristic. Comparisons with very recent exercises deploying various flavors of LLMs linked to chat versions of their GPTs have been put together in a supplement to this essay, inevitably, rather hastily and cursorily. The following section of the main text cannot really be considered a paper or a comparative study, but my sense is that, together with the supplement, it will nonetheless prove suggestive and indicative at an important moment in the history of eliterature and, indeed, (digital) language art. My remarks on the chat exchanges are very brief. They are contingent on, and mostly tied to the examples concerned, with the full text of prompts and GPT responses.
For some years now, I have been fascinated by what is considered Samuel Beckett’s last novel, Comment _c’est (1961), translated by Beckett himself as _How It Is (1964), and, in particular, I have used twenty two paragraphs from the first part of this novel (as well as the entirety of the novel text) as a supply text for the generation of pieces that I consider to be works of language art with computation (or, if you prefer, eliterature, or digital language art).18More details in John Cayley, ‘Beginning with ‘the Image’ in How It Is When Translating Certain Processes of Digital Language Art,’ in Colloques de l’Université Paris 8: Translating E-Literature = Traduire la littérature numérique, ed. Arnaud Regnauld and Yves Abrioux (Paris: Université Paris 8, 2015); ‘The Translation of Process,’ Amodern, no. 8 (2018). These twenty-two paragraphs were also separately published by Beckett in French as a sort of narrative prose poem, ‘l’Image’ (1956), ‘The Image’ in English. Thus, this sequence of paragraphs have a certain integrity of their own within the novel. A number of the resultant pieces – the outcomes of algorithmic processes with quasi-random, non-deterministic aspects – are published in the ‘Images’ section of my recently ‘augmented and reconfigured’ Image Generation (2023).19Image Generation: Augmented and Reconfigured.
Notes on the processes of the Image Generation pieces are provided online in the supplement already mentioned. Other notes on the site for the book as a whole give more information on processes, as used in different contexts for other supply texts. Texts of this stamp – along with much else in eliterature to date – set out their processes and typically make them explicit or discoverable. These processes often incorporate quasi-random algorithmic elements rendering them nondeterministic, but there will also be an underlying procedural constraint that could be applied deterministically. These can be compared as analogous with OuLiPian constraint or, for example, the instructions-as-art of Sol Lewitt. I call them heuristic because they are explicit and humanly readable (at least in principle). In the course of this kind of algorithmically implicated creative practice, an ‘art’ – in more or less conventional understanding – of its author or artist, as expected or demanded by evaluative critics, can still be found in one of two places. Before execution of the algorithm, in the preparation and choice of the material supplied; and in the choice and design of the algorithm itself, anticipating this material. Or after execution, which may or may not result in the author/artist working manually on the result of the algorithmic generation or rejecting the result and using the experience to feedback and adjust supply material and/or algorithm for subsequent iterations. All of this, in principle at least, is motivated and driven by human reading (reading in a broad transmedial sense) and rereading. In my own case, I tend to work on the preprocessing of supply material and the algorithm, aiming to reiterate as necessary until the resulting generated or manipulated text has reached what one of my teachers in another discipline would call a ‘satisfaction point.’ Again, in my own case, I am often satisfied with generated texts that are quite arbitrary, texts that require to be read as testing language at limits of syntax and word choice. They may be highlighting, in a discomforting way, linguistic or sociolinguistic phenomena at horizons where sense and style, significance and affect are at some kind of risk. At best, the language concerned, and their heuristic processes, discover humanly appreciable, language art aesthetics at their horizons – a sunset or sunrise with strange but articulated, artificial weathers.
None of this is currently possible, at least not in ways that can be articulated, when working with GPTs and their LLMs. The situation may change, or artists may develop and deploy their own models, with or without some understanding of their function. These, however, will be little models, not large, although insofar as they are networked with their larger cousins, or rely on their pre-trained models, they will still present us with the problem of hermeticism.
For the moment, we need to address the models’ and transformers’ actual-existing hermeticism and certain of its characteristics. We need to address what we are encountering now. One of the reasons to do so is to highlight a crucial misapprehension that tends to be performed and adopted by all of us, as and whenever we interact with current large models. As some of the tech-leader doomsayers – those advising government regulation or a pause in development – have pointed out, one of the most plausible existential risks of artificial super-‘intelligence’ derives from the fact that, in principle, it is a property of something that is more of a singular entity than a property of particular individuals belonging to an intelligent species (like homo sapiens).20Alex Hern, ‘Why the Godfather of AI Fears for Humanity,’ The Guardian (2023). As we tend to do with technology in general we anthropomorphize in this sense: when we encounter new things or interface with things that we have newly made, we may perceive and relate to them without necessarily showing the philosophical and sociopolitical ‘reserve’ that might be advisable, the ‘bracketing’ or epoché of phenomenology. If something seems to chat, we assume that it is a something that chats. If it uses language that passes the Turing test, we assume it is human or at least like a human. And yet, when we pause and bracket this supposition in the case of ChatGPT, we recall that we know that it is not chatting, and we also know that it is not ‘like a human.’ It is not separately using language with each of us, for example. Each of us may well be talking with it, many of us more or less simultaneously (even though, of course, actual simultaneity is philosophically impossible for linguistic interaction) and definitely simultaneously to the extent that thousands or millions of us are addressing one entity at more or less the same time and doing so in a manner which this entity can ‘appreciate,’ in the sense of making some use of all these transactions in a way that is impossible for any of us, individually. QED, GPTs are not individuals in anything approaching the sense that we are. And our treating them as such is a crucial, consequential problem of misapprehension, which some commentators associate with existential risk. For me, this includes the problem of using – not ‘them’ but – it, as ‘an’ assistant or collaborator, for eliterary practice. More properly stated, we may be trying to use it as the assistant or the collaborator. As an author/artist/creator, you may still want to go ahead with this (I will admit) but, given a number of other considerations, I hope you will concede that doing so is seriously problematic.
Disregarding, for the moment, what I have just said about the nature of the entity with which one is collaborating, by some lights, the overall ‘workflow’ when practicing eliterature with LLMs using GPT interfaces is not that different. The author still has some control over supply text, with or without additional curated training. There is skill and imaginative engagement in the composition of prompts, although there are hidden infrastructural constraints, at this point, of prompt length and triggering phrases for example. We have already addressed the hermeticism of the algorithm, but once the functions have been found and executed by the transformer, the author can still take the result and do more with it. They may also reject certain responses and feedback, although only really in terms of adjusting supply or prompt, plus minor tweaks, typically of ‘temperature,’ which seems to resolve to adjustment in the range of probabilities – allowing a range of responses for each ‘best’ next word. This, almost by definition will have strongest effects on what we otherwise call: word choice, diction, vocabulary – more variation in the paradigmatic as opposed to syntagmatic dimension.
Fundamentally, the artist cannot change the algorithm(s) of their singular assistant and collaborator: the language model assistant. The engineering of this assistant is not intended to test or experiment with language, whatever language may be. It has been and is being built and made and developed – and given vast resources to this end – to ‘compute’ or to ‘solve’ language, based on the data available. In fact, as Artificial General Intelligence. AGI, it is made and built and developed to solve everything that is subject to data capture and formulation. And this ‘everything’ is absolutely everything for some scientists and philosophers of science. I might be persuaded to agree that if AGI had access to all the data that the world can give us with respect to language, then it might listen and speak as we do. I know that, for now, it does not, since it has only a lot of text and relatively little data, so far, pertaining to our humanly embodied practices of language.
Apart from this more or less ‘quantitive’ objection, I am also able to assert, with a degree of plausibility, that language is, ontologically, not computable. It is not subject to any ‘best’ or ‘correct’ transcription, for one thing, and thus not subject, ultimately, to formulations that generate language as such. As it happens, I believe that language constitutes not so much ‘me’ but the ‘sayability’ of what I am, and does so in necessarily – also co-constitutive – social interactions with other animals like myself. I round out this paragraph by claiming to know that the LLM (the entity we will ultimately be presented with) is not an animal like myself and thus cannot, like Wittgenstein’s lion, ‘speak’ the same language. I am terrified by the prospect that I might have to learn to speak its text-degraded pseudo-language and that I might also be tempted to do so for the practice of a pseudo-language art of my own.
I have now tried to say what I wanted to say with regard to both the – scientistic, engineering – motivation for and – aesthetic, social, cultural – consequences of the hermeticism of transacting with GPTs and their LLMs (on the way to the one LLM). To fill out this section I ask the reader to consult the commented supplement to this essay. This presents a number of dated but relatively haphazard and cursory prompt-and-response sessions with a couple of GPT platforms. The intention is not so much to bolster the arguments I have set out here with respect to philosophical or art-pragmatic principle. Instead I hope that the supplement will provide some points of comparison with the processes and results of my earlier practices with the same material, Beckett’s How It Is and the twenty-two paragraphs of ‘The Image.’
Finally, this essay leaves you with a few sentences that speak to the moral rights of language makers, and the creation of language art with computation, in light of the LLMs advent. From my own personal and pragmatic point of view, copyright was and is the misapplication of already suspect liberal individualist real property rights to instances of linguistic performance recast as ‘copies’ of intellectual property. This was bound to founder in actual-existing commerce once it was no longer necessary to accumulate significant capital in order to manufacture physical, distributable copies, once there was no friction to overcome in the sharing of texts, for copyright applied to literature. This has also proved a problem for copyright in the market for music, where what is copied still requires additional investment in both trained skills – time – and instruments or equipment – money – on the part of the artist-creator. In the singular case of linguistic creation – although talent, time, and acquired skill in using language may be pertinent – the facility is immediately available to all of us, and so a market-based trade in language as such seems to us: absurd and wrong. We are creatures defined by our language use. It was history and technology that made possible a trade in transcribed language, but only as long as the capitalist friction, mentioned above, applies. Now, more recent history and technology – the internet – puts paid to the pragmatic motivation of that market’s property-based exchanges. Or should do. Instead, pure legislation and powerful interests reify an enclosure of the linguistic commons which should never have been allowed to establish itself. And in a more recent ‘now’ – since LLMs began to ‘generate’ their own linguistic artifacts – if we continue to concede that copyright is status quo, we find ourselves in an even stranger wonderland-for-the-IP-rich, for the chiefly corporate legal owners of whatever intellectual property is judged to be – with hardly a shred of intellectual or related economic justice.
Before my final sentences, attempting to make clearer what is obviously wrong with copyright since LLMs, recall that within the law of copyright, the moral rights – to be associated with something you have made, and to have the integrity of its form-as-composed respected – are considered separately from the property and commercial-value rights of the legal framework.21Invoking Saussure at this point and also the self-evident fact that events of language are produced and received by humans, and machines for that matter, as linear sequences, it is interesting to consider that, in the case of language, integrity amounts to the integrity of particular sequences of linguistic elements, or strings of tokens in computational terms. It is also interesting to consider that very short, unique – and thus, arguably, original? – sequences of such tokens of transcribed natural language can be relatively easy to compose (many, many 5- or 6- token sequences will be unique) and, since the advent of internet search, easy to find. Vast numbers of these sequences now exist ‘virtually’ in the latent space of the LLMs (or, more properly, in the latent space of the LTM, the Large Text Model). How long before the owner(s) of these latent sequences claim(s) to own all the so-far unrecorded sequences themselves? As creator-owner you may, and sometimes must, assert your moral rights. Regardless, you always already ‘possess’ them. If they are not worth anything, however, says the law, you won’t necessarily be able to receive, legally, a commercial remedy, unless you can show, for example, that an infringement of your asserted moral rights has harmed or reduced your (intellectual) property.
But let’s say that everything I say or write on the internet is worth something, even some very small amount. It is. Say this is true of everyone else on the network. What then, if some corporate individual builds a machine to harvest everything we say and uses it – with no regard for anyone’s moral rights – to make another machine that generates the most valuable language in the world that we all share? A machine to generate, that is, all the so-called language as it so-called should be according to ‘super-intelligent’ formulations, the “best” sequences of words in every instance? Well, who deserves to get paid for what the machine has written? The following is my own opinion of these circumstances in terms of current law. Original creations made by millions (if not billions) of authors have been taken from their owners in copies transcribed from the creators’ language. This so-called data has then been, literally, hermetically enclosed. Other words that one might use for this action might be: captured, seized, extracted, stolen. The creators’ moral right of association has been infringed and the integrity of their creations has been entirely disregarded. Their intellectual property, originally language as such, has been harmed. They are owed proportional compensation in respect of any profit that has been or will be derived. In the short term, and in practice, it seems that the law will find, by contrast, that LLMs produce ‘original’ creations that are owned by the neo- or vectoral-capitalist owners of GPT infrastructure. This is a monstrous, anti-human injustice.
Again, in my creator-layman’s opinion, the LLMs’ only defense of these circumstances in current law would come from the still largely untested principle of fair use: the principle that subsequent creators can fairly and freely use previous creations if the derivative work has transformative or educational value. Even if this is a plausible interpretation of what the LLMs are doing, should we then enrich corporate individuals for the ‘services’ entailed? Here I maintain – while also fundamentally rejecting copyright as it was and is – that the hermetic processing of the LLMs is pertinent. How do we know whether or not the LLMs are making fair use of our creations?22Contrast the critique of copyright that I undertook with my collaborator Daniel C. Howe for the mutlimodal project How It Is in Common Tongues, one outcome of which was a printed book Common Tongues, John Cayley, and Daniel C. Howe, How It Is in Common Tongues, (Providence: NLLF Press, 2012). The process was heuristic: find all of what we called the ‘longest common phrases,’ sequentially, in the entire text of Samuel Beckett’s How It Is. We showed that the phrase is ‘common’ – that is, it is a sequence of linguistic tokens from the commons of language – by finding it on the internet, but not attributed to (associated with) Samuel Beckett, and we provide our citations in the book. We thus end up generating the entire text from the commons of language (“in common tongues”) without disturbing its integrity or its (other chief) moral rights of association. We believe that in this instance we can claim fair use because the heuristic procedures we employed offer a transformative and educational critique of copyright (along with other characteristics of the text) in our post-internet literary and technologically-transformed culture. We can also claim that the manner of reading the text that we have openly, heuristically offered up provides a transformative, and arguably aesthetic, experience for reading How It Is … in common tongues. Any LLM ‘use’ of Beckett’s words or style (see the appendix) is not heuristically construable in these ways. How would they/it or its owners know? Why should the owners of these entities be allowed to profit from this appropriation and ‘use’ – without discrimination or integrity – of our everyday and our finest and most extensive linguistic creations? The original creative, social acts and events of language are not only proper to us, but they make us something that we know the LLMs are not. Reading and speaking with one another make us what we are, language animals.
Bibliography
Agamben, Giorgio. What Is Philosophy?Translated by Lorenzo Chiesa. Crossing Aesthetics. Stanford: Stanford University Press, 2018.
Cayley, John. ‘Beginning with ‘the Image’ in How It Is When Translating Certain Processes of Digital Language Art.’ In Colloques de l’Université Paris 8: Translating E-Literature = Traduire la littérature numérique, edited by Arnaud Regnauld and Yves Abrioux, http://www.bibliotheque-numerique-paris8.fr/fre/ref/168452/COLN11_4/. Paris: Université Paris 8, 2015.
Cayley, John. ‘Differences That Make No Difference and Ambiguities That Do.’ Review of Martin Paul Eve, Close Reading with Computers, Stanford: Stanford UP, 2019. Novel: a forum on fiction 54, no. 2 (August 2021): 315-320.
Cayley, John. Image Generation: Augmented and Reconfigured. Denver: Counterpath, 2023.
Cayley, John. ‘The Language That Machines Read.’ In Attention à la Marche = Mind the Gap: Penser la Littérature Électronique in a Digital Culture = Thinking Electronic Literature in a Digital Culture, edited by Bertrand Gervais and Sophie Marcotte, 105-113. Montreal: Les Presses de l’Écureuil, 2020.
Cayley, John. ‘The Translation of Process.’ Amodern, no. 8 (2018) https://amodern.net/article/the-translation-of-process/ (accessed July 31, 2018).
Common Tongues, John Cayley, and Daniel C. Howe. How It Is in Common Tongues. Providence: NLLF Press, 2012.
Eve, Martin Paul. Close Reading with Computers: Textual Scholarship, Computational Formalism, and David Mitchell's Cloud Atlas. Stanford: Stanford University Press, 2019.
Hayles, N. Katherine. ‘Inside the Mind of an AI: Materiality and the Crisis of Representation.’ New Literary History 54, no. 1 (Winter 2023): 635-666June 7, 2023).
Hern, Alex. ‘Why the Godfather of AI Fears for Humanity.’ The Guardian (May 5 2023) https://www.theguardian.com/technology/2023/may/05/geoffrey-hinton-godfather-of-ai-fears-for-humanity (accessed June 7, 2023).
Johnston, David Jhave. ReRites. Montreal: Anteism, 2017–19. http://glia.ca/2017/rerites/ (accessed July 9, 2018).
Mackenzie, Adrian. Machine Learners: Archaeology of a Data Practice. Cambridge: MIT Press, 2017.
Murrello, Jack, Carsten Eickhoff, and Ellie Pavlick. ‘Language Models Implement Simple Word2vec-Style Vector Arithmetic.’ arXiv (May 25 2023): Preprint. Under review. arXiv:2305.16130v1 (accessed June 2, 2023).
Parrish, Allison. ‘Material Paratexts.’ In ICCC 2022, 2022.
Parrish, Allison. ‘Nothing Survives Transcription, Nothing Doesn’t Survive Transcription.’ In Iona University, Data Science Symposium, 2023.
Parrish, Allison. ‘Poetic Sound Similarity Vectors Using Phonetic Features.’ In AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 2017.
Pavlick, Ellie. ‘Symbols and Grounding in Large Language Models.’ Philosophical Transactions of the Royal Society A (April 20, 2023): 381: 20220041.
Rettberg, Scott, Talan Memmott, Jill Walker Rettberg, Jason Nelson, and Patrick Lichty. ‘Aiwriting: Relations between Image Generation and Digital Writing.’ Paper presented at the ISEA 2023 Conference, Paris, 2023.
Stewart, Garrett. Book, Text, Medium: Cross-Sectional Reading for a Digital Age. Cambridge: Cambridge University Press, 2021.
Stewart, Garrett. Reading Voices: Literature and the Phonotext. Berkeley: University of California Press, 1990.
Wolfram, Stephen. ‘What Is ChatGPT Doing … and Why Does It Work?’. Writings (2023) https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/ (accessed March 31, 2023).