Off Center Episode 5: AI, Computational Creativity, and Media Production with Drew Keller

Drew Keller; Scott Rettberg

doi:10.7273/k9rb-wp11

interview

Off Center Episode 5: AI, Computational Creativity, and Media Production with Drew Keller

by Drew Keller, Scott Rettberg

Sunday, January 7th 2024

https://doi.org/10.7273/k9rb-wp11

Drew Keller, Microsoft employee and graduate of the Digital Culture program at the University of Bergen joins Scott Rettberg to talk about the potential role of AI in our media production. From the Jacquard loom to the PowerPoint designer, human creativity has always been intertwined with technology, but is the rapid increase in AI a revolution in the way we produce media, or just another tool?

SR: Welcome to Off Center, the podcast about digital narrative and algorithmic narrativity. My name is Scott Rettberg, and I’m the director of the center for Digital Narrative at the University of Bergen. In this podcast, I’ll have conversations with the researchers at the center, as well as other experts in the field to discuss topics revolving around digital storytelling and its impact on contemporary culture. In this episode, I’ll be talking with Drew Keller about AI and computational creativity in media production.

SR: Today I’m here with Drew Keller, and Drew is actually a recent graduate of our master’s program in digital culture. Drew, you have a unique background. Could you say a little bit about where you come from?

DK: Sure. First of all, thanks for letting me be here. I really appreciate this. I’m what I think you could consider a nontraditional student. I’m not exactly the youngest member of the cohort. This is my second master’s degree. I’m originally from the States, from Seattle, and I work for a large technology company.

SR: Oh, you can’t say which one?

DK: Well, I work for Microsoft, although this is not an official Microsoft representation. And one of the things about working for Microsoft is they have this great culture of making certain that the employees are lifelong students, that we’re always inquisitive, that we’re always looking for things. So, while I had my first degree in digital media - in how we share things, specifically videos online, I wanted to look at artificial intelligence. And the company was very supportive of the fact that I wanted to look at something that may not be directly applicable, at least at the time, to my work. The idea was that I wanted to take a deep dive into specifically artificial intelligence, storytelling, and video production.

SR: When I saw your application to the program, I said, “We must be doing something right if people are taking sabbaticals from Microsoft to come and work with us at the Center.”

DK: I will say that the company at first didn’t necessarily understand because this was over two years ago, and they really did not understand the potential of AI, specifically the applications in storytelling and media production. And it wasn’t until six months ago that they went, I get it. And so it turned out to be very fortuitous that I’m here in this program at this time, doing the work I’m doing.

SR: Yeah. And what a great time to be writing on this topic. So much has happened over the last couple of years.

DK: And my background originally is in media production, documentary production, writing, directing, editing, motion graphics, whatever I needed to do to tell video stories. And it was clear that the tools that were being developed were going to have direct applications into how I did my work. So, in many ways, wanting to look at this was occupational survival.

SR: Great. Well, let’s get into that a little bit later. But let’s talk first about the title of your study, “Computational Creativity and Media Production at the Crossroads of Progress and Peril.” So this is a really intriguing title. Can you explain how you’re highlighting both progress and peril here, and how AI fits into those two poles?

DK: Well, let’s start with progress and peril. I think that’s maybe the easiest way to enter into this. With almost every significant technological leap, not just AI, the technology, the iterative change presents opportunities, things that are unlocked that before were often unimagined. But there are almost always consequences to those iterations. And so, this development of AI shows that we are in yet another significant technological iterative leap. We’re taking a huge jump forward. But I think there’s peril, not just the media hype that we see about sentient machines and Skynets coming and that sort of thing. I think it’s a little more nuanced than that. There is what I would consider peril in terms of the individual ownership of creative ideas. I think there’s peril in terms of who’s going to stay employed, who’s not going to stay employed, who’s going to have requirements placed upon them to either be more efficient or to move to stay employed, or to have to completely reinvent themselves or just get left behind.

SR: Some jobs will go away.

DK: No doubt about it. And that’s been true with almost every large technological leap, this transformation leaves some folks behind.

SR: Yeah, so I think this is the case, although AI has exploded into the public consciousness right now, new technologies have always historically been received in complex ways when they first began impacting society. Maybe we should go back to that and think about how other technologies have historically been received, going back even to the Industrial Revolution.

DK: I wanted to get a little bit of insight into our potential future or at least some potential scenarios by looking back at previous technological changes. And one of the first places I looked at was lace production. Now, of all things, lace, how does this fit into AI? But there are some incredible parallels in using lace as a way to examine the outcomes.

DK: Lace was seen as a very fine crafted piece of artwork, and it was available to the super wealthy. There’s a reason why all of those paintings have the big lace collars around them with kings and queens and nobility, it was a big luxury item. And the production of lace was performed by craftspeople, and it didn’t require a lot of overhead. So it was often women and young girls who were living in basically subsistence farms. It wasn’t a big factory, it was literally a cottage industry. They were in their cottage being industrial, and folks who were manufacturing textiles saw an opportunity to make a lot of money because there was so much demand for lace, that they started looking for ways to automate, and that developed ways of trying to figure out how to create looms and things that could do it.

SR: In some ways, the connection to computation is the Jacquard loom.

DK: You’re spot on. The idea of giving a loom frame instructions through a sequence of basically on-off switches is found in the punch card of a Jacquard loom. And the Jacquard loom, that punch card, it comes back over and over and over again in the development of AI, because it, in essence, is binary. It’s on-off. And so whether it’s a player piano or Jacquard loom, it works the same way. In lace production it took a while to invent machines that actually could perform these very detailed tasks of creating lace. And ironically, the first successful model was created by a guy who sat and watched a woman making lace and recorded all of her movements, and she basically put herself out of a job sitting there. The guy recording what she did, he then replicated it as instructions. And ten years later, there was no work.

SR: Sort of the first artificial intelligence.

DK: Exactly. Whether it was the steam age or whether it was lace, what happened was machines would come in and they were about efficiency of labor, they were about creating things at scale, reducing the cost of production, and almost always at the expense of the people who originally were creating that craft.

SR: And people got upset about that.

DK: And—

SR: They burned things down.

DK: You bet they did.

SR: Factories.

DK: Yeah. So the rise of Luddism was a direct response. When hydropower arrived, there were a lot of folks who were in horrible working conditions and, frankly, were undervalued in what they were doing and were out of a job. And so there were a group of folks who used a fellow by the name of Ned Ludd as their sort of spiritual leader, and they decided to fight the rush towards industrialization. And they started, as you said, breaking into factories, breaking the looms, and they worked really hard to target the factories, the mill owners and the people who supported this rush to industrialization. Unfortunately, those are also the folks who had the money and the power, which—

SR: Maybe that’s the same case today.

DK: Exactly. Well, the consequence of that was laws in England where if you were convicted of basically breaking these things, they killed you. It was punishable by death. And it was this huge effort to try to fight industrialization, and ultimately it failed because the amount of money involved—

SR: Progress marches on.

DK: Exactly. And, frankly, the market liked the fact that we had all these cheap goods and ultimately, the folks who were protesting the rise of industrialization fell by the wayside.

SR: And now we’re looking at situations where people like graphic designers, people like copywriters, some secretarial functions, some middle level management, different media producers of various kinds, which is something you get into, and a lot of other people are potentially displaced by this technology.

DK: Oh, there’s no doubt about it. When we look at the rise of industrialization at the beginning of the 20th century, the invention of the assembly line, and then even as we got into the late 20th century with robotics and those sorts of things, the people who were being displaced were laborers. These were folks who may or may not be college educated. They had practical skills, but you could create much more efficiency in your factory if you welded with a robot than with 20 people. And frankly, it was more cost effective. What’s interesting about AI is the folks who are likely to feel it most are those of us who are college educated professionals, because AI is coming after the accountants, the project managers, the middle managers, the folks who do work that is specific but replicating patterns, things that you do over and over again. It’s making these connections with things. And that’s what AI excels at, is managing large, massive blocks of data and putting things together quickly. So the folks who are most likely to feel the initial push of AI are likely to be the folks who have not felt it in any of the previous revolutions. It’s going to be significant.

SR: It’s not the blue-collar workers.

DK: The white-collar folks are really going to get hit pretty hard. And it’s interesting because a lot of the AI that’s currently being woven into the applications and the valuation, the expected valuation could be the monetary valuation of AI, as opposed to something like the digitization of the economy. The digital economy, I believe, was at $17 trillion. And the estimates are that this is going to be over $80 trillion in half the time that we created the Facebook, the Metas, the Googles, all of those things. We’re looking at an exponentially larger slice of pie with this money. And a lot of those applications are being woven in on the backside or inside the shells of the applications we use every day, whether it’s productivity apps, whether it’s analysis apps, whether it’s trading stock, whether it’s searching on the web. You may not necessarily see a manifestation of AI, but suddenly your grammar is corrected, or you get a suggestion on a rewrite of a sentence.

SR: PowerPoint now does most of the design of my PowerPoint slides.

DK: Which is interesting that you bring that up because Designer, which is the app that they are rolling out, leverages a lot of AI to help folks become not only more efficient, but I think a lot of folks feel empowered to achieve things that previously felt unattainable. And whether that’s graphic design, whether that’s solid writing, whether that’s some sort of analysis, that’s really the potential part of this. That you can create efficiencies which eliminates a lot of the mundane stuff of our daily lives.

SR: The boring stuff.

DK: Just the drudgery, and it opens you up to spend more time on the things that interest you or the problems that are harder to solve. But really you don’t necessarily see evidence of it. We had a bike race here two weeks ago. I was out taking photographs, and I had a photograph that I really liked a lot. But the background, well, a lot of the apps have had generative fill. Generative fill within Photoshop is astronomical because, literally the ability to not just remove the sign, but to recreate a tree that looks like the bottom half of the tree. It’s not just blurred pixels. It opened up the opportunity to achieve the look of the photograph that I wanted, even though I couldn’t control the background.

SR: I’m going to use it to take my kid out, he’s always photobombing pictures.

DK: Yeah, that’s great.

SR: To come back to AI, even that term, artificial intelligence, that’s kind of been controversial in the history of computing. What’s the difference between human intelligence and artificial intelligence?

DK: Developers have been working since the very beginning to recreate human intelligence. I mean, that’s been the goal, to create a model that is comparable, and the easiest path for them to do that has been modeling how we think. The difference between artificial intelligence and human intelligence is beginning to blur. But one of the main differences that I think still gives us a leg up on the machines is our ability to make random connections. Most of the AI that we work with is still narrowly focused on a particular task. I can’t ask Alexa to drive my car. That’s not within the skill set of that AI.

SR: You need to ask Tesla. Ask Elon Musk instead.

DK: Exactly. So the AI that we’re working with, the models tend to be trained to accomplish a very narrowly focused and specific task. They’re learning different things within it, but it’s still narrowly focused. Our strength is that we can make connections. That may be a personal experience 20 years ago that has no relationship to the problem we’re solving, but in our brain, we see a parallel or a piece of insight or an observation, which we apply to create something new. And that is one of the benefits of human intelligence, is our ability to do sort of random pattern matching. And AI is very good at understanding patterns, but it’s not very good at connecting patterns from a number of different sets. That’s likely to change. But right now, that’s a limitation.

SR: Yeah, it doesn’t have childhood memories that drive the metaphors behind it.

DK: And the other thing about AI that we tend to lose sight of is our brain does more than just that sort of analytical, clinical processing. It’s sensory. We know how it feels, we see things, we touch things, we know how move through the world. It is a very complex organ that manages a lot of different inputs other than just this cognitive part. And so even in the fact that it is only emulating the intellectual part of what we’re doing, it doesn’t have the ability to manage and receive input from a variety of sources, which again, gives us the opportunity to create new ideas, put things forward that have not been put out there before. It’s a wonderful way to process new ideas.

SR: And as far as I understand it, most of these systems, certainly large language models or foundational models as they’re calling them now, these are probabilistic. It’s math behind it. So it’s trying to find the most likely answer, not the original answer.

DK: Right. I mean, when we look at when Google dropped the transformer model, I think in 2017 and that’s the T in GPT, essentially, they are looking at past datasets and how words fit together previously to then predict what’s the best choices for the next word. It’s like laying bricks in a wall. You do one and then the next one, and the next word to the next word to the next word. And that’s how it’s forming this. It has no context of facts, it has no context of referents. When I say the word cow to you in your head, you see it.

SR: I get a picture of a cow. I think of milk.

DK: Yeah, or a memory or what you associate the reference in your head—

SR: Or a poem: “How now brown cow…”

DK: And these tools don’t have any of these referents and that’s part of the reason why they generate faulty answers. We’ve heard a lot about these hallucinations and there are usually four reasons for hallucinations, or four types of hallucinations. One is giving you an answer that doesn’t have anything to do with your prompt. One is, and this is the most common: faulty datasets. Are you aware of the two of the largest data sets in GPT? Reddit and Wikipedia. Not exactly known for their fidelity and their rigor and facts.

SR: Fan fiction has a big chunk in there too.

DK: So, they are often generating these hallucinations, these sorts of drawings with six fingers or weird answers, because they have no reference, they have no idea. These are just tokens. They’re stringing one after another. We assign meaning to it, but to them it’s just a sort of numeric string. It’s a math equation.

SR: Yeah. In some ways, one of the things I love about these sort of text to image generators we’re talking about are the six fingers. This sort of really strange, Uncanny Valley type of thing that, to me, is way more artistic than once we get to total photorealism or things like Midjourney.

DK: Taking a leap over to generative models, there are a number of different ways to do it, but one of the most popular is referred to as a generative adversarial network. Without getting into too much detail, basically the model is trained on, does this look like a cat? Yes. No. Yes. No. And it keeps learning, oh, now I know how to draw a cat. Creative adversarial networks, CANs, are really about not trying to replicate a specific pattern, but instead taking basic parameters and really going outside the bounds of what the cat looks like. And there’s a lot of interesting work that’s being done in this sort of liminal space. This is cubism, but what can I do within cubism? And there’s really interesting creative voices that are being developed within that, not just trying to replicate a cat but beginning to think differently about how a particular style of painting or the sort of in between space, between genres in in storytelling.

SR: Latent space.

DK: The latent space is absolutely right.

SR: I want to come back to what you do, to video production and to storytelling, and this is something you deal with later in your study, some ways that you see this affecting production and not just sort of replacing people who make video productions now but enhancing our abilities. And you even propose a new model that might eventually lead to an application. Can you say a little bit about these positive effects that you see coming down the road?

DK: I think there are a couple of things. One of the things that I like to do with generative models, be it text or image, is ideation, having a concept of what I want to accomplish. And do you know what the spaghetti theory is? Throw it against the wall, see what sticks. So often I use AI as sort of my rapid prototyping, my sort of spaghetti theory, if I’m thinking of this or this and coming up with solutions that I may not have come up with as quickly as it does. And this sort of ideation is wonderful for me because it’s got a depth of resources, of a depth of reference that I don’t necessarily have readily available. The other thing that I see as a positive is what I talked about before with sort of workplace efficiency and that is eliminating some of the mundane, some of the drudgery that we do in creative endeavors.

SR: Right.

DK: And whether it is the photograph that I fixed in Photoshop at the bike race, I absolutely could have taken the time to do that. I could pull the masks, I could draw it out, I could do all of that. But the fact that I could do it in 4 seconds?

SR: Instead of 3 hours.

DK: Exactly. And to get the light just right. And all of those textures and the focus saved me boatloads of time. And I think that in video storytelling, there is a tremendous opportunity to expedite some of the drudgery without losing the creator’s voice. And I think that’s an important criteria.

SR: And maybe that frees you up creatively, in a lot of ways.

DK: Without a doubt, having worked in production for so long and having lost hours, days, years, doing a lot of this drudgery of logging files, transcribing interviews, doing a lot of the things that we have to do, the sort of foundational inventory before you can begin to craft a video story, all of these things can be automated. There are models already in place, and in fact, we’re beginning to see these applications already dropped into Resolve, Premiere Pro. A lot of these things are beginning to get—

SR: Runway—

DK: Runway, well Runway’s generation is really interesting, but what I see is there are opportunities to expedite the process for professionals. But moreover, just like you talked about using designer and PowerPoint to not only expedite things, but to create a PowerPoint that—

SR: Looks good.

DK: It looks good. I think there are a lot of people who would like to create video stories who, frankly, just don’t have either the time or the interest to develop the skills.

SR: Two years in a master’s program.

DK: Oh, my God. Even if you just do it on your own, learning the software, learning how to do visual storytelling, you’re managing four voices, you’re managing the words, you’re managing the pictures, you’re managing the sound, you’re managing the music. And your job in this sort of video storytelling is really abstract because you’re managing how those things coincide when they’re not coincidental, and when they are coincidental, how do you make a point and how does it get you to where you want to be?

DK: And frankly, you feel like Sisyphus pushing a rock up a hill as you try to learn the software, learn the technique, all those sorts of things. And AI can help somebody who wants to be able to do that but can’t do it. That’s why the model that I proposed was really to give creators a starting point to go through. And the idea is you ingest your footage, and with a prompt, you describe what you want to do, and the model analyzes the footage for sentiment shots, what’s available, what’s not, looks for keywords, does a whole bunch of stuff on the back end and ultimately presents a story that fits within the genre, the type that you want to create. And you can look at the information and go, no, that sucks. Change your prompt. And iterate again. But the idea is to minimize a lot of that drudge.

SR: I love how much of this is driven by language. As somebody coming from the humanities, I know there’s a lot of math and computation behind this, but a lot of the things that we’re doing now in the practical applications of AI are writing.

DK: I think that’s a really important observation. And strictly from the mechanics, I think that’s an important observation because the amount of math it takes to process, whether it’s spoken word or written word, is exponentially less than things that are visual, and especially with video. So being able to ingest something, speech to text, look for keywords, and then associate that information as, in essence, a keyframe or a marker within a video file, oh, we want to look here, do more analysis here. The ability to process audio, which then becomes language, becomes the sprocket holes within the entire process. Same with large language models. When that’s ingested and you begin to create, fabric, synthesize a story, that becomes the scaffolding, the instructions by which you compile the story. And the language is the key here.

SR: And narrative, I mean, this is one of the things that’s so exciting about having this center now is that storytelling in video, storytelling in text, storytelling through images, all these new potentialities with these models really, in a way, have narrative at their core.

DK: Without a doubt. And that’s actually one of the things that I started my report with, was, we are all storytellers. We do it at the dinner table. We do it standing around the coffee machine. We do it when we’re out hiking through the mountains. And some of us do it professionally. But that’s how we learn. That’s how we share. That’s—

SR: How we think.

DK: How we dream. And a good story, the storyteller understands what I call the emotional destination, which is, how is somebody supposed to feel when you’re done? Are they supposed to be outraged? Are they supposed to be charmed? Are they supposed to laugh? You have some sort of payoff, a beginning, a middle, and end. And we do it intuitively. We learn as kids how to tell a story and how to impart information. At least most learn. There are some who still can’t. My wife, I love her, but she cannot tell a story.

SR: And maybe our students who are beginning to try to write their essays with ChatGPT.

DK: So you’re right. It’s all about language, and that’s how we create ideas, Scott, and that process of forming words, that sort of the mechanics of one to the other, to the other, that sort of internal struggle we have, whether we’re writing, whether we’re shooting video, whether we’re compiling, whether we’re making sculpture, whatever, that sort of internal process is sort of a vital internal exploration we have. And I don’t want to lose that with AI, because—

SR: I don’t think we will.

DK: I don’t either. And I think that all of these technological things, they go in fits and starts. You’ve got this, this will change everything, and a year later, okay, well, it really didn’t change much. And even when we look at the 80 years of development of AI, big surge, and then it goes down, a big surge and it goes down. Even GPT-5. I just read that they’re not going to do training for a while because they’ve got things they want to solve.

SR: Probably they want to make sure that they won’t cut into their profit margins, unintentionally.

DK: Well there’s a political side, or they violate any sort of potential international agreements.

SR: Well, one of the things that a story does have is a sense of ending, and our time is basically up. There’s so much that we could keep talking about. I’m sure you and I could talk about ChatGPT and large language models and other forms of AI for hours and hours and it would be fascinating, but we’re restricted to 30 minutes for our podcast. And I want to thank you, Drew. I know that you’re planning on sticking around Bergen and we’re looking forward to having you and our milieu for years to come.

DK: I’m looking forward to too. I am absolutely in love with the intellectual questions that are being asked within the department and within the university. I think there’s a tremendous opportunity to do some really interesting work and I love the fact that I can continue to work for Microsoft, for my technology company, but also be a vital part of asking hard questions and trying to find some interesting answers.

SR: Great. Well, thank you very much. We’ve been talking to Drew Keller, scholar, media producer, Microsoft employee.

Listen to the full episode of Off Center. References

Adobe Inc. 2023. Photoshop. macOS and Windows.

Freedgood, Elaine. 2003. “Fine Fingers: Victorian Handmade Lace and Utopian Consumption.” Victorian Studies 45 (4): 625-647. Accessed November 14, 2022. http://www.jstor.com/stable/3829530.

Keller, Drew. 2023. Computational Creativity in Media Production: At the Crossroad of Progress and Peril. [Master’s Thesis]. Download: https://bora.uib.no/bora-xmlui/handle/11250/3071882

Microsoft. 2016. PowerPoint. macOS and Windows.

OpenAI. 2023. ChatGPT [Large language model]. https://chat.openai.com/chat.

This research is partially supported by the Research Council of Norway Centres of Excellence program, project number 332643, Center for Digital Narrative and project number 335129, Extending Digital Narrative.

Cite this interview

Rettberg, Scott and Drew Keller. "Off Center Episode 5: AI, Computational Creativity, and Media Production with Drew Keller" electronic book review, 7 January 2024, https://doi.org/10.7273/k9rb-wp11