Podcast: AI and video games (with Julian Togelius)

On this week’s episode of the podcast, I am joined by Julian Togelius, a professor of computer science at NYU and director of the NYU Game Innovation Lab. We delve into how games serve as the perfect sandbox for AI development, the potential for generative tools to revolutionize production workflows, and how the future of play might involve infinite, procedurally generated worlds that adapt to every player’s unique style. Among other things, we discuss:
- How AI researchers leverage the inherent fun and learning mechanisms within video game environments
- Whether the financial markets function as a complex game requiring the same reinforcement learning strategies used in digital play
- If consumer backlash against AI-generated assets will ultimately give way to the promise of faster and more ambitious development cycles
- What possibilities exist for truly open-ended games that utilize real-time world generation to create infinite, personalized player journeys
- How the role of the game developer shifts toward system architecture and critical thinking as coding becomes increasingly automated
- When and if world models will transition from impressive tech demos to foundational tools for interactive entertainment
Thanks to the sponsors of this week’s episode of the Mobile Dev Memo podcast:
- INCRMNTAL. True attribution measures incrementality, always on.
- Xsolla. With the Xsolla Web Shop, you can create a direct storefront, cut fees down to as low as 5%, and keep players engaged with bundles, rewards, and analytics.
- Branch. Branch is an AI-powered MMP, connecting every paid, owned, and organic touchpoint so growth teams can see exactly where to put their dollars to bring users in the door and keep them coming back
Interested in sponsoring the Mobile Dev Memo podcast? Contact Mobile Dev Memo advertising.
The Mobile Dev Memo podcast is available on:
Transcript
Eric Seufert: Hello, and welcome to the Mobile Dev Memo podcast. I am your host, Eric Seufert, and I am joined today by Julian Togelius. Julian, welcome to the podcast.
Julian Togelius: Hey, thanks, Eric. Good to be here.
ES: It is great to have you. I know you through having the great pleasure of working alongside you at Unity, where we are both on the AI Council. You also do a number of other things and have a prestigious CV, so I will let you introduce yourself to the audience before we dive into the conversation.
JT: I am a professor at NYU, and I also have a habit of coming up with startups. My main research is artificial intelligence for games and games for artificial intelligence. It is about how you make games better with AI and how you use games to test AI or develop better AI.
When I say games, I mean anything that people play, not so much traditional board games, but more like StarCraft, Minecraft, Candy Crush, Super Mario Bros, Doom, and so on. I founded a game testing company called Model AI and recently joined as head of AI at another startup called End of One, where we build AI for finance. Finance is just another game, isn’t it? By way of background, I am from Sweden originally but have lived in a bunch of places, including Switzerland and England. An academic career tends to take you to places, some more interesting than others.
ES: I actually had Christopher Holmgård on the podcast in October 2024. He is the CEO of Model AI. It was tough to come up with questions because there are so many interesting directions we could take this conversation. Given our mutual interest in games, that is probably the most logical place to start. As a big-picture question to set the tone, what makes games a particularly rich area of research for AI?
JT: Games are fun, and that seems like an inane thing to say, but there is something very deep about games being fun. What does fun mean? Fun is a very undefined concept, but it means that something is interesting and engages our minds in a particular way.
There is a theory popularized by Raph Koster, who is a designer of different well-known games like Star Wars Online. It comes from developmental psychology, looking at zones of proximal development. It basically converges on the idea that games and other things are fun because we learn as we go about them. They push our limits and put us in a zone of proximal development, forcing us to learn what is coming next. A well-designed game is an extremely well-designed machine for pedagogy for teaching itself.
It is not only that. Almost all games are an abstraction of some interesting real-life activity. Chess is about battles, Tetris is about stacking things, SimCity is about building cities, and Super Mario Bros is about playing on a playground. I was thinking about this when my son, who is a big Super Mario Bros fan, was jumping into a cover sheet on a playground and said he was going into the pipe. The platform game is a playground, or the playground is a platform game.
There are many good reasons why games are so interesting to study AI with, and this has been recognized for a long time. However, for a long time, the only games that were considered acceptable to work with in AI were things like chess, checkers, and Go. After AlphaGo, which was ten years ago, people said we were done with games and could go on to more interesting stuff. I think we are just getting started because there are so many different games and challenges.
Chess, checkers, and Go are all zero-sum, discrete, perfect information games with relatively few turns and a relatively low branching factor. They are not very indicative of the fascinating video games out there. It is not just about playing games well to win them; it is also about playing games in the style of a particular player, generating game content, generating complete games, and understanding players.
ES: Raph Koster is someone I look up to a lot. I saw him speak at an event in Helsinki. His book, A Theory of Fun, is a must-read for anyone who works in gaming. You mentioned Star Wars is the most famous of his games, but I think of Ultima Online. What is interesting about MMORPGs is that they were a fundamental deviation from a platform game like Mario. In earlier games, there were sets of paths you could take, but there was an endpoint and you would beat the game. When you got to MMORPGs, there was no beating it. It was a living society that persisted.
The incentive structure was different and social. There was essentially no score, though you might consider the stats of your character or the amount of money you had as a way of keeping score. In Ultima Online, you could have a house or a castle, and that was a way of comparing yourself against other people, but there was no singular quantitative benchmark.
JT: Isn’t this the case for many other games as well? My son and I play Super Mario Wonder. We beat the game and kicked Bowser out of the castle, but that was just the beginning. Even though it is as traditional a game as it gets, there are so many goals to set. Sometimes my son wants to replay various levels for no particular reason other than inventing new things like carrying each other around. There is so much game invention that goes on inside the gameplay.
ES: That was my point. In A Theory of Fun, Koster notes that a game is not one defined thing. With interactive video games, there is not one template. What you are trying to do is materialize that primitive of fun. How can we define that in a way where we can apply it in different settings that may or may not look like what you would consider a game? In many MMORPGs, I spent a lot of time being frustrated and irritated. Now you think about a much more abstract idea. An AI agent could probably beat the original Super Mario Bros very quickly, but that is not a great use case for AI. The more applicable and valuable use case is summoning that primitive of fun and allowing it to explore. Is that a topic of research for you?
JT: Definitely. Super Mario Bros is a ubiquitous model organism in game AI research, partly because Notch, the original creator of Minecraft, did a little Java-based version called Infinite Mario Bros before Minecraft. One of my students and I turned this into an AI competition back in 2009. It has been proliferating ever since.
It is funny to ask if you could make an agent that plays perfectly at Super Mario Bros. It depends on what kind of inputs you give the agent. If you have a simulator, an A-star agent can do extremely well. If you do not have that, it depends on whether it is a level it has seen before or not. Perfect play on unseen levels is something we are not even close to yet. We do not even have an agent that can reliably win unseen levels in the absence of a simulator.
This is a game that has been studied to death. A lot of work with this model organism—similar to fruit flies in genetics or snails in neuroscience—is about level generation. We generate levels that bring out various types of challenge, incentivize various types of fun, and allow for as many different playstyles as possible. A lot of research is about measuring interesting aspects of a player’s play, generating new content and variations, and using AI for that.
ES: You mentioned on your website a connection to finance. The stock market is essentially a game with real-life consequences. Is that a fundamental motivation for people who do things like high-frequency trading at firms like Jane Street or Jump Trading? I imagine people go to those firms because it is a way to say you are the best. These people likely did the Math Olympiad and are competing against people they know from those competitions. If you implement an algorithm and have a better year than someone else, you beat them. That has to be part of the motivation, other than just the money.
JT: Lots of people go into quantitative finance because they just like the competition. They treat it as a game. On another level, the agents you use for trading involve a lot of agentic trading. Even before LLMs, there were lots of little snippets of code that execute buy and sell actions all the time. That is how the cyberspace Wall Street works.
They are game-playing agents acting in reinforcement learning environments and learning to get better. The issue here is that the game is changing all the time. You have the non-stationarity of the market. The market conditions are different today than they were yesterday, and the policy you had yesterday that made you money may actually become counterproductive today. How are you going to adapt to that?
You can look at it such that everything you can observe in the world and all the actions you could take—shorting Bitcoin, buying Volvo, or holding options—create an enormous action and observation space. You cannot play that game because you do not have anything near the amount of data you need to learn a policy for it.
What everybody does is go the other way. They try to find very limited subgames, like this kind of option based on these inputs. They learn this and then keep retraining it. That is why the financial industry tends to work with very small models. People who come from deep learning into finance are often shocked to see that they are still using linear models, logistic regression, and decision trees. The finance people will look at you and say that it works and they can retrain it extremely fast.
The connection to games is that there is not an agent that can play at a Super Mario Bros level proficiently in the absence of a good simulator. If you wanted an agent that can play any 2D platformer in the style of Super Mario Bros, we definitely do not have that. The problem of general game playing is not only not solved, we have barely made any progress on it. Some people will take issue with this statement, but I know better. General game playing is like general finance, which is why people focus on playing one game like Go or Super Mario Bros, or trading one type of option.
ES: I want to talk about generative tools for game production. You have done a lot of research there. Is the gaming industry fighting an uphill battle in terms of consumer sentiment around the use of AI in development? I have the Crimson Desert experience in my mind, where there was a Reddit backlash because obvious placeholder graphics made by AI were shipped in the game. How does the gaming industry position the use of AI as a benefit for consumers? How do they make the case that if it costs less, they can create more with the same budget, and you will get more content as a result? Is it too late, or have we missed the boat on that?
JT: I find this fascinating. Most game developers are just staying away from it. We have run a game AI summer school every year since 2018. We invite people doing AI from within the games industry in different roles and companies. Weirdly enough, it has become harder to get the speakers we really want. We have a great lineup this year, but a lot of people said no because they do not want to be associated with AI. Their companies say they cannot go out and talk at an AI conference because of what their players might think.
I do not know where this is going or if we are going to get into an even more polarized environment. The AI debate is polarized in a very unproductive way. There are boosters who think agentic everything is the future and you should outsource your brain to Claude as soon as you can. Then there are Luddites who will not touch generative AI with a pole. These are very unproductive positions.
The games industry as a whole is just staying away from it and trying not to mention it. I think there are things in game dev that could be made more efficient. No one wanted GTA VI to take so many years to develop. Clearly, things could be made more efficient. But then there are all the things that AI can make possible in terms of new game experiences that we could not build otherwise, from non-player characters to environment generation to player adaptation. I fear that is going to get caught out because people are not going to want to experiment with this because it looks bad.
ES: Or it gets leaked that they used it and there is a moral panic about it. This is bizarre to me because if you want indies to make commercially successful games that reach your phone or your PlayStation, what better pathway is there than these AI tools? It is such a net good. I do not even understand the perceived downside from the gamer’s perspective, other than thinking they are getting an algorithmically generated experience versus something handcrafted and artisanal. But I do not think anyone is intending to use AI for that purpose. In the Crimson Desert episode, it was just textures. Does it really matter that the textures were made by AI?
JT: It is kind of silly. I tend to think that what people feel is a proxy for a lot of people’s reactions to AI in society. People dislike generative AI in general and they fear that it is taking over things they were good at or enjoyed doing. I can relate. I think I am a good writer and I like writing. I do not like that models are so good at writing. People see this and perceive that there is nothing they can do about it, so they get angry at data centers or textures generated in a game. It is proxy anger.
It is also true that I do not like reading something if someone sends me an AI-generated piece of text. Why do I have to read this if you did not write it? It is the same if I am playing a level in a game I paid money for. I expect intentionality. Part of playing a game is that you are interacting with a system that someone built for people like you.
ES: I get offended if someone sends me obvious ChatGPT output. At the same time, I also get offended if someone sends me sloppy, grammatically incorrect text riddled with spelling errors. I expect perfection now because you have all the tools at your disposal. If something is unclear, you could have sent it to ChatGPT and had it critique it for clarity and readability. Get good. You have no excuse for that.
I take the point that I paid $70 for this game and there is AI slop visible in the textures. But going back to the idea of the theory of fun and MMORPGs, at some point, you have to expect that is the only way a truly open-world game is going to be possible. Otherwise, you have a 20-year dev cycle and it is still going to be a constrained thing.
I read an interesting thought on Twitter the other day that GTA VI is the last pre-AI open-world game. From now on, these games will have much shorter dev cycles and much lower costs. GTA VI is a billion-dollar game. They recover that in presales, but nonetheless, that is probably true. My reaction was that it is great news because it means I do not have to wait more than a decade for the next GTA.
JT: I want better games sooner. There are too many games already, but can we get better games? I like that as a game consumer. As a game and AI researcher, what I am really interested in is whether we can put runtime AI in here to enable new experiences. Imagine some GTA X where you are in some place and you decide to go four hours in a random direction. It would basically create a new world or new regions where there is a city that no one has ever been to before.
It would have new architecture, new quest lines, new people, and new scenery. It would be a bit like No Man’s Sky, but it fits together with storylines from the rest of the world. Maybe some of these challenges are what the game thinks you may enjoy doing next, such as elaborations on the kind of quests you like. Will people like it or will people hate it? On one hand, it is generative AI on steroids where no human was involved in creating this new city. On the other hand, there is a long tradition of procedural content generation in rogue-likes. A significant stream of my research has been how we can make this better, more personalized, or more controllable for online co-creation in games.
ES: What does the optimal AI-enabled content generation pipeline look like? Is there a human in the loop for the reason that you cannot just let this procedural train run away? That might induce some kind of psychosis or lead to a loss of connectivity with the experience other people are having. If I have this totally unique experience, who do I commiserate with about it? Does the optimal pipeline include a human in the loop as productivity tools that make us faster and let one person do the work of ten people?
JT: Games have always been a multifaceted and diverse thing. The future will hold all of this and much more. There will definitely be games made like they are done today, where you use Claude or Cursor to write your code and some generative AI tool for assets, but it is mostly the pattern of game development as today.
Then there will be other games where the design work of the game designer shifts. Most game designers are perfectionists who want you to have the illusion of your own agency while they have designed everything ahead of time. But there will also be room for a type of game designer who designs the experience machine and sets up tool chains for things to change as you are playing the game.
As a designer, you never touch the actual level or the actual rooms. You design the rules for how this is going to be automatically designed, some of it during design time, but much of it at runtime. Memory is expensive and the world is enormous. There is always going to be more world to model than we can fit into memory. This type of game will have to have some kind of runtime generation going on.
Regarding the social aspects, I used to be an asocial gamer. When I was working at IT University Copenhagen, I sat next to T.L. Taylor, a world-famous MMO anthropologist. She was talking about how games were always social, and I said no, I pull the curtains, disconnect my Xbox from Wi-Fi, and sit there playing all alone. But then she asked if I ever read strategy guides. I just had to shut up.
I think it might be about inviting other people into the world that the game has created for you or that you have co-created with the game. It is up to the designer to figure out what that framework is. Maybe it is sharing generated worlds or some new framework for making it social. That is the job now: figuring out how to connect that to the theory of fun. The actual format or the visual output is almost secondary to that.
ES: I gave a talk at an event recently and was having lunch with the CEO and CPO of a company. One of the guys has a 17-year-old son thinking about colleges, and I said I do not envy him. I would not know what to tell my son to study now. I probably could not recommend computer science, which is unfortunate because I loved coding. I found it really relaxing.
I cannot listen to music that has lyrics if I am writing; it is distracting. But when I am coding, I can. Coding is rules-based and things work or they do not. What isn’t rules-based is architecture design. Now coding is essentially just architecture design and system design. I cannot have lyrics in the music if I am actually thinking through the prompt that I am going to send to Codex or Claude. There is a lesson there in cognitive load. If the computer scientist or systems architect of the future really is just mediating systems design, what should they be educated in? I think you are talking about critical thinking, the humanities, philosophy, and literature. How can you read something and truly parse meaning from it? That feels like the skill set you need even if you are working at a tech company building products.
JT: My bachelor’s degree is technically majoring in philosophy. It has shaped my thinking so much. If my son or daughter wanted to study philosophy, I would say it is a great idea, though they should take some math classes as side dishes. The thing about coding is that I think there are many different ways you can do it. When you draw or paint, you often need music to occupy some part of your brain while you are doing the rest with the other part. I think the same applies to coding.
The reason why LLMs are so good at coding and reinforcement learning with verifiable rewards has worked so well is that coding is a game. That is also why coding is so fun. You are playing a game as you are writing this code, like a classic single-player game with clear rewards. A corollary of this is that it is not clear how well coding skill really transfers to other things that are not like coding. There is no observable transfer between Super Mario Bros and chess or StarCraft. That is a perspective on what foundation models know and do not know.
ES: Talk to me about the promise of world models. What do people hope world models accomplish that language models cannot? When Yann LeCun says we are going to hit a wall with LLMs and world models are the next big leg of progress, what is he talking about? How does this relate to gaming?
JT: World models is a very broad term. In a technical sense in reinforcement learning, it is the transition model: if you are in this state of the world and this action is taken, what is the next state of the world going to be? What people tend to think of is something like Google’s Project Genie, essentially a steerable video model. You can prompt it to say you want to play a racing game set in Greenwich Village, and you get something that looks nice and plays a bit like a racing game.
It currently kind of sucks and decoheres after a bit. Even within the one minute you are allowed to play, you see the physics start behaving strangely and the other cars start looking wonky. But it is fascinating that it works at all. When Google went with the big publicity about Genie, stocks in game companies tanked. This was a strange overreaction because that is not a replacement for game creation or game engines. It has lots of disadvantages. It is really hard to edit things except through a prompt, which is inexact. It is slow and hard to predict because it is fundamentally non-deterministic.
However, it is very good at something. How can we fit this into the content workflow or think of new types of experiences? OpenAI had their Sora video creator. You could imagine something like Sora but for interactive experiences and game creation social networks where you create little games for each other. Someone is going to succeed with this. It is also a cool prototyping thing. You can prompt your way into a weird world you can interact with, and then you can remodel it in a game engine.
You could also imagine using this as forward models if you tune a neural game engine on a particular game and then create a very low-dimensional version of it that runs really fast. An issue with current game engines is that they do not simulate fast enough, so you cannot run planning in them. Maybe you can use neural game engines for that. I do not know where it is going, but it is very fascinating.
ES: Prototyping makes a lot of sense. If I want to render a full environment and see if a minute of gameplay works, I can test theme, tone, and setting. Those three things in combination are a pretty heavy lift if I need to radically alter them, such as switching from a photo-realistic noir crime game to a cartoony Viking battle game. If I could just render those with a prompt and see which best captures what I am trying to do, I could test fifty variations in a day and see which fits my vision. I could even test them as ads and see who clicks on what.
JT: Ads are content that is time-limited enough that the limited coherence and interactivity of neural game engines would actually be good for them. The problem there is the size of the trained network, but someone will figure that out. You could get a lot of hyper-personalized ads that are generated game worlds for you to interact with.
ES: I am sure we are not too far away from that. Julian, this was fantastic. How can people consume more of your wisdom and stay in touch with you?
JT: Just type my last name, Togelius, into your favorite social network, search engine, or LLM. It is an almost unique last name; there are only ten of us in the world. Togelius, that is me on X and everything else.
ES: I want to highlight a great post you wrote a couple of months ago about how you had a productive career in AI research without knowing any math. I really appreciated that post because a lot of this can be very inaccessible. People wrongly get turned off from AI because they look at a paper and have no idea what it is saying. To read these papers, there is some math you need to know, but you could teach that to yourself. Once you get familiar with it, you realize a lot of the math in research papers is completely gratuitous and you can just focus on the ideas. The real problem is that many research papers are badly written because people do not know how to write. We should demand that they get good at this, but it is perfectly fine to ask an LLM to explain it like you are five years old.
JT: Thank you so much, Eric. This was great. Enjoy your weekend.
ES: Thank you, Julian.
Comments: