#1407: Can One Million LLMs Predict the Next Global Crisis?

Discover how an undergraduate student built a viral simulation of one million AI agents to predict social behavior and policy outcomes.

Featuring

Daniel

Corn

Herman

Listen

0:00

Episode Details

Episode ID: MWP-1552
Published: Mar 20
Duration: 25:23
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: Gemini 3 Flash

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

In early 2026, a project titled MiroFish took the developer world by storm, amassing over 30,000 GitHub stars in a matter of weeks. Created by Guo Hangjiang, an undergraduate student, the project represents a paradigm shift in how we model human behavior at scale. By leveraging "vibe coding"—a high-level form of AI-assisted development—Hangjiang built a functional, massively parallel simulation engine that can orchestrate one million autonomous agents simultaneously.

From Hard-Coded Rules to LLM Cognition

Traditional agent-based modeling (ABM) has existed for decades, but it has historically relied on rigid, deterministic rules. In older systems, agents followed "if-then" scripts. MiroFish breaks this mold by using the OASIS framework (Open Agent Social Interaction Simulations), where every agent is powered by a large language model (LLM).

These agents do not just follow scripts; they possess distinct personalities, long-term memories, and social connections. When a real-world event—such as a new trade tariff or a geopolitical crisis—is seeded into the engine in plain English, the agents react based on their individual profiles. They can perform 23 social actions, including following, muting, or reposting, creating a digital reflection of our modern social media landscape.

The Technical Backbone of Scale

Simulating a million agents with individual reasoning steps is a massive computational challenge. MiroFish handles this through a distributed architecture and a sophisticated tech stack. By utilizing Neo4j, a graph database, the engine manages billions of social relationships that would cause traditional databases to fail. For execution, it often relies on local model execution through tools like Ollama, allowing for high-throughput inferences without astronomical token costs.

Predictive Wargaming and Policy

The most significant application of MiroFish lies in policy wargaming. Unlike traditional military simulations that focus on kinetic metrics like tank counts or fuel depots, MiroFish focuses on the "messy" human side of conflict. It allows planners to observe "90-day sentiment trajectories," showing how a population might polarize or react to specific narratives.

For example, in a conflict scenario involving regional proxies, the simulation can model how different factions—ranging from rural conservatives to tech-savvy youth—might coordinate messaging or resist specific policies. This provides a "god’s-eye view" of social phase shifts, helping planners identify tipping points where a single piece of information could cause a massive shift in public opinion.

Challenges of Validation and Bias

Despite its power, MiroFish faces the "validation problem." Because the behavior of the agents is emergent, it cannot be verified through simple mathematics. Researchers are currently using historical backtesting—feeding the model data from past events to see if it accurately predicts known outcomes—to calibrate the engine.

There is also the inherent risk of bias. The underlying LLM and the initial "seeds" for agent personalities reflect the worldviews of their creators. If a model is trained primarily on Western data, its simulation of a Middle Eastern social crisis may be inaccurate. The open-source nature of MiroFish is intended to mitigate this, allowing different groups to plug in various models to see if outcomes remain consistent across different datasets.

MiroFish signals a future where the barrier to high-fidelity social simulation has collapsed. What once required a hundred-million-dollar defense contract can now be initiated by a single developer with a GPU cluster and a clear vision.

Mentions

CAMEL-AI Research group behind OASIS framework
Guo Hangjiang Creator of MiroFish, undergraduate student
MiroFish Viral million-agent simulation engine
MiroFish-Offline English offline fork by nikmcfly
Neo4j Graph database for social graph
NetLogo Traditional agent-based modeling tool
nikmcfly Developer of MiroFish-Offline fork
OASIS Open Agent Social Interaction Simulations framework
Ollama Local LLM execution tool
WarAgent LLM-based war simulation project

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#1407: Can One Million LLMs Predict the Next Global Crisis?

I was looking at the GitHub trending list the other day, and it felt like I had stepped into a different dimension. Usually, that list is dominated by the big players, you know, Google, Microsoft, OpenAI, maybe a new framework from Meta. But in early March twenty twenty-six, everything got flipped on its head by a project from an undergraduate student that just came out of nowhere. Today's prompt from Daniel is about MiroFish, this viral one-million-agent simulation engine that has basically hijacked the conversation around predictive modeling and policy wargaming. It is rare to see a project hit thirty thousand stars in less than two weeks, but MiroFish did exactly that.

It really is a fascinating story, Corn. My name is Herman Poppleberry, and I have been diving into the source code for MiroFish for the last few days. What Guo Hangjiang managed to do in just ten days is nothing short of a miracle of modern development. He is a senior undergraduate student in China, and he used what people are calling vibe coding. Essentially, he was using high-level AI assistance to move from a raw concept to a functional, massively parallel simulation engine at a speed that would make a traditional software house's head spin. It is the ultimate meta-story: using AI to build a tool that then runs a million other AIs.

It is wild to think that a student can build something in a week and a half that then gets thirty million yuan in funding within twenty-four hours of the demo. That is about four point one million dollars for those keeping track. Chen Tianqiao, the founder of Shanda Group and once the richest man in China, clearly saw something revolutionary in this. But before we get into the money and the hype, what is MiroFish actually doing? Because we have had agent-based modeling for decades. Is this just a faster, shinier version of NetLogo?

That is the perfect place to start because the shift here is fundamental. Traditional agent-based modeling, or ABM, like what you see in NetLogo or older systems from the nineties and early two thousands, relies on hard-coded rules. You tell the agents, if X happens, do Y. If you are near a red agent, move two spaces away. It is deterministic, rigid, and ultimately limited by the imagination of the programmer. MiroFish is built on the OASIS framework, which stands for Open Agent Social Interaction Simulations, developed by the CAMEL-AI group. Instead of hard-coded rules, it uses large language models as the cognitive brains for every single agent.

So instead of a script that says move left or move right, each agent has a personality?

Not just a personality, but a memory, a set of social connections, and the ability to interpret language. When you seed a MiroFish world, you are not just setting numerical parameters like population density or interest rates. You are injecting a piece of news, a policy draft, or a geopolitical event in plain English, and then you let a million autonomous agents react to it based on their individual profiles. They have twenty-three distinct social actions they can take. They can follow each other, comment, repost, like, mute, or even search for more information. They are essentially living in a digital reflection of our social media landscape.

A million agents all running language models sounds like a computational nightmare. I mean, even with the most efficient models, the token cost alone would be astronomical. How are they handling the scale? I remember Brian Roemmele ran a simulation with five hundred thousand agents recently, and people were shocked that it even stayed upright.

The architecture is where the OASIS backbone really shines. It uses a distributed system that manages the state of the world across multiple nodes. It is not trying to run one giant model for everyone at once; it is orchestrating thousands of smaller inferences and managing the social graph in real-time. In the MiroFish-Offline fork, which a developer named nikmcfly put together for the English-speaking community, they are actually using a stack with Neo4j for the graph database and Ollama for the local model execution. Neo4j is crucial here because when you have a million agents, the number of relationships—who follows whom, who mutes whom—explodes into the billions. A traditional database would crawl to a halt, but a graph database handles those connections natively.

What I find most compelling is the concept of seeding. You mentioned that you feed the engine real-world inputs. So, if I take a breaking news story about a new trade tariff or a shift in military posture in the Middle East, I drop that into the engine, and then what? I just watch the digital world burn or flourish?

You get what the creators call a ninety-day sentiment trajectory. The engine builds a parallel digital world that mirrors our social structures. You see polarization curves. You see how certain narratives get amplified by media agents and how other ideas just die out in the noise. It is essentially a laboratory for social science. You can observe how a consensus forms or how a population might start to resist a specific policy before it is even implemented in the real world. You are looking for the tipping points—the moment where a small piece of information causes a massive phase shift in public opinion.

It sounds like a god's-eye view of human behavior. But let's talk about the implications for policy wargaming. This is where the stakes get very high. The U.S. Air Force and the Army have been looking for exactly this kind of thing. I remember reading about the Command and General Staff College exercise back in November twenty twenty-five where they used AI tools to increase their throughput by five times. But that was still very doctrinally focused—it was about how many shells are in the depot and how fast the tanks move. MiroFish seems more about the messy, human side of conflict.

That is where it gets incredibly powerful, especially for something like a conflict involving Iran. Traditional wargames are great at telling you how many tanks will be left after three days of fighting. They are not as good at telling you how the civilian population in Tehran will react to a specific strike, or how regional proxies in Lebanon or Yemen will coordinate their social media messaging to flip international opinion. With MiroFish, you can model the algorithmic adversary. We actually talked about this in episode twelve zero three, where we looked at the Islamic Revolutionary Guard Corps' AI strategy. They are not just fighting with kinetic weapons; they are fighting with information.

I remember that episode. The idea was that the IRGC is looking at ways to use AI to automate influence operations. If you can simulate a million agents that represent different factions within Iran—the young tech-savvy protesters, the rural conservatives, the diaspora, the military leadership—and then you add in Israeli decision-makers and U.S. policymakers, you start to see second-order effects that a human planner might miss. You might see that a specific diplomatic move actually triggers a polarization curve that makes a ceasefire less likely because the agents' internal memories and social connections drive them toward escalation.

And MiroFish agents have long-term memory. This is a massive leap over previous simulations. If an agent sees a piece of information on day three, that affects how they interpret a news event on day twenty. It is not a snapshot; it is an evolution. When you compare this to something like WarAgent, which is another project from the AGI Research group, you see how these models can even simulate deception. In WarAgent, they found that LLM-based agents would actually lie to each other to gain a strategic advantage in a World War scenario. They would form secret alliances and then betray them. MiroFish takes that individual agent behavior and scales it to a million-person society.

That is both terrifying and impressive. If the agents can be dishonest, then the simulation is actually capturing the reality of international relations. But how do we know if any of this is real? If I run a simulation of an Iran war scenario and it tells me that a certain strike leads to a regional uprising, how do I know the model isn't just hallucinating a good story? This feels like the ultimate validation problem.

You have hit on the biggest hurdle for MiroFish and similar tools. Because the behavior is emergent—meaning it arises from the interactions of the agents rather than being programmed in—you can't just check the math in a spreadsheet. You have to look at the macro patterns. Researchers are trying to validate these by running historical backtesting. They feed the engine the conditions of, say, early twenty twenty-four and see if it predicts the sentiment shifts we actually saw during major elections or protests. But there is always the risk of garbage in, garbage out. If your initial seeds for the agents' personalities are biased or inaccurate, the whole digital world will reflect that bias.

So if the creator of the simulation has a specific worldview, even subconsciously, that might be baked into the agents?

It is almost certain. Even the choice of the underlying language model matters. A model trained primarily on Western data might simulate a different reaction than one trained on a more global or specifically Middle Eastern dataset. That is why the open-source nature of MiroFish is so important. It allows different groups to plug in different models and different agent personalities to see if the outcomes remain consistent. If ten different versions of the simulation, using ten different LLMs, all point to the same escalation path in the Persian Gulf, you start to take it much more seriously.

It is a bit of a shift from the way we usually think about intelligence. We used to rely on human experts to give us their best guess. Now, we are basically saying, let's build a million little digital experts and see what they do. It reminds me of episode ten ninety-six where we discussed intelligence-grade OSINT and agentic AI. MiroFish feels like the logical conclusion of that trend. It is taking open-source intelligence and turning it into a predictive engine.

It really is. And the speed of innovation here is what blows my mind. Guo Hangjiang used vibe coding to build this. He was essentially a conductor using AI to write the code that then runs more AI. It is meta-innovation. This is why a senior undergraduate was able to outpace massive defense contractors who have been trying to build similar wargaming tools for years. The barrier to entry for creating a high-fidelity social simulation has just collapsed. You don't need a hundred-million-dollar contract anymore; you need a good GPU cluster and the right prompts.

Let's dive deeper into the technical side for a minute. You mentioned the twenty-three social actions. How does an agent decide to, say, mute another agent versus engaging in a comment war? Is that just a probabilistic roll based on their personality profile?

It is more sophisticated than a simple dice roll. Each agent has a behavioral logic loop. When they receive an input, whether it is a post from another agent or a global seed event, the LLM processes that input through the lens of its personality and memory. It then generates a reasoning step—a chain of thought. It essentially thinks to itself, this information contradicts my core beliefs and comes from a source I don't trust, therefore I will mute them to maintain my current worldview. Then it executes the action. This reasoning step is what makes the behavior feel human and emergent. It is not just reacting; it is rationalizing.

So it is simulating the internal monologue of a million people?

In a way, yes. And because these agents are connected in a social graph, you get these cascading effects. If a high-influence agent mutes a specific narrative, their followers might never see it, creating these digital silos and echo chambers that we see in the real world. This is how you get those polarization curves. You can actually see the digital world splitting into two or three distinct camps in real-time. You can watch the death of nuance as the simulation progresses.

I can see how this would be invaluable for something like evaluating the outcome of a conflict with Iran. If you are looking at the ring of fire strike doctrine, which we covered in episode nine forty-five, you could use MiroFish to model the non-kinetic response. How do the various proxy groups in Lebanon, Yemen, and Iraq coordinate their messaging in the wake of a strike? Does a specific type of Iranian response lead to more or less support from the global community? These are questions that used to be answered by gut feeling, and now we have a data-driven, albeit simulated, way to look at them.

What is really interesting is that you can inject variables mid-simulation. It is like being a god in a digital universe. You can let the simulation run for thirty days, see that things are heading toward a massive regional war, and then pause it. You go back to day fifteen and inject a new diplomatic offer or a different media narrative. Then you run it again and see if the outcome changes. It allows for a kind of iterative strategy testing that was never possible before. You are essentially debugging the future.

It is like Doctor Strange looking through fourteen million possible futures. But instead of a sorcerer, it is a server farm running a million instances of a language model. I wonder, though, about the collective intelligence aspect. Is there a point where the swarm of agents becomes smarter than the sum of its parts?

That is the ultimate goal of swarm intelligence. When you have a million agents interacting, you start to see patterns and solutions emerge that no individual agent, and certainly no human programmer, could have anticipated. The simulation might find a path to de-escalation that involves a complex series of social and political moves that a human mind just couldn't hold all at once. It is the same way a colony of ants can solve complex logistical problems that no single ant understands.

On the flip side, it could also find a path to catastrophe that we haven't even considered. It is a powerful tool, but it feels like it could be a bit of a double-edged sword. If a government uses a MiroFish simulation to justify a certain military action because the digital world says it will work out, they are putting a lot of faith in a system that is still fundamentally experimental. We have seen how LLMs can be confidently wrong.

You are right to be cautious. We have to remember that these are models, and as the saying goes, all models are wrong, but some are useful. The value of MiroFish isn't in giving us a definitive answer about the future. It is in expanding our imagination about what could happen. It helps us identify the leverage points. If the simulation shows that a small change in a media narrative can prevent a massive riot, then we know where to focus our attention in the real world. It is a tool for risk assessment, not a crystal ball.

It also brings up some interesting questions about the nature of our own reality. If we can build a digital world that is this high-fidelity, and the agents in it are behaving in ways that are indistinguishable from humans, it makes you wonder about the data we are feeding these models. We are essentially training these agents on our own digital exhaust. They are a mirror of us, which I suppose is why the project is called MiroFish.

It is a mirror. It reflects our own social dynamics back at us. And because it is open-source, we can all look into that mirror. I think it is a great development for transparency. Instead of these predictive models being hidden away in the basement of some three-letter agency, they are on GitHub for everyone to see and poke at. We can audit the personalities of the agents. We can see if the simulation is biased against certain groups.

I really like the idea of the MiroFish-Offline fork. Being able to run this locally with Neo4j and Ollama is a big deal for privacy and for people who don't want to spend a fortune on API credits. It means a small research group or even a dedicated individual can start doing their own policy analysis. You don't need to be a billionaire to run a society-scale simulation anymore.

It definitely democratizes the power of simulation. And while you might not be able to run a million agents on a single laptop, you can certainly run a few thousand, which is more than enough to see some very interesting social dynamics emerge. Even at that scale, you start to see the polarization and the narrative amplification that makes MiroFish so compelling. It is accessible in a way that previous wargaming tools never were.

What do you think this means for the future of political forecasting? Are the days of the talking head on TV giving their opinion over?

I don't think they are over, but I think their role will change. We might see a future where every political commentator has their own personal simulation engine. Instead of saying, I think this will happen, they will say, my model ran ten thousand simulations of this scenario, and in sixty percent of them, we saw this outcome. It adds a layer of rigour, or at least a different kind of data, to the conversation. The talking head becomes a data interpreter.

It also makes the role of the prompter, someone like Daniel who sent us this topic, even more important. The quality of the seeds and the way you frame the initial conditions of the digital world determines everything. It is a new kind of expertise. You are not just a programmer; you are a digital world-builder. You have to understand the nuances of the situation you are trying to model.

And you have to be a bit of a sociologist and a historian as well. You need to know how to seed the agents with the right historical grievances and cultural nuances to make the simulation realistic. If you are modeling Iran, you need to understand the deep history and the different power structures within the country. You can't just give the agents a generic personality and expect a useful result. You need to account for the specificities of the IRGC versus the regular army, or the different ethnic groups within the country.

This brings us back to the Iran war scenario. If we were to use MiroFish to look at the potential outcomes of a conflict in twenty twenty-six, we would need to be very careful about how we represent the different actors. The IRGC agents would need a very different internal logic than the agents representing the young, tech-savvy population in Tehran. You would need to model the influence of the supreme leader's office versus the pragmatic elements in the government.

And you would need to account for the regional actors like Israel, Saudi Arabia, and the various proxy groups. The beauty of MiroFish is that it can handle that complexity. You can have a hundred different types of agents, each with their own specific goals and constraints. When you let them all loose in the digital world, you get a much richer picture of the potential conflict than any static map or traditional wargame could ever provide. You see the human friction.

It is a lot to take in. We have gone from a student project on GitHub to the future of geopolitical strategy in about twenty minutes. But that is the world we are living in now. Everything is moving at the speed of AI. The fact that a senior undergraduate could build this in ten days using vibe coding is perhaps the most important part of the story. It shows that the tools for innovation are now in the hands of anyone with a good idea.

It really is. And I think the take-away here is that agentic simulation is no longer a niche academic pursuit. It is a practical tool that is being used right now to model some of the most important events in our world. Whether it is MiroFish, WarAgent, or the tools being developed by the military, we are entering an era where we can test our ideas in a digital laboratory before we try them out in the real world. We are moving from guessing to simulating.

Let's talk about the practical side for our listeners. If someone wanted to get started with MiroFish today, what should they do?

The first step is to head over to GitHub and look for the repository by six six six g h j. That is the original version. If you want the English version with the offline capabilities, look for the MiroFish-Offline fork by nikmcfly. You will need a decent machine with a good GPU to run the local models via Ollama, and you will need to set up a Neo4j instance to handle the social graph. The documentation is surprisingly good for such a new project, and there is already a growing community of people sharing their seeds and simulation results.

And for the policy-minded listeners, this is a great time to start thinking about how these tools can be used for good. How can we use MiroFish to find paths to peace or to understand the impact of climate policy on different social groups? The possibilities are really endless. It is not just about war and politics; it is about understanding the complex systems that make up our world.

They are. And I think we will see a lot of very creative uses for this technology in the coming months. It could be used for urban planning, for understanding the spread of information in a public health crisis, or even for designing better social media platforms that are less prone to polarization. Imagine testing a new algorithm on a million agents before you roll it out to a billion humans.

That would be a welcome change. If we can use a simulation to design a world that is less divided, then MiroFish will have been worth every penny of that thirty million yuan investment. But we have to be realistic about the limitations. As you said, garbage in, garbage out. We have to be vigilant about the data and the models we are using. We can't let the simulation replace human judgment; it should inform it.

That is the eternal challenge of any technology. It is a tool, and its impact depends on the hands that hold it. But I am optimistic. I think the fact that this is happening in the open, with a global community of developers and researchers, gives us a much better chance of getting it right. We are all learning how to use these digital mirrors together.

I agree. It is a brave new digital world, and we are all just agents in it, whether we like it or not. I think we have covered a lot of ground today. We have looked at the technical architecture of MiroFish, the implications for policy wargaming, and the broader potential of swarm intelligence. It is a lot to process, but it is incredibly exciting.

It has been a great discussion. I always enjoy diving into these technical topics with you, Corn. You always ask the questions that get to the heart of the matter, especially when it comes to the ethical and human implications.

And you always have the deep dives that keep us grounded in the facts. It is a good team effort. We should probably wrap things up here before we start simulating our own podcast audience.

Before we go, we should mention that if you want to dig deeper into the world of agentic AI and OSINT, you should definitely check out episode ten ninety-six and episode twelve zero three. They provide a lot of the foundational context for what we talked about today, especially regarding the algorithmic adversary and the rise of agentic workflows.

Good call. And as always, thanks to our producer Hilbert Flumingtop for keeping everything running smoothly behind the scenes. He is the human agent who makes this all possible.

And a big thanks to Modal for providing the GPU credits that power this show. We couldn't do these deep dives into the latest AI tech without their support. Running these models for research isn't cheap, and they make it possible for us to stay on the cutting edge.

If you are enjoying My Weird Prompts, please consider leaving us a review on your favorite podcast app. It really helps other people find the show and join the conversation. We love hearing your feedback and your own weird prompts.

You can also find us at myweirdprompts dot com for our full archive and all the ways to subscribe. We have a lot of great episodes in the works, so stay tuned.

This has been My Weird Prompts. We will be back soon with another prompt from Daniel and another deep dive into the weird and wonderful world of AI. Until then, keep an eye on those trending lists.

Take care, everyone.

See you next time.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.