Agent-Based Modeling with Python and?NetLogo
Rubens Zimbres, Ph.D.
ML Engineer, Gen AI, Sec+, Google Developer Expert in AI/ML ^ Google Cloud
Lately I’ve been doing a course Introduction to Agent-Based Modeling at Complexity Explorer, a teaching platform from the Santa Fe Institute. The content is awesome and is taught by Dr. William Rand, from North Carolina State University.
Agent-Based Modeling (ABM) is a methodology to simulate phenomena according to complexity principles. In complex systems, processes occur simultaneously and the complex behavior of the whole system depends on its sub-units in a non-trivial way. The observation in this context is a change of paradigms in an attempt to understand our world, as we realize that the laws governing the whole cannot be deduced simply from the mere observation of the details of its constituent parts (Vicsek, 2002).
In ABMs we start from simple rules to generate complex patterns, where micro behaviors cause macro phenomena. From simple and localized rules at the individual level, we can see the appearance of emergent properties of a given system (Shelling, 1978; Cederman, 2003; Axelrod, Tesfatsion, 2005; Epstein, Axtell, 1996; Sawyer, 2003-2004; Hegselmann, Flache, 1998; Wolfram, 2002). The MBA that relates the micro and macro levels is a relevant research tool for sociologists (Macy, Willer, 2002) through which one can perform abstractions.
This emergent outcome is not necessarily related to the initial conditions. The model runs parallel updates of individuals in a discrete manner, and an agent may or may not have consciousness about the choices he made in the past. He may simply interact, make choices, achieve a goal or even try to maximize his utility (Kahneman, Tversky, 1979; Kim, Matson, 2016).
In Agent-Based Modeling, rather than map X (causes) to Y (effects), we are more interested in understanding the processes happening between cause and effect. That’s why ABM is called science from the bottom up (Axelrod, 1997; Epstein, Axtell, 1996). This approach facilitates epistemological validity, given that it requires rigorous internal validity and construct validity as well as the verification of the implemented model according to the concepts being modeled. Replication is a key issue in ABMs. Sometimes the outcome may be surprising, counter intuitive and this may generate the need for a new understanding of concepts taken for granted, generating a new theory (Zimbres, 2006).
ABMs are sensitive to initial conditions, and are often non-linear. In fact, ABMs can be thought of as a third way of doing science, beyond induction and deduction (Axelrod, 1997). You are not trying to generalize, differently from Machine Learning algorithms. Also, you are not trying to add all possible rules of the system to a given individual, as the basic assumption of ABMs is the simplicity.
When you create an agent-based model, you must define the types of agents that compose the system, their rules of behavior (actions), properties and environment interaction. As it is a discrete event simulation, you must also define what happens at each time step. Inputs are the independent variables of the model that will be applied to agents/environment. So, what to expect as an output from an agent-based model?? It depends. Sometimes you are developing an exploratory model, to understand the underlying dynamics of a system. Sometimes you already have a hypothesis that you want to accept or reject. This affects the way you are doing science and the way you handle and understand the findings of your research.
In my final project at the above mentioned course I developed a social network based in Cellular Automata as in Zimbres et al. (2008). In my thesis, I used a Cellular Automata (CA) model to simulate human interactions that happen in the real world (Zimbres, Oliveira, 2009). The model used market research with real people in two different times: one at time zero and the second at time zero plus 4 months (longitudinal market research).
Then, a Cellular Automaton model was developed and its initial condition was inherited from the results of the first market research response values and evolved to simulate human interactions that led to the values of the second market research, without explicitly imposing causality rules. Then, I compared the results of the model with the second market research. The model reached 73.80% accuracy. Below, the behavior of the chosen unidimensional rule, 5 states, radius one in a two dimensional lattice with random initial conditions is presented:
In the same way, my final course project was an exploratory ABM that modeled individuals in a closed society whose behaviors depend upon the result of interaction with two neighbors within a radius of interaction, one on the relative “right” and other one on the relative “left”. According to the states (colors) of neighbors, a given cellular automata output is obtained. Five states were used and were defined as levels of quality perception, where red (states 0 and 1) means unhappy, state 3 is neutral (yellow) and green (states 3 and 4) means happy.
I also developed a Message Passing algorithm in the social network, to analyze the flow and spread of information among nodes. Both the cellular automaton and the message passing algorithms were developed using the Python extension inside NetLogo. NetLogo software can be downloaded here.
NetLogo is an interesting language, it is very fast and made in Java and Scala. It has its own syntax and agents are called turtles. For instance, if you want a turtle to move, you must politely ask them:?
ask turtles [ forward 1?
cellular automata ]
As you will see in the pictures and plots ahead, there are two types of agents (breeds): clients (person shape) and service providers (star shape). Each one of them carries an internal state from 0 to 4, and also the amount of information, a float starting at 0 (no information at all) and greater than that (amount of information carried).
Each agent breed will choose two neighbors within the radius of interaction: an agent with the same breed as itself and an agent of another breed. This set will be used by the cellular automaton algorithm to generate the future state of the agent. Each agent will then move to the XY coordinate between the two neighbors. Note that in the case of lack of two neighbors in a small radius of interaction, the agent may consider its two neighbors as a single other agent.?
The Cellular Automaton follow these rules: for each cell in the grid with its position c(i,j) where i and j are the row and the column respectively, a function Sc(t)=S(t;i,j) is associated with the lattice to describe the cell c state in time t (Wolfram, 2002; Ganguly et al., 2004). So, in a time t+1, state S(t+1,i,j) is given by:
S(t+1;i, j ) = [S(t;i,j)+δ]mod k
where ? k ≤ δ ≤ k and k is the number of cell c states. The formula for δ is:
δ = μ if condition (a) is true
δ = -S(t;i,j) if condition (b) is true
δ = 0 otherwise
where a and b change according to the rule.
The NetLogo code for the cellular automaton is the following:
Regarding the Message Passing algorithm (Gilmer et al., 2017, Jost, 2022), information starts at level 1.00 for the individual with the biggest degree (connections in the social network), and 0.00 for all the others. It’s possible to note in the plots that the information flows through the network, increasing or decreasing its value over time.?
Given an adjaceny matrix A:
The calculation of information flow is given by the following iterative algorithm:
The NetLogo code for the Message Passing algorithm is presented below:
Besides the interaction and information spread in the social network?, the system I developed is also subject to levels of temperature of the environment, measured with a sensor attached to Arduino, following this sketch, and connected to NetLogo:
Patches (background) allow the free movement of agents, but agents cannot overlap. If the user have the opportunity to connect the Arduino device, they will notice how higher and lower temperatures influence the amount of movement steps taken by agents, following these equations:
The modulus is necessary to avoid imaginary numbers. The lattice considered was not a toroid (periodic boundary), meaning it is not wrapped at borders. The system is initialized choosing the following inputs:
Frac-providers: fraction of the agents that are service providers. The other fraction is composed of clients.
Percent-unhappy: amount of agents with state < 3
Movement-steps variable defines how much an agent should move in case there are no neighbors available inside the radius defined in Setup.
Radius-of-interaction variable helps the agent to decide the radius in which neighbors will be chosen.
Mutated individuals. Using a genetic metaphor, the idea here is to add diversity to allow the evolution of the social network.
CA-base: number of states of the cellular automaton
CA-rule: rule of the cellular automaton
Layout: options as radial, spring and circular
Epochs: how many cycles will last the model.
The temperature is collected by a DHT11 temperature and humidity sensor attached to Arduino and influences the degree of movimentation of agents. In colder temperatures, agents move less steps, and in high temperatures, agents move more steps.
The picture below shows the inputs (left column) and outputs (right column) for the model:
The initial links (edges) and relationships were obtained through part of emails exchange from a dataset email-Eu-core available at Stanford Large Network Dataset Collection. Here, N = 80 agents. The model is initialized the following way for the radial layout:
By clicking Go, the model evolves according to the setup of inputs and rules of interaction:
The output column shows some of the social networks’ measurements, like states of agents, environment variables and measurements regarding social ties. You can then detect the biggest cliques in the network, identify communities (different colors), betweenness centrality of agents and closeness centrality of agents. Mood of agents, average euclidean distance, degrees, total amount of links and amount of weak and strong ties are also presented.
Structural holes are individuals in the social network that have partial connections that leave holes in the density of the social network. The strength of ties can be seen by the thickness of the connections (links, edges) among agents, that is added at the end of each interaction with the same neighbor.
After you stop the model from running, if you click Biggest Cliques, you will find the individual with more cliques in the social network. A clique is a subset of a network in which the actors are more closely and intensely tied to one another than they are to other members of the network. Think of it as a group of people connected by strong social ties.
You can also detect communities. Netlogo interface will show each community with its members and correspondent colors. This option detects community structures present in the network. It does this by maximizing modularity using the Louvain method. The Louvain method is a greedy optimization of modularity, a value between ?0.5 (non-modular clustering) and 1 (fully modular clustering) that measures the relative density of edges inside communities with respect to edges outside the community tested. In a detected community, the modularity will increase. Then, community nodes are grouped to restart the algorithm.
The Closeness button will show you the closeness of each agent to the rest of the network. Closeness centrality indicates how close a node is to all other nodes in this network. It is calculated as the average of the shortest path length from the node to every other node in the network. A smaller value means that the given agent is closer to other nodes of the network.
To calculate the betweenness centrality of an agent, you take every other possible pairs of agents and, for each pair, you calculate the proportion of shortest paths between members of the pair that passes through the current agent. The betweenness centrality for each node is the sum of the numbers of these shortest paths that pass through the node.
NetLogo also offers the possibility of visualizing plots. In this case, the evolution of mood of agents, their states according breed, amount of information over time and information reach in the network:
Note that the mood (state of the agent) output in the models is very similar to the oscillating behavior found by Brian Arthur study (1994) and characteristic of natural phenomena.
领英推荐
Visually, the output is interesting, but we have to analyze statistically what happens in the model. NetLogo has a feature called Behavior Space that allows to run the model multiple times to generate robustness of results and save the evolution of the agent-based model in a?.csv file. Then, it can be analyzed in R or Python, analyzing correlations between initial and final conditions, t-tests and F-tests according to different variables, regression or any other supervised/unsupervised algorithm. For this model, these are the results of correlations between initial and final conditions:
It is interesting because for the mood of providers, total mood and centroids, there is little correlation between initial and final results, suggesting emergence of phenomena. In the table below I present the results of the F-test, according to changes in radius of interaction, and their levels of significance:
These findings mean that the network measurements (cluster coefficient, paths, closeness centrality, betweenness centrality and centroids) increased their variance with the evolution of the model. So, it looks like diversity was added to the model, considering the inputs chosen. The t-test compare means of the initial and final states, and show significant differences also in the network properties, like cluster coefficient, centroids positioning, closeness and betweenness centrality. Agents are closer to each other, more interconnected, away from the initial centroid and keeping a high diversity in the cluster.
By using Matplotlib you can see the path of the agents along the interactions.
Notice how the spatial position of the social network changes along the interactions:
Another nice feature of NetLogo is the existence of optimization algorithms, Simulated Annealing and Genetic Algorithm. Suppose you want a specific setup that makes people happy. Which are the inputs for the system in order to achieve and maximize or minimize this outcome?? Let’s see some examples:
The following optimization is oriented to maximize the amount of information available in the network in the long term. The transmission of communication is subject to interests that generate agency costs and distortions (Albaum, 1967; Granovetter, 1973) with a distance limit beyond which its transmission is no longer practicable (Granovetter, 1973).
Notice how the inputs (independent variables) at the right are disposed. A small radius of interaction (1), with freedom of movement (13 steps), with 20% of unhappy agents and 1 mutation among the 80 agents seem to make information more available in the social network.
After running the model with this setup, the outcome is the following, converging to the picture on the right side. Interesting to notice there are no unhappy agents at the end of the simulation. Many agents at the periphery of the network keep weak ties with the core of the system.
In order to maximize the amount of information available, it seems that weak ties play a special role. Also, the maximum degree is 22. This result suggests that individuals with higher degrees, weak ties with different other agents make information available for more time in the network, rather than agents with fewer connections and stronger ties. Also, the average euclidean distance of the network is not small. This means that weak ties bring diversity to the network, as stated by Granovetter (1973).
The optimization below was made in order to find a perfect setup to maximize the appearance of happy individuals, meaning that I wanted to maximize the state of agents. It is interesting that individuals must interact with other agents in a larger radius (9) and make small movement steps (1). The percentage of unhappy agents mustn’t be larger than 8% with a very small fraction of mutated individuals. Service providers must account for 10% of the population.
After running the model with this setup, the outcome is the following, converging to the picture on the right side:
The results suggest that a strongly connected network (1505 links and 1210 strong ties) with a small average euclidean distance of the network (along with the cellular automaton rule) maximize the state of agents, improving mood and making consensus appear (Jiang, Xia, 2009; Jiang, Ishida, 2007, Li et al 2010).
Another optimization was made to maximize strength of ties. The perfect setup is to have interactions within a range of 10 (radius of interaction), move only 1 step, with 80% of unhappy people, 10% of service providers and one mutated individual.
Once again, strong ties are correlated with a small euclidean distance. It’s also possible to notice the skewness of the distribution of states towards happy agents and the great number of links and higher degrees.
References
Albaum, G., (1967). Information flow and decentralized decision making in marketing. California Management Review, 9(4), 59–70.
Arthur, W.B. (1994) Inductive Reasoning and Bounded Rationality. The American Economic Review, Vol. 84, №2, Papers and Proceedings of the Hundred and Sixth Annual Meeting of the American Economic Association, pp. 406–411.
Axelrod, R. (1997). Advancing the Art of Simulation in the Social Sciences. Handbook of Research on Nature Inspired Computing for Economy and Management, Jean-Philippe Rennard (Ed.). Hersey, PA: Idea Group.
Axelrod, R. (1997). Complexity of cooperation. New Jersey: Princeton University Press, 1997.
Cederman, L.E., (2003). Computational models of social forms: Advancing generative macro theory. Paper prepared for presentation at the 8 th Annual Methodology Meeting of the American Sociology Association, University of Washington, Seattle.
Dorri, A. Kanhere, S.S., Jurdak, R. (2018). Multi-Agent Systems: A Survey. IEEE Access, 10.1109/ACCESS.2018.2831228.
Epstein, J.M.;Axtell, R., (1996). Growing artificial societies: Social science from the bottom up. MIT Press, Cambridge.
Flache, A. Hegselmann, R. (2001). Do Irregular Grids make a Difference? Relaxing the Spatial Regularity Assumption in Cellular Models of Social Dynamics. Journal of Artificial Societies and Social Simulation, v. 4, n. 4.
Ganguly, N.; Sikdar, B.K.; Deutch, A.; Canright, G.; Chaudhuri, P.P. (2003) A survey on cellular automata.
Gilmer, J. Schoenholz, S.S., Riley, P.F., Vinyals, O., Dahl G.E. Neural Message Passing for Quantum Chemistry. arXiv preprint arXiv:1704.0121, 2017.
Granovetter M.S. (1973) The Strength of Weak Ties. American Journal of Sociology. Vol. 78, №6, pp. 1360–1380.
Iribarren, J.L., Moro, E. (2009) Impact of human activity patterns on the
dynamics of information diffusion. Phys. Rev. Lett., vol. 103, no. 3.
Jiang, Y. Ishida, T. (2007) A model for collective strategy diffusion in agent social law evolution. In Proc. 20th Int. Joint Conf. Artif. Intell. (IJCAI), Hyderabad, India, pp. 1353–1358.
Jost, Z. Basics of Graph Neural Networks. Available at Welcome AI Overlords
Z. Li, Z. Duan, G. Chen, and L. Huang. (2010) Consensus of multiagent sys-
tems and synchronization of complex networks: A unified viewpoint.
IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 57, no. 1, pp. 213–224.
Jiang, Y.?, Jiang, J.C. (2015) Diffusion in Social Networks: A Multiagent Perspective. IEEE Transactions on Systems, Man, and Cybernetics Systems, Vol. 45, No. 2.
Jiang, Y. Xia, X. (2009) Prominence convergence in the collective synchro-
nization of situated multi-agents. Inf. Process. Lett., vol. 109, no. 5,
pp. 278–285.
Kahneman, D. Tversky. A.. (1979) Prospect Theory: An Analysis of Decision under Risk. Econometrica, Vol. 47, No. 2, pp. 263–291.
Kim, Y. and Matson. E.T. (2016) A Realistic Decision Making for Task Allocation in Heterogeneous Multi-agent Systems. Elsevier, Procedia Computer Science 94, pp. 386–391.
Macy, M.W.; Willer, R. (2002). From factors to actors: Computational sociology and agent-based modeling. Annual Review of Sociology, v. 28.
Sawyer, R.K. (2003). Artificial societies: Multiagent systems and the micro-macro link in sociological theory. Sociological Methods and Research, v. 31, n. 3, Feb 2003.
Sawyer, R.K. (2004). Social explanation and computational simulation. Philosophical explorations, v. 7, n. 3.
Schelling, T. (1978). Micromotives and Macrobehavior. New York: Norton.
Tesfatsion, L., (2005). Agent-based computational economics: A constructive approach to economic theory. Forthcoming in Judd, K.L. Tesfatsion, L. Handbook of Computational Economics. North-Holland.
Vicsek, T. (2002) Complexity: The Bigger Picture. Nature, v. 418.
Wolfram, S. (2002). A new kind of science. Canada: Wolfram Media Inc.
Zimbres, R.A. (2006) Modelagem Baseada em Agentes: uma Terceira Maneira de se Fazer Ciência? ANPAD, Presented at Encontro da Associa??o Nacional de Pós-Gradua??o e Pesquisa em Administra??o, Brasil.
Zimbres, R.A.; Brito, E.P.Z.; Oliveira, P.P.B. (2008) Cellular automata based modeling of the formation and evolution of social networks: A case in Dentistry. In: J. Cordeiro and J. Filipe, eds. Proc. of the 10th Int. Conf. on Enterprise Information Systems, INSTICC Press: Setubal-Portugal, Vol. III: Arti?cial Intelligence and Decision Support Systems, pp. 333–339.
Zimbres, R.A., Oliveira,P.P.B. (2009) Dynamics of Quality Perception in a Social Network: A Cellular Automaton Based Model in Aesthetics Services. Electronic Notes in Theoretical Computer Science, Elsevier, 252 pp 157–180.
ML & AI Practitioner. Integration & Synergy Driver. Multi-Stack Developer. Web3.0 dev. MM, Double Master. Having both Master of Science, and Master of Information Systems Management (MISM) diplomas
2 年Well done!