dc.description.abstract | Very large scale multi-agent systems occur both naturally and in engineering applications. In many of these systems, the agents are either selfish or act with varying levels of coordination. Understanding the evolution of large populations of such agents has been of interest in diverse domains such as biology, ecology, sociology, economics, transportation engineering, robotics and control engineering. One specific area that has not been studied enough is the evolution of large populations on networks of choices. This specific setting has potential applications in the contexts of fleet redistribution of ride-sharing services, evolution of transportation mode choices of a population, opinion dynamics, human and insect swarm migrations and robotics swarms.
In this thesis, we consider a population composed of a continuum of agents that seek to maximize a payoff function by moving on a network. The nodes in the network may represent physical locations or abstract choices. The population is stratified and hence agents opting for the same choice may not get the same payoff. In particular, we assume payoff functions that model diminishing returns, that is, agents in "newer" strata of a node receive a smaller payoff compared to "older" strata. Moreover, at each time instant, the network imposes constraints on the set of choices that an agent can revise to.
We first model the population dynamics under three choice revision policies, each having varying levels of coordination - (i). no coordination and the agents are selfish, (ii). coordination among agents in each node and (iii). coordination across the entire population. To model the case with selfish agents, we generalize the Smith dynamics to our setting, where we have a stratified population and network constraints. We refer to this dynamics as stratified Smith dynamics or SSD. To model nodal coordination, we allow the fraction of population in a node, as a whole, to take the 'best response' to the state of the population in the node's neighborhood. We call this as nodal best response dynamics or NBRD. For the case of population-wide coordination, we explore a dynamics where the population evolves according to centralized gradient ascent of the social utility, though constrained by the network. We call this dynamics as network restricted payoff maximization or NRPM. In each case, we show that the dynamics has existence and uniqueness of solutions. We also show that the solutions from any initial condition asymptotically converge to the set of Nash equilibria and the social utility converges to a constant.
We then study the steady state of the population and the steady state social utility for the three dynamics SSD, NBRD and NRPM. We provide sufficient conditions on the network based on a maximum payoff density parameter of each node under which there exists a unique Nash equilibrium. We then utilize positive correlation properties of the dynamics to reduce the flow graph, in order to provide an upper bound on the steady state social utility. Finally we extend the idea behind the sufficient condition for the existence of a unique Nash equilibrium to partition the graph appropriately in order to provide a lower bound on the steady state social utility. We also illustrate interesting cases as well as our results using simulations.
Towards the end, we discuss some preliminary ideas to utilize this framework to control the population state to a desired configuration. We try to achieve this by changing the payoff functions at a much slower rate than the rate at which the population converges. We also look at ways to compute an optimal control action that achieves this. | en_US |