“Equilibria under Dynamic Benchmark Consistency in Non-Stationary Multi-Agent Systems”
Bio
Yonatan is a researcher at Netflix, where he develops and applies data-driven economic modeling with machine learning and optimization methods for studying the design and analysis of content platforms and marketplaces. Yonatan’s research has been recognized by several academic awards, including INFORMS Lanchester Prize. Prior to joining Netflix Yonatan has been an Associate Professor at Stanford Graduate School of Business. He received his PhD in Decision, Risk, and Operations from Columbia Business School. He holds a B.Sc. degree from the School of Physics and Astronomy and an M.Sc. from the School of Mathematical Sciences, Tel Aviv University.
Paper Abstract
We formulate and study a general time-varying multi-agent system where players repeatedly compete under incomplete information. Our work is motivated by scenarios commonly observed in online advertising and retail marketplaces, where agents and platform designers optimize algorithmic decision-making in dynamic competitive settings. In these systems, no-regret algorithms that provide guarantees relative to static benchmarks can perform poorly and the distributions of play that emerge from their interaction do not correspond anymore to static solution concepts such as coarse correlated equilibria. Instead, we analyze the interaction of dynamic benchmark consistent policies that have performance guarantees relative to dynamic sequences of actions, and through a novel tracking error notion we delineate when their empirical joint distribution of play can approximate an evolving sequence of static equilibria. In systems that change sufficiently slowly (sub-linearly in the horizon length), we show that the resulting distributions of play approximate the sequence of coarse correlated equilibria, and apply this result to establish improved welfare bounds for smooth games. On a similar vein, we formulate internal dynamic benchmark consistent policies and establish that they approximate sequences of correlated equilibria. Our findings therefore suggest that, in a broad range of multi-agent systems where non-stationarity is prevalent, algorithms designed to compete with dynamic benchmarks can improve both individual and welfare guarantees, and their emerging dynamics approximate a sequence of static equilibrium outcomes.