Self-Supervised Multi-Agent Learning Algorithm for Automated Supply Chain Coordination and Disruption Recovery
Keywords:
Multi-Agent Reinforcement Learning, Self-Supervised Learning, Supply Chain Resilience;Abstract
With sprawling multi-tiered networks and many links to each of the various stages, modern supply chains are susceptible to cascading failures caused by natural disasters, geopolitical shocks, demand variability, and supplier bankruptcies. Heuristic, rule-based, and single-agent reinforcement learning approaches are limited in their ability to model the many dimensions of modern global supply chains, particularly because of their distributed, partially observed, and nonstationary nature. This paper introduces a new decentralized learning algorithm called SSMASC (Self-Supervised Multi-Agent Supply Chain) that allows for heterogeneous autonomous agents (suppliers, manufacturers, distributors, and retailers) to coordinate without a centralized authority. SSMASC uses a two-phase methodology comprised of contrastive self-supervised pre-training to create rich latent representations of the states of a supply chain from unlabelled operational data, followed by a cooperative multi-agent reinforcement learning (MARL) phase using graph-attention-based communications. An innovative mechanism is introduced called disruption-aware value decomposition with adaptive credit assignments that allows for rapid recovery behaviors to occur even when only partially observed. Comprehensive evaluation experiments across three publicly available benchmark supply chain environments including an innovative 128 node global trade simulation demonstrate that SSMASC is capable of producing outcomes (i.e., resilience scores), faster recovery times, and higher total profit for the entire supply chain than the state-of-the-art solution, as evidenced by SSMASC's maximum performance increase of 31.4% for resilience scores, 43.7% decrease in average recovery time, and 22.1% increase in total profit. Ablation studies confirm that both self-supervised pre-training and graph-attention-based communication modules are critical components of SSMASC.




