Decentralized reinforcement learning for scalable embodied intelligence in robotic swarms
The realization of scalable embodied intelligence in robotic swarms represents a fundamental challenge in robotics and artificial intelligence, hindered by the limitations of conventional decentralized multi-agent reinforcement learning approaches. This paper introduces a novel integrated framework for decentralized reinforcement learning that holistically addresses these challenges through three key innovations: a dynamic hypergraph convolutional communication protocol for bandwidth-efficient coordination, a hierarchical policy network with recurrent state estimation for managing partial observability and enabling scalability, and a federated learning-inspired training paradigm for enhanced robustness and sim-to-real transfer. Extensive experimental evaluation demonstrates that our approach achieves a 78% reduction in communication overhead compared to state-of-the-art baselines while maintaining superior task performance in swarms of up to 50 agents in simulation. Crucially, the framework exhibits remarkable robustness, showing only a 16–22% performance drop when transferred from simulation to reality, compared to 45–68% for baseline methods. Physical validation on a swarm of 15 nano drones confirms the practical efficacy of our approach, with an 85% success rate in dynamic target pursuit tasks. Statistical analysis confirms the significance of these improvements (p<0.001). These results collectively establish a new state-of-the-art for deploying scalable, communication-efficient, and robust embodied intelligence in robotic swarms.
- Rockbach JD, Bennewitz M. Robot swarms as embodied extensions of humans. IOP Conf Ser Mater Sci Eng. 2022;1261(1):012015. doi: 10.1088/1757-899x/1261/1/012015
- Gautam A, Mohan S. A Review of Research in Multi-Robot Systems. In: 2012 IEEE 7th International Conference on Industrial and Information Systems (ICIIS). Chennai, India. IEEE; 2012. p. 1-5. doi: 10.1109/ICIInfS.2012.6304778
- Omicini A. Agents Writing on Walls: Cognitive Stigmergy and Beyond. College Publications eBooks; 2012. p. 565-578. Available from: https://cris.unibo.it/handle/11585/132993 [Last accessed on 2025 Aug 31].
- Ganin AA, Massaro E, Gutfraind A, et al. Operational resilience: Concepts, design and analysis. Sci Rep. 2016;6(1):19540. doi: 10.1038/srep19540
- Tan M. Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents. Amsterdam: Elsevier; 1993. p. 330-337. doi: 10.1016/b978-1-55860-307-3.50049-6
- Li T, Zhu K, Luong NC, et al. Applications of multi-agent reinforcement learning in future internet: A comprehensive survey. IEEE Commun Surv Tutor. 2022;24(2):1240-1279. doi: 10.1109/comst.2022.3160697
- Gronauer S, Diepold K. Multi-agent deep reinforcement learning: A survey. Artif Intell Rev. 2021;55(2):895-943. doi: 10.1007/s10462-021-09996-w
- Pendleton B, Goodrich M. Scalable Human Interaction with Robotic Swarms. In: AIAA Infotech@Aerospace (I@a) Conference; 2013. doi: 10.2514/6.2013-4731
- Bai Y, Gong C, Zhang B, Fan, G, Hou X, Lu Y. Cooperative Multi-Agent Reinforcement Learning with Hypergraph Convolution. In: 2022 International Joint Conference on Neural Networks (IJCNN); 2022. p. 1-8. doi: 10.1109/ijcnn55064.2022.9891942
- Bhatt NDM. Self-Adaptive sensor fault detection in IoT health monitoring using federated learning and lightweight transformers. J Inf Syst Eng Manage. 2025;10(41s):298-309. doi: 10.52783/jisem.v10i41s.7838
- Madhavi M, Agal S, Odedra ND, et al. Elevating offensive language detection: CNN-GRU and BERT for enhanced hate speech identification. Int J Adv Comput Sci Appl. 2024;15(5): 1164–1172. doi: 10.14569/ijacsa.2024.01505118
- Hammoud A, Iskandar A, Kovács B. Dynamic foraging in swarm robotics: A hybrid approach with modular design and deep reinforcement learning intelligence. Inf Autom. 2025;24(1):51-71. doi: 10.15622/ia.24.1.3
- Thilak KR, Chandrasekar P. Modeling and simulation of anaerobic digestion-gasification integration using MADRL-FAHP. Biofuels. 2025;1-24. doi: 10.1080/17597269.2025.2547552
- Foerster J, Farquhar G, Afouras T, Nardelli N, Whiteson S. Counterfactual Multi-Agent policy gradients. Proc AAAI Conf Artif Intell. 2018;32(1):2974-2982. doi: 10.1609/aaai.v32i1.11794
- Chai J, Li W, Zhu Y, et al. UNMAS: Multiagent reinforcement learning for unshaped cooperative scenarios. IEEE Trans Neural Netw Learn Syst. 2021;34(4):2093-2104. doi: 10.1109/tnnls.2021.3105869
- Yan Z, Xu Y. A multi-Agent deep reinforcement learning method for cooperative load frequency control of a Multi-Area power system. IEEE Trans Power Syst. 2020;35(6):4599-4608. doi: 10.1109/tpwrs.2020.2999890
- Kadumbadi V, Packirisamy T, Sivakumar B, Seenuvasan P. Optimizing cluster head selection and routing in 5G WSNs: A reinforcement learning and deep learning approach. Commun Opt Connect. 2025;2. doi: 10.69709/cnc.2025.138412
- Guo Z, Wu Z, Xiao T, Aggarwal C, Liu H, Wang S. Counterfactual learning on graphs: A survey. Deleted J. 2025;22(1):17-59. doi: 10.1007/s11633-024-1519-z
- Shekhar J, Bhargavi N, Merlin JS, Suresh HR, RajaSekhar J, Prakalya SB. Employing reinforcement learning in autonomous Vehicle-to-Vehicle communication systems. Int J Comput Exp Sci Eng. 2025;11(3):4329–4335. doi: 10.22399/ijcesen.2490
- Yu Y. Managing Complex Intelligent Systems : The Coexistence of Generativity and Criticality. In: Linköping Studies in Science and Technology. [Dissertations]; 2025. doi: 10.3384/9789180759984
- Ryu H, Shin H, Park J. Multi-agent actor-critic with hierarchical graph attention network. Proc AAAI Conf Artif Intell. 2020;34(5):236-7243. doi: 10.1609/aaai.v34i05.6214
- Ding R, Yang Z, Wei Y, Jin H, Wang X. Multi-Agent Reinforcement Learning for Urban Crowd Sensing with for- Hire Vehicles. In: IEEE INFOCOM 2022 - IEEE Conference on Computer Communications; 2021. doi: 10.1109/infocom42981.2021.9488713
- Tan J, Zhang T, Coumans E, et al. Sim-to-real: Learning agile locomotion for quadruped robots. Robot Sci Syst Proc 2018. doi: 10.15607/rss.2018.xiv.010
- Kyriazos T, Poga M. The hybrid modern network model: A multi-technique framework for comprehensive network analysis. Interpers Int J Pers Relat. 2025;19(1):135-158. doi: 10.5964/ijpr.15021
- Kingslin S, Vaishnavi K. Comparative analysis of AI-driven IoT-based smart agriculture platforms with blockchain-enabled marketplaces. Int J Res Innov Appl Sci. 2025;10(9):243-249. doi: 10.51584/ijrias.2025.100900021
- Zhao W, Queralta JP, Westerlund T. Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: A Survey. In: 2021 IEEE Symposium Series on Computational Intelligence (SSCI); 2020. p737-744. doi: 10.1109/ssci47803.2020.9308468
- Nguyen DC, Ding M, Pathirana PN, Seneviratne A, Li J, Poor HV. Federated learning for internet of things: A comprehensive survey. IEEE Commun Surv Tutor. 2020;23(3):1622-1658. doi: 10.1109/comst.2021.3075439
- Sun W, Lei S, Wang L, Liu Z, Zhang Y. Adaptive federated learning and digital twin for industrial internet of things. IEEE Trans Ind Inf. 2020;17(8):5605-5614. doi: 10.1109/tii.2020.3034674
- 2020 Index IEEE Transactions on Components, Packaging and Manufacturing Technology. IEEE Trans Compon Packaging Manuf Technol. 2020;10(12):2133–2186. doi: 10.1109/tcpmt.2020.3045544
- Friedman JH. Greedy function approximation: A gradient boosting machine. Ann Stat. 2001;29(5):1189-1232. doi: 10.1214/aos/1013203451
- Eccles T, Bachrach Y, Lever G, Lazaridou A, Graepel T. Biases for emergent Communication in Multi-Agent Reinforcement Learning. Vol. 32. New York: Cornell University; 2019. p. 13111-13121.
- Hüttenrauch M, Šošić A, Neumann G. Deep reinforcement learning for swarm systems. J Mach Learn Res. 2019;20(54):1-31. doi: 10.48550/arXiv.1807.06613
