AccScience Publishing / EIR / Online First / DOI: 10.36922/EIR025380008
ORIGINAL RESEARCH ARTICLE

Decentralized reinforcement learning for scalable embodied intelligence in robotic swarms

Sanjay Agal1* Niyati Dhirubhai Odedra2
Show Less
1 Department of Artificial Intelligence and Data Science, Faculty of Engineering and Technology, Parul University, Vadodara, Gujarat, India
2 Department of Computer Science and Engineering, College of Engineering and Technology, Dr V. R. Godhania Institute of Engineering, IT, and Management, Porbandar, Gujarat, India
Received: 21 September 2025 | Revised: 10 October 2025 | Accepted: 14 October 2025 | Published online: 31 October 2025
© 2025 by the Author(s). This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution -Noncommercial 4.0 International License (CC-by the license) ( https://creativecommons.org/licenses/by-nc/4.0/ )
Abstract

The realization of scalable embodied intelligence in robotic swarms represents a fundamental challenge in robotics and artificial intelligence, hindered by the limitations of conventional decentralized multi-agent reinforcement learning approaches. This paper introduces a novel integrated framework for decentralized reinforcement learning that holistically addresses these challenges through three key innovations: a dynamic hypergraph convolutional communication protocol for bandwidth-efficient coordination, a hierarchical policy network with recurrent state estimation for managing partial observability and enabling scalability, and a federated learning-inspired training paradigm for enhanced robustness and sim-to-real transfer. Extensive experimental evaluation demonstrates that our approach achieves a 78% reduction in communication overhead compared to state-of-the-art baselines while maintaining superior task performance in swarms of up to 50 agents in simulation. Crucially, the framework exhibits remarkable robustness, showing only a 16–22% performance drop when transferred from simulation to reality, compared to 45–68% for baseline methods. Physical validation on a swarm of 15 nano drones confirms the practical efficacy of our approach, with an 85% success rate in dynamic target pursuit tasks. Statistical analysis confirms the significance of these improvements (p<0.001). These results collectively establish a new state-of-the-art for deploying scalable, communication-efficient, and robust embodied intelligence in robotic swarms.

Keywords
Robotic swarms
Embodied intelligence
Decentralized reinforcement learning
Multi-agent systems
Hypergraph neural networks
Federated learning
Sim-to-real transfer
Scalable autonomy
Funding
None.
Conflict of interest
The authors declare that they have no competing interests.
References
  1. Rockbach JD, Bennewitz M. Robot swarms as embodied extensions of humans. IOP Conf Ser Mater Sci Eng. 2022;1261(1):012015. doi: 10.1088/1757-899x/1261/1/012015

 

  1. Gautam A, Mohan S. A Review of Research in Multi-Robot Systems. In: 2012 IEEE 7th International Conference on Industrial and Information Systems (ICIIS). Chennai, India. IEEE; 2012. p. 1-5. doi: 10.1109/ICIInfS.2012.6304778

 

  1. Omicini A. Agents Writing on Walls: Cognitive Stigmergy and Beyond. College Publications eBooks; 2012. p. 565-578. Available from: https://cris.unibo.it/handle/11585/132993 [Last accessed on 2025 Aug 31].

 

  1. Ganin AA, Massaro E, Gutfraind A, et al. Operational resilience: Concepts, design and analysis. Sci Rep. 2016;6(1):19540. doi: 10.1038/srep19540

 

  1. Tan M. Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents. Amsterdam: Elsevier; 1993. p. 330-337. doi: 10.1016/b978-1-55860-307-3.50049-6

 

  1. Li T, Zhu K, Luong NC, et al. Applications of multi-agent reinforcement learning in future internet: A comprehensive survey. IEEE Commun Surv Tutor. 2022;24(2):1240-1279. doi: 10.1109/comst.2022.3160697

 

  1. Gronauer S, Diepold K. Multi-agent deep reinforcement learning: A survey. Artif Intell Rev. 2021;55(2):895-943. doi: 10.1007/s10462-021-09996-w

 

  1. Pendleton B, Goodrich M. Scalable Human Interaction with Robotic Swarms. In: AIAA Infotech@Aerospace (I@a) Conference; 2013. doi: 10.2514/6.2013-4731

 

  1. Bai Y, Gong C, Zhang B, Fan, G, Hou X, Lu Y. Cooperative Multi-Agent Reinforcement Learning with Hypergraph Convolution. In: 2022 International Joint Conference on Neural Networks (IJCNN); 2022. p. 1-8. doi: 10.1109/ijcnn55064.2022.9891942

 

  1. Bhatt NDM. Self-Adaptive sensor fault detection in IoT health monitoring using federated learning and lightweight transformers. J Inf Syst Eng Manage. 2025;10(41s):298-309. doi: 10.52783/jisem.v10i41s.7838

 

  1. Madhavi M, Agal S, Odedra ND, et al. Elevating offensive language detection: CNN-GRU and BERT for enhanced hate speech identification. Int J Adv Comput Sci Appl. 2024;15(5): 1164–1172. doi: 10.14569/ijacsa.2024.01505118

 

  1. Hammoud A, Iskandar A, Kovács B. Dynamic foraging in swarm robotics: A hybrid approach with modular design and deep reinforcement learning intelligence. Inf Autom. 2025;24(1):51-71. doi: 10.15622/ia.24.1.3

 

  1. Thilak KR, Chandrasekar P. Modeling and simulation of anaerobic digestion-gasification integration using MADRL-FAHP. Biofuels. 2025;1-24. doi: 10.1080/17597269.2025.2547552

 

  1. Foerster J, Farquhar G, Afouras T, Nardelli N, Whiteson S. Counterfactual Multi-Agent policy gradients. Proc AAAI Conf Artif Intell. 2018;32(1):2974-2982. doi: 10.1609/aaai.v32i1.11794

 

  1. Chai J, Li W, Zhu Y, et al. UNMAS: Multiagent reinforcement learning for unshaped cooperative scenarios. IEEE Trans Neural Netw Learn Syst. 2021;34(4):2093-2104. doi: 10.1109/tnnls.2021.3105869

 

  1. Yan Z, Xu Y. A multi-Agent deep reinforcement learning method for cooperative load frequency control of a Multi-Area power system. IEEE Trans Power Syst. 2020;35(6):4599-4608. doi: 10.1109/tpwrs.2020.2999890

 

  1. Kadumbadi V, Packirisamy T, Sivakumar B, Seenuvasan P. Optimizing cluster head selection and routing in 5G WSNs: A reinforcement learning and deep learning approach. Commun Opt Connect. 2025;2. doi: 10.69709/cnc.2025.138412

 

  1. Guo Z, Wu Z, Xiao T, Aggarwal C, Liu H, Wang S. Counterfactual learning on graphs: A survey. Deleted J. 2025;22(1):17-59. doi: 10.1007/s11633-024-1519-z

 

  1. Shekhar J, Bhargavi N, Merlin JS, Suresh HR, RajaSekhar J, Prakalya SB. Employing reinforcement learning in autonomous Vehicle-to-Vehicle communication systems. Int J Comput Exp Sci Eng. 2025;11(3):4329–4335. doi: 10.22399/ijcesen.2490

 

  1. Yu Y. Managing Complex Intelligent Systems : The Coexistence of Generativity and Criticality. In: Linköping Studies in Science and Technology. [Dissertations]; 2025. doi: 10.3384/9789180759984

 

  1. Ryu H, Shin H, Park J. Multi-agent actor-critic with hierarchical graph attention network. Proc AAAI Conf Artif Intell. 2020;34(5):236-7243. doi: 10.1609/aaai.v34i05.6214

 

  1. Ding R, Yang Z, Wei Y, Jin H, Wang X. Multi-Agent Reinforcement Learning for Urban Crowd Sensing with for- Hire Vehicles. In: IEEE INFOCOM 2022 - IEEE Conference on Computer Communications; 2021. doi: 10.1109/infocom42981.2021.9488713

 

  1. Tan J, Zhang T, Coumans E, et al. Sim-to-real: Learning agile locomotion for quadruped robots. Robot Sci Syst Proc 2018. doi: 10.15607/rss.2018.xiv.010

 

  1. Kyriazos T, Poga M. The hybrid modern network model: A multi-technique framework for comprehensive network analysis. Interpers Int J Pers Relat. 2025;19(1):135-158. doi: 10.5964/ijpr.15021

 

  1. Kingslin S, Vaishnavi K. Comparative analysis of AI-driven IoT-based smart agriculture platforms with blockchain-enabled marketplaces. Int J Res Innov Appl Sci. 2025;10(9):243-249. doi: 10.51584/ijrias.2025.100900021

 

  1. Zhao W, Queralta JP, Westerlund T. Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: A Survey. In: 2021 IEEE Symposium Series on Computational Intelligence (SSCI); 2020. p737-744. doi: 10.1109/ssci47803.2020.9308468

 

  1. Nguyen DC, Ding M, Pathirana PN, Seneviratne A, Li J, Poor HV. Federated learning for internet of things: A comprehensive survey. IEEE Commun Surv Tutor. 2020;23(3):1622-1658. doi: 10.1109/comst.2021.3075439

 

  1. Sun W, Lei S, Wang L, Liu Z, Zhang Y. Adaptive federated learning and digital twin for industrial internet of things. IEEE Trans Ind Inf. 2020;17(8):5605-5614. doi: 10.1109/tii.2020.3034674

 

  1. 2020 Index IEEE Transactions on Components, Packaging and Manufacturing Technology. IEEE Trans Compon Packaging Manuf Technol. 2020;10(12):2133–2186. doi: 10.1109/tcpmt.2020.3045544

 

  1. Friedman JH. Greedy function approximation: A gradient boosting machine. Ann Stat. 2001;29(5):1189-1232. doi: 10.1214/aos/1013203451

 

  1. Eccles T, Bachrach Y, Lever G, Lazaridou A, Graepel T. Biases for emergent Communication in Multi-Agent Reinforcement Learning. Vol. 32. New York: Cornell University; 2019. p. 13111-13121.

 

  1. Hüttenrauch M, Šošić A, Neumann G. Deep reinforcement learning for swarm systems. J Mach Learn Res. 2019;20(54):1-31. doi: 10.48550/arXiv.1807.06613
Share
Back to top
Embodied Intelligence and Robotics, Published by AccScience Publishing