Data science in embodied artificial intelligence and robotics: A comprehensive study of models, methods, and applications

© 2025 by the Author(s). This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution -Noncommercial 4.0 International License (CC-by the license) ( https://creativecommons.org/licenses/by-nc/4.0/ )

Download PDF

XML

Cite

Abstract

Embodied intelligence stands at the confluence of robotics, machine learning, and cognitive science, promising systems that perceive, adapt, and act with context-aware reasoning. This study comprehensively analyzes recent advances in data-driven approaches that empower embodied agents with intelligent behaviors, focusing on hybrid models that integrate symbolic and sub-symbolic reasoning, multimodal robotic perception, and adaptive decision-making. We critically examine the role of data science in integrating deep learning, causal inference, and uncertainty handling across diverse robotic applications. Furthermore, the paper explores challenges in human-robot interaction, the ethical design of artificial intelligence, and the scalable deployment in real-world environments. By synthesizing interdisciplinary perspectives, we identify research gaps and propose a unifying roadmap to advance responsible, explainable, and autonomous systems. To demonstrate these concepts, we introduce the neuro-symbolic hybrid controller with adaptive fusion model, which fuses multimodal data using a transformer, extracts symbolic predicates, and applies differentiable reasoning for action selection. Experiments on MetaWorld, Yale-CMU-Berkeley, and real-world tasks achieved 92–96% success rates with robust sim-to-real transfer, outperforming proximal policy optimization, soft actor-critic, and behavioral cloning while maintaining low latency and interpretable decision-making. This work serves as a foundation for scholars and practitioners seeking to bridge theoretical insights with practical deployment of intelligent robotics.

Keywords

Embodied intelligence

Hybrid artificial intelligence

Robotic perception

Adaptive decision making

Human robot collaboration

Causal inference in artificial intelligence

Autonomous intelligent systems

Funding

None.

Conflict of interest

The authors declare that they have no competing interests.

References

Hassabis D, Kumaran D, Summerfield C, Botvinick M. Neuroscience-inspired artificial intelligence. Neuron. 2017;95(2):245-258. doi: 10.1016/j.neuron.2017.06.011

Silver D, Schrittwieser J, Simonyan K, et al. Mastering the game of Go without human knowledge. Nature. 2017;550(7676):354-359. doi: 10.1038/nature24270

Bommasani R, Hudson DA, Adeli E, et al. On the Opportunities and Risks of Foundation Models. arXiv (Cornell University). 2021 doi: 10.48550/arXiv.2108.07258

Raffel C, Shazeer N, Roberts A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res. 2020;21:1-67. doi: 10.48550/arXiv.1910.10683

Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Adv Neural Inf Process Syst. 2012;25:1097-1105.

Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16x16 words: transformers for image recognition at scale. In: The International Conference on Learning Representations; 2021. doi: 10.48550/arXiv.2010.11929

Sutton RS, Barto AG. Reinforcement Learning: An Introduction. 2nd ed. The MIT Press; 2018.

Amodei D, Olah C, Steinhardt J, Christiano P, Schulman J, Mané D. Concrete problems in AI safety. arXiv; 2016. doi: 10.48550/arXiv.1606.06565

Buckman J, Neubig G. Emergent properties of finetuned language representation models. Trans Assoc Comput Linguist. 2021;9:1212-1229. doi: 10.1162/tacl_a_00036

Quiñonero-Candela J, Sugiyama M, Schwaighofer A, Lawrence ND. Dataset Shift in Machine Learning. Cambridge: The MIT Press; 2009. p. 3-28.

Tedrake R. Underactuated Robotics: Algorithms for Walking, Running, Swimming, Flying, and Manipulation. Cambridge: MIT Press; 2020.

Garcez AA, Lamb LC. Neurosymbolic AI: The 3rd wave. Artif Intell Rev. 2022;55(4):3501-3525. doi: 10.1007/s10462-021-10096-z

Schölkopf B, Locatello F, Bauer S, et al. Toward causal representation learning. Proc IEEE. 2021;109(5):612-634. doi: 10.1109/JPROC.2021.3058954

Chen T, Kornblith S, Norouzi M, Hinton G. A simple framework for contrastive learning of visual representations. Proc Mach Learn Res. 2020;119:1597-1607.

Radford A, Kim JW, Hallacy C, et al. Learning transferable visual models from natural language supervision. Proc Mach Learn Res. 2021;139:8748-8763.

Floridi L, Cowls J, Beltrametti M, et al. AI4People-an ethical framework for a good AI society: Opportunities, risks, principles, and recommendations. Minds Mach. 2021;31(1):117-145. doi: 10.1007/s11023-020-09549-0

Innes M, Edelman A, Fischer K, et al. A differentiable programming system to bridge machine learning and scientific computing. arXiv; 2019. doi: 10.48550/arXiv.1907.07587

Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell. 2019;1(5):206-215. doi: 10.1038/s42256-019-0048-x

Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6(7):e1000097. doi: 10.1371/journal.pmed.1000097

LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi: 10.1038/nature14539

Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis. 2015;115(3):211-252. doi: 10.1007/s11263-015-0816-y

Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2015. p. 1-9. doi: 10.1109/CVPR.2015.7298594

Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Adv Neural Inf Process Syst. 2017;30:5998-6008.

Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. Proc NAACL. 2019;1:4171-4186.

Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning. Nature. 2015;518(7540):529-533. doi: 10.1038/nature14236

Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O. Proximal policy optimization algorithms. arXiv; 2017. doi: 10.48550/arXiv.1707.06347

Haarnoja T, Zhou A, Abbeel P, Levine S. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. Proc Mach Learn Res. 2018;80:1861-1870. doi: 10.48550/arXiv.1812.05905

Janner M, Fu J, Zhang M, Levine S. When to trust your model: Model-based policy optimization. arXiv (Cornell University). 2019. doi: 10.48550/arXiv.1906.08253

Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL. Rethinking atrous convolution for semantic image segmentation. arXiv; 2017. doi: 10.48550/arXiv.1706.05587

Qi CR, Su H, Mo K, Guibas LJ. PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2017. p. 652-660. doi: 10.48550/arXiv.1612.00593

Gallego G, Delbruck T, Orchard G, et al. Event-based vision: A survey. IEEE Trans Pattern Anal Mach Intell. 2022;44(1):154-180. doi: 10.1109/TPAMI.2020.3008413

Das A, Datta S, Gkioxari G, Lee S, Parikh D, Batra D. Embodied question answering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW); 2018. p. 2135-213509. doi: 10.1109/CVPRW.2018.00279

Ross S, Gordon G, Bagnell D. A reduction of imitation learning and structured prediction to no-regret online learning. Proc Mach Learn Res. 2011;15:627-635. doi: 10.48550/arXiv.1011.0686

Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. Proc Mach Learn Res. 2017;70:1126-1135.

Levy A, Konidaris G, Platt R, Saenko K. Learning multi-level hierarchies with hindsight. The International Conference on Learning Representations; 2019.

Katz G, Barrett C, Dill DL, Julian K, Kochenderfer MJ. Reluplex: An efficient SMT solver for verifying deep neural networks. In: Proceedings of the International Conference on Computer Aided Verification (CAV); 2017. p. 97-117.

Wunderlich A, Booth K, Santi E. Hybrid analytical and data-driven modeling for networked dynamical systems. In: 2021 IEEE Electric Ship Technologies Symposium (ESTS). IEEE; 2021. p. 1-8.

Lakshminarayanan B, Pritzel A, Blundell C. Simple and scalable predictive uncertainty estimation using deep ensembles. Adv Neural Inf Process Syst. 2017;30:6405-6416.

Ribeiro MT, Singh S, Guestrin C. “Why should I trust you?”: Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM; 2016. p. 1135-1144.

Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A. A survey on bias and fairness in machine learning. ACM Comput Surv. 2021;54(6):1-35.

Abadi M, Chu A, Goodfellow I, et al. Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS). ACM; 2016. p. 308-318.

Acemoglu D, Restrepo P. Automation and new tasks: How technology displaces and reinstates labor. J Econ Perspect. 2019;33(2):3-30.

Agal S, Odedra ND. In: Agal S, editor. Foundations of Data Science From Theory to Practice. 1st ed; 2025. Available from: https://www.amazon.in/dp/B0FGV2V46H [Last accessed on 2025 May 14].

Brown T, Mann B, Ryder N, et al. Language models are few-shot learners. Adv Neural Inf Process Syst. 2020;33:1877-1901.

Brohan A, Brown N, Carbajal J, et al. RT-1: Robotics transformer for real-world control at scale. arXiv; 2022. doi: 10.48550/arXiv.2212.06817

Davies M, Srinivasa N, Lin T, et al. Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro. 2018;38(1):82-99.

Agal S, Odedra ND. IoT as a tool for remote engineering education opportunities and challenges. In: Parul University International Conference on Engineering and Technology 2025 (PiCET 2025). Vadodara; 2025.

Dean J, Corrado G, Monga R, et al. Large scale distributed deep networks. Adv Neural Inf Process Syst. 2012;25:1223-1231.

Parisi GI, Kemker R, Part JL, Kanan C, Wermter S. Continual lifelong learning with neural networks: A review. Neural Netw. 2019;113:54-71.

Bubeck S, Chandrasekaran V, Eldan R, et al. Sparks of artificial general intelligence: Early experiments with GPT-4. arXiv; 2023. doi: 10.48550/arXiv.2303.12712

Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian Data Analysis. 3rd ed. Boca Raton: Chapman and Hall/CRC; 2013. doi: 10.1201/b16018

Todorov E, Erez T, Tassa Y. MuJoCo: A physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE; 2012. p. 5026-5033.

Yu T, Quillen D, He Z, et al. Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning. In: Proceedings of the Conference on Robot Learning. 2020. p. 1094-1100.

Calli B, Walsman A, Singh A, Srinivasa S, Abbeel P, Dollar AM. The YCB object and model set: Towards common benchmarks for manipulation research. In: 2015 International Conference on Advanced Robotics (ICAR). IEEE; 2015. p. 510-517.

Previous article in this issue

Next article in this issue

Embodied Intelligence and Robotics, Published by AccScience Publishing