AccScience Publishing / IJAMD / Online First / DOI: 10.36922/IJAMD025480050
ORIGINAL RESEARCH ARTICLE

A multi-agent deep reinforcement learning framework for the generative design of alloys and processing routes

Bilal Muhammed1 Akash Bhattacharjee1 B. P. Gautham1* Amol Joshi1
Show Less
1 TCS Research and Innovation, Tata Consultancy Services, Pune, Maharashtra, India
Received: 26 November 2025 | Revised: 24 December 2025 | Accepted: 5 January 2026 | Published online: 28 January 2026
© 2026 by the Author(s). This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution 4.0 International License ( https://creativecommons.org/licenses/by/4.0/ )
Abstract

The design of alloys and their manufacturing processes requires extensive exploration of a broad design space comprising various compositional and processing variables, many of which remain inadequately explored in practice. The existence of multiple viable processing routes for achieving desired alloy properties further complicates the design process. This paper presents a multi-agent deep reinforcement learning (DRL) framework for the in silico design of alloys and their processing routes/conditions tailored to specific property targets. The framework consists of distinct decentralized DRL agents, each responsible for making decisions regarding composition selection and the individual manufacturing steps involved in the process. These agents interact with their respective environments, which represent the assigned processes, and share responsibilities related to both process-specific outcomes and overall property satisfaction, as governed by the reward functions. The reward functions integrate considerations of sustainability, cost, and manufacturability into the decision-making process. A generative design step is proposed to leverage the capabilities of the trained DRL agents to produce multiple design alternatives for a given requirement. The framework is applied to the design of a hot-rolled steel sheet, exploring two feasible processing routes: Conventional casting and thin slab casting, resulting in several alternatives for each route. The framework’s performance is evaluated on two experimental cases from the literature, indicating its success in biasing the sample toward the preferred solution space. A benchmark study is conducted to evaluate the framework’s performance against designs produced by materials engineers for three distinct use cases, demonstrating the superior performance of the proposed framework.

Graphical abstract
Keywords
Alloy and processing design
In silico design
Multi-agent systems
Deep reinforcement learning
Manufacturing process routes
Funding
None.
Conflict of interest
The authors declare that they have a pending patent titled ‘Methods and systems for automated design of materials and its manufacturing process for desired properties’ assigned to Tata Consultancy Services Ltd.
References
  1. Pollock TM, Van der Ven A. The evolving landscape for alloy design. MRS Bull. 2019;44(4):238-246. doi: 10.1557/mrs.2019.69

 

  1. Ishida K. Alloy design and development of advanced materials based on phase diagrams and microstructural control. Mater Trans. 2020;65(5):807-819. doi: 10.2320/matertrans.mt-m2019362

 

  1. Gorsse S, Tancret F. Current and emerging practices of CALPHAD toward the development of high entropy alloys and complex concentrated alloys. J Mater Res. 2018;33(19):2899-2923. doi: 10.1557/jmr.2018.152

 

  1. Wu M, Wang S, Huang H, Shu D, Sun B. CALPHAD aided eutectic high-entropy alloy design. Mater Lett. 2020;262:127175. doi: 10.1016/j.matlet.2019.127175

 

  1. Carvalho SR, Ong TH, Guimarães G. A mathematical and computational model of furnaces for continuous steel strip processing. J Mater Process Technol. 2006;178(1):379-387. doi: 10.1016/j.jmatprotec.2006.04.083

 

  1. Albertin E, Beneduce F, Matsumoto M, Teixeira I. Optimizing heat treatment and wear resistance of high chromium cast irons using computational thermodynamics. Wear. 2011;271(9-10):1813-1818. doi: 10.1016/j.wear.2011.01.079

 

  1. Frydrych K, Karimi K, Pecelerowicz M, et al. Materials informatics for mechanical deformation: A review of applications and challenges. Materials (Basel). 2021;14(19):5764. doi: 10.3390/ma14195764

 

  1. Zou C, Li J, Wang WY, et al. Integrating data mining and machine learning to discover high-strength ductile titanium alloys. Acta Mater. 2021;202:211-221. doi: 10.1016/j.actamat.2020.10.056

 

  1. Hart GLW, Mueller T, Toher C, Curtarolo S. Machine learning for alloys. Nat Rev Mater. 2021;6(8):730-755. doi: 10.1038/s41578-021-00340-w

 

  1. Gao X, Wang H, Tan H, Xing L, Hu Z. Data-driven machine learning for alloy research: Recent applications and prospects. Mater Today Commun. 2023;36:106697. doi: 10.1016/j.mtcomm.2023.106697

 

  1. Golmohammadi M, Aryanpour M. Analysis and evaluation of machine learning applications in materials design and discovery. Mater Today Commun. 2023;35:105494. doi: 10.1016/j.mtcomm.2023.105494

 

  1. Vanpoucke DEP, Van Knippenberg OSJ, Hermans K, Bernaerts KV, Mehrkanoon S. Small data materials design with machine learning: When the average model knows best. J Appl Phys. 2020;128(5):054901. doi: 10.1063/5.0012285

 

  1. Noh J, Gu GH, Kim S, Jung Y. Machine-enabled inverse design of inorganic solid materials: Promises and challenges. Chem Sci. 2020;11(19):4871-4881. doi: 10.1039/d0sc00594k

 

  1. Debnath A, Krajewski AM, Sun H, et al. Generative Deep Learning as a Tool for Inverse Design of High-Entropy Refractory Alloys. [arXiv Preprint]; 2021. doi: 10.48550/arXiv.2108.12019

 

  1. Nguyen P, Tran T, Gupta S, Rana S, Venkatesh S. Hybrid Generative-Discriminative Models for Inverse Materials Design. [arXiv Preprint]; 2018. doi: 10.48550/arXiv.1811.06060

 

  1. Chen L, Zhang W, Nie Z, Li S, Pan F. Generative models for inverse design of inorganic solid materials. J Mater Inform. 2021;1:4. doi: 10.20517/jmi.2021.07

 

  1. Sousa T, Correia J, Pereira V, Rocha M. Generative deep learning for targeted compound design. J Chem Inf Model. 2021;61(10):5343-5361. doi: 10.1021/acs.jcim.0c01496

 

  1. Rui X, Siriwardane EMD, Song Y, et al. Active-learning-based generative design for the discovery of wide-band-gap materials. J Phys Chem C. 2021;125(29):16118-16128. doi: 10.1021/acs.jpcc.1c02438

 

  1. Witman M, Ek G, Ling S, et al. Data-driven discovery and synthesis of high-entropy alloy hydrides with targeted thermodynamic stability. Chem Mater. 2021;33(11):4067-4076. doi: 10.1021/acs.chemmater.1c00647

 

  1. Sheikh S, Vela B, Honarmandi P, et al. High-throughput alloy and process design for metal additive manufacturing. NPJ Comput Mater. 2025;11:179. doi: 10.1038/s41524-025-01670-x

 

  1. Lee JW, Park WB, Lee D, Kim S, Goo NH, Sohn KS. Dirty engineering data-driven inverse prediction machine learning model. Sci Rep. 2020;10:20443. doi: 10.1038/s41598-020-77575-0

 

  1. Couperthwaite R, Molkeri A, Khatamsaz D, Srivastava A, Allaire D, Arróyave R. Materials design through batch Bayesian optimization with multisource information fusion. JOM. 2020;72(10):4431-4443. doi: 10.1007/s11837-020-04396-x

 

  1. Honarmandi P, Attari V, Arróyave R. Accelerated materials design using batch Bayesian optimization: A case study for solving the inverse problem from materials microstructure to process specification. Comput Mater Sci. 2022;210:111417. doi: 10.1016/j.commatsci.2022.111417

 

  1. Khatamsaz D, Vela B, Singh P, Johnson DD, Allaire D, Arróyave R. Bayesian optimization with active learning of design constraints using an entropy-based approach. NPJ Comput Mater. 2023;9:74. doi: 10.1038/s41524-023-01006-7

 

  1. Sardeshmukh A, Reddy S, Gautham BP. Bayesian framework for inverse inference in manufacturing process chains. Integr Mater Manuf Innov. 2019;8(2):95-106. doi: 10.1007/s40192-019-00140-9

 

  1. Rao Z, Tung PY, Xie R, et al. Machine learning-enabled high-entropy alloy discovery. Science. 2022;378(6615):78-85. doi: 10.1126/science.abo4940

 

  1. Wen C, Zhang Y, Wang C, et al. Machine learning assisted design of high entropy alloys with desired property. Acta Mater. 2019;170:109-117. doi: 10.1016/j.actamat.2019.03.010

 

  1. Coto AG, Precker CE, Andersson T, et al. The use of generative models to speed up the discovery of materials. Comput Methods Mater Sci. 2023;23(1):13-26. doi: 10.7494/cmms.2023.1.0802

 

  1. Li Z, Nash WT, O’Brien SP, Qiu Y, Gupta RK, Birbilis N. cardiGAN: A generative adversarial network model for multi-principal element alloys. J Mater Sci Technol. 2022;125:81-96. doi: 10.1016/j.jmst.2022.03.008

 

  1. Dan Y, Zhao Y, Li X, Li S, Hu M, Hu J. Generative adversarial networks (GAN) based efficient sampling of chemical composition space for inverse design of inorganic materials. NPJ Comput Mater. 2020;6:1-7. doi: 10.1038/s41524-020-00352-0

 

  1. Iyer A, Dey B, Dasgupta A, Chen W, Chakraborty A. Conditional Generative Model for Predicting Material Microstructures. [arXiv Preprint]; 2019. doi: 10.48550/arXiv.1910.02133

 

  1. Zhou Z, Shang Y, Liu X, Yang Y. A generative deep learning framework for inverse design of compositionally complex bulk metallic glasses. NPJ Comput Mater. 2023;9:15. doi: 10.1038/s41524-023-00968-y

 

  1. Sardeshmukh A, Reddy S, Gautham BP, Bhattacharyya P. Material Microstructure Design using VAE-Regression with Multimodal Prior. [arXiv Preprint]; 2024. doi: 10.48550/arxiv.2402.17806

 

  1. Menon D, Ranganathan R. A generative approach to materials discovery, design, and optimization. ACS Omega. 2022;7(30):25958-25973. doi: 10.1021/acsomega.2c03264

 

  1. Chen CT, Gu GX. Generative deep neural networks for inverse materials design using backpropagation and active learning. Adv Sci (Weinh). 2020;7(5):1902607. doi: 10.1002/advs.201902607

 

  1. Pei Z, Rozman KA, Do ÖN, et al. Machine-learning microstructure for inverse material design. Adv Sci (Weinh). 2021;8(23):2101207. doi: 10.1002/advs.202101207

 

  1. Popova M, Isayev O, Tropsha A. Deep reinforcement learning for de novo drug design. Sci Adv. 2018;4(7):eaap7885. doi: 10.1126/sciadv.aap7885

 

  1. Turk H, Landini E, Kunkel C, Margraf JT, Reuter K. Assessing deep generative models in chemical composition space. Chem Mater. 2022;34(21):9455-9467. doi: 10.1021/acs.chemmater.2c01860

 

  1. Karpovich C, Pan E, Olivetti EA. Deep reinforcement learning for inverse inorganic materials design. NPJ Comput Mater. 2024;10:287. doi: 10.1038/s41524-024-01474-5

 

  1. Volk AA, Epps RW, Yonemoto DT, et al. AlphaFlow: Autonomous discovery and optimization of multi-step chemistry using a self-driven fluidic lab guided by reinforcement learning. Nat Commun. 2023;14:1403. doi: 10.1038/s41467-023-37139-y

 

  1. Xian Y, Dang P, Tian Y, et al. Compositional design of multicomponent alloys using reinforcement learning. Acta Mater. 2024;274:120017. doi: 10.1016/j.actamat.2024.120017

 

  1. Sui F, Guo R, Zhang Z, Gu GX, Lin L. Deep reinforcement learning for digital materials design. ACS Mater Lett. 2021;3(8):1433-1439. doi: 10.1021/acsmaterialslett.1c00390

 

  1. Yang J, Tian B, Chen L, et al. Deep reinforcement learning for multiphase microstructure design. Comput Mater Contin. 2021;68(1):1285-1302. doi: 10.32604/cmc.2021.016829

 

  1. Rajak P, Chen ASC, Kim JY, et al. Autonomous reinforcement learning agent for kirigami design of 2D materials. NPJ Comput Mater. 2021;7:72. doi: 10.1038/s41524-021-00572-y

 

  1. Pandit P, Abdusalamov R, Itskov M, Rege A. Deep reinforcement learning for microstructural optimisation of silica aerogels. Sci Rep. 2024;14:1511. doi: 10.1038/s41598-024-51341-y

 

  1. Dornheim J, Morand L, Zeitvogel S, Iraki T, Link N, Helm D. Deep reinforcement learning methods for structure-guided processing path optimization. J Intell Manuf. 2022;33:333-352. doi: 10.1007/s10845-021-01805-z

 

  1. Mianroodi JR, Siboni NH, Raabe D. Computational Discovery of Energy-Efficient Heat Treatment for Microstructure Design using Deep Reinforcement Learning. [arXiv Preprint]; 2022. doi: 10.48550/arXiv.2209.11259

 

  1. Ghafarollahi A, Buehler MJ. Automating alloy design and discovery with physics-aware multimodal multiagent AI. Proc Natl Acad Sci USA. 2025;122:e2414074122. doi: 10.1073/pnas.2414074122

 

  1. Hu Z, Huang C, Xie L, Hua L, Yuan Y, Zhang LC. Machine learning assisted quality control in metal additive manufacturing: A review. Adv Powder Mater. 2025;4(6):100342. doi: 10.1016/j.apmate.2025.100342

 

  1. Li Y. Deep Reinforcement Learning: An Overview. [arXiv Preprint]; 2017. doi: 10.48550/arXiv.1701.07274

 

  1. Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA. A Brief Survey of Deep Reinforcement Learning. [arXiv Preprint]; 2017. doi: 10.48550/arXiv.1708.05866

 

  1. Hernandez-Leal P, Kartal B, Taylor ME. A survey and critique of multiagent deep reinforcement learning. Auton Agents Multi-Agent Syst. 2019;33(6):750-797. doi: 10.1007/s10458-019-09421-1

 

  1. Canese L, Cardarilli GC, Di Nunzio L, et al. Multi-agent reinforcement learning: A review of challenges and applications. Appl Sci (Basel). 2021;11:4948. doi: 10.3390/app11114948

 

  1. Mountstephens J, Teo J. Progress and challenges in generative product design. Computers. 2020;9(4):80. doi: 10.3390/computers9040080

 

  1. Lillicrap TP, Hunt JJ, Pritzel A, et al. Continuous Control with Deep Reinforcement Learning. [arXiv Preprint]; 2015. doi: 10.48550/arXiv.1509.02971

 

  1. Sutton RS, Barto AG. Reinforcement learning: An introduction. In: IEEE Transactions on Neural Networks. 1st ed., vol. 9. New York: IEEE; 1998. doi: 10.1109/tnn.1998.712192

 

  1. Paszke A, Gross S, Massa S, et al. PyTorch: An Imperative style, High-Performance Deep Learning Library. In: 33rd Annual Conference on Neural Information Processing Systems. Vol. 32; 2019. doi: 10.48550/arXiv.1912.01703

 

  1. Zambrano PC, Guerrero MP, Colas R, Leduc LA. Microstructural analysis of hot-rolled, low-carbon steel strips. Mater Charact. 2001;47(3-4):275-282. doi: 10.1016/S1044-5803(01)00188-7

 

  1. JFE Steel Corp. Hot Rolled Steel Sheet Catalogue. JFE Steel Corp. Available from: https://www.jfe-steel.co.jp/en/ products/sheets/catalog/b1e-001.pdf [Last accessed on 2025 Nov 05].

 

  1. Jarfors AEW, Du A, Yu G, Zheng J, Wang K. On the sustainable choice of alloying elements for strength of aluminum-based alloys. Sustainability. 2020;12:1059. doi: 10.3390/su12031059

 

  1. Ginzburg VB, Ballas R. Flat Rolling Fundamentals. Boca Raton, FL: CRC Press; 2000. doi: 10.1201/9781482277357

 

  1. Townsend H. Effects of alloying elements on corrosion of steel in industrial atmospheres. Corrosion. 2001;57:497-501. doi: 10.5006/1.3290374

 

  1. Miettinen J, Louhenkilpi S, Kytönen H, Laine J. IDS: Thermodynamic-kinetic-empirical tool for modelling of solidification, microstructure and material properties. Math Comput Simul. 2010;80:1536-1550. doi: 10.1016/j.matcom.2009.11.002

 

  1. Lee SJ. Predictive model for austenite grain growth during reheating of alloy steels. ISIJ Int. 2013;53:1902-1904. doi: 10.2355/isijinternational.53.1902

 

  1. Sims RB. Calculation of roll force and torque in hot rolling mills. Proc Inst Mech Eng. 1954;168:191-200. doi: 10.1243/pime_proc_1954_168_023_02

 

  1. Zhang J, Cui Z. Simulation of multi-pass hot rolling by a mixed analytical-numerical method. Int J Appl Mech. 2011;3:469-489. doi: 10.1142/S1758825111001081

 

  1. Medina SF, Quispe A. Improved model for static recrystallization kinetics of hot-deformed austenite in low alloy and Nb/V microalloyed steels. ISIJ Int. 2001;41:774-781. doi: 10.2355/isijinternational.41.774

 

  1. Chubenko V, Khinotskaya A, Yarosh T, Saithareiev L. Sustainable development of the steel plate hot rolling technology due to energy-power process parameters justification. E3S Web Conf. 2020;166:06009. doi: 10.1051/e3sconf/202016606009

 

  1. Singh SB, Krishnan K, Sahay SS. Modeling non-isothermal austenite to ferrite transformation in low carbon steels. Mater Sci Eng A. 2007;445-446:310-315. doi: 10.1016/j.msea.2006.09.044

 

  1. Umemoto M, Guo ZH, Tamura I. Effect of cooling rate on grain size of ferrite in carbon steel. Mater Sci Technol. 1987;3:249-255. doi: 10.1179/mst.1987.3.4.249

 

  1. Wang L, Tang D, Song Y. Prediction of mechanical behavior of ferrite-pearlite steel. J Iron Steel Res Int. 2017;24:321-327. doi: 10.1016/S1006-706X(17)30046-8

 

  1. Hahn GT, Rosenfield AR. Sources of fracture toughness: The relation between K1c and the ordinary tensile properties of metals. In: Conrad H, Jaffee RI, Kessler HP, Minkler WW, editors. Applications Related Phenomena in Titanium Alloys. United States: ASTM International; 1968. p. 5-32. doi: 10.1520/STP33617S

 

  1. JMatPro, Sente Software Ltd. Modelling the Plane Strain Fracture Toughness of Titanium and Aluminium Alloys. Sente Software Ltd. Available from: https://www.sentesoftware. co.uk/site-media/fracture-toughness-ti-al [Last accessed on 2024 Mar 04].
Share
Back to top
International Journal of AI for Materials and Design, Electronic ISSN: 3029-2573 Print ISSN: 3041-0746, Published by AccScience Publishing