M2Echem: A multilevel dual encoder-based model for predicting organic chemistry reactions

² Department of Electrical and Computer Engineering, State Key Laboratory of Internet of Things for Smart City, Faculty of Science and Technology, University of Macau, Macau, China

AIH, 025260058 https://doi.org/10.36922/AIH025260058

Received: 26 June 2025 | Revised: 21 July 2025 | Accepted: 23 July 2025 | Published online: 5 August 2025

© 2025 by the Author(s). This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution 4.0 International License ( https://creativecommons.org/licenses/by/4.0/ )

Download PDF

XML

Cite

Abstract

Chemical reaction prediction is a vital application of artificial intelligence. While Transformer models are widely used for this task, they often overlook deeper-level semantic information. In addition, the traditional Transformer model suffers from a decline in prediction performance and shows poor generalization when faced with different representations of the same molecule. To address these challenges, we propose a dual encoder-based reaction prediction method tailored for multilevel organic chemistry. Our approach began with the introduction of synergistic dual-encoder architecture: The atomic encoder focused on inter-atomic attention weights. In contrast, the molecular encoder employed a molecular maximum dimension reduction algorithm to identify key chemical features. We then performed multilevel feature fusion by combining the outputs from both the atomic and molecular encoders. Finally, we applied an optimized contrast loss to enhance the model’s robustness. The results indicated that this method outperformed existing models across all four datasets, significantly improving generalization performance and contributing to advancements in artificial intelligence-driven drug development and research.

Graphical abstract

Keywords

Forward reaction prediction

Multilevel feature fusion

Machine learning

Simplified molecular input line entry system code

Transformer

Funding

This research was funded by the National Natural Science Foundation of China (62406153, 62471259, and 62371261), the General Program of the Natural Science Research of Higher Education of Jiangsu Province (23KJB520031), and the Postgraduate Research and Practice Innovation Program of Jiangsu Province (SJCX25_2007).

Conflict of interest

Jiashuang Huang is the Youth Editorial Board Member of this journal, but was not in any way involved in the editorial and peer-review process conducted for this paper, directly or indirectly. Separately, other authors declared that they have no known competing financial interests or personal relationships that could have influenced the work reported in this paper.

References

Corey EJ, Wipke WT. Computer-assisted design of complex organic syntheses. Science. 1969;166(3902):178-192. doi: 10.1126/science.166.3902.178

Satoh H, Funatsu K. Further development of a reaction generator in the SOPHIA system for organic reaction prediction. Knowledge-guided addition of suitable atoms and/or atomic groups to product skeleton. J Chem Inform Comput Sci. 1996;36(2):173-184. doi: 10.1021/ci950058a

Coley CW, Barzilay R, Jaakkola TS, Green WH, Jensen KF. Prediction of organic reaction outcomes using machine learning. ACS Cent Sci. 2017;3(5):434-443. doi: 10.1021/acscentsci.7b00064

Duvenaud DK, Maclaurin D, Iparraguirre J, et al. Convolutional networks on graphs for learning molecular fingerprints. In: Advances in Neural Information Processing Systems 28. California: Curran Associates, Inc.; 2015. p. 2224-2232.

Raccuglia P, Elbert KC, Adler PD, et al. Machine-learning-assisted materials discovery using failed experiments. Nature. 2016;533(7601):73-76. doi: 10.1038/nature17439

Segler MH, Waller MP. Modelling chemical reasoning to predict and invent reactions. Chemistry. 2017;23(25):6118-6128. doi: 10.1002/chem.201604556

Weininger DJ. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Infrom Comput Sci. 1988;28(1):31-36. doi: 10.1021/ci00057a005

Schwaller P, Gaudin T, Lanyi D, Bekas C, Laino TJ. “Found in translation”: Predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models. Chem Sci. 2018;9(28):6091-6098. doi: 10.1039/c8sc02339e

Vaswani A, Shazeer N, Parmar N, et al. Attention is All you Need. Vol. 30. United States: Cornell University; 2017.

Schwaller P, Laino T, Gaudin T, et al. Molecular transformer: A model for uncertainty-calibrated chemical reaction prediction. ACS Cent Sci. 2019;5(9):1572-1583. doi: 10.1021/acscentsci.9b00576

Tang G, Müller M, Rios A, Sennrich RJ. Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures. Pennsylvania: Association for Computational Linguistics; 2018.

Wu F, Fan A, Baevski A, Dauphin YN, Auli MJ. Pay Less Attention with Lightweight and Dynamic Convolutions. United States: Cornell University; 2019.

Schwaller P, Probst D, Vaucher AC, et al. Mapping the space of chemical reactions using attention-based neural networks. Nat Mach Intell. 2021;3(2):144-152. doi: 10.1038/s42256-020-00284-w

Mellah Y, Kocaman V, Haq HU, Talby D. Efficient schema-less text-to-SQL conversion using large language models. AIH. 2024;1(2):96-106. doi: 10.36922/aih.2661

Mumtaz U, Ahmed A, Mumtaz S. LLMs-Healthcare: Current applications and challenges of large language models in various medical specialties. AIH. 2024;1(2):16-28. doi: 10.36922/aih.2558

Bran AM, Schwaller P. Transformers and large language models for chemistry and drug discovery. In: Drug Development Supported by Informatics. Berlin: Springer; 2024. p. 143-163.

Leon M, Perezhohin Y, Peres F, Popovič A, Castelli M. Comparing SMILES and SELFIES tokenization for enhanced chemical language modeling. Sci Rep. 2024;14(1):25016. doi: 10.1038/s41598-024-76440-8

Xiong J, Zhang W, Wang Y, et al. Bridging chemistry and artificial intelligence by a reaction description language. Nat Mach Intell. 2025;7(5):782-793. doi: 10.1038/s42256-025-01032-8

Lo A, Pollice R, Nigam A, White AD, Krenn M, Aspuru- Guzik AJ. Recent advances in the self-referencing embedded strings (SELFIES) library. Dig Discov. 2023;2(4):897-908. doi: 10.1039/D3DD00044C

Ucak UV, Ashyrmamatov I, Lee J. Improving the quality of chemical language model outcomes with atom-in-SMILES tokenization. J Cheminform. 2023;15(1):55. doi: 10.1186/s13321-023-00725-9

Wu Z, Jiang D, Wang J, et al. Knowledge-based BERT: A method to extract molecular features like computational chemists. Brief Bioinform. 2022;23(3):bbac131. doi: 10.1093/bib/bbac131

Chen S, Jung Y. Deep retrosynthetic reaction prediction using local reactivity and global attention. JACS Au. 2021;1(10):1612-1620. doi: 10.1021/jacsau.1c00246

Liu Z, Zhang W, Xia Y, et al. Molxpt: Wrapping Molecules with Text for Generative Pre-Training. United States: Association for Computational Linguistics; 2023.

Lu J, Zhang Y. Unified deep learning model for multitask reaction predictions with explanation. J Chem Inform Model. 2022;62(6):1376-1387. doi: 10.1021/acs.jcim.1c01467

Guo W, Wang J, Wang S. Deep multimodal representation learning: A survey. IEEE Access. 2019;7:63373-63394. doi: 10.1109/ACCESS.2019.2916887

Ma S, Zhang D, Zhou M. A Simple and Effective Unified Encoder for Document-Level Machine Translation. United States: Association for Computational Linguistics; 2020. p. 3505-3511.

Zhang X, Li P, Li H. AMBERT: A Pre-Trained Language Model with Multi-Grained Tokenization. United States: Cornell University; 2020.

Zhu J, Xia Y, Wu L, et al. Incorporating Bert into Neural Machine Translation. Cornell University; 2020.

Jin D, Jin Z, Zhou JT, Szolovits P. Is Bert Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment. United States: Cornell University; 2020. p. 8018-8025.

Singh S, Shingatgeri V, Srivastava P. Revolutionizing new drug discovery: Harnessing AI and machine learning to overcome traditional challenges and accelerate targeted therapies. AIH. 2024;2(2):29-40. doi: 10.36922/aih.4423

Gao T, Yao X, Chen D. Simcse: Simple Contrastive Learning of Sentence Embeddings. United States: Cornell University; 2021.

Chen X, Alamro H, Li M, et al. Target-Aware Abstractive Related Work Generation with Contrastive Learning. In: SIGIR ‘22: Proceedings of the 45^thInternational ACM SIGIR Conference on Research and Development in Information Retrieval; 2022. p. 373-383.

Miculicich L, Ram D, Pappas N, Henderson J. Document- Level Neural Machine Translation with Hierarchical Attention Networks. Belgium: Association for Computational Linguistics; 2018.

Mao A, Mohri M, Zhong Y. Cross-Entropy Loss Functions: Theoretical Analysis and Applications. In: Proceedings of Machine Learning Research PMLR; 2023. p. 23803-23828.

Jiang S, Zhang Z, Zhao H, et al. When SMILES smiles, practicality judgment and yield prediction of chemical reaction via deep chemical language processing. IEEE Access. 2021;9:85071-85083. doi: 10.1109/ACCESS.2021.3083838

Lowe DJ. Chemical Reactions from US Patents (1976- Sep2016); 2017.

Liu B, Ramsundar B, Kawthekar P, et al. Retrosynthetic reaction prediction using neural sequence-to-sequence models. ACS Cent Sci. 2017;3(10):1103-1113. doi: 10.1021/acscentsci.7b00303

Jin W, Coley C, Barzilay R, Jaakkola TJ. Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network. Vol. 30. United States: Cornell University; 2017.

Papineni K, Roukos S, Ward T, Zhu WJ. Bleu: A Method for Automatic Evaluation of Machine Translation. USA: Association for Computational Linguistics; 2002. p. 311-318.

Wang T. Research on Chemical Reaction Prediction Model Based on Fairseq. United States: IEEE; 2021. p. 167-171.

Bjerrum EJ. SMILES Enumeration as Data Augmentation for Neural Network Modeling of Molecules. United States: Cornell University; 2017.

Tetko IV, Karpov P, Van Deursen R, Godin G. State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis. Nat Commun. 2020;11(1):5575. doi: 10.1038/s41467-020-19266-y

Khalifa AA, Haranczyk M, Holliday J. Comparison of nonbinary similarity coefficients for similarity searching, clustering and compound selection. J Chem Inf Model. 2009;49(5):1193-1201. doi: 10.1021/ci8004644

Previous article in this issue

Next article in this issue

Artificial Intelligence in Health, Electronic ISSN: 3029-2387 Print ISSN: 3041-0894, Published by AccScience Publishing