AccScience Publishing / AIH / Online First / DOI: 10.36922/AIH025120021
ORIGINAL RESEARCH ARTICLE

Large language models-in-the-loop: Leveraging expert small artificial intelligence models for multilingual anonymization and de-identification of protected health information

Murat Gunay1* Bunyamin Keles2 Raife Hizlan1
Show Less
1 Department of Research and Development, AI Handed LLC, Lewes, Delaware, United States of America
2 Department of Health Management, Hacettepe University Institute of Social Sciences, Ankara, Turkey
Received: 19 March 2025 | Revised: 21 August 2025 | Accepted: 26 August 2025 | Published online: 19 September 2025
© 2025 by the Author(s). This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution 4.0 International License ( https://creativecommons.org/licenses/by/4.0/ )
Abstract

The rise of chronic diseases and pandemics, such as COVID-19 has emphasized the need for effective patient data processing while ensuring privacy through anonymization and de-identification of protected health information. Anonymized data facilitates research without compromising patient confidentiality. This paper introduces expert small artificial intelligence (AI) models developed using the large language model (LLM)-in-the-loop methodology to meet the demand for domain-specific de-identification of named entity recognition (NER) models. These models overcome the privacy risks associated with LLMs used through application programming interfaces by eliminating the need to transmit or store sensitive data. More importantly, they consistently outperform LLMs in de-identification tasks, offering superior performance and reliability. Our de-identification NER models, developed in eight languages—English, German, Italian, French, Romanian, Turkish, Spanish, and Arabic—achieved F1-macro score averages of 0.931, 0.960, 0.955, 0.937, 0.930, 0.963, 0.957, and 0.922, respectively. These results establish our de-identification NER models as the most accurate healthcare anonymization solutions, surpassing existing small models and even general-purpose LLMs, such as GPT-4o. While Part I of this series introduced the LLM-in-the-loop methodology for biomedical document translation, this second paper showcases its success in developing cost-effective expert small NER models in de-identification tasks. Our findings lay the groundwork for future healthcare AI innovations, including biomedical entity and relation extraction, demonstrating the value of specialized models for domain-specific challenges.

Keywords
De-identification
Health Insurance Portability and Accountability Act
Protected health information
Patient safety
Large language models-in-the-loop
Anonymization
Funding
None.
Conflict of interest
The authors declare that they have no competing interests.
References
  1. Ahmed T, Al Aziz MM, Mohammed N. De-identification of electronic health record using neural network. Sci Rep. 2020;10(1):18600. doi: 10.1038/s41598-020-75544-1

 

  1. Wood A, Denholm R, Hollings S, et al. Linked electronic health records for research on a nationwide cohort of more than 54 million people in England: Data resource. BMJ. 2021;373:n826. doi: 10.1136/bmj.n826

 

  1. Gungoren M, Orhan F, Kurutkan N. Mikro Rekabetcilikte Yeni Yaklasımlar: Hastanelerde Olusan Etik Iklimin Kalite ve Akreditasyon Acısından Degerlendirilmesi [New Approaches in Micro-Competitiveness: Evaluating the Ethical Climate in Hospitals in Terms of Quality and Accreditation]. Vol. 18. Suleyman Demirel Universitesi Iktisadi ve Idari Bilimler Fakultesi Dergisi; 2013. p. 221-241. Available from: https://dergipark.org.tr/tr/pub/sduiibfd/issue/ 20819/222797 [Last accessed on 2025 Sep 17].

 

  1. Varol S, Orhan F, Tuncer S, Akyuz S. Saglık kurumlarında bilgi guvenligi baglamında biyometrik sistemler [Biometric systems in the context of information security in healthcare institutions]. Saglık Akadem Derg. 2016;3(4):155-162. doi: 10.5455/sad.13-1483706096

 

  1. Yilmaz D, Ozkoc EE, Ogutcu Ulas G. Elektronik saglik kayitlarinda farkındalık [Awareness of electronic health records]. Hacettepe Sağlık İdaresi Derg. 2021;24(4):777-792.

 

  1. HealthITSecurity. De-Identification of PHI According to the HIPAA Privacy Rule. Available from: https://healthitsecurity.com/features/de/identification/of/phi/according/to/the/ hipaa/privacy/rule [Last accessed on 2023 Apr 13].

 

  1. Act A. Health Insurance Portability and Accountability Act of 1996. Vol. 104. Public Law; 1996. p. 191. Available from: https://www.govinfo.gov/content/pkg/PLAW-104publ191/ pdf/PLAW-104publ191.pdf [Last accessed on 2025 Sep 17].

 

  1. Fernández-Alemán JL, Señor IC, Lozoya PÁ, Toval A. Security and privacy in electronic health records: A systematic literature review. J Biomed Inform. 2013;46(3):541-562. doi: 10.1016/j.jbi.2012.12.003

 

  1. Office for Civil Rights HH. Standards for privacy of individually identifiable health information. Final rule. Fed Regist. 2002;67(157):53181-53273.

 

  1. Toscano F, O’Donnell E, Unruh MA, et al. Electronic health records implementation: Can the European union learn from the United States? Eur J Public Health. 2018;28 Suppl 4:pcky213.401. doi: 10.1093/eurpub/cky213.401

 

  1. Guidance on De-Identification of Protected Health Information hhs Deid Guidance.pdf; 2012. Available from: https://www. hhs.gov/sites/default/files/ocr/priv/identification/hhs/deid/ guidance.pdf [Last accessed on 2023 Jul 17].

 

  1. Standards for Privacy of Individually Identifiable Health Information HHS.gov,; 2013. Available from: https://www.hhs.gov/hipaa/for/professionals/privacy/guidance/standards/ privacy/individually/identifiable/health/ information/index.html [Last accessed on 2023 Jul 17].

 

  1. Neamatullah I, Douglass MM, Lehman LW, et al. Automated de-identification of free-text medical records. BMC Med Inform Decis Mak. 2008;8:32. doi: 10.1186/1472-6947-8-32

 

  1. Paul T, Rana MKZ, Tautam PA, et al. Investigation of the utility of features in a clinical de-identification model: A demonstration using EHR pathology reports for advanced NSCLC patients. Front Digit Health. 2022;4:728922. doi: 10.3389/fdgth.2022.728922

 

  1. Garfinkel S. De-Identification of Personal Information, 2015: US Department of Commerce, National Institute of Standards and Technology. Available from: https://nvlpubs.nist.gov/ nistpubs/ir/2015/nist.ir.8053.pdf [Last accessed on 2025 Sep 17].

 

  1. Wu H, Toti G, Morley KI, et al. SemEHR: A general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trial recruitment, and clinical research. J Am Med Inform Assoc. 2018;25(5):530-537. doi: 10.1093/jamia/ocx160

 

  1. Stubbs A, Uzuner O. Annotating risk factors for heart disease in clinical narratives for diabetic patients. J Biomed Inform. 2015;58 Suppl: S78-S91. doi: 10.1016/j.jbi.2015.05.009

 

  1. Catelli R, Gargiulo F, Casola V, De Pietro G, Fujita H, Esposito M. A novel COVID-19 data set and an effective deep learning approach for the de-identification of italian medical records. IEEE Access. 2021;9:19097-19110. doi: 10.1109/ACCESS.2021.3054479

 

  1. Reddy S, Allan S, Coghlan S, Cooper P. A governance model for the application of AI in health care. J Am Med Inform Assoc. 2020;27(3):491-497. doi: 10.1093/jamia/ocz192

 

  1. Ong JCL, Seng BJ, Law JZ, et al. Artificial intelligence, ChatGPT, and other large language models for social determinants of health: Current state and future directions. Cell Rep Med. 2024;5(1):101356. doi: 10.1016/j.xcrm.2023.101356

 

  1. Gunasekeran DV, Tham YC, Ting DS, Tan GS, Wong TY. Digital health during COVID-19: Lessons from operationalising new models of care in ophthalmology. Lancet Digit Health. 2021;3(2):e124-e134. doi: 10.1016/S2589-7500(20)30287-9

 

  1. Ting DS, Carin L, Dzau V, Wong TY. Digital technology and COVID-19. Nat Med. 2020;26(4):459-461. doi: 10.1038/s41591-020-0824-5

 

  1. Verdicchio M, Perin A. When doctors and AI interact: On human responsibility for artificial risks. Philos Technol. 2022;35(1):11. doi: 10.1007/s13347-022-00506-6

 

  1. Dai SC, Xiong A, Ku LW. LLM-in-the-Loop: Leveraging Large Language Model for Thematic Analysis. [arXiv Preprint]; 2023. doi: 10.48550/arXiv.2310.15100

 

  1. De Paoli S. Can Large Language Models Emulate an Inductive Thematic Analysis of Semi-Structured Interviews? An Exploration and Provocation on the Limits of the Approach and the Model. [arXiv Preprint]; 2023. doi: 10.48550/arXiv.2305.13014

 

  1. Gilardi F, Alizadeh M, Kubli M. ChatGPT outperforms crowd workers for text- annotation tasks. Proc Natl Acad Sci U S A. 2023;120(30):e2305016120. doi: 10.1073/pnas.2305016120

 

  1. Islam T, Goldwasser D. Discovering Latent Themes in Social Media Messaging: A Machine-in-the-Loop Approach Integrating Llms. [arXiv Preprint]; 2024. doi: 10.48550/arXiv.2403.10707

 

  1. Pham DK, Vo BQ. Towards Reliable Medical Question Answering: Tech- niques and Challenges in Mitigating Hallucinations in Language Models. [arXiv Preprint]; 2024. doi: 10.48550/arXiv.2408.13808

 

  1. Umphrey R, Roberts J, Roberts L. Investigating Expert-in-the-Loop LLM Discourse Patterns for Ancient Intertextual Analysis. [arXiv Preprint]; 2024. doi: 10.48550/arXiv.2409.01882

 

  1. Keles B, Gunay M, Caglar SI. LLMs-in-the-loop Part-1: Expert Small AI Models for Bio-Medical Text Translation. [arXiv Preprint]; 2024. doi: 10.48550/arXiv.2407.12126

 

  1. Khin K, Burckhardt P, Padman R. A Deep Learning Architecture for De- identification of Patient Notes: Implementation and Evaluation. [arXiv Pre-print ]; 2018. doi: 10.48550/arXiv.1810.01570

 

  1. Morrison FP, Sengupta S, Hripcsak G. Using a pipeline to improve de-identification performance. AMIA Annu Symp Proc. 2009;2009:447-451.

 

  1. Stubbs A, Kotfila C, Uzuner O. Automated systems for the de-identification of longitudinal clinical narratives: Overview of 2014 i2b2/UThealth shared task track 1. J Biomed Inform. 2015;58 Suppl: S11-S19. doi: 10.1016/j-bi.2015.06.007

 

  1. Uzuner O, Luo Y, Szolovits P. Evaluating the state-of-the-art in automatic de-identification. J Am Med Inform Assoc. 2007;14(5):550-563. doi: 10.1197/jamia.M2444

 

  1. Dernoncourt F, Lee JY, Uzuner O, Szolovits P. De-identification of patient notes with recurrent neural networks. J Am Med Inform Assoc. 2017;24(3):596-606. doi: 10.48550/arXiv.1606.03475

 

  1. Ferrández O, South BR, Shen S, Friedlin FJ, Samore MH, Meystre SM. Evaluating current automatic de-identification methods with Veteran’s health administration clinical documents. BMC Med Res Methodol. 2012;12:109. doi: 10.1186/1471-2288-12-109

 

  1. Meystre SM, Friedlin FJ, South BR, Shen S, Samore MH. Automatic de-identification of textual documents in the electronic health record: A review of recent research. BMC Med Res Methodol. 2010;10:70. doi: 10.1186/1471-2288-10-70

 

  1. Liu Z, Chen Y, Tang B, et al. Automatic de-identification of electronic medical records using token-level and character-level conditional random fields. J Biomed Inform. 2015;58 Suppl: S47-S52. doi: 10.1016/j.jbi.2015.06.009

 

  1. Yang H, Garibaldi JM. Automatic detection of protected health information from clinic narratives. J Biomed Inform. 2015;58 Suppl: S30-S38. doi: 10.1016/j.jbi.2015.06.015

 

  1. Nadkarni PM, Ohno-Machado L, Chapman WW. Natural language processing: An introduction. J Am Med Inform Assoc. 2011;18(5):544-551. doi: 10.1136/amiajnl-2011-000464

 

  1. Sweeney L. Replacing personally-identifying information in medical records, the Scrub system. Proc AMIA Annu Fall Symp. 1996:333-337.

 

  1. Gupta D, Saul M, Gilbertson J. Evaluation of a deidentification (De-Id) software engine to share pathology reports and clinical documents for research. Am J Clin Pathol. 2004;121(2):176-186. doi: 10.1309/E6K3-3GBP-E5C2-7FYU

 

  1. He B, Guan Y, Cheng J, Cen K, Hua W. CRFs based de-identification of medical records. J Biomed Inform. 2015;58 Suppl: S39-S46. doi: 10.1016/j.jbi.2015.08.012

 

  1. Lafferty J, McCallum A, Pereira F. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. Williamstown, MA: ACM; 2001. doi: 10.1145/3696410.3714901

 

  1. Kocaman V, Talby D, Hak HU. Beyond accuracy: Automated de-identification of large real-world clinical text datasets. Value in Health. 2023;26(12):S532. doi: 10.48550/arXiv.2312.08495

 

  1. Liu Z, Huang Y, Cao C, et al. Deid-Gpt: Zero-Shot Medical Text de-Identification by Gpt-4. [arXiv Preprint]; 2023. doi: 10.48550/arXiv.2303.11032

 

  1. Stubbs A, Kotfila C, Xu H, Uzuner O. Identifying risk factors for heart disease over time: Overview of 2014 i2b2/UTHealth shared task track 2. J Biomed Inform. 2015;58 Suppl: S67-S77. doi: 10.1016/j.jbi.2015.07.001
Share
Back to top
Artificial Intelligence in Health, Electronic ISSN: 3029-2387 Print ISSN: 3041-0894, Published by AccScience Publishing