Efficient schema-less text-to-SQL conversion using large language models

© 2024 by the Author (s). This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution 4.0 International License ( https://creativecommons.org/licenses/by/4.0/ )

Download PDF

XML

Cite

Abstract

Large language models (LLMs) are increasingly being applied to several tasks including text-to-SQL (the process of converting natural language to SQL queries). While most studies revolve around training LLMs on large SQL corpora for better generalization and then perform prompt engineering during inference, we investigate the notion of training LLMs for schema-less prompting. In particular, our approach uses simple natural language questions as input without any additional knowledge about the database schema. By doing so, we demonstrate that smaller models paired with simpler prompts result in considerable performance improvement while generating SQL queries. Our model, based on the Flan-T5 architecture, achieves logical form accuracy (LFA) of 0.85 on the MIMICSQL dataset, significantly outperforming current state-of-the-art models such as Defog-SQL-Coder, GPT-3.5-Turbo, LLaMA-2-7B and GPT-4. This approach reduces the model size, lessening the amount of data and infrastructure cost required for training and serving, and improves the performance to enable the generation of much complex SQL queries.

Keywords

Large language models

MIMICSQL

Schema-less

Logical form accuracy

Defog-SQL-Coder

GPT-3.5-Turbo

LLaMA-2-7B

GPT-4

Funding

None.

Conflict of interest

The authors declare no competing interests.

References

Deng N, Chen Y, Zhang Y. Recent Advances in Text-to- SQL: A Survey of What We Have and What We Expect. In: Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Republic of Korea. International Committee on Computational Linguistics; 2022. p. 2166-2187.

Katsogiannis-Meimarakis G, Koutrika G. A survey on deep learning approaches for text-to-SQL. VLDB J. 2023;32:905-936. doi: 10.1007/s00778-022-00776-8

Wang P, Shi T, Reddy CK. Text-to-SQL Generation for Question Answering on Electronic Medical Records. In: Proceedings of the Web Conference 2020 (WWW ‘20). New York, NY, USA: Association for Computing Machinery. p. 350-361. doi: 10.1145/3366423.3380120

Codd EF. A relational model of data for large shared data banks. Commun ACM. 1970;13(6):377-387. doi: 10.1145/362384.362685

Hemphill CT, Godfrey JJ, Doddington GR. The ATIS Spoken Language Systems Pilot Corpus. In: Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania; 1990.

Dahl DA, Bates M, Brown M, et al. Expanding the Scope of the ATIS Task: The ATIS-3 Corpus. In: Human Language Technology: Proceedings of a Workshop Held at Plainsboro, New Jersey; 1994.

Zelle JM, Mooney RJ. Learning to Parse Database Queries Using Inductive Logic Programming. In: Proceedings of the National Conference on Artificial Intelligence; 1996. p. 1050-1055.

Popescu AM, Etzioni O, Kautz H. Towards a Theory of Natural Language Interfaces to Databases. In: Proceedings of the 8th International Conference on Intelligent user Interfaces (IUI ‘03). New York, NY, USA: Association for Computing Machinery; 2003. p. 149-157. doi: 10.1145/604045.604070

Bertomeu N, Uszkoreit H, Frank A, Krieger HU, Jörg B. Contextual Phenomena and Thematic Relations in Database QA Dialogues: Results from a Wizard-of-Oz experiment. In: Proceedings of the Interactive Question Answering Workshop at HLT-NAACL. New York, NY, USA: Association for Computational Linguistics; 2006. p. 1-8.

Saha D, Floratou A, Sankaranarayanan K, Minhas UF, Mittal AR, Özcan F. ATHENA: An ontology-driven system for natural language querying over relational data stores. Proc VLDB Endow. 2016;9(12):1209-1220. doi: 10.14778/2994509.2994536

Choi DH, Shin MC, Kim EG, Shin DR. RYANSQL: Recursively applying sketch-based slot fillings for complex text-to-SQL in cross-domain databases. Comput Linguistics. 2021;47(2):309-332. doi: 10.1162/coli_a_00403

Wang B, Shin R, Liu X, Polozov O, Richardson M. RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. United States: Association for Computational Linguistics; 2020. p. 7567-7578. doi: 10.18653/v1/2020.acl-main.677

Cao R, Chen L, Chen Z, Zhao Y, Zhu S, Yu K. LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Vol. 1 (Long Papers); 2021. p. 2541-2555. doi: 10.18653/v1/2021.acl-long.198

Raffel C, Shazeer N, Roberts A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res. 2020;21(1):5485-5551. doi: 10.48550/arXiv.1910.10683

Scholak T, Schucher N, Bahdanau D. PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Punta Cana, Dominican Republic. Association for Computational Linguistics; 2021. p. 9895-9901. doi: 10.18653/v1/2021.emnlp-main.779 8111035487

Li H, Zhang J, Li C, Chen H. RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL. In: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence (AAAI’ 23/IAAI’ 23/EAAI ‘23), Vol. 37. Washington, DC, U.S: AAAI Press, 2023. p. 13067-13075. doi: 10.1609/aaai.v37i11.26535

Min S, Lyu X, Holtzman A, et al. Rethinking the role of demonstrations: What makes in-context learning work? In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Abu Dhabi, United Arab Emirates. Association for Computational Linguistics. p. 11048-11064. doi: 10.18653/v1/2022.emnlp-main.759

Liu J, Shen D, Zhang Y, Dolan B, Carin L, Chen W. What Makes Good In-Context Examples for GPT-3? In: Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. Dublin, Ireland: Association for Computational Linguistics; 2022. p. 100-114. doi: 10.18653/v1/2022.deelio-1.10

Wei J, Wang X, Schuurmans D, et al. Chain-of-thought prompting elicits reasoning in large language models. Adv Neural Info Process Syst. 2022;35:24824-24837.

Yu T, Zhang R, Yang K, et al. Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics; 2018. p. 3911-3921. doi: 10.18653/v1/D18-1425

Previous article in this issue

Next article in this issue

Artificial Intelligence in Health, Electronic ISSN: 3029-2387 Print ISSN: 3041-0894, Published by AccScience Publishing