Efficient schema-less text-to-SQL conversion using large language models
Large language models (LLMs) are increasingly being applied to several tasks including text-to-SQL (the process of converting natural language to SQL queries). While most studies revolve around training LLMs on large SQL corpora for better generalization and then perform prompt engineering during inference, we investigate the notion of training LLMs for schema-less prompting. In particular, our approach uses simple natural language questions as input without any additional knowledge about the database schema. By doing so, we demonstrate that smaller models paired with simpler prompts result in considerable performance improvement while generating SQL queries. Our model, based on the Flan-T5 architecture, achieves logical form accuracy (LFA) of 0.85 on the MIMICSQL dataset, significantly outperforming current state-of-the-art models such as Defog-SQL-Coder, GPT-3.5-Turbo, LLaMA-2-7B and GPT-4. This approach reduces the model size, lessening the amount of data and infrastructure cost required for training and serving, and improves the performance to enable the generation of much complex SQL queries.
- Deng N, Chen Y, Zhang Y. Recent Advances in Text-to- SQL: A Survey of What We Have and What We Expect. In: Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Republic of Korea. International Committee on Computational Linguistics; 2022. p. 2166-2187.
- Katsogiannis-Meimarakis G, Koutrika G. A survey on deep learning approaches for text-to-SQL. VLDB J. 2023;32:905-936. doi: 10.1007/s00778-022-00776-8
- Wang P, Shi T, Reddy CK. Text-to-SQL Generation for Question Answering on Electronic Medical Records. In: Proceedings of the Web Conference 2020 (WWW ‘20). New York, NY, USA: Association for Computing Machinery. p. 350-361. doi: 10.1145/3366423.3380120
- Codd EF. A relational model of data for large shared data banks. Commun ACM. 1970;13(6):377-387. doi: 10.1145/362384.362685
- Hemphill CT, Godfrey JJ, Doddington GR. The ATIS Spoken Language Systems Pilot Corpus. In: Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania; 1990.
- Dahl DA, Bates M, Brown M, et al. Expanding the Scope of the ATIS Task: The ATIS-3 Corpus. In: Human Language Technology: Proceedings of a Workshop Held at Plainsboro, New Jersey; 1994.
- Zelle JM, Mooney RJ. Learning to Parse Database Queries Using Inductive Logic Programming. In: Proceedings of the National Conference on Artificial Intelligence; 1996. p. 1050-1055.
- Popescu AM, Etzioni O, Kautz H. Towards a Theory of Natural Language Interfaces to Databases. In: Proceedings of the 8th International Conference on Intelligent user Interfaces (IUI ‘03). New York, NY, USA: Association for Computing Machinery; 2003. p. 149-157. doi: 10.1145/604045.604070
- Bertomeu N, Uszkoreit H, Frank A, Krieger HU, Jörg B. Contextual Phenomena and Thematic Relations in Database QA Dialogues: Results from a Wizard-of-Oz experiment. In: Proceedings of the Interactive Question Answering Workshop at HLT-NAACL. New York, NY, USA: Association for Computational Linguistics; 2006. p. 1-8.
- Saha D, Floratou A, Sankaranarayanan K, Minhas UF, Mittal AR, Özcan F. ATHENA: An ontology-driven system for natural language querying over relational data stores. Proc VLDB Endow. 2016;9(12):1209-1220. doi: 10.14778/2994509.2994536
- Choi DH, Shin MC, Kim EG, Shin DR. RYANSQL: Recursively applying sketch-based slot fillings for complex text-to-SQL in cross-domain databases. Comput Linguistics. 2021;47(2):309-332. doi: 10.1162/coli_a_00403
- Wang B, Shin R, Liu X, Polozov O, Richardson M. RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. United States: Association for Computational Linguistics; 2020. p. 7567-7578. doi: 10.18653/v1/2020.acl-main.677
- Cao R, Chen L, Chen Z, Zhao Y, Zhu S, Yu K. LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Vol. 1 (Long Papers); 2021. p. 2541-2555. doi: 10.18653/v1/2021.acl-long.198
- Raffel C, Shazeer N, Roberts A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res. 2020;21(1):5485-5551. doi: 10.48550/arXiv.1910.10683
- Scholak T, Schucher N, Bahdanau D. PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Punta Cana, Dominican Republic. Association for Computational Linguistics; 2021. p. 9895-9901. doi: 10.18653/v1/2021.emnlp-main.779 8111035487
- Li H, Zhang J, Li C, Chen H. RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL. In: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence (AAAI’ 23/IAAI’ 23/EAAI ‘23), Vol. 37. Washington, DC, U.S: AAAI Press, 2023. p. 13067-13075. doi: 10.1609/aaai.v37i11.26535
- Min S, Lyu X, Holtzman A, et al. Rethinking the role of demonstrations: What makes in-context learning work? In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Abu Dhabi, United Arab Emirates. Association for Computational Linguistics. p. 11048-11064. doi: 10.18653/v1/2022.emnlp-main.759
- Liu J, Shen D, Zhang Y, Dolan B, Carin L, Chen W. What Makes Good In-Context Examples for GPT-3? In: Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. Dublin, Ireland: Association for Computational Linguistics; 2022. p. 100-114. doi: 10.18653/v1/2022.deelio-1.10
- Wei J, Wang X, Schuurmans D, et al. Chain-of-thought prompting elicits reasoning in large language models. Adv Neural Info Process Syst. 2022;35:24824-24837.
- Yu T, Zhang R, Yang K, et al. Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics; 2018. p. 3911-3921. doi: 10.18653/v1/D18-1425