Testing ChatGPT on Terminology Generation, Definitions: Translation, and Ontology Creation in German, English and Polish
DOI:
https://doi.org/10.18778/1731-7533.23.20Keywords:
ChatGPT, cluster equivalence, corpus, LLM, ontology, ontoterminology, prompt, term, translationAbstract
The paper presents the generation process and analysis of Chat GPT-produced terminology, definitions, translation and ontology creation of content derived from a restricted domain of electrotechnology on the basis of German manufacturing instruction (Fertigungsvorschrift) for the assembly and functioning of thermal switches. The first part of the study presents a brief historical sketch of corpus linguistics towards the development of LLMs and their applications. Research steps include the generation of terminology and a basic restricted domain ontology in German and their equivalents in English and Polish as identified in general web resources, followed by the same tasks based on the analysed tool instructions. The tests were first performed on earlier ChatGPTPro, followed by its recent version ChatGPT4, also with regard to their English and Polish equivalents, for different tasks, i.a., relevant thesauri building. An Assistant called SLA (Special Language Assistant) was created for the purpose of analysing prompts, recognising context and intent, and processing data using language models.
The tests have been carried out in terms of 4 prompts. First, specialist terms typical of the product and its parts were identified in German and the translation of these terms and of other relevant specialist phrases into English and Polish was performed. Finally, the ontological categorisation and its visualisation were generated. Results indicate areas of fair correspondences with manual intervention needed for the term definition refinement and ontology specification. The presentation emphasizes in the conclusions the effects of AI tasks as a contribution to lexicography, translation and foreign language education. The study can also serve as a reference for NLP researchers to improve the functioning of LLM tools.
References
Dornseiff, Franz (2003). Der deutsche Wortschatz nach Sachgruppen. Berlin & New York: W. de Gruyter (8., völlig neu bearb. ... Aufl.).
Google Scholar
DOI: https://doi.org/10.1515/9783110901009
Hallig, Rudolf & Wartburg, Walther von (1963). Begriffssystem als Grundlage fur die Lexikographie. Versuch eines Ordnungsschemas. Berlin: Akademie-Verlag (2nd ed.).
Google Scholar
DOI: https://doi.org/10.1515/9783112580301
Li, Ning, Ren, Liang, Liu, Zon-Ssang., Ren, Shu-hang, Xiang, Chong, Wu, Bo-yu., and Cai, Xuan . (2024). Ontology Construction Technology of Knowledge Graph in Oil and Gas Exploration and Development. In: Lin, J. (eds) Proceedings of the International Field Exploration and Development Conference 2023. IFEDC 2023. Springer Series in Geomechanics and Geoengineering. Springer, Singapore. https://doi.org/10.1007/978-981-97-0272-5_38
Google Scholar
DOI: https://doi.org/10.1007/978-981-97-0272-5_38
Lewandowska-Tomaszczyk, Barbara (2017). Cluster Equivalence, General Language, and Language for Specific Purposes. In: M. Grygiel (ed.) Cognitive Approaches to Specialist Language. Newcastle upon Tyne: Cambridge Scholars Publishing. 384 – 418.
Google Scholar
Lewandowska-Tomaszczyk, Barbara (2022/2023). Emerging AI technologies: ChatGPT challenges in contemporary university foreign language education. Konin Language Studies. KSJ. 11 (2). 2023. 167-186.
Google Scholar
Pawłowski, Grzegorz (2023). The Implementation of professional language terminology in Polish production companies. In: Lewandowska-Tomaszczyk, B., Trojszczak, M. (Eds.). Language Use, Education, and Professional Contexts. Springer. 55-70.
Google Scholar
DOI: https://doi.org/10.1007/978-3-030-96095-7_4
Pęzik, Piotr (2014). Graph-based analysis of collocational profiles. In V. Jesenšek and P. Grzybek (Eds.). Phraseologie Im Worterbuch Und Korpus (Phraseology in Dictionaries and Corpora). ZORA 97 (pp. 227–243). Maribor, Bielsko-Biała, Budapest, Kansas, Praha: Filozofska Fakuteta.
Google Scholar
Pęzik, Piotr (2016). Exploring phraseological equivalence with paralela. In: Gruszczyńska E., Leńko-Szymańska, A. (eds) Polish-language parallel Corpora. Warsaw: Instytut Lingwistyki Stosowanej UW, 67–81.
Google Scholar
Radszuweit, Siegrid & Spalier, Martha (1982). Knaurs Lexikon der sinnverwandten Worter. 20 000 Stichworter mit ihren Synonymen. Munchen & Zurich: Droemer Knaur.
Google Scholar
Rees, Geraint Paul. & Robert Lew (2023). The Effectiveness of OpenAI GPT generated definitions versus definitions from an English Learners’ Dictionary in a lexically orientated reading task. International Journal of Lexicography XX 1-25.
Google Scholar
DOI: https://doi.org/10.1093/ijl/ecad030
Roget, Peter Mark (1852), Thesaurus of English words and phrases. classified and arranged so as to facilitate the expression of ideas and assist in literary composition. London etc.: Longmans, Green & Co.
Google Scholar
Roget, Peter Mark (1984), Roget's II. The new thesaurus. New York: Berkley (Condensed version of the homonymous publication by Houghton Mifflin, 1980).
Google Scholar
Schryver, de Gilles-Maurice 2023. Generative AI and Lexicography: The Current State of the Art Using ChatGPT. International Journal of Lexicography, 2023, 36, 355–387 https://doi.org/10.1093/ijl/ecad021
Google Scholar
DOI: https://doi.org/10.1093/ijl/ecad021
Sierra, Gerardo, 2000. The onomasiological dictionary: a gap in lexicography. In: Heid, U., Evert, S., Lehmann, E., and C. Rohrer (Eds.), Proceedings of EURALEX 2000 . Stuttgart University. 223-235. https://euralex.org/wpcontent/themes/euralex/proceedings/
Google Scholar
Veseli, Blerta, Singhania, Sneha, Razniewski, Simon and Gerhard Weikum. 2023. Evaluating language models for knowledge base completion, In: Proceedings of ESWC, vol. 13870 of LNCS, Springer. 227–243. doi:10.1007/ 978-3-031-33455-9_14.
Google Scholar
DOI: https://doi.org/10.1007/978-3-031-33455-9_14
Vrolijk, Jarno, Reklos, Ioannis, Vafaie, Mahsa, Massari, Arcangelo, Mohammadi, Maryan and Sebastian Rudolph. 2022.Toward a comparison framework for interactive ontology enrichment methodologies, in: Proceedings of VOILA@ISWC, vol. 3253 of CEUR Workshop Proceedings, CEUR-WS. 41–50
Google Scholar
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
