Creation of Query Expansion Based on the Subject Domain Thesaurus in the Ontology of Knowledge of the Semantic Library

Main Article Content

Olga Muratovna Ataeva
Vladimir Alekseevich Serebriakov
Natalia Pavlovna Tuchkova

Abstract

Possibilities of query expansion with subject area thesaurus are discussed. The role of the context defined by thesaurus term links is both to refine the query and to increase the size of the sample on the query. Of particular importance is the process of expanding the query for scientific subject areas where the search based on special terminology. In this case, thesauruses of subject areas must be used to minimize the occurrence of information noise. The proposed approach takes into account the application of similar terminology in various subject areas. Examples of the use of thesaurus of separate sections of equations of mathematical physics and related fields demonstrate the effectiveness of the chosen approach of research. By linking to concepts of information resources of other areas of knowledge, the extension of the information query captures search fields of remote subject areas and various types of data, texts, symbolic, audio and video archives. Research shows that expanding the query based on context semantics improves the search quality of scientific publications in digital information and increases the effectiveness of scientific interdisciplinary research.

Article Details

Author Biographies

Olga Muratovna Ataeva

Researcher of the of Dorodnicyn computing center FRC SCS RAS, expert in the field of system programming and databases.

Vladimir Alekseevich Serebriakov

Expert in the field of theory of formal languages and its applications, doctor of sciences, professor, head of Dorodnicyn computing center FRC SCS RAS department. Head and participant in the development of a number of well-known program projects, in particular, ISIR RAS, Scientific portal RAS.

Natalia Pavlovna Tuchkova

Senior researcher of Dorodnicyn computing center FRC SCS RAS, PhD in physics with a math degree, graduated from CS Faculty of Lomonosov MSU. The expert in the field of algorithmic languages and information technologies.

References

Voorhees E.M. Query expansion using lexical-semantic relations. In SIGIR 94. ACM 1994. P. 61–69.

Golden P., Shaw R., Buckland M. Decentralized coordination of controlled vocabularies // Proceedings of the American Society for Information Science and Technology. Annual Meeting, October 31 – November 4, 2014, Seattle, WA, USA. 2014 DOI: 10.1002/meet.2014.14505101146 77th ASIS&T

Vechtomova O. Query Expansion for Information Retrieval. In: LIU L., ÖZSU M.T. (eds.) Encyclopedia of Database Systems. Springer, Boston, MA. 2009 DOI: 10.1007/978-0-387-39940-9_947

Salton G. The SMART retrieval system (Chapter 14). Prentice-Hall, Englewood Cliffs NJ. (Reprinted from Rocchio J.J. (1965). Relevance feedback in information retrieval. In Scientific Report ISR-9, Harvard University), 1971.

Маннинг К.Д., Рагбхаван П., Шютце Г. Введение в информационный поиск. Издательский дом Вильямс. 528 с. ISBN 978-5-8459-1623-5.

Spärck Jones K. Automatic keyword classification for information retrieval. Butterworths, London, 1971.

van Rijsbergen C.J. A theoretical basis for the use of co-occurrence data in information retrieval // J. Doc. 1977. V. 33. No 2. P. 106–119.

Qui Y., Frei H. Concept based query expansion. SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval Pittsburgh, Pennsylvania, USA June 27 – July 01, 1993. ACM New York, NY, USA. P. 160–169. ISBN 0-89791-605-0. DOI:10.1145/160688.160713.

Schütze H. Automatic Word Sense Discrimination // Computational Linguistics, March 1998 – Special Issue on Word Sense Disambiguation. 1998. V. 24. No 1. P. 97–123. https://www.aclweb.org/anthology/J98-1004.pdf

Larkey L.S., Croft W.B. Combining classifiers in text categorization // SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval Zurich, Switzerland. August 18–22. 1996. 1996. P. 289–297. ISBN:0-89791-792-8 DOI: 10.1145/243199.243276.

Zentralblatt MATH https://zbmath.org

Муромский А.А., Тучкова Н.П. Об онтологии адресата в математической предметной области // Электронные библиотеки. 2018. Т. 21. № 6. С. 506–533.

Моисеев Е.И., Муромский А.А., Тучкова Н.П. О тезаурусе предметной области смешанные уравнения математической физики // CEUR Workshop Proceedings. 2018. V. 2260. P. 395–405. DOI: 10.20948/abrau-2018-43

Атаева О.М., Серебряков В.А., Тучкова Н.П. Подходы к организации математических знаний при формировании предметных тезаурусов различных разделов математики // CEUR Workshop Proceedings. 2018. V. 2260. P. 42–54. ISSN:1613-0073. DOI: 10.20948/abrau-2018-66.

Bizer C., Heath T., Berners-Lee T.  Linked Data – The Story So Far // International Journal on Semantic Web and Information Systems. 2009. V. 5. No 3. URL: https://eprints.soton.ac.uk/271285/1/bizer-heath-berners-lee-ijswis-linked-data.pdf. DOI:10.4018/jswis.2009081901.

Моисеев Е.И., Лихоманенко Т.Н. Собственные функции задачи Трикоми с наклонной линией изменения типа // Дифференциальные уравнения. 2016. Т. 52, № 10, С. 1375–1382.

Виноградов И.М. (ред.). Математическая энциклопедия: В 5-ти т. Сов. энцикл., 1979.