Method for Automatic Classification of Full-Text Descriptions of Cores Using Dictionaries

Main Article Content

Alexey Petrovich Antonov
Sergey Alexandrovich Afonin
Alexander Sergeevich Kozytsyn
Vladimir Mikhailovich Staroverov

Abstract

The use of automatic text processing methods, including full-text description classification methods, allows achieving a significant reduction in labor costs when processing experimental data. This paper discusses the use of the automatic text classification method in the field of processing and classifying core elements and determining lithofacies. Lithofacies are coeval geological bodies (deposits) that differ in composition or structure from adjacent layers. When assessing the oil and gas potential of fields, it is necessary to construct maps and diagrams of lithofacies distribution. This requires classifying a large number of full-text descriptions of core sections prepared by specialists. The algorithm presented in the article allows, based on specified rules and dictionaries, to conduct classification taking into account the order and significance of keywords in sentences. The advantages of this approach are: the ability to distinguish between close lithofacies, the ability to use archival data, ease of adjustment to new classes, adaptation to Russian-language core descriptions and the possibility of local use without the need to transfer core descriptions to third-party applications.

Article Details

How to Cite
Antonov, A. P., S. A. Afonin, A. S. Kozytsyn, and V. M. Staroverov. “Method for Automatic Classification of Full-Text Descriptions of Cores Using Dictionaries ”. Russian Digital Libraries Journal, vol. 29, no. 1, Feb. 2026, pp. 3-23, doi:10.26907/1562-5419-2026-29-1-3-23.

References

1. Iskusstvennyi intellekt v neftegazovoi industrii Kitaia. URL: https://nntc.pro/tpost/h2hoet4se1-iskusstvennii-intellekt-v-neftegazovoi-i (data obrashcheniia: 11.12.2025)
2. Antonov A.P., Afonin S.A., Kozitsyn A.S. i dr. Avtomatizirovannoe postroenie realistichnykh litofatsialnykh kart metodami kombinatornoi optimizatsii // Intellektualnye sistemy. Teoriia i prilozheniia. 2024. Vol. 28, № 4. S. 5–20.
3. Informatsionnaia sistema ABAI. URL: https://kmge.kz/abai/ (11.12.2025)
4. Baraboshkin E.E., Panchenko E.A., Demidov A.E. i dr. Sistema avtomaticheskogo opisaniia kerna v proizvodstvennom protsesse. Opyt primeneniia // Puti realizatsii neftegazovogo potentsiala Zapadnoi Sibiri: Materialy XXV nauchno-prakticheskoi konferentsii, Khanty-Mansiisk, 23–26 noiabria 2021 goda / Pod redaktsiei E.A. Vtorushinoi, E.E. Oksenoid, S.A. Aleshina, N.N. Zakharchenko, E.V. Oleinik, T.N. Pecherina. Khanty-Mansiisk: Avtonomnoe uchrezhdenie Khanty-Mansiiskogo avtonomnogo okruga – Iugry "Nauchno-analiticheskii tsentr ratsionalnogo nedropolzovaniia im.V.I.Shpilmana", 2022. S. 293–299.
5. Kompleks DHD. URL: https://magazine.neftegaz.ru/articles/tsifrovizatsiya/682038-tsifrovoy-analiz-kerna-v-zadachakh-proektirovaniya-razrabotki-neftyanykh-i-gazovykh-mestorozhdeniy-/ (11.12.2025)
6. Programmnyi kompleks "Tsifrovoi kern". URL: https://globalcio.ru/projects/10448/ (11.12.2025)
7. Aristov A.I., Zelenin A.V., Katanov Iu.E. Neirosetevoe raspoznavanie teksturnykh osobennostei graficheskikh kernovykh dannykh. Svidetelstvo o registratsii programmy dlia EVM RU 2024615647, 11.03.2024. Zaiavka № 2024614650 11.03.2024.
8. Li H, Wan B, Chu D, Wang R, Ma G, Fu J, Xiao Z. Progressive Geological Modeling and Uncertainty Analysis Using Machine Learning // ISPRS International Journal of Geo-Information. 2023. Vol. 12(3). 97. https://doi.org/10.3390/ijgi12030097
9. Khimulia V.V. Primenenie tekhnologii tsifrovogo analiza kerna dlia izucheniia filtratsionno-emkostnykh svoistv i struktury vysokopronitsaemykh porod podzemnykh khranilishch gaza // RJES. 2024. №5. S. 1–15. URL: https://rjes.ru/temp/fddc89c0f81314f3d14bad3446565446.pdf (11.12.2025).
10. Fuentes I., Padarian J., Iwanaga T., Vervoort R.W., 3D Lithological mapping of borehole descriptions using word embeddings // Computers & Geosciences. 2020. Vol. 141. 104516. https://doi.org/10.1016/j.cageo.2020.104516 URL: https://www.sciencedirect.com/science/article/pii/S0098300419306533
11. Padarian J., Fuentes I. Word embeddings for application in geosciences: development, evaluation, and examples of soil-related concepts // SOIL. 2019. Vol. 5. P. 177–187. https://doi.org/10.5194/soil-5-177-2019, 2019. URL: https://soil.copernicus.org/articles/5/177/2019/
12. Pennington J., Socher R., Manning C. Glove: Global vectors for word representation // Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014. P. 1532–1543
13. Katanov Iu.E., Aristov A.I., Iagafarov A.K., Novruzov O.D. Tsifrovoi kern: neirosetevoe raspoznavanie tekstovoi geologo-geofizicheskoi informatsii // Izvestiia vysshikh uchebnykh zavedenii. Neft i gaz. 2023. № 3 (159). S. 35–54.
14. Denisov D.V. Analiz metodov mashinnogo obucheniia dlia tematicheskoi klassifikatsii tekstov // Mezhdunarodnyi zhurnal informatsionnykh tekhnologii i energoeffektivnosti. 2024. Vol. 9, № 4(42). S. 5–11.
15. Kozitsyn A.S. Algoritmy tematicheskogo poiska dannykh v naukometricheskikh sistemakh // Programmnaia inzheneriia. 2022. Vol. 13, № 6. S. 291–300.


Most read articles by the same author(s)