Method for Automatic Classification of Full-Text Descriptions of Cores Using Dictionaries
Main Article Content
Abstract
The use of automatic text processing methods, including full-text description classification methods, allows achieving a significant reduction in labor costs when processing experimental data. This paper discusses the use of the automatic text classification method in the field of processing and classifying core elements and determining lithofacies. Lithofacies are coeval geological bodies (deposits) that differ in composition or structure from adjacent layers. When assessing the oil and gas potential of fields, it is necessary to construct maps and diagrams of lithofacies distribution. This requires classifying a large number of full-text descriptions of core sections prepared by specialists. The algorithm presented in the article allows, based on specified rules and dictionaries, to conduct classification taking into account the order and significance of keywords in sentences. The advantages of this approach are: the ability to distinguish between close lithofacies, the ability to use archival data, ease of adjustment to new classes, adaptation to Russian-language core descriptions and the possibility of local use without the need to transfer core descriptions to third-party applications.
Keywords:
Article Details
References
2. Antonov A.P., Afonin S.A., Kozitsyn A.S. i dr. Avtomatizirovannoe postroenie realistichnykh litofatsialnykh kart metodami kombinatornoi optimizatsii // Intellektualnye sistemy. Teoriia i prilozheniia. 2024. Vol. 28, № 4. S. 5–20.
3. Informatsionnaia sistema ABAI. URL: https://kmge.kz/abai/ (11.12.2025)
4. Baraboshkin E.E., Panchenko E.A., Demidov A.E. i dr. Sistema avtomaticheskogo opisaniia kerna v proizvodstvennom protsesse. Opyt primeneniia // Puti realizatsii neftegazovogo potentsiala Zapadnoi Sibiri: Materialy XXV nauchno-prakticheskoi konferentsii, Khanty-Mansiisk, 23–26 noiabria 2021 goda / Pod redaktsiei E.A. Vtorushinoi, E.E. Oksenoid, S.A. Aleshina, N.N. Zakharchenko, E.V. Oleinik, T.N. Pecherina. Khanty-Mansiisk: Avtonomnoe uchrezhdenie Khanty-Mansiiskogo avtonomnogo okruga – Iugry "Nauchno-analiticheskii tsentr ratsionalnogo nedropolzovaniia im.V.I.Shpilmana", 2022. S. 293–299.
5. Kompleks DHD. URL: https://magazine.neftegaz.ru/articles/tsifrovizatsiya/682038-tsifrovoy-analiz-kerna-v-zadachakh-proektirovaniya-razrabotki-neftyanykh-i-gazovykh-mestorozhdeniy-/ (11.12.2025)
6. Programmnyi kompleks "Tsifrovoi kern". URL: https://globalcio.ru/projects/10448/ (11.12.2025)
7. Aristov A.I., Zelenin A.V., Katanov Iu.E. Neirosetevoe raspoznavanie teksturnykh osobennostei graficheskikh kernovykh dannykh. Svidetelstvo o registratsii programmy dlia EVM RU 2024615647, 11.03.2024. Zaiavka № 2024614650 11.03.2024.
8. Li H, Wan B, Chu D, Wang R, Ma G, Fu J, Xiao Z. Progressive Geological Modeling and Uncertainty Analysis Using Machine Learning // ISPRS International Journal of Geo-Information. 2023. Vol. 12(3). 97. https://doi.org/10.3390/ijgi12030097
9. Khimulia V.V. Primenenie tekhnologii tsifrovogo analiza kerna dlia izucheniia filtratsionno-emkostnykh svoistv i struktury vysokopronitsaemykh porod podzemnykh khranilishch gaza // RJES. 2024. №5. S. 1–15. URL: https://rjes.ru/temp/fddc89c0f81314f3d14bad3446565446.pdf (11.12.2025).
10. Fuentes I., Padarian J., Iwanaga T., Vervoort R.W., 3D Lithological mapping of borehole descriptions using word embeddings // Computers & Geosciences. 2020. Vol. 141. 104516. https://doi.org/10.1016/j.cageo.2020.104516 URL: https://www.sciencedirect.com/science/article/pii/S0098300419306533
11. Padarian J., Fuentes I. Word embeddings for application in geosciences: development, evaluation, and examples of soil-related concepts // SOIL. 2019. Vol. 5. P. 177–187. https://doi.org/10.5194/soil-5-177-2019, 2019. URL: https://soil.copernicus.org/articles/5/177/2019/
12. Pennington J., Socher R., Manning C. Glove: Global vectors for word representation // Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014. P. 1532–1543
13. Katanov Iu.E., Aristov A.I., Iagafarov A.K., Novruzov O.D. Tsifrovoi kern: neirosetevoe raspoznavanie tekstovoi geologo-geofizicheskoi informatsii // Izvestiia vysshikh uchebnykh zavedenii. Neft i gaz. 2023. № 3 (159). S. 35–54.
14. Denisov D.V. Analiz metodov mashinnogo obucheniia dlia tematicheskoi klassifikatsii tekstov // Mezhdunarodnyi zhurnal informatsionnykh tekhnologii i energoeffektivnosti. 2024. Vol. 9, № 4(42). S. 5–11.
15. Kozitsyn A.S. Algoritmy tematicheskogo poiska dannykh v naukometricheskikh sistemakh // Programmnaia inzheneriia. 2022. Vol. 13, № 6. S. 291–300.

This work is licensed under a Creative Commons Attribution 4.0 International License.
Presenting an article for publication in the Russian Digital Libraries Journal (RDLJ), the authors automatically give consent to grant a limited license to use the materials of the Kazan (Volga) Federal University (KFU) (of course, only if the article is accepted for publication). This means that KFU has the right to publish an article in the next issue of the journal (on the website or in printed form), as well as to reprint this article in the archives of RDLJ CDs or to include in a particular information system or database, produced by KFU.
All copyrighted materials are placed in RDLJ with the consent of the authors. In the event that any of the authors have objected to its publication of materials on this site, the material can be removed, subject to notification to the Editor in writing.
Documents published in RDLJ are protected by copyright and all rights are reserved by the authors. Authors independently monitor compliance with their rights to reproduce or translate their papers published in the journal. If the material is published in RDLJ, reprinted with permission by another publisher or translated into another language, a reference to the original publication.
By submitting an article for publication in RDLJ, authors should take into account that the publication on the Internet, on the one hand, provide unique opportunities for access to their content, but on the other hand, are a new form of information exchange in the global information society where authors and publishers is not always provided with protection against unauthorized copying or other use of materials protected by copyright.
RDLJ is copyrighted. When using materials from the log must indicate the URL: index.phtml page = elbib / rus / journal?. Any change, addition or editing of the author's text are not allowed. Copying individual fragments of articles from the journal is allowed for distribute, remix, adapt, and build upon article, even commercially, as long as they credit that article for the original creation.
Request for the right to reproduce or use any of the materials published in RDLJ should be addressed to the Editor-in-Chief A.M. Elizarov at the following address: amelizarov@gmail.com.
The publishers of RDLJ is not responsible for the view, set out in the published opinion articles.
We suggest the authors of articles downloaded from this page, sign it and send it to the journal publisher's address by e-mail scan copyright agreements on the transfer of non-exclusive rights to use the work.