The system for the automatic generation, processing, and management of document metadata in digital collections

Main Article Content

Almaz Rustamovich Khamedzhanov

Abstract

The publishing cycle is currently undergoing significant technological changes: automated publication management systems are being implemented, neural network technologies are being used for content processing, and tools for the intelligent analysis of scientific data are being actively developed. One of the key trends is the automation of the publishing cycle, aimed at accelerating manuscript processing, improving the quality of metadata, and ensuring the interoperability of information resources. In this context, metadata serves as a connecting element for machine processing and navigation within the scientific knowledge space, ensuring the structuring, interpretation, and integration of information into digital library systems. However, metadata for scientific publications often contain errors, inaccuracies, or are incomplete, and their manual creation and refinement are time-consuming and do not ensure high accuracy. The aim of this work is to design and develop a system for the automatic generation, processing, and management of metadata for scientific documents based on data obtained from scientific publication search services and open knowledge bases. The system can be used to automate the process of extracting, refining, and supplementing the metadata of scientific publications for the purpose of subsequently creating electronic collections of scientific documents.

Article Details

How to Cite
Khamedzhanov, A. R. “The System for the Automatic Generation, Processing, and Management of Document Metadata in Digital Collections”. Russian Digital Libraries Journal, vol. 29, no. 3, June 2026, pp. 937-59, doi:10.26907/1562-5419-2026-29-3-937-959.

References

1. Gartner R. Metadata. Shaping Knowledge from Antiquity to the Semantic Web. Springer Cham, 2016. https://doi.org/10.1007/978-3-319-40893-4
2. Kogalovsky M.R. Metadata in Computer Systems // Programming and Computer Software. 2013. V. 39, No. 4. P. 182–193. https://doi.org/10.1134/S0361768813040038
3. Xie I., Matusiak K. K. Discover Digital Libraries Theory and Practice. Elsevier Inc., 2016.
4. Kogalovsky M.R. Metadata, their Properties, Functions and Classifications // CEUR Workshop Proceedings. 2012. V. 934. P. 3–14.
5. Kogalovsky M.R., Serebryakov V.A. Metadata // National Interactive Encyclopedia Portal "Knowledge". 2022. No. 9. https://doi.org/10.54972/00000048_2022_9_48
6. Olver P.J. The World Digital Mathematics Library: Report of a Panel Discussion // Proceedings of the International Congress of Mathematicians, August 13–21, 2014, Seoul, Korea. Kyung Moon SA, 1. 2014. P. 773–785.
7. EuDML metadata schema specification (v2.0–final). URL: https://initiative.eudml.org/eudml-metadata-schema-specification-v20-final.
8. The EuDML metadata schema. Revision: 1.6 as of 15th December 2010. / Jost M., Bouche T., Goutorbe C., Jorda J.P. URL: http://www.mathdoc.fr/ publis/d3.2-v1.6.pdf (last access 04.04.2026)
9. Sylwestrzak W., Borbinha J., Bouche T., Nowiński A., Sojka P. EuDML –Towards the European Digital Mathematics Library // In: SojkaP. (Ed.) Towards a Digital Mathematics Library. Masaryk University, 2010. P. 11–26. URL: https://eudml.org/doc/220786 (last access 04.04.2026)
10. Bouche T. Reviving the free public scientific library in the digital age? The EuDML project // In: Kaiser K., Krantz S.G., Wegner B. (Eds.) Topics and Issues in Electronic Publishing JMM/AMS Special Session, FIZ Karlsruhe. 2013. P. 57–80. URL: https://www.emis.de/proceedings/TIEP2013/05bouche.pdf (last access 04.04.2026)
11. Gafurova P.O., Elizarov A.M., Lipachev E.K., Khammatova D.M. Metadata Normalization Methods in the Digital Mathematical Library // CEUR Workshop Proceedings. 2020. V. 2543. P. 136–148.
12. Khamedzhanov А.R. The system of automatic generation of a block of metadata of scientific documents using open databases // Highly Available Systems. 2026. V. 22, No. 1. P. 51−55. https://doi.org/10.18127/j20729472-202601-10
13. Gerasimov A.N., Elizarov A.M., Lipachev E.K. Formation of metadata for international citation databases in the management system of electronic scientific journals // Russian Digital Libraries Journal. 2015. V. 18, No. 1–2. P. 6–31.
14. Gafurova P.О., Lipachov E.K. Method for Clarifying the Affiliation of Authors of Scientific Documents Based on Requests to the Semantic Web. XXIV All-Russian Scientific Conference ‘Scientific Service on the Internet’. 2022. P. 115–127. https://doi.org/10.20948/abrau-2022-31
15. Gafurova P., Elizarov A., Lipachev E. Algorithms for Integration of Unstructured Mathematical Documents into the Common Digital Space of Scientific Knowledge // CEUR Workshop Proceedings. 2021. V. 2990. P. 39–49. URL: http://ceur-ws.org/Vol-2990/rpaper4.pdf (last access 04.04.2026)
16. Elizarov A., Gafurova P., Lipachev E., Wikidata in Metadata Formation Methods for Documents of Digital Mathematical Library // CEUR Workshop Proceedings. 2021. V. 3066. P. 23–33.
17. Gafurova P.O., Elizarov A.M., Lipachev E.K. Extraction of Wikidata Knowledge for the Metadata Formation for Documents of Electronic Mathematical Collections // Russian Digital Libraries Journal. 2021. V. 24, No. 6. P. 1023–1059. https://doi.org/10.26907/1562-5419-2021-24-6-1023-1059
18. Bouche T., Labbe O. The New Numdam Platform // In: Geuvers H., England M., Hasan O., Rabe F., Teschke O. (Eds.) Intelligent Computer Mathematics. CICM 2017. Lecture Notes in Computer Science. Vol. 10383. Springer, Cham, 2017. P. 70–82. https://doi.org/10.1007/978-3-319-62075-6_6
19. Elizarov A., Lipachev E. Digital Platforms and Digital Scientific Libraries // International Journal of Open Information Technologies. 2020. V. 8, No. 11. P. 80–90.
20. Elizarov A., Lipachev E. Digital Library Metadata Factories // CEUR Workshop Proceedings. 2021. V. 2813. P. 13-21.
21. Gafurova P.O., Elizarov A.M., Lipachev E.K. Basic Services of Factory Metadata Digital Mathematical Library Lobachevskii-DML // Russian Digital Libraries Journal. 2020. V. 23, No. 3. P.336–381. https://doi.org/10.26907/1562-5419-2020-23-3-336-381
22. Elizarov A.M., Lipachev E.K. Lobachevskii Digital Library in the Scientific Space of Mathematical Knowledge // Automatic Documentation and Mathematical Linguistics Series 1: Organization and Methods of Information Work. 2023. No. 1. P. 32–37. https://doi.org/10.36535/0548-0019-2023-01-3
23. Elizarov A., Lipachev E. BIG MATH Methods in Lobachevskii-DML Digital Library // CEUR Workshop Proceedings. 2019. V. 2523. P. 59–72.
24. Gafurova P.O., Elizarov A.M., Lipachev E.K., Khammatova D.M. Metadata Normalization Methods in the Digital Mathematical Library // CEUR Workshop Proceedings. 2020. V. 2543. P. 136–148
25. Biryal'tsev E., Elizarov A., Zhil'tsov N., Lipachev E., Nevzorova O., Solov'ev V. Methods for Analyzing Semantic Data of Electronic Collections in Mathematics // Automatic Documentation and Mathematical Linguistics. 2014. V. 48, No. 2. P. 81–85. https://doi.org/10.3103/S000510551402006X
26. Biryal'tsev E., Elizarov A., Zhil'tsov N., Lipachev E., Nevzorova O., Solov'ev V. Methods for analyzing semantic data of mathematical electronic collections // Scientific and Technical Information. Series 2: Information Processes and Systems. 2014. No 4. P. 12–17.
27. Elizarov A.M., Lipachev E.K., Khaydarov S.M. Automated System of Services for Processing of Large Collections of Scientific Documents // CEUR Workshop Proceedings. 2016. V. 1752. P. 58–68.
28. Elizarov A., Khaydarov S., Lipachev E. Scientific Documents Ontologies for Semantic Representation of Digital Libraries // RPC 2017 – Proceedings of the 2nd Russian-Pacific Conference on Computer Technology and Applications. 2017. P. 1–5. https://doi.org/10.1109/RPC.2017.8168064
29. Peroni S. Semantic Web Technologies and Legal Scholarly Publishing. Springer International Publishing, 2014. https://doi.org/10.1007/978-3-319-04777-5
30. Andreichev M.D., Gafurova P.O., Elizarov A.M., Lipachev E.K. Replenishment of Documents of Mathematical Digital Retro-collections by Searching in Semantic Web. XXIII All-Russian Scientific Conference ‘Scientific Service on the Internet’. 2021. P. 22–33. https://doi.org/10.20948/abrau-2021-22
31. Gafurova P.O., Elizarov A.M., Lipachev E.K. Algorithms for Formation of Metadata Mathematical Retro Collections Based on Analysis of Structural Features of Documents // Russian Digital Libraries Journal. 2021. V. 24, No 2. P. 238–271. https://doi.org/10.26907/1562-5419-2021-24-2-238-270