Algorithms for Formation of Metadata Mathematical Retro Collections Based on Analysis of Structural Features of Documents
Main Article Content
Abstract
The solutions of the main problems associated with the formation of digital mathematical collections from documents published in the pre-digital period are presented – such collections are designated in the work as retro collections. Algorithms for creating a meta description of retro collections based on the analysis of the structure of mathematical documents and the use of software tools for extracting metadata are given. The description of retro-collections formed using the developed algorithms and included in the metadata factory of the digital mathematical library Lobachevskii-DML is given. The schemes for the formation of metadata and methods for normalizing the extracted metadata in accordance with the schemes and requirements of the integrating mathematical libraries are indicated.
Article Details
References
2. Елизаров А.М., Липачёв Е.К. Семантические методы и инструменты электронной математической библиотеки Lobachevskii-DML // Научный сервис в сети Интернет: труды XIX Всероссийской научной конференции (18–23 сентября 2017 г., г. Новороссийск). М.: ИПМ им. М.В. Келдыша, 2017. С. 130–136. https://doi.org/10.20948/abrau-2017-73. URL: http://keldysh.ru/ abrau/ 2017/73.pdf.
3. Elizarov A.M., Lipachev E.K. Big Math Methods in Lobachevskii-DML Digital Library // CEUR Workshop Proceedings. 2019. V. 2523. P. 59–72.
4. Developing a 21st Century Global Library for Mathematics Research // Washington: The National Academies Press, 2014. 142 p. doi:10.17226/18619.
5. Ion P. The Effort to Realize a Global Digital Mathematics Library // In: Greuel G.M., Koch T., Paule P., Sommese A. (Eds). Mathematical Software – ICMS 2016. ICMS 2016. Lecture Notes in Computer Science, Springer, Cham, 2016. V. 9725.
https://doi.org/10.1007/978-3-319-42432-3_59.
6. Ion P.D.F., Watt S.M. The Global Digital Mathematics Library and the International Mathematical Knowledge Trust // ICM 2017: Intelligent Computer Mathematics, 2017. Lecture Notes in Artificial Intelligence. 2017. V. 10383. P. 56–69. URL: https://doi.org/10.1007/978-3-319-62075-6_5.
7. Bouche T. Some Thoughts on the Near-Future Digital Mathematics Library. Towards a Digital Mathematics Library. Masaryk University, 2008. P. 3–15. URL: https://eudml.org/doc/221606, last accessed 2020/12/12.
8. Bouche T. Digital Mathematics Libraries: The Good, the Bad, the Ugly // Math. Comput. Sci. 2010. V. 3. P. 227–241. https://doi.org/10.1007/s11786-010-0029-2.
9. Bouche T. The Digital Mathematics Library as of 2014 // Notices Amer. Math. Soc 2014. V. 61 (9). P. 1085–1088.
10. EuDML metadata schema specification (v2.0–final), https://initiative.eudml.org/eudml-metadata-schema-specification-v20-final, last accessed 2020/12/12.
11. Bouche T., Rákosník J. Report on the EuDML External Cooperation Model // In: Kaiser K., Krantz S.G., Wegner B. (Eds) Topics and Issues in Electronic Publishing, JMM, Special Session. San Diego. 2013. P. 99–108. URL: https://www.emis.de/ proceedings/TIEP2013/07bouche_rakosnik.pdf, last accessed 2020/12/12.
12. Jost M., Bouche T., Goutorbe C., Jorda J.P. D3.2: The EuDML metadata schema. Revision: 1.6 as of 15th December 2010. URL: http://www.mathdoc.fr/ publis/d3.2-v1.6.pdf, last accessed 2020/12/12.
13. Гафурова П.О., Елизаров А.М., Липачёв Е.К., Хамматова Д.М. Методы формирования и нормализации метаданных в цифровой математической библиотеке // Научный сервис в сети Интернет: труды XXI Всероссийской научной конференции (23–28 сентября 2019 г., г. Новороссийск). М.: ИПМ им. М.В. Келдыша, 2019. С. 234–244. https://doi.org/10.20948/abrau-2019-28. http://keldysh.ru/abrau/2019/theses/ 28.pdf, last accessed 2020/12/12.
14. Gafurova P.O., Elizarov A.M., Lipachev E.K., Khammatovа D.M. Metadata Normalization Methods in the Digital Mathematical Library // CEUR Workshop Proсeedings. 2020. V. 2543. P. 136–148.
15. Zhizhchenko A.B., Izaak A.D. The information system Math-Net.Ru. Application of contemporary technologies in the scientific work of mathematicians // Russian Math. Surveys. 2007. V. 62 (5). P. 943–966. http://dx.doi.org/10.1070/ RM2007v062n05ABEH004455.
16. Zhizhchenko A.B., Izaak A.D. The information system Math-Net.Ru. Current state and prospects. The impact factors of Russian mathematics journals // Russian Math. Surveys. 2009. V. 64 (4). P. 775–784. http://dx.doi.org/10.1070/ RM2009v064n04ABEH004638.
17. Жижченко А.Б., Изаак А.Д. Информационная система Math-Net.Ru. Применение современных технологий в научной работе математика // Успехи математических наук. 2007. Т. 62, №5 (377). C. 107–132. URL: https://doi.org/10.4213/rm8147. URL: http://www.mathnet.ru/links/c59aff2f134382372f88aa415a76755f/rm8147.pdf.
18. Жижченко А.Б., Изаак А.Д. Информационная система Math-Net.Ru. Современное состояние и перспективы развития. Импакт-факторы российских математических журналов // Успехи математических наук. 2009. Т. 64, №4 (388). С. 195–204. URL: https://doi.org/10.4213/rm9312; http://www.mathnet.ru/links/e27ab619eaefe03fe79d663468ddd3a0/rm9312.pdf
19. Chebukov D.E., Izaak A.D., Misyurina O.G., Pupyrev Yu.A., Zhizhchenko A.B. Math-Net.Ru as a Digital Archive of the Russian Mathematical Knowledge from the XIX Century to Today. Intelligent Computer Mathematics // Lecture Notes in Computer Science. 2013. V. 7961. P. 344–348. https://doi.org/ 10.1007/978-3-642-39320-4_26.
20. Chebukov D.E., Izaak A.D., Misyurina O.G., Pupyrev Yu.A. Math-Net.Ru video library: Creating a collection of scientific talks // In: Greuel G.-M. (Ed.) et al., Mathematical software – ICMS 2016. 5th international conference, Berlin, Germany, July 11–14, 2016. Proceedings. Cham: Springer. Lecture Notes in Computer Science. 2016. V. 9725. P. 447–450. https://doi.org/10.1007/978-3-319-42432-3_57.
21. Гафурова П.О., Елизаров А.М., Липачёв Е.К. Базовые сервисы цифровой математической библиотеки Lobachevskii-DML // Электронные библиотеки. 2020. Т. 23 (3). С. 336–381. https://doi.org/10.26907/1562-5419-2020-23-3-336-381.
22. Elizarov A., Lipachev E. Digital Library Metadata Factories // Proceedings of the International Conference "Internet and Modern Society" (IMS-2020). CEUR Workshop Proceedings. 2021. V. 2813. P. 13–21.
23. Rocha E.M., Rodrigues J.F. Disseminating and preserving mathematical knowledge. In: Borwein J.M., Rocha E.M., Rodrigues J.F. (Eds.). Communicating Mathematics in the Digital Era. A K Peters, Ltd., 2008. P. 3–21.
24. Bouche T. Toward a Digital Mathematics Library? A French Pedestrian Overview. In: Borwein J.M., Rocha E.M., Rodrigues J.F. (Eds.). Communicating Mathematics in the Digital Era. A K Peters, Ltd., 2008. P. 47–73.
25. Schonfeld R. JSTOR a History. Princeton University Press, Princeton, 2003. 448 p.
26. Burns J., Brenner A., Kiser K., Krot M., Llewellyn C., and Snyder R. JSTOR – Data for Research // M. Agosti et al. (Eds.): ECDL 2009. Lecture Notes in Computer Science. 2009. V. 5714. P. 416–419.
27. Gallica: the Online Digital Library of the Bibliotheque nationale de France. Review Essay // Nineteenth-Century Music Review. 2014. V. 11 (2). P. 337–347. https://doi.org/10.1017/S1479409814000287.
28. Bouche T. The NUMDAM program. MSRI workshop, April 16th 2005, Berkeley, 2005. URL: https://www.msri.org/specials/dmlp/6-Bouche-numdam.pdf, last accessed 2020/12/12.
29. Bartošek M., Lhoták M., Rákosník J., Sojka P., and Šárfy M. The DML-CZ Project: Objectives and First Steps. In: Borwein J.M., Rocha E.M., Rodrigues J.F. (Eds.). Communicating Mathematics in the Digital Era. A K Peters, Ltd., 2008. P. 75–86.
30. Bartošek M., and Rákosník J. DML-CZ: The Experience of a Medium-Sized Digital Mathematics Library // Notices of the AMS. 2013. V. 60, No. 8. P. 1028–1033. http://dx.doi.org/10.1090/noti1031.
31. D7.4: Toolset for Image and Text Processing and Metadata Enhancements – Final Release. URL: https://wiki.eudml.eu/mediawiki/eudml/images/D7.4-v1.0.pdf, last accessed 2020/12/12.
32. Journal Article Tag Suite. https://jats.nlm.nih.gov/about.html, last accessed 2020/12/12.
33. Elizarov A.M., Lipachev E.K. Methods of Processing Large Collections of Scientific Documents and the Formation of Digital Mathematical Library // CEUR Workshop Proceedings. 2020. V. 2543. P. 354–360.
34. Nilsson M., Naeve A., Duval E., Johnston P., Massart D. Harmonization Methodology for Metadata Models.
https://hal.archives-ouvertes.fr/hal-00591548, last accessed 2020/12/12.
35. Elizarov A.M., Lipachev E.K., Haidarov S.M. Automated Processing Service System of Large Collections of Scientific Documents // CEUR Workshop Proceedings. 2016. V. 1752. P. 58–64.
36. Elizarov A.M., Khaydarov Sh.M., Lipachev E.R. Scientific documents ontologies for semantic representation of digital libraries // 2017 Second Russia and Pacific Conference on Computer Technology and Applications (RPC). Vladivostok, Russky Island, Russia 25-29 September, 2017. P. 1–5. https://doi.org/10.1109/RPC.2017.8168064.
37. Peroni S. Semantic Web Technologies and Legal Scholarly Publishing, Springer International Publishing, 2014. 304 p.
https://doi.org/10.1007/978-3-319-04777-5.
38. Constantin A., Peroni S., Pettifer S., Shotton D., Vitali F. The Document Components Ontology (DoCO) // Semantic Web. 2016. V. 7, No. 2. P. 167–181. https://doi.org/10.3233/SW-150177.
39. Ruiz-Iniesta A., and Corcho O. A review of ontologies for describing scholarly and scientific documents // CEUR Workshop Proceedings. 2014. V. 1155. P. 1–12. URL: http://ceur-ws.org/Vol-1155/paper-07.pdf, last accessed 2020/12/12.
40. Kogalovsky M.R., Parinov S.I. Scholarly Communication in a Semantically Enrichable Research Information System with Embedded Taxonomy of Scientific Relationships // In: Klinov P., Mouromtsev D. (Eds.) Knowledge Engineering and Semantic Web. Communications in Computer and Information Science, Springer, 2015. V. 518. P. 87–101.
https://doi.org/10.1007/978-3-319-24543-0_7.
41. Биряльцев Е.В., Елизаров А.М., Жильцов Н.Г., Липачёв Е.К., Невзорова О.А., Соловьев В.Д. Методы анализа семантических данных математических электронных коллекций // Научно-техническая информация. Серия 2: Информационные процессы и системы. 2014. № 4. С. 12–17.
42. Biryal'tsev E., Elizarov A., Zhil'tsov N., Lipachev E., Nevzorova O., Solov'ev V. Methods for Analyzing Semantic Data of Electronic Collections in Mathematics // Automatic Documentation and Mathematical Linguistics. 2014. V. 48. No. 2. P. 81–85.
43. Ronzano F., Saggion H. Dr. Inventor Framework: Extracting Structured Information from Scientific Publications // In: Japkowicz N., Matwin S. (Eds.) Discovery Science. Lecture Notes in Computer Science, Springer, Cham., 2015. V. 9356. https://doi.org/10.1007/978-3-319-24282-8_18.
44. Tkaczyk D., Tarnawski B. and Bolikowski Ł. Structured Affiliations Extraction from Scientific Literature // D-Lib Magazine. 2015. V. 21, No. 11/12. https://doi.org/10.1045/november2015-tkaczyk.
Presenting an article for publication in the Russian Digital Libraries Journal (RDLJ), the authors automatically give consent to grant a limited license to use the materials of the Kazan (Volga) Federal University (KFU) (of course, only if the article is accepted for publication). This means that KFU has the right to publish an article in the next issue of the journal (on the website or in printed form), as well as to reprint this article in the archives of RDLJ CDs or to include in a particular information system or database, produced by KFU.
All copyrighted materials are placed in RDLJ with the consent of the authors. In the event that any of the authors have objected to its publication of materials on this site, the material can be removed, subject to notification to the Editor in writing.
Documents published in RDLJ are protected by copyright and all rights are reserved by the authors. Authors independently monitor compliance with their rights to reproduce or translate their papers published in the journal. If the material is published in RDLJ, reprinted with permission by another publisher or translated into another language, a reference to the original publication.
By submitting an article for publication in RDLJ, authors should take into account that the publication on the Internet, on the one hand, provide unique opportunities for access to their content, but on the other hand, are a new form of information exchange in the global information society where authors and publishers is not always provided with protection against unauthorized copying or other use of materials protected by copyright.
RDLJ is copyrighted. When using materials from the log must indicate the URL: index.phtml page = elbib / rus / journal?. Any change, addition or editing of the author's text are not allowed. Copying individual fragments of articles from the journal is allowed for distribute, remix, adapt, and build upon article, even commercially, as long as they credit that article for the original creation.
Request for the right to reproduce or use any of the materials published in RDLJ should be addressed to the Editor-in-Chief A.M. Elizarov at the following address: amelizarov@gmail.com.
The publishers of RDLJ is not responsible for the view, set out in the published opinion articles.
We suggest the authors of articles downloaded from this page, sign it and send it to the journal publisher's address by e-mail scan copyright agreements on the transfer of non-exclusive rights to use the work.