Scientific Publications and the Embedding Space of Knowledge

Main Article Content

Abstract

The article examines current challenges in scientometrics arising from the surge in publication activity and the widespread adoption of generative artificial intelligence. The existing scientometric toolkit for analyzing research activity is reviewed, categorized into quantitative metrics and science mapping methods (citation network analysis, academic genealogy, semantic analysis, etc.). An attempt is made to overcome the limitations of traditional citation analysis, such as “semantic blindness” and vulnerability to manipulation. As a potential solution, a conceptual model is proposed where the unit of analysis shifts from the publication as a whole to an individual “key statement”. This approach involves recording not only the statement’s content but also its type, area of relevance, and its logical relationship with other claims (confirmation, refutation, clarification, generalization, etc.). Within this framework, principles for calculating modified scientometric metrics are introduced.


The proposed model was tested on a corpus of 728 articles from the Russian  journal Informatics and Education (2016–2025). An analysis conducted using large language models revealed that retrospective extraction of statements faces significant hurdles due to established cultures of scientific communication. Consequently, the study highlights the advantages of having authors formulate key statements themselves as a distinct type of metadata. In conclusion, the paper outlines development paths for the concept of an “embedding space of knowledge,” which could eventually complement existing approaches to analyzing the evolution of scientific ideas and theories.

Article Details

How to Cite
Marinosyan, A. K., and S. G. Grigoriev. “Scientific Publications and the Embedding Space of Knowledge”. Russian Digital Libraries Journal, vol. 29, no. 2, Apr. 2026, pp. 565-94, doi:10.26907/1562-5419-2026-29-2-565-594.

References

1. Hirsch J.E. An Index to Quantify an Individual's Scientific Research Output // Proceedings of the National Academy of Sciences of the United States of America. 2005. Vol. 102. No. 46. P. 16569–16572. https://doi.org/10.1073/pnas.0507655102
2. Egghe L. Theory and Practice of the g-index // Scientometrics. 2006. Vol. 69. No. 1. P. 131–152. https://doi.org/10.1007/s11192-006-0144-7
3. Connor J. Google Scholar Citations Open To All // Google Scholar Blog. 2011. November 11. URL: https://scholar.googleblog.com/2011/11/google-scholar-citations-open-to-all.html (date accessed: 12.01.2026)
4. Colledge L. Snowball Metrics Recipe Book. Edition 2. 2014. URL: https://arma.ac.uk/wp-content/uploads/2021/08/Snowball-Metrics-Recipe-Book-edition-2.pdf (date accessed: 12.01.2026)
5. Hutchins B.I., Yuan X., Anderson J.M., Santangelo G.M. Relative Citation Ratio (RCR): A New Metric That Uses Citation Rates to Measure Influence at the Article Level // PLOS Biology. 2016. Vol. 14. No. 9. e1002541. https://doi.org/10.1371/journal.pbio.1002541
6. Priem J., Taraborelli D., Groth P., Neylon C. Altmetrics: A Manifesto. 2011. URL: https://digitalcommons.unl.edu/scholcom/185/ (date accessed: 12.01.2026)
7. García-Villar C. A Critical Review on Altmetrics: Can We Measure the Social Impact of Research? // Insights into Imaging. 2021. Vol. 12. No. 1. Article 92. https://doi.org/10.1186/s13244-021-01033-2
8. Bornmann L., Marx W., Gasparyan A.Y., Kitas G.D. Diversity, Value and Limitations of the Journal Impact Factor and Alternative Metrics // Rheumatology International. 2012. Vol. 32. No. 7. P. 1861–1867. https://doi.org/10.1007/s00296-011-2276-1
9. Teixeira da Silva J.A. CiteScore: Advances, evolution, applications, and limitations // Publishing Research Quarterly. 2020. Vol. 36. No. 3. P. 459–468. https://doi.org/10.1007/s12109-020-09736-y
10. González-Pereira B., Guerrero-Bote V.P., Moya-Anegón F. A new approach to the metric of journals’ scientific prestige: The SJR indicator // Journal of Informetrics. 2010. Vol. 4. No. 3. P. 379–391. https://doi.org/10.1016/j.joi.2010.03.002
11. Moed H.F. Measuring contextual citation impact of scientific journals // Journal of Informetrics. 2010. Vol. 4. No. 3. P. 265–277. https://doi.org/10.1016/j.joi.2010.01.002
12. Bergstrom C. Eigenfactor: Measuring the value and prestige of scholarly journals // College & Research Libraries News. 2007. Vol. 68. No. 5 P. 314–316. https://doi.org/10.5860/crln.68.5.7804
13. scite.ai. Bringing Smart Citations to Rankings. June 16, 2025. URL: https://scite.ai/blog/smart-citations-rankings (date accessed: 12.01.2026)
14. Open Science Collaboration. Estimating the Reproducibility of Psychological Science // Science. 2015. Vol. 349. No. 6251. https://doi.org/10.1126/science.aac4716
15. Goodman S.N., Fanelli D., Ioannidis J.P.A. What Does Research Reproducibility Mean? // Science Translational Medicine. 2016. Vol. 8. No. 341. https://doi.org/10.1126/scitranslmed.aaf5027
16. Guyatt G.H., Oxman A.D., Vist G.E., Kunz R., Falck-Ytter Y., Alonso-Coello P., Schünemann H.J. GRADE: An Emerging Consensus on Rating Quality of Evidence and Strength of Recommendations // British Medical Journal. 2008. Vol. 336. No. 7650. P. 924–926. https://doi.org/10.1136/bmj.39489.470347.AD
17. Small H. Co-citation in the Scientific Literature: A New Measure of the Relationship Between Two Documents // Journal of the American Society for Information Science. 1973. Vol. 24. No. 4. P. 265–269. https://doi.org/10.1002/asi.4630240406
18. Leydesdorff L., Nerghes A. Co-word maps and topic modeling: A comparison using small and medium-sized corpora (N < 1,000) // Journal of the Association for Information Science and Technology. 2017. Vol. 68. No. 4. P. 1024–1035. https://doi.org/10.1002/asi.23740
19. Jackson A. A Labor of Love: The Mathematics Genealogy Project // Notices of the American Mathematical Society. 2007. Vol. 54. No. 8. P. 1002–1003.
20. Rossi L., Freire I.L., Mena-Chalco J.P. Genealogical Index: A Metric to Analyze Advisor-Advisee Relationships // Journal of Informetrics. 2017. Vol. 11. No. 2. P. 564–582. https://doi.org/10.1016/j.joi.2017.04.001
21. Lerner I.M., Marinosyan A.Kh., Grigoriev S.G., Yusupov A.R., Anikieva M.A., Garifullina G.A. An approach to the formation of intellectual academic genealogy using large language models // Journal Electromagnetic Waves and Electronic Systems. 2024. Vol. 29. No. 4. P. 108–120. https://doi.org/10.18127/j5604128-202404-09 (In Russ.)
22. Grigoriev S.G., Lerner I.M., Marinosyan A.Kh., Arutyunova N.K., Grigorieva M.A. On the issue of educational and methodological information selection for implementing an adaptive learning management system: Algorithm of a priori authors classification // Informatics and Education / Informatika i obrazovanie. 2025. Vol. 40. No. 2. P. 66–78.https://doi.org/10.32517/0234-0453-2025-40-2-66-78 (In Russ.)
23. Marinosyan A.Kh., Grigoriev S.G., Lerner I.M., Anikieva M.A. Automated comparison of scientific research based on academic genealogy // Informatics and Education / Informatika i obrazovanie. 2025. Vol. 40. No. 6. P. 16–27. https://doi.org/10.32517/0234-0453-2025-40-6-16-27 (In Russ.)
24. Newman M.E.J. Coauthorship Networks and Patterns of Scientific Collaboration // Proceedings of the National Academy of Sciences. 2004. Vol. 101. Suppl 1. P. 5200–5205. https://doi.org/10.1073/pnas.0307545100
25. Hou H., Kretschmer H., Liu Z. The Structure of Scientific Collaboration Networks in Scientometrics // Scientometrics. 2008. Vol. 75. No. 2. P. 189–202. https://doi.org/10.1007/s11192-007-1771-3
26. van Raan A.F.J. Sleeping Beauties in Science // Scientometrics. 2004. Vol. 59. No. 3. P. 467–472. https://doi.org/10.1023/B:SCIE.0000018543.82441.f1
27. Kosyakov D.V. Can a knowledge map be drawn? History, approaches, and the AI revolution in scientific cartography // “Pulse of Science” Company. 2025. URL: https://vkvideo.ru/video-224951981_456239150 (date accessed: 12.01.2026) (In Russ.)
28. Strathern M. ‘Improving ratings’: Audit in the British University system. European Review. 1997. Vol. 5. No. 3. P. 305–321. https://doi.org/10.1002/(SICI)1234-981X(199707)5:3<305::AID-EURO184>3.0.CO;2-4
29. Groth P., Gibson A., Velterop J. The Anatomy of a Nanopublication // Information Services & Use. 2010. Vol. 30. No. 1–2. P. 51–56. https://doi.org/10.3233/ISU-2010-0613
30. Kosmulski M. A New Hirsch-Type Index Saves Time and Works Equally Well as the Original h-index // ISSI Newsletter. 2006. Vol. 2. No. 3. P. 4–6.
31. Page L., Brin S., Motwani R., Winograd T. The PageRank citation ranking: Bringing order to the web // Stanford InfoLab. Technical Report. 1999. URL: http://ilpubs.stanford.edu:8090/422/ (date accessed: 12.01.2026).
32. Gemini Team, Google. Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities // arXiv. URL: https://arxiv.org/abs/2507.06261