• Main Navigation
  • Main Content
  • Sidebar

Russian Digital Libraries Journal

  • Home
  • About
    • About the Journal
    • Aims and Scopes
    • Themes
    • Editor-in-Chief
    • Editorial Team
    • Submissions
    • Open Access Statement
    • Privacy Statement
    • Contact
  • Current
  • Archives
  • Register
  • Login
  • Search
Published since 1998
ISSN 1562-5419
16+
Language
  • Русский
  • English

Search

Advanced filters

Search Results

Sorting problem on graths in programming contests

Mihail Ivanovich Kinder, Andrei Kazantsev
384-391
Abstract: The problem of sorting data is analyzed, the order relation between which is described as the adjacency relation of vertices on an arbitrary graph. Subtasks and issues related to the ‘neighborhood‘ of the problem are highlighted; their solution is the level of ‘immersion‘ in the solution of the general problem. Algorithms for solving individual subtasks for graphs of a special kind are discussed, as well as various approaches to solving the sorting problem in the general case. A sorting task of this type was proposed at the ISI-Junior School Programming Cup in July 2019 (Innopolis).
Keywords: mathematical olympiads, programming contests, informatics olympiads, multilevel tasks in mathematics, multilevel tasks in informatics contests, sorting problem on graphs.

On the Synonym Search Model

Olga Muratovna Ataeva, Vladimir Alekseevich Serebriakov, Natalia Pavlovna Tuchkova
1006-1022
Abstract:

The problem of finding the most relevant documents as a result of an extended and refined query is considered. For this, a search model and a text preprocessing mechanism are proposed, as well as the joint use of a search engine and a neural network model built on the basis of an index using word2vec algorithms to generate an extended query with synonyms and refine search results based on a selection of similar documents in a digital semantic library. The paper investigates the construction of a vector representation of documents based on paragraphs in relation to the data array of the digital semantic library LibMeta. Each piece of text is labeled. Both the whole document and its separate parts can be marked. The problem of enriching user queries with synonyms was solved, then when building a search model together with word2vec algorithms, an approach of "indexing first, then training" was used to cover more information and give more accurate search results. The model was trained on the basis of the library's mathematical content. Examples of training, extended query and search quality assessment using training and synonyms are given.

Keywords: search model, word2vec algorithm, synonyms, information query, query extension.

The Problem of the Existence of a Tree with a Characteristic Vector of Node Vertices

Ivan Nikolaevich Popov
474-484
Abstract:

The paper presents the problem of the existence of a tree with certain numerical characteristics. It is clear that if a tree is given, it is possible to determine the number of node vertices of the tree and leaves, as well as to determine their degrees. Thus, for a tree, you can define a set of pairs whose coordinates are numbers corresponding to the number of node vertices and their degrees. We can form the inverse problem: we give pairs of natural numbers whose second coordinates are greater than 1, and we should determine whether there is at least one tree that the numbers of its node vertices and their degrees coincide with these pairs. The solution to this problem is presented in this paper.

Keywords: algorithm, Python, graph-tree, Prufer code of the tree.

Analysis of the Distribution of Key Terms in Scientific Articles

Svetlana Aleksandrovna Vlasova, Nikolay Evgenievich Kalenov, Irina Nikolaevna Sobolevskaya
35-51
Abstract:

One of the Common Digital Space of Scientific Knowledge (CDSSK) main components are the subject ontologies of individual thematic subspaces, which include the basic concepts related to this scientific area. The constructing subject ontologies task at the initial phase requires the array of key terms formation in a given scientific are with the subsequent establishment of links between them. A similar task is in the encyclopedias formation in terms of the articles (slots) list generating that determines their content. One of the sources for the formation of the key terms array can be the metadata of articles published in the leading scientific journals. Namely, the author's key terms ("keywords" in the terminology of the journals editors) quoted by the article. To make a conclusion about the possibility of using this approach to the subject ontologies formation, it is necessary to conduct the author's key terms array preanalysis, both in terms of real correspondence to the main areas of research in this science branch and in terms of the distribution of the certain terms occurrence frequency. This article presents the results of the occurrence frequency analysis of the author's key terms in Russian and English, carried out on the software processing basis of several thousand articles from leading Russian journals in mathematics, computer science and physics, reflected in the MathNet database. An assessment was made of the distribution of key terms correspondence (as phrases) and individual words to the Bradford's law, and the key terms cores within the thematic direction were identified.

Keywords: digital space of scientific knowledge, subject ontologies, encyclopedia articles, key terms, article metadata, frequency analysis.

On Serious and Funny in Science (Based on Materials of Digital Libraries)

Yuri Evgenievich Polak
215-249
Abstract:

Digital libraries (DL) and archives accumulate gigantic volumes of various information. The goal of this work is, without trying to cover the immensity, to try, using a relatively small number of striking examples, to trace how issues of scientific creativity are reflected in DL; discuss and dispel stereotypical ideas about scientists as unsociable, pedantic formalists or eccentric, absent-minded persons; show how the peculiarities of their thought processes, combined with high intelligence, can cause misunderstanding in everyday life. At the same time, these qualities, combined with originality of thinking, sometimes turning into paradox, are manifested in non-standard approaches to problems, non-trivial solutions, and an ironic attitude towards the surrounding reality. As a result, along with serious results, unexpected associations and analogies; jokes, witticisms, and anecdotes are born. The paper provides examples of the creativity of scientists in the professional field, as well as works in such genres as science fiction, utopia, humor, and art song. Materials from 20+ electronic libraries were used.

Keywords: digital libraries, image of a scientist, scientific creativity, humor, art song.

Tactical Sorting of Managerial Tasks During Their Administration by Means of Priority, Specifications and Affiliations Labels

Felix Osvaldovich Kasparinsky
733-745
Abstract: The article analyzes the specifics of the functional programs for managing strategic, tactical and operational tasks. A technique for prefixing operational task names with tactical labels of Priorities, Specifications and Affiliations is proposed. Label abbreviations are formed in such a way as to ensure the correct prioritization when sorting tasks in alphabetical order. The quadrants of the D. Eisenhower Priorities matrix are indicated by two-letter marks: important urgently (IF – Important, Fast); important indefinitely (IS – Important, Slow); not important, but promptly (UF – Unimportant, Fast): neither important nor urgent (US – Unimportant, Slow). The labels of the Specifications matrix for the information environment (RA, RI, SA, SI) are composed of mutually exclusive properties of the availability of the Network (I – Internet and A – Autonomous) and the presence of reduced or special functionality (R – Reduced and S – Special). Labels of the transport specification (TA, TB, TC, TP) allow you to sort tasks that require moving (T – Translocation) on an airplane (A), a bus (B), a car (C) and on foot (P – Pedestrian), respectively. Three-letter marks of Affiliations (belonging to an individual or legal entity) are formed from the first letters of the name, middle name and last name or name of the laboratory, company, project. Tactical marks accelerate decision-making when forming a daily list of operational tasks.
Keywords: task, planning, management, priority, specification, affiliation, label, operational, tactical.

The Rating of the Journal in the Bibliographic Database

Mikhail Mikhailovich Gorbunov-Posadov, Tatyana Alekseevna Polilova
1060-1089
Abstract:

The tool for building ratings of scientific journals is one of the popular services of bibliographic databases. The task of building a rating is usually divided into two main subtasks: determining the reference group of journals and calculating the rating indicator for journals of this reference group. Practice shows that for the correct comparison of journals, a necessary condition is to limit the reference group to exclusively journals of a certain subject. In the case of methodological errors made at the stage of selecting a reference group, the values of the journal index in the rating may differ greatly from the expected ones.


For example, in the ranking of journals in the Russian Science Citation Index (RSCI) according to the two-year impact factor in the thematic area “Mathematics”, classical fundamental mathematical journals, contrary to expectations, do not reach the first positions of the rating. The first positions were taken by journals for which mathematics is not the dominant profile discipline. Analysis of statistical data on the subject of published articles and citations in journals that occupy leading positions in the RSCI rating shows that the multidisciplinary nature of these journals significantly influenced the rating indicators.


The noted misunderstanding leads to the idea that in this case, not all the articles of the journal should have been involved in the calculation of the rating, but only those related to this thematic area. At the same time, the existing scheme of thematic classification of directions also raises questions. The "bottom-up" classification, which is gaining popularity and works on a representative array of articles, seems to be more promising. Here thematic clusters are isolated on the basis of the concept of proximity of articles, interpreted as the proximity of their bibliographic links. And further, the thematic affiliation of the article is not assigned by the volitional decision of the author or the editorial board, but is strictly formally calculated on the basis of its bibliographic list.

Keywords: scientific publication, citation, rating of journals, thematic classification, impact factor, multidisciplinary, bibliographic reference, co-citation, bottom-up classification, thematic clustering, Citation Topics.

Recommender system of text analytics of legal documents

Денис Сергеевич Зуев, Марат Фаритович Насрутдинов, Айрат Фаридович Хасьянов
435-449
Abstract:

The paper discusses the use of machine learning mechanisms, natural language analysis and intellectual search in the field of jurisprudence. The main expected results are the methodology for applying text-based analytics and semantic natural language processing (NLP) algorithms in knowledge management cases in different types of legal practice. The obtained results can be applied in the field of education and knowledge management in a wider context, since the study lies at the union of jurisprudence, mathematical and computer linguistics.

We describe a prototype of a multi-agent system of intellectual analysis of legal texts that is capable of identifying general dependencies on the existing database of legal documents, providing legal cases with similar topics, recommending the most likely outcomes of judicial review.
Keywords: data analytics and data mining, data intensive domains, digital libraries, clustering, classification of judicial acts, recommender system, micro-service architecture.

From the originator

Наталья Валентиновна Лукашевич
86-87
Abstract: -

Ontology of the Universal Subspace of Common Digital Space of Scientific Knowledge

Svetlana Aleksandrovna Vlasova, Nikolay Evgenievich Kalenov, Alexander Nikolaevch Sotnikov
22-42
Abstract:

The work is a development of research conducted by the authors in the field of creating a Common Digital Space for Scientific Knowledge (CDSSK). Previous research has proposed a unified structure for representing the ontology of CDSSK elements (subspaces, classes and attributes of objects, relationships between objects or attributes). In the process of modeling the ontology using the example of the universal and a number of thematic subspaces of the CDSSK, the need for some adjustments to the structure of the ontology regarding CDSSK directories was revealed to ensure the possibility of describing nested data attributes. In addition, the concept of “data attribute value dictionary type” was introduced into the ontology; two types of dictionaries were defined – “static” and “dynamic”. This information makes it possible to simplify formal-logical control algorithms when generating CDSSK content. An indication of the dictionary type has been introduced into the structure of object attribute directories. The presented work describes the modified structure of the ontology using the example of 11 auxiliaries and 10 subject classes of the CDSSK universal subspace (USS). Examples of directories of each class, built in accordance with the ontology structure model, a list of object attributes and examples of static dictionaries are given.

Keywords: digital space of scientific knowledge, ontologies, structuring, linked data.

Creating Pseudowords Generator and Classifier of Their Similarity with Words from Russian Dictionary using Machine Learning

Kirill Alekseevich Romadanskiy, Artemii Evgenyevich Akhaev, Tagmir Radikovich Gilyazov
145-162
Abstract:

In this article, a pseudoword is defined as a unit of speech or text that appears to be a real word in Russian but actually has no meaning. A real or natural word is a unit of speech or text that has an interpretation and is presented in a dictionary. The paper presents two models for working with the Russian language: a generator that creates pseudowords that resemble real words, and a classifier that evaluates the degree of similarity between the entered sequence of characters and real words. The classifier is used to evaluate the results of the generator. Both models are based on recurrent neural networks with long short-term memory layers and are trained on a dataset of Russian nouns. As a result of the research, a file was created containing a list of pseudowords generated by the generator model. These words were then evaluated by the classifier to filter out those that were not similar enough to real words. The generated pseudowords have potential applications in tasks such as name and branding creation, layout design, art, crafting creative works, and linguistic studies for exploring language structure and words.

Keywords: word generation, pseudoword, neural network, recurrent neural network, long short-term memory.

On the Invariance of Indefinite Integral On the Method of It’S Calculation

Sergey Vyacheslavovich Kostin
627-635
Abstract: Invariance of the indefinite integral on the method of its calculation is noted. Model problem that can be used for the demonstration of this invariance is treated and solved via three different methods. Significance of formation and development of student’s mathematical culture is noted.
Keywords: primitive, indefinite integral, methods of integration.

Mechanisms for using mobile devices in distributed computing

Нуршат Рушанович Низамов, Ирина Сергеевна Шахова
200-213
Abstract: The paper is aimed to describe a system with some mechanisms for using mobile devices in distributed computing. Emphasis is placed on components of the system which control tasks and distribute resources.
Keywords: distributed computing, mobile applications, Android, mobile devices.

Generation of academic groups and project teams based on learners data acquisition

Наталья Александровна Коргутлова, Светлана Юрьевна Басаргина, Михаил Михайлович Абрамский, Марат Альбертович Солнцев, Таисия Сергеевна Бузукина
193-208
Abstract:

The questions of usage of the learners’ data in the solutions for generating student academic groups, electives and project teams are considered. The applications of Machine Learning clustering algorithms for these tasks are illustrated. The opportunity of usage of social network data is shown.

Keywords: personal portrait of student, clustering, competence distribution, social networking analysis.

Semantic analysis of documents in the control system of digital scientific collections

Шамиль Махмутович Хайдаров
61-85
Abstract: Methods of the semantic documents parsing in digital control system of scientific collections, including electronic journals, offered. The methods of processing documents containing mathematical formulas and methods for the conversion of documents from the OpenXML-format in ТеХ-format considered. The search algorithm for the mathematical formulas in the collections of documents stored in OpenXML-format designed. The algorithm is implemented as online-service on platform science.tatarstan.
Keywords: semantic analysis, publishing systems.

Methods and Tools Used for Preparation Scientific Articles Publications in HTML Format

Rimma Yuryevna Skornyakova
252-302
Abstract:

Along with the traditional form of electronic presentation of full texts scientific articles – the PDF format, the HTML format has become increasingly widespread in recent years. It has a number of advantages for online publications due to the available means for better content structuring, adding multimedia and implementing of various interactive and dynamic features. In this regard, the task of getting an HTML version of a scientific article from the original format sent by the author becomes highly topical. The article discusses various approaches to preparing HTML versions of full texts scientific articles and describes the software used in this process. The main attention is paid to the tools used for source materials in the Word format.


The paper also outlines the basics of the JATS XML standard, which is widely used in the preparation of online publications of journal articles.

Keywords: HTML version of a scientific article, XML version of a scientific article, standard for the exchange of scientific articles, JATS, conversion of scientific article formats.

Application of the Douglas-Peucker Algorithm in Online Authentication of Remote Work Tools for Specialist Training in Higher Education Group of Scientific Specialties (UGSN) 10.00.00

Anton Grigorievich Uymin, Vladimir Sergeyevich Grekov
679-694
Abstract:

In today's world, digital technologies are penetrating all aspects of human activity, including education and labor. Since 2019, when, in response to global challenges, the world's educational systems have actively started to shift to distance learning, there has been an urgent need to develop and implement reliable identification and authentication technologies. These technologies are necessary to ensure the authenticity of work and protection from falsification of academic achievements, especially in the context of higher education in accordance with the group of specialties and directions (USGS) 10.00.00 - Information Security, where laboratory and practical work play a key role in the educational process.


The problem lies in the need to optimize the flow of incoming data, which, first, can affect the retraining of the neural network core of the recognition system, and second, impose excessive requirements on the network's bandwidth. To solve this problem, efficient preprocessing of gesture data is required to simplify their trajectories while preserving the key features of the gestures.


This article proposes the use of the Douglas–Peucker algorithm for preliminary processing of mouse gesture trajectory data. This algorithm significantly reduces the number of points in the trajectories, simplifying them while preserving the main shape of the gestures. The data with simplified trajectories are then used to train neural networks.


The experimental part of the work showed that the application of the Douglas–Peucker algorithm allows for a 60% reduction in the number of points in the trajectories, leading to an increase in gesture recognition accuracy from 70% to 82%. Such data simplification contributes to speeding up the neural networks' training process and improving their operational efficiency.


The study confirmed the effectiveness of using the Douglas–Peucker algorithm for preliminary data processing in mouse gesture recognition tasks. The article suggests directions for further research, including the optimization of the algorithm's parameters for different types of gestures and exploring the possibility of combining it with other machine learning methods. The obtained results can be applied to developing more intuitive and adaptive user interfaces.

Keywords: authentication, biometric identification, remote work, distance learning, Douglas–Peucker algorithm, data preprocessing, neural network, HID devices, mouse gesture trajectories, data optimization.

We are looking for the creative students… where do children grow, learnt for the main competative – possible to think

Eldar Reshatovich Yanbarisov, Elzara Reshatovich Yuzlikaeva
492-500
Abstract: The global problem of the pedagogical science is consists of education continuous in the period of extremely development of the innovative economic branches, where a pupil must become the creative specialist who can develop the social progress. A human who possible to modeling, analyzing and critically drops inside the new area of knowledge always becomes the demanded specialist. This is the main target of the advanced training school.
Keywords: the competitive specialist preparation, student must have creative, logic and critical thinking.

Modeling an Adaptive Interface using Semantic Ontology Relations

Olga Muratovna Ataeva, Vladimir Alekseevich Serebriakov, Natalia Pavlovna Tuchkova
2-17
Abstract:

The work is devoted to the problem of customizing the user interfaces of an information system that integrates data. An adaptive interface serves as one of the means of organizing the presentation of subject domain data. The issue of using the semantic relations of ontology to select data corresponding to the objectives of the study is investigated. A model of an adaptive interface is considered, which allows the most accurate reflection of the needs of a researcher within a particular subject domain. It is shown how the adaptive interface is formed by means of the semantic library model.

Keywords: ontology, adaptive interface, subject domain, data model.

Electronic Information Resources of the Library of Perm State Humanitarian Pedagogical University: on Subscription and Produced by own

А.В. Костицина
Abstract: The article focuses on the experience of choice and connection of electronic-library systems in the Perm State Humanitarian Pedagogical University. The efficient use of the information resources is analyzed. The foundation of the electronic library, its aims and objectives, presented resources are described. The main emphasis is given to the reflection of the university intellectual products and students’ and teachers’ access to them.
Keywords: University libraries, electronic library system, electronic educational resources.

Web application development based on technologies, resources and services of the Geoportal of the Institute of Computational Modelling SB RAS

О.Э. Якубайлик, А.А. Кадочников, А.В. Токарев
Abstract: The geoportal is a mapping web site; it can be described as specialized software and technologies for spatial data processing. Geoportal's main task is to provide the user with the tools and services of storing and cataloguing, publications and download the spatial (geographic) data, search and filter by metadata, interactive web visualization, direct access to geodata based web mapping services. Geoportal developed in ICM SB RAS with appropriate set of its components and services, has become a GIS platform for creating a number of applied GIS web applications. The article deals with the experience of design and development of these systems.
Keywords: spatial data processing, geodata, web mapping services, geoportal, GIS web applications.

Experimental Study of Cognitive Function of Generating Elliptical Sentences in Planimetric Tasks

Vladimir Andreevich Parkhomenko, Xenia Aleksandrovna Naidenova, Tan’yana Aleksandrovna Martirova, Alexander Valentinovich Schukin
316-335
Abstract:

The paper is devoted to the study of the cognitive function associated with the generation of elliptical sentences in the Russian language. The study is conducted by testing this cognitive ability using a computer system specially developed by the authors for this purpose. Testing of this cognitive ability is proposed and implemented for the first time. The system is an extension of Moodle and is openly hosted in the github repository. Elliptical constructions are limited to verbal and nominal ellipses, which are theoretically possible to be completely reconstructed based on the context of the sentence. The study is conducted with the participation of SPbPU students as respondents. The texts of planimetric tasks are chosen as the subject area. As a result of the analysis of testing data, the following results are obtained: the influence of the respondent’s knowledge of the subject area (planimetry) on the test results is established; a tendency towards self-study of respondents was discovered, which is manifested in a reduction in time and an increase in scores as they pass tests; it is shown that respondents are poorly motivated if they do not see feedback on the answer to the completed task. The paper discusses the problems of further development of the testing system and its use in adapting questionnaires (tasks) to assess the knowledge of SPbPU students in the field of automation of bug detection in programs, as well as for diagnosing the functional state of operator specialists and express diagnosis of dementia. It also seems promising to use the system to improve the processes of syntactic parsing of elliptic sentences and automate the restoration of ellipses in the subject area of planimetry.

Keywords: online testing system, development, experiments, cognitive function, ellipsis, planimetry.

The Third All-Russian Symposium "Infrastructure of scientific information resources and systems"

Е.Б. Кудашев, В.А. Серебряков
Abstract: This article analyzes the work of the Third All-Russian Symposium "Infrastructure scientific information resources and systems", held in Sukhum, Abkhazia, 5-8 October 2013. The avalanche growth of electronic content required the development of new approaches to storage and continuous access to digital scientific data. Of particular interest are the current scientific tasks of creating spatial data infrastructures. Symposium traditionally discusses issues related to the integration of geographic information resources and free access to them, research e-Infrastructures to form a distributed scientific information resources, development of related directories and create a network of integrated, interoperable databases. The development of e-Science Infrastructures should be the basis of emerging systems for collective work of researchers based on a virtual integration of information and computing resources. The main focus of the Third Symposium were questions the use of modern approaches to technology development of information systems to the problems of informational support of scientific research.
Keywords: digital content, scientific data, the formation of digital infrastructure, continuous access and long-term storage of data.

Analysis of Word Embeddings for Semantic Role Labeling of Russian Texts

Leysan Maratovna Kadermyatova, Elena Victorovna Tutubalina
1026-1043
Abstract: Currently, there are a huge number of works dedicated to semantic role labeling of English texts [1–3]. However, semantic role labeling of Russian texts was an unexplored area for many years due to the lack of train and test corpora. Semantic role labeling of Russian Texts was widely disseminated after the appearance of the FrameBank corpus [4]. In this approach, we analyzed the influence of the word embedding models on the quality of semantic role labeling of Russian texts. Micro- and macro- F1 scores on word2vec [5], fastText [6], ELMo [7] embedding models were calculated. The set of experiments have shown that fastText models averaged slightly better than word2vec models as applied to Russian FrameBank corpus. The higher micro- and macro- F1 scores were obtained on deep tokenized word representation model ELMo in relation to classical shallow embedding models.
Keywords: machine learning, ML-model, natural language processing, word embedding, semantic role labeling.

Digital Library of Satellite Data and Development of Information Infrastructure for access to Space data

Е.Б. Кудашев, А.Н. Филонов
Abstract: Статья посвящена проблемам развития информационной поддержки космических исследований в области наук о Земле и спутникового экологического мониторинга. Основное внимание при этом сфокусировано на разработке электронной библиотеки спутникового мониторинга окружающей среды. Анализируется актуальная проблема интеграции информационных ресурсов в мировые системы космического экологического мониторинга. Показано, что с оздание масштабной геоинформационной инфраструктуры занимает ведущее место в задачах информатики аэрокосмического дистанционного зондирования Земли.
1 - 25 of 45 items 1 2 > >> 
Information
  • For Readers
  • For Authors
  • For Librarians
Make a Submission
Current Issue
  • Atom logo
  • RSS2 logo
  • RSS1 logo

Russian Digital Libraries Journal

ISSN 1562-5419

Information

  • About the Journal
  • Aims and Scopes
  • Themes
  • Author Guidelines
  • Submissions
  • Privacy Statement
  • Contact
  • eLIBRARY.RU
  • dblp computer science bibliography

Send a manuscript

Authors need to register with the journal prior to submitting or, if already registered, can simply log in and begin the five-step process.

Make a Submission
About this Publishing System

© 2015-2025 Kazan Federal University; Institute of the Information Society