• Main Navigation
  • Main Content
  • Sidebar

Russian Digital Libraries Journal

  • Home
  • About
    • About the Journal
    • Aims and Scopes
    • Themes
    • Editor-in-Chief
    • Editorial Team
    • Submissions
    • Open Access Statement
    • Privacy Statement
    • Contact
  • Current
  • Archives
  • Register
  • Login
  • Search
Published since 1998
ISSN 1562-5419
16+
Language
  • Русский
  • English

Search

Advanced filters

Search Results

Semantic analysis of documents in the control system of digital scientific collections

Шамиль Махмутович Хайдаров
61-85
Abstract: Methods of the semantic documents parsing in digital control system of scientific collections, including electronic journals, offered. The methods of processing documents containing mathematical formulas and methods for the conversion of documents from the OpenXML-format in ТеХ-format considered. The search algorithm for the mathematical formulas in the collections of documents stored in OpenXML-format designed. The algorithm is implemented as online-service on platform science.tatarstan.
Keywords: semantic analysis, publishing systems.

Taking into Account the Structure of the Document in the Method of Automatic Annotation of Mathematical Concepts in Educational Texts

Konstantin Sergeevich Nikolaev
558-577
Abstract:

The enrichment of educational texts with semantic content (in particular, adding hyperlinks to the pages of the service that displays detailed information about concepts in the text) helps to increase the efficiency of students' assimilation of the material. The existing methods of semantic markup of educational texts do not take into account the structural features of such documents, which leads to excessive recognition of concepts. This article describes the development of the method of automatic annotation of mathematical concepts in educational mathematical texts by adding functionality to account for the structure of an educational document. The main purpose of the method is to process educational materials of the distance education course "Technology for solving planimetric problems". Following a single template when creating course pages allows you to apply an analysis of the web page markup and keywords used by the course creators. The main task in this process is to determine the type of table cell containing text fragments of educational materials. In accordance with the recommendations of the course creators, definitions should be highlighted in the cells containing the task statement, as well as in those blocks where the input data of the task is indicated. The type of table cells is determined by analyzing their attributes and searching for keywords in their contents. This limitation of recognizable text fragments will improve the student's perception of the course pages and improve the quality of learning.

Keywords: semantic analysis, mathematical ontology, didactic relations, mathematical education, document markup.

Technology for Filling Subject Ontologies of the Scientific Knowledge Space

Nikolay Evgenievich Kalenov
101-115
Abstract:

Subject ontology in the context of this article is understood as a set of key concepts related to a certain field of science, with their semantic connections, supplemented by indexes of various classification systems describing this scientific field. Subject ontologies are a necessary component of each subspace that is part of the Unified digital space of scientific knowledge (DSSK). This article presents the results of research related to the construction of subject ontologies based on the created automated system for supporting terminological dictionaries and suggests a methodology for identifying new key terms in a particular field of science. The proposed methodology is based on the use of existing classification systems in conjunction with citation databases, such as Web of Science and Scopus for English–language publications and the Russian citation index for Russian-language publications. The methodology involves dividing the scientific field into a number of sections in accordance with the selected classification system, extracting from the CSB the core of articles related to each section, and from the articles - new author's keywords, which should constitute, in combination with the corresponding sections of classification systems, the basis of the subject ontologies of this scientific field.

Keywords: scientific digital space, subject ontology, citation databases, keywords, thesaurus, classification systems.

Проект NewsAgent for Libraries: Персонифицированная служба оперативного информационного обеспечения

Р. Йетс
Abstract: There are three main ways of obtaining information: searching, browsing and alerting. The first two are being widely developed by libraries using the Web, but the last has been somewhat neglected. The NewsAgent for Libraries project was originally funded under the eLib Programme by JISC (Joint Information Systems Committee of the UK higher education funding councils) as a two-year collaborative project started in April 1996.
Several small publishers of library and information science journals worked with network specialists, market evaluators and commercial software developers to design an open, distributed architecture for disseminating information via email and personalised Web pages. Dublin Core metadata was used, enhanced by NewsAgent specific keywords, to map stored user subject profiles against information feeds. Metadata was harvested using software robots to build an Oracle database where both user profiles and document attributes were stored.
Users can join the service via a Web page, to receive information updates by email or as a personalised Web page. Users can select predefined Topics in which they are interested, or create new named ones (stored queries). They can also modify existing Topics. Topics are presented in groups, called Channels.
A major part of the project was an extensive study of the potential end users of the service, before and after a prototype service was created. The project was considered a success, although further development of both software and marketing strategy were needed before a full scale launch could be planned. This is now expected in autumn 1999. In addition to this service, the software is being applied to other services by different organisations, targetted at groups such as small businesses, medical information and environmental information. It is expected that a commercial software package will be available from Fretwell-Downing Informatics as a result of the project.

Automated Students' Short Answers Grading using Language Models

Chulpan Bakievna Minnegalieva, Ilnur Ilhamovich Kashapov, Olga Dmitrievna Morozova
278-293
Abstract:

Methods for assessing student answers using language models are currently being studied by various specialists. The results of automated assessment depend on the subject area and characteristics of the academic discipline. This paper analyzes the students’ answers received during the course «Computer Graphics and Design». It is proposed to determine the cosine similarity of document vectors obtained using language models and refine the estimates by checking keywords. The results obtained can be used for preliminary assessment of students' answers and are the basis for further research.

Keywords: language model, knowledge control, natural text processing, student answer keyword, automated short answer grading, cosine similarity, document vector representation, BERT, word2vec, open-ended question.

Analysis of the Distribution of Key Terms in Scientific Articles

Svetlana Aleksandrovna Vlasova, Nikolay Evgenievich Kalenov, Irina Nikolaevna Sobolevskaya
35-51
Abstract:

One of the Common Digital Space of Scientific Knowledge (CDSSK) main components are the subject ontologies of individual thematic subspaces, which include the basic concepts related to this scientific area. The constructing subject ontologies task at the initial phase requires the array of key terms formation in a given scientific are with the subsequent establishment of links between them. A similar task is in the encyclopedias formation in terms of the articles (slots) list generating that determines their content. One of the sources for the formation of the key terms array can be the metadata of articles published in the leading scientific journals. Namely, the author's key terms ("keywords" in the terminology of the journals editors) quoted by the article. To make a conclusion about the possibility of using this approach to the subject ontologies formation, it is necessary to conduct the author's key terms array preanalysis, both in terms of real correspondence to the main areas of research in this science branch and in terms of the distribution of the certain terms occurrence frequency. This article presents the results of the occurrence frequency analysis of the author's key terms in Russian and English, carried out on the software processing basis of several thousand articles from leading Russian journals in mathematics, computer science and physics, reflected in the MathNet database. An assessment was made of the distribution of key terms correspondence (as phrases) and individual words to the Bradford's law, and the key terms cores within the thematic direction were identified.

Keywords: digital space of scientific knowledge, subject ontologies, encyclopedia articles, key terms, article metadata, frequency analysis.

Authors Identification within the Subject Area in the Semantic Library

Olga Muratovna Ataeva, Vladimir Alekseevich Serebriakov, Natalia Pavlovna Tuchkova
198-217
Abstract:

The peculiarities of the task of authors identifying and determining author's contribution to publications in digital bibliographic codes are considered. The features of the problem of insufficient identification are manifested in the repetition of information, doubling, the presence of authors with completely coincidental names, self-quotation, autoplagiate and plagiarism itself. It is proposed to use publication information that has already been accumulated in the digital library in the form of related object area data and a variety of target thesaurus data, as the author and user of the library. This information contains links whereby keyword contexts, multiple co-authors, and term associations in dictionaries and thesauruses can be used to identify authorship. It is important that an array of scientific publications is considered, since they have an established traditional structure, which allows comparing fixed text elements (annotations, keywords, classifier codes, etc.). Thus, even if the names in the publications are fully matched, the question of authorship can be raised if the publications in the digital library correspond to different subject areas. Resolution of such contradictions is accomplished by evaluating a plurality of links of all elements of secondary publication information. The result of the comparison could be the addition of the author to a specific area, i.e. the extension of the addressee's thesaurus and the author's personal thesaurus, or the appearance of full namesakes in the library, but from different areas of knowledge. It has been shown that modern data analysis tools allow you to evaluate the author's contribution to publication, despite the fact that of course, only the scientific community can evaluate the real contribution to scientific research.

Keywords: comparison of scientific texts, semantic search, thesaurus for the ontology of knowledge information, query using the thesaurus methods of authors identification, addressee thesaurus, secondary information, individual frequency dictionary, LibMeta.
1 - 7 of 7 items
Information
  • For Readers
  • For Authors
  • For Librarians
Make a Submission
Current Issue
  • Atom logo
  • RSS2 logo
  • RSS1 logo

Russian Digital Libraries Journal

ISSN 1562-5419

Information

  • About the Journal
  • Aims and Scopes
  • Themes
  • Author Guidelines
  • Submissions
  • Privacy Statement
  • Contact
  • eLIBRARY.RU
  • dblp computer science bibliography

Send a manuscript

Authors need to register with the journal prior to submitting or, if already registered, can simply log in and begin the five-step process.

Make a Submission
About this Publishing System

© 2015-2025 Kazan Federal University; Institute of the Information Society