• Main Navigation
  • Main Content
  • Sidebar

Russian Digital Libraries Journal

  • Home
  • About
    • About the Journal
    • Aims and Scopes
    • Themes
    • Editor-in-Chief
    • Editorial Team
    • Submissions
    • Open Access Statement
    • Privacy Statement
    • Contact
  • Current
  • Archives
  • Register
  • Login
  • Search
Published since 1998
ISSN 1562-5419
16+
Language
  • Русский
  • English

Search

Advanced filters

Search Results

Image Classification Using Reinforcement Learning

Artem Aleksandrovich Elizarov , Evgenii Viktorovich Razinkov
1172-1191
Abstract:

Recently, such a direction of machine learning as reinforcement learning has been actively developing. As a consequence, attempts are being made to use reinforcement learning for solving computer vision problems, in particular for solving the problem of image classification. The tasks of computer vision are currently one of the most urgent tasks of artificial intelligence.


The article proposes a method for image classification in the form of a deep neural network using reinforcement learning. The idea of ​​the developed method comes down to solving the problem of a contextual multi-armed bandit using various strategies for achieving a compromise between exploitation and research and reinforcement learning algorithms. Strategies such as -greedy, -softmax, -decay-softmax, and the UCB1 method, and reinforcement learning algorithms such as DQN, REINFORCE, and A2C are considered. The analysis of the influence of various parameters on the efficiency of the method is carried out, and options for further development of the method are proposed.

Keywords: machine learning, image classification, reinforcement learning, contextual multi-armed bandit problem.

Automatic Annotation of Training Datasets in Computer Vision using Machine Learning Methods

Aleksey Konstantinovich Zhuravlev, Karen Albertovich Grigorian
718-729
Abstract:

This paper addresses the issue of automatic annotation of training datasets in the field of computer vision using machine learning methods. Data annotation is a key stage in the development and training of deep learning models, yet the process of creating labeled data often requires significant time and labor. This paper proposes a mechanism for automatic annotation based on the use of convolutional neural networks (CNN) and active learning methods.


The proposed methodology includes the analysis and evaluation of existing approaches to automatic annotation. The effectiveness of the proposed solutions is assessed on publicly available datasets. The results demonstrate that the proposed method significantly reduces the time required for data annotation, although operator intervention is still necessary.


The literature review includes an analysis of modern annotation methods and existing automatic systems, providing a better understanding of the context and advantages of the proposed approach. The conclusion discusses achievements, limitations, and possible directions for future research in this field.

Keywords: computer vision, machine learning, automatic data annotation, training datasets, image segmentation.

AI in Cancer Prevention: a Retrospective Study

Petr Aleksandrovich Philonenko, Vladimir Nikolaevich Kokh, Pavel Dmitrievich Blinov
1253-1266
Abstract:

This study investigates the feasibility of effectively solving population-scale cancer screening problems using artificial intelligence (AI) methods that predict malignant neoplasm risk based on minimal electronic health record (EHR) data – medical diagnosis and service codes. To address the formulated problem, we considered a broad spectrum of modern approaches, including classical machine learning methods, survival analysis, deep learning, and large language models (LLMs). Numerical experiments demonstrated that gradient boosting using survival analysis models as additional predictors possesses the best ability to rank patients by cancer risk level, enabling consideration of both population-level and individual risk factors for malignant neoplasms. Predictors constructed from EHR data include demographic characteristics, healthcare utilization patterns, and clinical markers. This solution was tested in retrospective experiments under the supervision of specialized oncologists. In the retrospective experiment involving more than 1.9 million patients, we established that the risk group captures up to 5.4 times more patients with cancer at the same level of medical examinations. The investigated method represents a scalable solution using exclusively diagnosis and service codes, requiring no specialized infrastructure and integrable into oncological vigilance processes, making it applicable for population-scale cancer screening.

Keywords: AI in medicine, cancer prevention, retrospective experiments.

Neural Network Architecture of Embodied Intelligence

Ayrat Rafkatovich Nurutdinov
598-655
Abstract:

In recent years, advances in artificial intelligence (AI) and machine learning have been driven by advances in the development of large language models (LLMs) based on deep neural networks. At the same time, despite its substantial capabilities, LLMs have fundamental limitations such as spontaneous unreliability in facts and judgments; making simple errors that are dissonant with high competence in general; credulity, manifested by a willingness to accept a user's knowingly false claims as true; and lack of knowledge about events that have occurred after training has been completed.


Probably the key reason is that bioinspired intelligence learning occurs through the assimilation of implicit knowledge by an embodied form of intelligence to solve interactive real-world physical problems. Bioinspired studies of the nervous systems of organisms suggest that the cerebellum, which coordinates movement and maintains balance, is a prime candidate for uncovering methods for realizing embodied physical intelligence. Its simple repetitive structure and ability to control complex movements offer hope for the possibility of creating an analog to adaptive neural networks.


This paper explores the bioinspired architecture of the cerebellum as a form of analog computational networks capable of modeling complex real-world physical systems. As a simple example, a realization of embodied AI in the form of a multi-component model of an octopus tentacle is presented, demonstrating the potential in creating adaptive physical systems that learn and interact with the environment.

Keywords: artificial neural network, large language model, implicit learning, cerebellum model, analog computing, embodied cognition, soft robotics, octopus.

Research of Data Processing, Detection and Protection Algorithms to Minimize the Impact of Malware and Phishing Attacks on Users of Digital Platforms

Tatiana Sergeevna Volokitina, Maxim Olegovich Tanygin
187-206
Abstract:

The article is devoted to the development of a scientific and methodological apparatus for improving the effectiveness of protecting digital platforms from cyber threats by creating processing and detection algorithms that take into account the cognitive characteristics of users. A conceptual model of a three-stage protection system is proposed, integrating technical security mechanisms with cognitive decision-making models. A heuristic detection algorithm based on Random Forest machine learning with analysis of 47 features, including technical URL characteristics and cognitive-semantic content characteristics, has been developed. A methodology for dynamic integration of four threat data sources has been created, reducing response time from 12–14 hours to two hours. An algorithm for recursive analysis of redirection chains up to ten levels deep to detect masked threats is proposed. Experimental validation on an empirical base of approximately one million records confirmed detection accuracy of 87% when processing one hundred thousand records per hour. The developed solutions ensure compliance with the requirements of GOST R 57580.1-2017 and Russian legislation in the field of personal data protection.

Keywords: heuristic threat detection, machine learning, cognitive security, phishing attacks, social engineering, data protection, threat source integration.

VR-Telecontrol of Multi-Arm Devices: Problems, Hypotheses, Problem Statement

Vlada Vladimirovna Kugurakova, Igor Dmitrievich Sergunin , Evgeniy Yurevich Zykov, Oleg Dmitrievich Sergunin, Alexey Valerievich Ulanov, Dinara Rustamovna Gabdullina, Artem Shamilevich Gilemyanov
441-471
Abstract:

The article discusses various solutions that exist in the field of remote control of robotic devices equipped with manipulators. New approaches are presented for organizing joint telecontrol of multiple manipulators using various user inputs. The following usage scenarios are considered: the architecture of a system with many manipulators and user control interfaces, including such promising areas as deep machine learning and neural interfaces.

Keywords: virtual reality, telecontrol, robot, co-bot, robotics, joint telecontrol, teleimpedance, cognitive radio.

Analysis of Word Embeddings for Semantic Role Labeling of Russian Texts

Leysan Maratovna Kadermyatova, Elena Victorovna Tutubalina
1026-1043
Abstract: Currently, there are a huge number of works dedicated to semantic role labeling of English texts [1–3]. However, semantic role labeling of Russian texts was an unexplored area for many years due to the lack of train and test corpora. Semantic role labeling of Russian Texts was widely disseminated after the appearance of the FrameBank corpus [4]. In this approach, we analyzed the influence of the word embedding models on the quality of semantic role labeling of Russian texts. Micro- and macro- F1 scores on word2vec [5], fastText [6], ELMo [7] embedding models were calculated. The set of experiments have shown that fastText models averaged slightly better than word2vec models as applied to Russian FrameBank corpus. The higher micro- and macro- F1 scores were obtained on deep tokenized word representation model ELMo in relation to classical shallow embedding models.
Keywords: machine learning, ML-model, natural language processing, word embedding, semantic role labeling.

Title extraction from english scientific books in PDF format

Дмитрий Сергеевич Филиппов
392-411
Abstract:

Relevance of the issue under study is due to tenuity of methods proposed by other researchers that use simple heuristics or machine learning algorithms. The purpose of the article is to provide better way to extract titles from scientific PDF documents and offer better and more reasonable approach to title selection generally. The leading approach to the study is regard as many cases and problems appeared during extraction as possible and find an approach to solve all of them. The results showed the efficiency of chosen approach in case of having a document set with all of considered problems. The research highlights that deep analysis of current task problem is a perspective to make the best solutions and tools. The article may be useful for all researchers and developers who often encounter the problem of document structural analysis or title detection as secondary task of a main program workflow.

Keywords: Pdf processing, title extraction, header extraction, strategy based approach, title heuristic, structural analysis, style information, text analysis, document analysis, information extraction.
1 - 8 of 8 items
Information
  • For Readers
  • For Authors
  • For Librarians
Make a Submission
Current Issue
  • Atom logo
  • RSS2 logo
  • RSS1 logo

Russian Digital Libraries Journal

ISSN 1562-5419

Information

  • About the Journal
  • Aims and Scopes
  • Themes
  • Author Guidelines
  • Submissions
  • Privacy Statement
  • Contact
  • eLIBRARY.RU
  • dblp computer science bibliography

Send a manuscript

Authors need to register with the journal prior to submitting or, if already registered, can simply log in and begin the five-step process.

Make a Submission
About this Publishing System

© 2015-2026 Kazan Federal University; Institute of the Information Society