Search

An Innovative Approach to the Design of Integrated Tasks in Computer Modeling Training

Olga Aleksandrovna Shirokova, Tatyana Yurievna Gainutdinova

378-393

Abstract:

The article discusses the possible use of LMS Moodle in the development of the course “The use of computer modeling in education”. The course is based on the introduction of interdisciplinary integration of higher mathematics, computer modeling, programming into the educational process and involves the use of computer mathematics systems and software environments. Examples of specific integrated tasks are presented.

When designing the training course “Using Computer Simulation in Education” in LMS Moodle, the following set of elements was used: “lecture”, “task”, “test”, “forum”, “resource”, “wiki”, “chat”, “glossary”.

The use of the methodology for compiling integrated tasks based on LMS Moodle showed that: integrated tasks using information technology help to increase the level of mastering the material of complex sections of higher mathematics; the content of the course of higher mathematics is the fundamental basis of the material studied in the proposed course, and contributes to a deep understanding of mathematical disciplines; integrated design tasks form practical skills and abilities of computer modeling using programming in various software environments.

Keywords: integrated tasks, higher mathematics, computer modeling, programming, LMS Moodle, computer mathematics systems.

Exploring Post-Training Quantization of Large Language Models with a Focus on Russian Evaluation

Dmitrii Romanovich Poimanov, Mikhail Sergeevich Shutov

1138-1163

Abstract:

The rapid adoption of large language models (LLMs) has made quantization a central technique for enabling efficient deployment under real-world hardware and memory constraints. While English-centric evaluations of low-bit quantization are increasingly available, much less is known about its effects on morphologically rich and resource-diverse languages such as Russian. This gap is particularly important given the recent emergence of high-performing Russian and multilingual LLMs. In this work, we conduct a systematic study of 2-, 3-, and 4-bit post-training quantization (PTQ) for state-of-the-art Russian LLMs across different model scales (4B and 32B). Our experimental setup covers both standard uniform quantization and specialized low-bit formats, as well as lightweight finetuning for recovery in the most extreme 2-bit setting. Our findings highlight several important trends: (i) the tolerance of Russian LLMs to quantization differs across model families and scales; (ii) 4-bit quantization is generally robust, especially when advanced formats are used; (iii) 3-bit models expose sensitivity to calibration data and scaling strategies; and (iv) 2-bit models, while severely degraded under naive PTQ, can be partially restored through short finetuning. Empirical results show that the model's domain must be considered when using different quantization techniques.

Keywords: neural networks quantization, compression and optimization of large language models.

Normalization of Text Recognized by Optical Character Recognition using Lightweight LLMS

Vladislav Konstantinovich Vershinin, Ivan Vladimirovich Khodnenko, Sergey Vladimirovich Ivanov

1036-1056

Abstract:

Despite recent progress, Optical Character Recognition (OCR) on historical newspapers still leaves 5–10% character errors. We present a fully automated post-OCR normalization pipeline that combines lightweight 7–8B instruction-tuned LLMs quantized to 4-bit (INT4) with a small set of regex rules. On the BLN600 benchmark (600 pages of 19th-century British newspapers), our best model YandexGPT-5-Instruct Q4 reduces Character Error Rate (CER) from 8.4% to 4.0% (–52.5%) and Word Error Rate (WER) from 20.2% to 6.5% (–67.8%), while raising semantic similarity to 0.962. The system runs on consumer hardware (RTX-4060 Ti, 8 GB VRAM) at about 35 seconds per page and requires no fine-tuning or parallel training data. These results indicate that compact INT4 LLMs are a practical alternative to large checkpoints for post-OCR cleanup of historical documents.

Keywords: optical character recognition, post-OCR correction, historical newspapers, large language models, quantization, INT4, normalization pipeline, character error rate, semantic similarity, regex rules, YandexGPT-5, lightweight models, natural language processing, digital humanities, document digitization.

Instruments supporting role-based excercises using STAD strategy in e-learning systems

Владислав Владимирович Матюнин, Антон Алексадрович Марченко

209-221

Abstract:

In this paper one of the possible implementations of cooperative education models based on STAD (Student Teams-achievement Divisions) strategy of cooperative learning in LMS (Learning Management System) is described. The approaches of this learning methodology increase the level of teamwork skills, need in further professional activity, and their injection into LMS can help to automate and optimize some processes and open new opportunities for implementation of new instruments.

Keywords: cooperative learning, STAD, LMS, e-learning.

Post-Correction of Weak Transcriptions by Large Language Models in the Iterative Process of Handwritten Text Recognition

Valerii Pavlovich Zykov, Leonid Moiseevich Mestetskiy

1385-1414

Abstract:

This paper addresses the problem of accelerating the construction of accurate editorial annotations for handwritten archival texts within an incremental training cycle based on weak transcription. Unlike our previously published results, the present work focuses on integrating automatic post-correction of weak transcriptions using large language models (LLMs). We propose and implement a protocol for applying LLMs at the line level in a few-shot setup with carefully designed prompts and strict output format control (preservation of pre-reform orthography, protection of proper names and numerals, prohibition of structural changes to lines). Experiments are conducted on the corpus of diaries by A.V. Sukhovo-Kobylin. As the base recognition model, we use the line-level variant of the Vertical Attention Network (VAN). Results show that LLM post-correction–exemplified by the ChatGPT-4o service–substantially improves the readability of weak transcriptions and significantly reduces the word error rate (in our experiments by about −12 percentage points), without degrading the character error rate. Another service tested, DeepSeek-R1, demonstrated less stable behavior. We discuss practical prompt engineering, limitations (context length limits, risk of “hallucinations”), and provide recommendations for the safe integration of LLM post-correction into an iterative annotation pipeline to reduce expert annotators’ workload and speed up the digitization of historical archives.

Keywords: handwritten text recognition, weak markup, Vertical Attention Network (VAN), large language models (LLM), post-correction, iterative retraining.

Neural Network Architecture of Embodied Intelligence

Ayrat Rafkatovich Nurutdinov

598-655

Abstract:

In recent years, advances in artificial intelligence (AI) and machine learning have been driven by advances in the development of large language models (LLMs) based on deep neural networks. At the same time, despite its substantial capabilities, LLMs have fundamental limitations such as spontaneous unreliability in facts and judgments; making simple errors that are dissonant with high competence in general; credulity, manifested by a willingness to accept a user's knowingly false claims as true; and lack of knowledge about events that have occurred after training has been completed.

Probably the key reason is that bioinspired intelligence learning occurs through the assimilation of implicit knowledge by an embodied form of intelligence to solve interactive real-world physical problems. Bioinspired studies of the nervous systems of organisms suggest that the cerebellum, which coordinates movement and maintains balance, is a prime candidate for uncovering methods for realizing embodied physical intelligence. Its simple repetitive structure and ability to control complex movements offer hope for the possibility of creating an analog to adaptive neural networks.

This paper explores the bioinspired architecture of the cerebellum as a form of analog computational networks capable of modeling complex real-world physical systems. As a simple example, a realization of embodied AI in the form of a multi-component model of an octopus tentacle is presented, demonstrating the potential in creating adaptive physical systems that learn and interact with the environment.

Keywords: artificial neural network, large language model, implicit learning, cerebellum model, analog computing, embodied cognition, soft robotics, octopus.

Hiding in Meaning: Semantic Encoding for Generative Text Steganography

Oleg Yurievich Rogov, Dmitrii Evgenievich Indenbom, Dmitrii Sergeevich Korzh, Darya Valeryaevna Pugacheva, Vsevolod Alexandrovich Voronov, Elena Viktorovna Tutubalina

1165-1185

Abstract:

We propose a novel framework for steganographic text generation that hides binary messages within semantically coherent natural language using latent-space conditioning of large language models (LLMs). Secret messages are first encoded into continuous vectors via a learned binary-to-latent mapping, which is used to guide text generation through prefix tuning. Unlike prior token-level or syntactic steganography, our method avoids explicit word manipulation and instead operates entirely within the latent semantic space, enabling more fluent and less detectable outputs. On the receiver side, the latent representation is recovered from the generated text and decoded back into the original message. As a key theoretical contribution, we provide a robustness guarantee: if the recovered latent vector lies within a bounded distance of the original, exact message reconstruction is ensured, with the bound determined by the decoder’s Lipschitz continuity and the minimum logit margin. This formal result offers a principled view of the reliability–capacity trade-off in latent steganographic systems. Empirical evaluation on both synthetic data and real-world domains such as Amazon reviews shows that our method achieves high message recovery accuracy (above 91%), strong text fluency and competitive capacity up to 6 bits per sentence element while maintaining resilience against neural steganalysis. These findings demonstrate that latent conditioned generation offers a secure and practical pathway for embedding information in modern LLMs.

Keywords: steganography, semantic encoding, language models, prefix tuning, knowledge graphs, natural language generation, latent conditioning, neural steganalysis.

Measuring Uncertainty in Transformer Circuits with Effective Information Consistency

Anatoly Anatolievich Krasnovsky

1103-1119

Abstract:

Mechanistic interpretability has identified functional subgraphs within large language models (LLMs), known as Transformer Circuits (TCs), that appear to implement specific algorithms. Yet we lack a formal, single-pass way to quantify when an active circuit is behaving coherently and thus likely trustworthy. Building on the author’s prior sheaf‑theoretic formulation of causal emergence (Krasnovsky, 2025), we specialize it to transformer circuits and introduce the single‑pass, dimensionless Effective‑Information Consistency Score (EICS). EICS combines (i) a normalized sheaf inconsistency computed from local Jacobians and activations, with (ii) a Gaussian EI proxy for circuit-level causal emergence derived from the same forward state. The construction is white-box, single-pass, and makes units explicit so that the score is dimensionless. We further provide practical guidance on score interpretation, computational overhead (with fast and exact modes), and a toy sanity-check analysis.

Keywords: mechanistic interpretability, ransformer circuits, sheaf theory, causal emergence, uncertainty quantification, large language models (LLMs).

A Tool for Rapid Diagnostics of Memory in Neural Network Architectures of Language Models

Pavel Andreevich Gavrikov, Azamat Komiljon ugli Usmanov, Dmitriy Revayev, Sergey Nikolaevich Buzykanov

1346-1367

Abstract:

Large Language Models (LLMs) have evolved from simple n-gram systems to modern universal architectures; however, a key limitation remains the quadratic complexity of the self-attention mechanism with respect to input sequence length. This significantly increases memory consumption and computational costs, and with the emergence of tasks requiring extremely long contexts, creates the need for new architectural solutions. Since evaluating a proposed architecture typically requires long and expensive full-scale training, it is necessary to develop a tool that allows for a rapid preliminary assessment of a model’s internal memory capacity.

This paper presents a method for quantitative evaluation of the internal memory of neural network architectures based on synthetic tests that do not require large data corpora. Internal memory is defined as the amount of information a model can reproduce without direct access to its original inputs.

To validate the approach, a software framework was developed and tested on the GPT-2 and Mamba architectures. The experiments employed copy, inversion, and associative retrieval tasks. Comparison of prediction accuracy, error distribution, and computational cost enables a fast assessment of the efficiency and potential of various LLM architectures.

Keywords: large language models, neural network architecture, internal memory, long-term information retention, sequence processing, functional memory measurement, architecture comparison.

Development of an Adaptive System for Generating Game Quests and Dialogues Based on Large Language Models

Vsevolod Tarasovich Trofimchuk, Vlada Vladimirovna Kugurakova

953-993

Abstract:

This article addresses the problem of creating dynamic narrative systems for video games with real-time interactivity. It presents the development and testing of a GPT integration component for dialogue generation, which revealed a critical limitation of cloud-based solutions – a 30-second latency unacceptable for gameplay. A hybrid architecture of an adaptive system is proposed, combining LLMs with reinforcement learning mechanisms. Particular attention is given to solving the problems of game world consistency and managing long-term context of NPC interactions through a RAG approach. The transition to the Edge AI paradigm with the application of quantization methods to achieve a target latency of 200–500 ms is substantiated. Metrics for evaluating personalization and dynamic content adaptation have been developed.

Keywords: video games, large language models, LLM, dialogue generation, quest generation, adaptive quests, procedural content generation, agent behavior, game AI, machine learning in games.

Detection of Hallucinations Based on the Internal States of Large Language Models

Timur Rustemovich Aisin, Tatiana Vyacheslavovna Shamardina

1282-1305

Abstract:

In recent years, large language models (LLMs) have achieved substantial progress in natural language processing tasks and have become key instruments for addressing a wide range of applied and research problems. However, as their scale and capabilities grow, the issue of hallucinations — i.e., the generation of false, unreliable, or nonexistent information presented in a credible manner—has become increasingly acute. Consequently, analyzing the nature of hallucinations and developing methods for their detection has acquired both scientific and practical significance.

This study examines the phenomenon of hallucinations in large language models, reviews their existing classification, and investigates potential causes. Using the Flan-T5 model, we analyze differences in the model’s internal states when generating hallucinations versus correct responses. Based on these discrepancies, we propose two approaches for hallucination detection: one leveraging attention maps and the other utilizing the model’s hidden states. These methods are evaluated on data from HaluEval and Shroom 2024 benchmarks in tasks such as summarization, question answering, paraphrasing, machine translation, and definition generation. Additionally, we assess the transferability of the trained detectors across different hallucination types, in order to evaluate the robustness of the proposed methods.

Keywords: large language models, hallucinations, detection, Flan-T5, natural language processing, attention maps, hidden states, HaluEval, Shroom.

AI in Cancer Prevention: a Retrospective Study

Petr Aleksandrovich Philonenko, Vladimir Nikolaevich Kokh, Pavel Dmitrievich Blinov

1253-1266

Abstract:

This study investigates the feasibility of effectively solving population-scale cancer screening problems using artificial intelligence (AI) methods that predict malignant neoplasm risk based on minimal electronic health record (EHR) data – medical diagnosis and service codes. To address the formulated problem, we considered a broad spectrum of modern approaches, including classical machine learning methods, survival analysis, deep learning, and large language models (LLMs). Numerical experiments demonstrated that gradient boosting using survival analysis models as additional predictors possesses the best ability to rank patients by cancer risk level, enabling consideration of both population-level and individual risk factors for malignant neoplasms. Predictors constructed from EHR data include demographic characteristics, healthcare utilization patterns, and clinical markers. This solution was tested in retrospective experiments under the supervision of specialized oncologists. In the retrospective experiment involving more than 1.9 million patients, we established that the risk group captures up to 5.4 times more patients with cancer at the same level of medical examinations. The investigated method represents a scalable solution using exclusively diagnosis and service codes, requiring no specialized infrastructure and integrable into oncological vigilance processes, making it applicable for population-scale cancer screening.

Keywords: AI in medicine, cancer prevention, retrospective experiments.

Open Archives of Ground-Based Ionospheric Radiosounding Data by Shortwave Signals

Andrey Olegovich Schiriy, Alina Alexandrovna Pisarenko

992-1005

Abstract:

By the radiosounding of the ionosphere with short-wave signals, can be obtained information about the processes in the ionospheric plasma, about its structure and state; these data are also extremely important for radio engineering systems operating in the short-wave range. To date, a large amount of experimental data has been accumulated for various geo- and heliophysical, spatial and temporal conditions. The interest in large amounts of ionospheric radiosonde data is also motivated by the possibility of constructing statistical models using machine learning theory methods. The paper presents some Internet resources with ionospheric radiosonde data, shows the prospects for their application, and also identifies some problems, such as insufficient documentation of some data formats and the presentation of ionograms only in the form of raster images, most of which are also scanned from photographic films.

Keywords: ionosphere, propagation of radio waves, radiosounding, vertical sounding of the ionosphere, ionogram, ionogram processing.

Search Results