ОТ СОСТАВИТЕЛЕЙ

Настоящий тематический выпуск журнала «Электронные библиотеки» состоит из двух частей и включает статьи, подготовленные их авторами на основе материалов, представленных на научной конференции «Научный сервис в сети Интернет».

Эта конференция состоялась 23–28 сентября 2019 г. в окрестностях Новороссийска. Организатором конференции был Институт прикладной математики им. М.В. Келдыша Российской академии наук. Конференция собрала около 140 участников из разных городов России, в т. ч. Москвы, Санкт-Петербурга, Иркутска, Казани, Красноярска, Новосибирска, Ростова-на Дону, Томска и др.

Тематика конференции достаточно широка: от цифровых библиотек, библиографических баз и наукометрии до различных специальных областей использования возможностей интернета для научных исследований.

Первая часть тематического выпуска размещена в №3 журнала «Электронные библиотеки», вторая часть – в №4.

М. М. Горбунов-Посадов, А. М. Елизаров

Published: 09.05.2020

Progress in Dvm-System

Valery Fedorovich Aleksahin, Vladimir Aleksandrovich Bakhtin, Olga Fedorovna Zhukova, Dmitry Aleksandrovich Zakharov, Victor Alekseevich Krukov, Nataliya Victorovna Podderyugina, Olga Antonievna Savitskaya
247-270
Abstract: DVM-system is designed for the development of parallel programs of scientific and technical calculations in the C-DVMH and Fortran-DVMH languages. These languages use a single DVMH-model of parallel programming model and are an extension of the standard C and Fortran languages with parallelism specifications in the form of compiler directives. The DVMH model makes it possible to create efficient parallel programs for heterogeneous computing clusters, in the nodes of which accelerators, graphic processors or Intel Xeon Phi coprocessors can be used as computing devices along with universal multi-core processors. The article presents new features of DVM-system that have been developed recently.

Creation of Query Expansion Based on the Subject Domain Thesaurus in the Ontology of Knowledge of the Semantic Library

Olga Muratovna Ataeva, Vladimir Alekseevich Serebriakov, Natalia Pavlovna Tuchkova
271-291
Abstract: Possibilities of query expansion with subject area thesaurus are discussed. The role of the context defined by thesaurus term links is both to refine the query and to increase the size of the sample on the query. Of particular importance is the process of expanding the query for scientific subject areas where the search based on special terminology. In this case, thesauruses of subject areas must be used to minimize the occurrence of information noise. The proposed approach takes into account the application of similar terminology in various subject areas. Examples of the use of thesaurus of separate sections of equations of mathematical physics and related fields demonstrate the effectiveness of the chosen approach of research. By linking to concepts of information resources of other areas of knowledge, the extension of the information query captures search fields of remote subject areas and various types of data, texts, symbolic, audio and video archives. Research shows that expanding the query based on context semantics improves the search quality of scientific publications in digital information and increases the effectiveness of scientific interdisciplinary research.

Building a Publishing Toolkit for Multimedia Sciece Journals

Nikolay Valentinovich Borisov, Valentina Valentinovna Zakharkina, Irina Anatiljevna Mbogo, Dmitry E. Prokudin, Pavel Shcherbakov
292-314
Abstract: This article discusses approaches to the creation of an electronic scientific journal tool platform that provides the publication of multimedia materials through a web interface. The problems associated with the need to include multimedia data of different types are described and a working prototype of the multimedia of the scientific journal is presented.

Graph Self-Transformation Model Based on the Operation of Change the End of the Edge

Igor Borisovich Burdonov
315-335
Abstract: We consider a distributed network whose topology is described by an undirected graph. The network itself can change its topology, using special “commands” provided by its nodes. The work proposes an extremely local atomic transformation acb of a change the end c of the edge ac, “moving” along the edge cb from vertex c to vertex b. As a result of this operation, the edge ac is removed, and the edge ab is added. Such a transformation is performed by a “command” from a common vertex c of two adjacent edges ac and cb. It is shown that from any tree you can get any other tree with the same set of vertices using only atomic transformations. If the degrees of the tree vertices are bounded by the number d (d 3), then the transformation does not violate this restriction. As an example of the purpose of such a transformation, the problems of maximizing and minimizing the Wiener index of a tree with a limited degree of vertices without changing the set of its vertices are considered. The Wiener index is the sum of pairwise distances between the vertices of a graph. The maximum Wiener index has a linear tree (a tree with two leaf vertices). For a root tree with a minimum Wiener index, its type and method for calculating the number of vertices in the branches of the neighbors of the root are determined. Two distributed algorithms are proposed: transforming a tree into a linear tree and transforming a linear tree into a tree with a minimum Wiener index. It is proved that both algorithms have complexity no higher than 2n–2, where n is the number of tree vertices. We also consider the transformation of arbitrary undirected graphs, in which there can be cycles, multiple edges and loops, without restricting the degree of the vertices. It is shown that any connected graph with n vertices can be transformed into any other connected graph with k vertices and the same number of edges in no more than 2(n+k)–2.

Basic Services of Factory Metadata Digital Mathematical Library Lobachevskii-Dml

Polina Gafurova, Alexander Elizarov, Evgeny Konstantinovich Lipachev
336-381
Abstract: A number of problems related to the construction of the metadata factory of the digital mathematical library Lobachevskii-DML have been solved. By metadata factory we mean a system of interconnected software tools aimed at creating, processing, storing and managing metadata of digital library objects and allowing integrating created electronic collections into aggregating digital scientific libraries. In order to select the optimal such software tools from existing ones and their modernization:we discussed the features of the presentation of the metadata of documents of various electronic collections related both to the formats used and to changes in the composition and completeness of the set of metadata throughout the entire publication of the corresponding scientific journal;we presented and characterized software tools for managing scientific content and methods for organizing automated integration of repositories of mathematical documents with other information systems;we discussed such an important function of the digital library metadata factory as the normalization of metadata in accordance with the formats of other aggregating libraries.As a result of the development of the metadata factory of the digital mathematical library Lobachevskii-DML, we proposed a system of services for the automated generation of metadata for electronic mathematical collections; we have developed an xml metadata presentation language based on the Journal Archiving and Interchange Tag Suite (NISO JATS); we have created software tools for normalizing metadata of electronic collections of scientific documents in formats developed by international organizations – aggregators of resources in mathematics and Computer Science; we have developed an algorithm for converting metadata to oai_dc format and generating the archive structure for import into DSpace digital storage; we have proposed and implemented methods for integrating electronic mathematical collections of Kazan University into domestic and foreign digital mathematical libraries.

Russian Scientific Publication — 2019

Mikhail Mikhailovich Gorbunov-Posadov
382-389
Abstract: The article presents the events that took place last year in the world of Russian scientific publications. There is a slow slide towards paid access of some academic journals turned in open access in 2018. The European Union has announced plan "S" for the mass transition of scientific journals to open access. New models of the scientific publication are introducing. Reporting on publications requested by the Ministry of education and science in 2019 does not take into account the size of the readership of the article. Neither the Ministry of education and science, nor the Higher Attestation Commission (HAC) does not encourage publication in the public domain. In Russian Science Citation Index began the fight against widespread fraudulent trade in references to the article, but the HAC is not interested in this activity. A proliferation of contradictory the term "self-plagiarism" has spread. This label is widely stigmatized authors and journals for repeated publications.

Building Subject Domain Ontology on the Base of a Logical Data Mod

Alexander M. Gusenkov, Naille R. Bukharaev, Evgeny V. Biryaltsev
390-417
Abstract: The technology of automated construction of the subject domain ontology, based on information extracted from the comments of the TATNEFT oil company relational databases, is considered. The technology is based on building a converter (compiler) translating the logical data model of Epicenter Petrotechnical Open Software Corporation (POSC), presented in the form of ER diagrams and a set of the EXPRESS object-oriented language descriptions, into the OWL ontology description language, recommended by the W3C consortium. The basic syntactic and semantic aspects of the transformation are described.

Digital 3D-Objects Visualization in Forming Virtual Exhibitions

Nikolay Evgenvich Kalenov, Sergey Alexandrovich Kirillov, Irina Nikolaevna Sobolevskaya, Aleksandr Nikolaevich Sotnikov
418-432
Abstract: The paper is presents approaches to solving the problem of creating realistic interactive 3D web-collections of museum exhibits. The presentation of 3D-models of objects based on oriented polygonal structures is considered. The method of creating a virtual collection of 3D-models using interactive animation technology is described. It is also shown how a full-fledged 3D-model is constructed on the basis of individual exposure frames using photogrammetry methods. The paper assesses the computational complexity of constructing realistic 3D-models. For the creation of 3D-models in order to provide them to a wide range of users via the Internet, the so-called interactive animation technology is used. The paper presents the differences between the representations of full-fledged 3D-models and 3D-models presented in the form of interactive multiplication. The technology of creating 3D-models of objects from the funds of the State Biological Museum named K.A Timiryazev and the formation on their basis of the digital library “Scientific Heritage of Russia” of a virtual exhibition dedicated to the scientific activities of M.M. Gerasimov and his anthropological reconstructions, and vividly demonstrating the possibility of integrating information resources by means of an electronic library. The format of virtual exhibitions allows you to combine the resources of partners to provide a wide range of users with collections stored in museum, archival and library collections.

Formalization of Processes for Forming User Collections in the Digital Space of Scientific Knowledge

Nikolay Evgenvich Kalenov, Irina Nikolaevna Sobolevskaya, Aleksandr Nikolaevich Sotnikov
433-450
Abstract: The task of forming a digital space of scientific knowledge (DSSK) is analyzed in the paper. The difference of this concept from the general concept of the information space is considered. DSSK is presented as a set containing objects verified by the world scientific community. The form of a structured representation of the digital knowledge space is a semantic network, the basic organization principle of which is based on the classification system of objects and the subsequent construction of their hierarchy, in particular, according to the principle of inheritance. The classification of the objects that make up the content of the DSSK is introduced. A model of the central data collection system is proposed as a collection of disjoint sets containing digital images of real objects and their characteristics, which ensure the selection and visualization of objects in accordance with multi-aspect user requests. The concept of a user collection is defined, and a hierarchical classification of types of user collections is proposed. The use of the concepts of set theory in the construction of DSSK allows you to break down information into levels of detail and formalize the algorithms for processing user queries, which is illustrated by specific examples.

Audiovisual Recording of Synchronous Lessons During Full-Time and Distance Learning

Felix Osvaldovich Kasparinsky
451-472
Abstract: The modern information environment provides unprecedented opportunities for combining high-tech and high-touch learning approaches. It can be expected that in the near future, the general trend will be the use of audio-visual recordings of synchronized classes, which should be used for subsequent consolidation, repetition, control, generalization and systematization of knowledge. The article summarizes the results of 10 years of experience in creating and using audio-visual recordings of full-time and distance learning in university and school classrooms.

Investigation of Data Dependencies by Dynamic Analysis of Sapfor

Nikita Andreevich Kataev, Alexander Andreevich Smirnov, Andrey Dmitrievich Zhukov
473-493
Abstract: The use of pointers and indirect memory accesses in the program, as well as the complex control flow are some of the main weaknesses of the static analysis of programs. The program properties investigated by this analysis are too conservative to accurately describe program behavior and hence they prevent parallel execution of the program. The application of dynamic analysis allows us to expand the capabilities of semi-automatic parallelization. In the SAPFOR system (System FOR Automated Parallelization), a dynamic analysis tool has been implemented, based on on the instrumentation of the LLVM representation of an analyzed program, which allows the system to explore programs in both C and Fortran programming languages. The capabilities of the static analysis implemented in SAPFOR are used to reduce the overhead program execution, while maintaining the completeness of the analysis. The use of static analysis allows to reduce the number of analyzed memory accesses and to ignore scalar variables, which can be explored in a static way. The developed tool was tested on performance tests from the NAS Parallel Benchmarks package for C and Fortran languages. The implementation of dynamic analysis, in addition to traditional types of data dependencies (flow, anit, output), allows us to determine privitizable variables and a possibility of pipeline execution of loops. Together with the capabilities of DVM and OpenMP these greatly facilitates program parallelization and simplify insertion of the appropriate compiler directives.

Leveraging Semantic Markups for Incorporating External Resources Data to the Content of a Web Page

Evgeny L’vovich Kitaev, Rimma Yuryevna Skornyakova
494-513
Abstract: The semantic markups of the World Wide Web have accumulated a large amount of data and their number continues to grow. However, the potential of these data is, in our opinion, not fully utilized. The semantic markups contents are widely used by search systems, partly by social networks, but the usual approach to using that data by application developers is based on converting data to RDF standard and executing SPARQL queries, which requires good knowledge of this language and programming skills. In this paper, we propose to leverage the semantic markups available on the Web to automatically incorporate their contents to the content of other web pages. We also present a software tool for implementing such incorporation that does not require a web page developer to have knowledge of any programming languages ​​other than HTML and CSS. The developed tool does not require installation, the work is performed by JavaScript plugins. Currently, the tool supports semantic data contained in the popular types of semantic markups “microdata” and JSON-LD, in the tags of HTML documents and the properties of Word and PDF documents.

Determining the Thematic Proximity of Scientific Journals and Conferences Using Big Data Technologies

Alexander Sergeevich Kozitsin, Sergey Alexandrovich Afonin, Dmitiy Alekseevich Shachnev
514-525
Abstract: The number of scientific journals published in the world is very large. In this regard, it is necessary to create software tools that will allow analyzing thematic links of journals. The algorithm presented in this paper uses graphs of co-authorship for analyzing the thematic proximity of journals. It is insensitive to the language of the journal and can find similar journals in different languages. This task is difficult for algorithms based on the analysis of full-text information. Approbation of the algorithm was carried out in the scientometric system IAS ISTINA. Using a special interface, a user can select one interesting journal. Then the system will automatically generate a selection of journals that may be of interest to the user. In the future, the developed algorithm can be adapted to search for similar conferences, collections of publications and research projects. The use of such tools will increase the publication activity of young employees, increase the citation of articles and quoting between journals. In addition, the results of the algorithm for determining thematic proximity between journals, collections, conferences and research projects can be used to build rules in the ontology models for access control systems.

Strong and Weak Relations in the Academic Web

Andrey Anatolievich Pechnikov
526-542
Abstract: The web graph is the most popular model of real Web fragments used in Web science. The study of communities in the web graph contributes to a better understanding of the organization of the fragment of the Web and the processes occurring in it. It is proposed to allocate a communication graph in a web graph containing only those vertices (and arcs between them) that have counter arcs, and in it to investigate the problem of splitting into communities. By analogy with social studies, connections realized through edges in a communication graph are proposed to be called "strong" and all others "weak". Thematic communities with meaningful interpretations are built on strong connections. At the same time, weak links facilitate communication between sites that do not have common features in the field of activity, geography, subordination, etc., and basically preserve the coherence of the fragments of the Web even in the absence of strong links. Experiments conducted for a fragment of the scientific and educational Web of Russia show the possibility of meaningful interpretation of the results and the prospects of such an approach.

RSCI as a Mirror of Publication Activity of RAE Members

Yuri Evgenievich Polyak
543-562
Abstract: Based on information from open sources, a table was compiled reflecting the indicators of 128 full members of the Russian Academy of Education (RAE) in the Russian Science Citation Index (RSCI). The main results are given in a condensed form and compared with the results of a similar study performed several years earlier. The conclusions and features of the RSCI as an analytical tool are discussed.