Search

Digital preservation in Karelian Research Centre of RAS

Андрей Анатольевич Крижановский, Анатолий Дмитриевич Сорокин, Виктор Алексеевич Лебедев, Эльвира Викторовна Ямса, Валентина Геннадьевна Старкова, Юлия Андреевна Новикова, Александр Владимирович Чирков, Наталья Борисовна Крижановская, Юлия Васильевна Чиркова

305-367

Abstract:

Digital resources of Karelian Research Centre of RAS (KarRC RAS) related to digital libraries, repositories and search systems are described in this paper. These resources devoted to the gathering, organization and dissemination of scientific and technical information (scientific papers, archival documents) in order to use this information in theoretical and applied researches. The stages of development of these resources in connection with the history of the Karelian Research Centre of RAS subdivisions (Scientific library, Scientific archives) are presented. Future directions for the development of these resources outlined. This work is being licensed under the Creative Commons Attribution-ShareAlike 4.0.

Keywords: Karelian Research Centre of RAS, digital preservation, digital library.

Business Process of Library Electronic Catalog Integration and Samara University Repository

Mariya Mishanina, Oksana Petrova

963-969

Abstract: Institutional repositories (IR) help to increase public value, rating, prestige and visibility of both individual researchers and the whole universities. Repositories are filled with their own content and provide access to other researchers around the world. The number of IRs is growing due to the involvement in the work on their creation of university libraries. Libraries want all materials in the university is IR to be in demand by users and used in the educational and scientific process. Therefore, in addition to the repository’s own search engine and the search indexes ‘Google’ and ‘Yandex’, IR resources should be in the electronic catalog, that brings them as close as possible to the reader. The article describes the business processes introduced by Samara University library into the practice of working with electronic resources of the university repository.

Keywords: academic library, institutional repository, business process, electronic resource, electronic publication, electronic edition, information technologies, electronic catalogue, database, open access repository.

Study results for the detection of matching content using citation analysis

Вадим Николаевич Гуреев, Николай Алексеевич Мазов

322-331

Abstract:

Translated plagiarism has widely spread in a scientific world and posed a serious problem due to the challenges in its automatic detection. However, in the last five years some progress has been observed in this area. The authors of this paper, as well as foreign research team from several universities independently of each other proposed an approach to detect plagiarism based on citation analysis with search of initial source for analyzed suspected paper with the same or similar references. Developed methods of detection of illegal use of borrowed text successfully passed several tests. The report shows the results that we have obtained in the last four years.

Keywords: detection of matching content, translated plagiarism, plagiarism detection, citation analysis, bibliographic database.

On the Synonym Search Model

Olga Muratovna Ataeva, Vladimir Alekseevich Serebriakov, Natalia Pavlovna Tuchkova

1006-1022

Abstract:

The problem of finding the most relevant documents as a result of an extended and refined query is considered. For this, a search model and a text preprocessing mechanism are proposed, as well as the joint use of a search engine and a neural network model built on the basis of an index using word2vec algorithms to generate an extended query with synonyms and refine search results based on a selection of similar documents in a digital semantic library. The paper investigates the construction of a vector representation of documents based on paragraphs in relation to the data array of the digital semantic library LibMeta. Each piece of text is labeled. Both the whole document and its separate parts can be marked. The problem of enriching user queries with synonyms was solved, then when building a search model together with word2vec algorithms, an approach of "indexing first, then training" was used to cover more information and give more accurate search results. The model was trained on the basis of the library's mathematical content. Examples of training, extended query and search quality assessment using training and synonyms are given.

Keywords: search model, word2vec algorithm, synonyms, information query, query extension.

Semantic analysis of documents in the control system of digital scientific collections

Шамиль Махмутович Хайдаров

61-85

Abstract: Methods of the semantic documents parsing in digital control system of scientific collections, including electronic journals, offered. The methods of processing documents containing mathematical formulas and methods for the conversion of documents from the OpenXML-format in ТеХ-format considered. The search algorithm for the mathematical formulas in the collections of documents stored in OpenXML-format designed. The algorithm is implemented as online-service on platform science.tatarstan.

Keywords: semantic analysis, publishing systems.

Search Results

Digital preservation in Karelian Research Centre of RAS

Business Process of Library Electronic Catalog Integration and Samara University Repository

Study results for the detection of matching content using citation analysis

On the Synonym Search Model

Semantic analysis of documents in the control system of digital scientific collections