Abstract:
This article considers approach to intelligent search of complex objects in different types of texts with structural markup which can be used for Big Data processing. We research two types of data entry: relational databases, which use their schemes as structural markup, and full-text scientific documents containing mathematical expressions (formulae). For such full-text documents we suggest additory automated markup to allow formula search. In both cases we use natural language texts, which are semistructured data, as data source for building ontology and conducting search at a later stage. For relational databases those are comments to table and table attribute names; for scientific documents (articles, monographs, etc.) it is a text content of marked up documents.
Keywords:
Big Data, semantic search, semi-structured data, ontology, relational databases, science texts, mathematical expressions markup.