Abstract:
The paper describes our approach to the task of sentiment analysis of tweets within SentiRuEval – an open evaluation of sentiment analysis systems for the Russian language. We took part in the task of sentiment analysis of Russian tweets concerning two types of organizations: banks and telecommunications companies. On both datasets, the participants were required to perform a three-way classification of tweets: positive, negative or neutral.
We used various statistical methods as basis for our machine learning algorithms. Linguistic features produced by our morpho-syntactic analyzer are applied to the classification. Syntactic relations proved to be a crucial feature for any statistical method evaluated, and SVM-based classification performed better than the others. Normalized words are another important feature for the algorithm.
The evaluation revealed that our method proved to be rather successful: we scored the first in three out of four evaluation measures.