Artigo Revisado por pares

Senti-lexicon and improved Naïve Bayes algorithms for sentiment analysis of restaurant reviews

2011; Elsevier BV; Volume: 39; Issue: 5 Linguagem: Inglês

10.1016/j.eswa.2011.11.107

ISSN

1873-6793

Autores

Hanhoon Kang, Seong Joon Yoo, Dongil Han,

Tópico(s)

Text and Document Classification Technologies

Resumo

The existing senti-lexicon does not sufficiently accommodate the sentiment word that is used in the restaurant review. Therefore, this thesis proposes a new senti-lexicon for the sentiment analysis of restaurant reviews. When classifying a review document as a positive sentiment and as a negative sentiment using the supervised learning algorithm, there is a tendency for the positive classification accuracy to appear up to approximately 10% higher than the negative classification accuracy. This creates a problem of decreasing the average accuracy when the accuracies of the two classes are expressed as an average value. In order to mitigate such problem, an improved Naïve Bayes algorithm is proposed. The result of the experiment showed that when this algorithm was used and a unigrams + bigrams was used as the feature, the gap between the positive accuracy and the negative accuracy was narrowed to 3.6% compared to when the original Naïve Bayes was used, and that the 28.5% gap was able to be narrowed compared to when SVM was used. Additionally, the use of this algorithm based on the senti-lexicon showed an accuracy that improved by a maximum of 10.2% in recall and a maximum of 26.2% in precision compared to when SVM was used, and by a maximum of 5.6% in recall and a maximum of 1.9% in precision compared to when Naïve Bayes was used.

Referência(s)
Altmetric
PlumX