Artigo Acesso aberto Revisado por pares

Peringkasan Sentimen Esktraktif di Twitter Menggunakan Hybrid TF-IDF dan Cosine Similarity

2016; Gadjah Mada University; Volume: 10; Issue: 2 Linguagem: Inglês

10.22146/ijccs.16625

ISSN

2460-7258

Autores

Devid Haryalesmana Wahid, Azhari Sn,

Tópico(s)

Data Mining and Machine Learning Applications

Resumo

The using of Twitter by selebrities has become a new trend of impression management strategy . Mining public reaction in social media is a good strategy to obtain feedbacks, but extracting it are not trivial matter. Reads hundred of tweets while determine their sentiment polarity are time consuming . Extractive sentiment summarization machine are needed to address this issue. Previous research generally do not include sentiment information contained in a tweet as weight factor, as a results only general topics of discussion are extracted. This research aimed to do an extractive sentiment summarization on both positive and negative sentiment mentioning Indonesian selebrity, Agnes Monica , by combining SentiStrength, Hybrid TF-IDF, and Cosine Similarity. SentiStrength is used to obtain sentiment strength score and classify tweet as a positive, negative or neutral. The summarization of posisitve and negative sentiment can be done by rank tweets using Hybrid TF-IDF summarization and sentiment strength score as additional weight then removing similar tweet by using Cosine Similarity. The test results showed that the combination of SentiStrength, Hybrid TF-IDF, and Cosine Similarity perform better than using Hybrid TF-IDF only, given an average 60 % accuracy and 62% f-measure . This is due to the addition of sentiment score as a weight factor in sentiment summ­ari­zation.

Referência(s)