Capítulo de livro Produção Nacional Revisado por pares

Towards Automatic Content Analysis of Rhetorical Structure in Brazilian College Entrance Essays

2021; Springer Science+Business Media; Linguagem: Inglês

10.1007/978-3-030-78270-2_29

ISSN

1611-3349

Autores

Rafael Ferreira Mello, Giuseppe Fiorentino, Péricles Miranda, Hilário Oliveira, Mladen Raković, Dragan Gašević,

Tópico(s)

Sentiment Analysis and Opinion Mining

Resumo

Essay scorers manually look for the presence of required rhetorical categories to evaluate coherence, which is a time-consuming task. Several attempts in the literature have been reported to automate the identification of rhetorical categories in essays with machine learning. However, existing machine learning algorithms are mostly trained on content features which can lead to over-fitting and hindering model generalizability. Thus, this paper proposed a set of content-independent features to identify rhetorical categories. The best performing classifier, XGBoost, achieved performance comparable to human annotation and outperformed previous models.

Referência(s)