Computational Information Retrieval
2003; Linguagem: Inglês
10.1007/978-3-642-57410-8_3
ISSN2628-8966
Autores Tópico(s)Information Retrieval and Search Behavior
ResumoThe main goal of this note is to introduce the notion of collection dependent “same context words”. Two (or more) words are the “same context words” if they occur in the same (or similar) context across a given text collection. Each word w in the collection is associated with a profile P(w). The profile P(w) is the set of words occurring in sentences that contain w. We introduce a distance function in the set profiles, and use it to cluster words. Words contained in the same cluster are “same context words”. We select “same context words” for several text collections, and briefly discuss further possible applications of the introduced concepts to a number of information retrieval related problems.
Referência(s)