新規登録 | ログイン | FAQ      [?] 
CiteULike is a free online bibliography manager. Register and you can start organising your references online.
Recent | Unread | Search | Authors | Tags | Export

Query log mining for detecting polysemy and spam

by: Carlos Castillo, Claudio Corsi, Debora Donato, Paolo Ferragina, Aristides Gionis
(August 2008)


View FullText article


X Reviews [Write a review of this article]

There are no reviews of this article

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Abstract

Through their interaction with search engines, users provide implicit feedback that can be used to extract useful knowledge and improve the quality of the search process. This feedback is encoded in the form of a query log that consists of a sequence of search actions, which contain information about submitted queries, documents viewed, and documents clicked by the users. In this paper, we propose characterizing documents and queries via the information available within a query log, with the goal of detecting either query polysemy or spam-hosts and spam-queries, i.e., queries that shown the undesirable property of showing a higher rate of spam pages in their list of results than other queries. The main contribution of our paper consists of exploiting user feedback and query-log mining to combat spam and identify query polysemy. Our experiments attest the effectiveness of our approach for the applications we consider.


X BibTeX record

X RIS record



RIS BibTeX
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.