Artigo Revisado por pares

Privacy‐preserving data‐mining through micro‐aggregation for web‐based e‐commerce

2010; Emerald Publishing Limited; Volume: 20; Issue: 3 Linguagem: Inglês

10.1108/10662241011050759

ISSN

2054-5657

Autores

Guillermo Navarro‐Arribas, Vicenç Torra,

Tópico(s)

Internet Traffic Analysis and Secure E-voting

Resumo

Purpose The purpose of this paper is to anonymize web server log files used in e‐commerce web mining processes. Design/methodology/approach The paper has applied statistical disclosure control (SDC) techniques to achieve its goal. More precisely, it has introduced the micro‐aggregation of web access logs. Findings The experiments show that the proposed technique provides good results in general, but it is especially outstanding when dealing with relatively small websites. Research limitations/implications As in all SDC techniques there is always a trade‐off between privacy and utility or, in other words, between disclosure risk and information loss. In this proposal, it has borne this issue in mind, providing k ‐anonymity, while preserving acceptable information accuracy. Practical implications Web server logs are valuable information used nowadays for user profiling and general data‐mining analysis of a website in e‐commerce and e‐services. This proposal allows anonymizing such logs, so they can be safely outsourced to other companies for marketing purposes, stored for further analysis, or made publicly available, without risking customer privacy. Originality/value Current solutions to the problem presented here are very poor and scarce. They are normally reduced to the elimination of sensitive information from query strings of URLs in general. Moreover, to its knowledge, the use of SDC techniques has never been applied to the anonymization of web logs.

Referência(s)
Altmetric
PlumX