Email spam filtering

Loading...
Thumbnail Image
Identifiers

Publication date

Advisors

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Metrics

Google Scholar

Research Projects

Organizational Units

Journal Issue

Abstract

In recent years, email spam has become an increasingly important problem, with a big economic impact in society. In this work, we present the problem of spam, how it affects us, and how we can fight against it. We discuss legal, economic, and technical measures used to stop these unsolicited emails. Among all the technical measures, those based on content analysis have been particularly effective in filtering spam, so we focus on them, explaining how they work in detail. In summary, we explain the structure and the process of different Machine Learning methods used for this task, and how we can make them to be cost sensitive through several methods like threshold optimization, instance weighting, or MetaCost. We also discuss how to evaluate spam filters using basic metrics, TREC metrics, and the receiver operating characteristic convex hull method, that best suits classification problems in which target conditions are not known, as it is the case. We also describe how actual filters are used in practice. We also present different methods used by spammers to attack spam filters and what we can expect to find in the coming years in the battle of spam filters against spammers.

Description

Keywords

Bibliographic reference

Puertas Sanz, E., Gómez Hidalgo, J. M., & Cortizo Pérez, J. C. (2008). Email spam filtering. Advances in computers, 74, 45-114.

Type of document