Evaluating cost-sensitive unsolicited bulk email categorization

Loading...
Thumbnail Image
Identifiers

Publication date

Advisors

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

ACM

Metrics

Google Scholar

Research Projects

Organizational Units

Journal Issue

Abstract

In the recent years, Unsolicited Bulk Email has became an increasingly important problem, with a big economic impact. In this paper, we discuss cost-sensitive Text Categorization methods for UBE filtering. In concrete, we have task (C4.5, Naive Bayes, PART. Support Vector Machines and Rocchio), made cost sensitive through several methods (Threshold Optimization, Instance Weighting, and Meta-Cost). We have used the Receiver Operating Characteristic Convex Hull method for the evaluation, that best suits classification problems in which target conditions are not known, as it is the case. Our results do not show a dominant algorithm nor method for making algorithms cost-sensitive, but are the best reported on the test collection used, and approach real-world hand-crafted classifiers accuracy.

Description

Keywords

Bibliographic reference

Gómez Hidalgo, J. M. (2002). Evaluating cost-sensitive unsolicited bulk email categorization. In Proceedings of the 2002 ACM symposium on Applied computing, March 11-14 (pp. 615-620). Madrid: ACM.

Type of document