Email Classification Research Trends: Review and Open Issues
Personal and business users prefer to use email as one of the crucial sources of communication. The usage and importance of emails continuously grow despite the prevalence of alternative means, such as electronic messages, mobile applications, and social networks. As the volume of business-critical emails continues to grow, the need to automate the management of emails increases for several reasons, such as spam email classification, phishing email classification, and multi-folder categorization, among others. This study comprehensively reviews articles on email classification published in 2006â??2016 by exploiting the methodological decision analysis in five aspects, namely, email classification application areas, datasets used in each application area, feature space utilized in each application area, email classification techniques, and use of performance measures. A total of 98 articles (56 articles from Web of Science core collection databases and 42 articles from Scopus database) are selected. To achieve the objective of the study, a comprehensive review and analysis is conducted to explore the various areas where email classification was applied. Moreover, various public datasets, features sets, classification techniques, and performance measures are examined and used in each identified application area. This review identifies five application areas of email classification. The most widely used datasets, features sets, classification techniques, and performance measures are found in the identified application areas. The extensive use of these popular datasets, features sets, classification techniques, and performance measures is discussed and justified. The research directions, research challenges, and open issues in the field of email classification are also presented for future researchers.
Email Classification, Spam Detection, Phishing Detection, Multi Folder Categorization, Machine Learning Techniques