Support Vector Machines and Kernel Functions for Text Processing

Authors

  • Celso Antonio Alves Kaestner Informatics Department Federal University of Technology - Paraná - UTFPR Av. Sete de Setembro, 3165 Curitiba, Paraná - Brazil 80.230-901 Phone: +55 (41) 3310-4644 Fax: +55 (41) 3310-4646

DOI:

https://doi.org/10.22456/2175-2745.39702

Abstract

This work presents kernel functions that can be used in conjunction with the Support Vector Machine – SVM – learning algorithm to solve the automatic text classification task. Initially the Vector Space Model for text processing is presented. According to this model text is seen as a set of vectors in a high dimensional space; then extensions and alternative models are derived, and some preprocessing procedures are discussed. The SVM learning algorithm, largely employed for text classification, is outlined: its decision procedure is obtained as a solution of an optimization problem. The “kernel trick”, that allows the algorithm to be applied in non-linearly separable cases, is presented, as well as some kernel functions that are currently used in text applications. Finally some text classification experiments employing the SVM classifier are conducted, in order to illustrate some text preprocessing techniques and the presented kernel functions.

Downloads

Download data is not yet available.

Author Biography

Celso Antonio Alves Kaestner, Informatics Department Federal University of Technology - Paraná - UTFPR Av. Sete de Setembro, 3165 Curitiba, Paraná - Brazil 80.230-901 Phone: +55 (41) 3310-4644 Fax: +55 (41) 3310-4646

Celso A. A. Kaestner has a Ph.D. in Electrical Engineering - Information Systems - at the Federal University of Santa Catarina (1993). He also holds postdoctoral studies at the École de Technologie Supérière of the University of Québéc in Montreal (1999), and at the York University in Toronto, Canada (2012-2013). Presently he is an Associate Professor of the Informatics Department at the Federal University of Technology - Paraná (UTFPR), working in the following areas: computational intelligence, knowledge discovery and data mining, information retrieval and text mining.

Downloads

Published

2013-11-06

How to Cite

Kaestner, C. A. A. (2013). Support Vector Machines and Kernel Functions for Text Processing. Revista De Informática Teórica E Aplicada, 20(3), 130–154. https://doi.org/10.22456/2175-2745.39702