Toronto Metropolitan University
Browse
- No file added yet -

Optimum Regularization Parameter C in Support Vector Machine (SVM) Binary Classification

Download (17.37 MB)
thesis
posted on 2023-08-29, 16:08 authored by Ishtiaque Ahmed

Support Vector Machines (SVMs) are widely used learning algorithms for data classification. In machine learning algorithms, such as the SVM approach, one or more parameters control the smoothness of the solution and are required to be tuned for the optimum solution. Such parameters are called regularization parameters, which are critical in building robust and accurate algorithms to prevent overfitting and underfitting. In SVM, the regularization parameter, denoted by C, regularizes the training loss of misclassified data. Traditionally, the value of C is first set to one, and if after training the data, misclassifications are observed, C is tuned by K-fold Cross-Validation (CV) method, which is a time- consuming process. This thesis aims to rigorously analyze and study the behavior of the C value in SVM. The analysis shows that for the case of a linearly separable dataset, setting the value of C to one does not always provide the optimum solution, and in addition, it is shown that there exists a Minimum Acceptance Value (MAV) for C as a function of Separability and Scatteredness (S&S). S&S is a new notion that is defined in this thesis, inspired by the Signal-to-Noise ratio (SNR) definition and is shown to be a critical parameter in the analysis of SVM classifiers. The study is further extended for the case of linearly non-separable dataset, and it has shown that a lookup table based on the analysis of bias-variance tradeoff (BVB C-Table) provides the optimum value of C, which not only outperforms but also is much faster than, the existing k-fold CV. For example, in a simple binary classification scenario, a typical k-fold cross-validation can take more than two hours, whereas the proposed method requires only a couple of minutes in a python-based environment. Due to its efficiency, the proposed method of choosing the regularization parameter enables online binary classification and will have potential benefits in One-vs-All and One-vs-One SVM classification. 

History

Language

English

Degree

  • Master of Applied Science

Program

  • Electrical and Computer Engineering

Granting Institution

Ryerson University

LAC Thesis Type

  • Thesis

Thesis Advisor

Dr. Soosan Beheshti

Year

2021

Usage metrics

    Electrical and Computer Engineering (Theses)

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC