Image Classification: Feature Selection, Data Augmentation, and Transferred Learning


Introduction


Computer vision is a growing area of interest utilizing computers to evaluate visual input for various tasks, one of these tasks being image classification. When working with visual data for classification, one of the limitations is the lack of available labeled image data to train models. This project explores several methods to combat the limited image data which are feature modification and selection, data augmentation, and transfer learning. These methods were applied to convolution neural networks and support vector machines where applicable within experiments trying to optimize the accuracy of image classification. The two domains of classification explored in this project are medical data utilizing a MRI brain tumor dataset , and emotion recognition using the facial emotion recognition 2013 dataset.

Objective


To explore the techniques of feature selection, data augmentation, and transfer learning seeking to improve Machine/Deep Learning model (SVM and CNN) image classification accuracy on medical and emotion recognition.


Tools and Technologies


Fig [1]: Tools and Technologies Utilized Throughout the Project

Datasets


All datasets are available on Kaggle:

Fig [2]: MRI Brain Tumor Image Class Distribution
Fig [3]: FER-2013 Image Class Distribution

Experiment Results


The following visuals are our experiment results for SVM and CNN models.

MRI Brain Tumor Dataset

SVM

Fig [4]: MRI - SVM Performance On Raw Pixel Features
Raw Pixel Features
(Without PCA)
PCA Variance # of Features
  100% 279
  99% 245
  97% 206
196608 95% 179
  90% 131
  80% 77
  70% 46
Table [1]: Raw Pixel Data and PCA Features for MRI Dataset

Fig [5]: MRI - SVM PCA Experiment Accuracy Progress

Raw Pixel Features
(Without PCA)
PCA Variance # of Features
(Without PCA)
LBP
(P : 8, R : 1)
LBP
(P : 16, R : 2)
LBP
(P : 24, R : 3)
  100% 279 279 279 279
  99% 245 272 273 273
  97% 206 262 264 265
196608 95% 179 253 255 257
  90% 131 232 235 239
  80% 77 193 200 205
  70% 46 158 166 173
Table [2]: Raw Pixel Data and LBP Handcrafted Features and PCA

Fig [6]: SVM with PCA for LBP (P: 8, R: 1)

Fig [7]: SVM with PCA for LBP (P: 16, R: 2)

Fig [8]: SVM with PCA for LBP (P: 24, R: 3)

Fig [9]: MRI - SVM Experiment Accuracy Progress

CNN

Fig [10]: MRI - CNN Structure

Fig [11]: MRI - CNN Structure
Fig [12]: MRI - DenseNet169 Fine Tuning

Fig [13]: MRI - DenseNet169 Pre and Post Augmentation Confusion Matrix



Fig [14]: MRI - CNN Experiment Accuracy Progress


FER-2013 Facial Dataset

SVM

Fig [15]: FER-2013 - SVM Performance On Raw Pixel Features
Raw Pixel Features
(Without PCA)
PCA Variance # of Features
  100% 2304
  99% 904
  97% 425
2304 95% 256
  90% 104
  80% 32
  70% 13
Table [3]: Raw Pixel Data and PCA Features for FER-2013 Dataset

Fig [16]: FER-2013 - SVM PCA Experiment Accuracy Progress

Raw Pixel Features
(Without PCA)
PCA Variance LBP
(P : 8, R : 1)
LBP
(P : 16, R : 2)
LBP
(P : 24, R : 3)
  100% 10 18 26
  99% 8 13 18
2304 97% 7 10 13
  95% 4 6 7
  86% 2 2 2
Table [4]: FER-2013 - Raw Pixel Data and LBP Handcrafted Features and PCA

Fig [17]: SVM with PCA for LBP (P: 8, R: 1)

Fig [18]: SVM with PCA for LBP (P: 16, R: 2)

Fig [19]: SVM with PCA for LBP (P: 24, R: 3)

Fig [20]: FER-2013 - SVM With HoG Accuracy
Fig [21]: FER-2013 - The Effective of K in SIFT

Fig [22]: FER-2013 - SVM With SIFT Feature Accuracy
Fig [23]: FER-2013 - SVM Experiment Accuracy Progress

CNN

Fig [24]: FER - CNN Structure



Fig [25]: FER-2013 - ResNet50 With Different Trainable Layers

Fig [26]: FER-2013 - VGG16 With Different Trainable Layers

Fig [27]: MRI - VGG16 Pre Augmentation

Fig [28]: MRI - VGG16 Post Augmentation



Fig [29]: FER-2013 - VGG16 Per Class Performance Utilizing 48*48 Pixels vs 224*224 Pixels
No Augmentation Reflection and Rotation Reflection and Translation
64.11% 62.52% 64.43%
Table [6]: FER-2013 - CNN Accuracy Per Augmentation Methods

Fig [30]: MRI - CNN Experiment Accuracy Progress

Reports



Team



Micheal Trzaskoma

Data Preprocessing
Data Augmentation
CNN Experiments

Hui (Henry) Chen

Data Preprocessing
Data Augmentation
SVM Experiments