GMM-Based Synthetic Samples for Classification of Hyperspectral Images With Limited Training Data
Tarih
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Erişim Hakkı
Özet
The amount of training data that is required to train a classifier scales with the dimensionality of the feature data. In hyperspectral remote sensing (HSRS), feature data can potentially become very high dimensional. However, the amount of training data is oftentimes limited. Thus, one of the core challenges in HSRS is how to perform multiclass classification using only relatively few training data points. In this letter, we address this issue by enriching the feature matrix with synthetically generated sample points. These synthetic data are sampled from a Gaussian mixture model (GMM) fitted to each class of the limited training data. Although the true distribution of features may not be perfectly modeled by the fitted GMM, we demonstrate that a moderate augmentation by these synthetic samples can effectively replace a part of the missing training samples. Doing so, the median gain in classification performance is 5% on two datasets. This performance gain is stable for variations in the number of added samples, which makes it easy to apply this method to real-world applications.









