Clustering with density based initialization and Bhattacharyya based merging

Yükleniyor...
Küçük Resim

Tarih

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Tubitak Scientific & Technological Research Council Turkey

Erişim Hakkı

info:eu-repo/semantics/openAccess

Özet

Centroid based clustering approaches, such as k-means, are relatively fast but inaccurate for arbitrary shape clusters. Fuzzy c-means with Mahalanobis distance can accurately identify clusters if data set can be modelled by a mixture of Gaussian distributions. However, they require number of clusters apriori and a bad initialization can cause poor results. Density based clustering methods, such as DBSCAN, overcome these disadvantages. However, they may perform poorly when the dataset is imbalanced. This paper proposes a clustering method, named clustering with density initialization and Bhattacharyya based merging based on the fuzzy clustering. The initialization is carried out by density estimation with adaptive bandwidth using k-Nearest Orthant-Neighb or algorithm to avoid the effects of imbalanced clusters. The local peaks of the point clouds constructed by the k-Nearest Orthant-Neighb or algorithm are used as initial cluster centers for the fuzzy clustering. We use Bhattacharyya measure and Jensen inequality to find overlapped Gaussians and merge them to form a single cluster. We carried out experiments on a variety of datasets and show that the proposed algorithm has remarkable advantages especially for imbalanced and arbitrarily shaped data sets.

Açıklama

Anahtar Kelimeler

Infinite mixture models, density estimation, Jensen inequality, bandwidth selection, optimal number of

Kaynak

Turkish Journal of Electrical Engineering and Computer Sciences

WoS Q Değeri

Scopus Q Değeri

Cilt

30

Sayı

3

Künye

Onay

İnceleme

Ekleyen

Referans Veren