Status | 已发表Published |
Title | An automated robust algorithm for clustering multivariate data |
Creator | |
Date Issued | 2023-09-01 |
Source Publication | Journal of Computational and Applied Mathematics
![]() |
ISSN | 0377-0427 |
Volume | 429 |
Abstract | Clustering analysis is widely used in various applications, such as marketing, biology, medical science, finance, data mining, image processing, data analysis and pattern recognition. For instance, clustering can be used to: characterize the customer groups based on their purchasing patterns by discovering the distinct groups in the customer base; derive plant and animal taxonomies, categorize genes with similar functionalities and gain insight into structures inherent to populations; identify the cancer cells; and detect credit card fraud. The k-means, Hierarchical and self-organizing (Kohonen) map are widely used clustering algorithms. The practice demonstrated that these clustering algorithms have some significant limitations and drawbacks. This manuscript gives an automated robust algorithm for clustering multivariate data without prior information about the number of clusters. Robust estimate of location and covariance matrix are used to define Mahalanobis distance and corresponding radius of clustering algorithm. The algorithm is designed in a way that it controls both masking and swamping effects. It automatically divides a given data set into a number of clusters. Some properties pertaining to the algorithm are demonstrated which helps in finding clusters that accommodates observations with large deviation. A method to avoid the use of a fixed cutoff for determining outlier is discussed. The performance of the proposed algorithm is compared with the existing clustering algorithms and robust multiple outlier detection methods. |
Keyword | Clustering Mahalanobis distance Multiple regression Outliers Robust estimate Self-organizing maps |
DOI | 10.1016/j.cam.2023.115219 |
URL | View source |
Indexed By | SCIE |
Language | 英语English |
WOS Research Area | Mathematics |
WOS Subject | Mathematics, Applied |
WOS ID | WOS:000968327700001 |
Scopus ID | 2-s2.0-85151265196 |
Citation statistics | |
Document Type | Journal article |
Identifier | http://repository.uic.edu.cn/handle/39GCC9TT/10821 |
Collection | Faculty of Science and Technology |
Corresponding Author | Paul, Chinmoy |
Affiliation | 1.Department of Mathematics and Computing,Indian Institute of Technology Dhanbad,Dhanbad,826004,India 2.Institute of Actuarial Science and Data Analytics,UCSI University,Kuala Lumpur,56000,Malaysia 3.Department of Statistics,PDUAM,Eraligool,Karimganj,788723,India 4.Department of Mathematics and Actuarial Science,The American University in Cairo,New Cairo,11835,Egypt 5.Department of Statistics and Data Science,Cornell University,14853-3901,United States 6.Department of Statistics and Data Science,Faculty of Science and Technology,Beijing Normal University-Hong Kong Baptist University United International College,Zhuhai,519087,China 7.Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science,BNU-HKBU United International College,Zhuhai,519087,China 8.Department of Mathematics,Faculty of Science,Zagazig University,Zagazig,44519,Egypt |
Recommended Citation GB/T 7714 | Vishwakarma, Gajendra K.,Paul, Chinmoy,Hadi, Ali S.et al. An automated robust algorithm for clustering multivariate data[J]. Journal of Computational and Applied Mathematics, 2023, 429. |
APA | Vishwakarma, Gajendra K., Paul, Chinmoy, Hadi, Ali S., & Elsawah, A. M. (2023). An automated robust algorithm for clustering multivariate data. Journal of Computational and Applied Mathematics, 429. |
MLA | Vishwakarma, Gajendra K.,et al."An automated robust algorithm for clustering multivariate data". Journal of Computational and Applied Mathematics 429(2023). |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment