科研成果详情

发表状态已发表Published
题名An automated robust algorithm for clustering multivariate data
作者
发表日期2023-09-01
发表期刊Journal of Computational and Applied Mathematics
ISSN/eISSN0377-0427
卷号429
摘要

Clustering analysis is widely used in various applications, such as marketing, biology, medical science, finance, data mining, image processing, data analysis and pattern recognition. For instance, clustering can be used to: characterize the customer groups based on their purchasing patterns by discovering the distinct groups in the customer base; derive plant and animal taxonomies, categorize genes with similar functionalities and gain insight into structures inherent to populations; identify the cancer cells; and detect credit card fraud. The k-means, Hierarchical and self-organizing (Kohonen) map are widely used clustering algorithms. The practice demonstrated that these clustering algorithms have some significant limitations and drawbacks. This manuscript gives an automated robust algorithm for clustering multivariate data without prior information about the number of clusters. Robust estimate of location and covariance matrix are used to define Mahalanobis distance and corresponding radius of clustering algorithm. The algorithm is designed in a way that it controls both masking and swamping effects. It automatically divides a given data set into a number of clusters. Some properties pertaining to the algorithm are demonstrated which helps in finding clusters that accommodates observations with large deviation. A method to avoid the use of a fixed cutoff for determining outlier is discussed. The performance of the proposed algorithm is compared with the existing clustering algorithms and robust multiple outlier detection methods.

关键词Clustering Mahalanobis distance Multiple regression Outliers Robust estimate Self-organizing maps
DOI10.1016/j.cam.2023.115219
URL查看来源
收录类别SCIE
语种英语English
WOS研究方向Mathematics
WOS类目Mathematics, Applied
WOS记录号WOS:000968327700001
Scopus入藏号2-s2.0-85151265196
引用统计
被引频次:6[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符https://repository.uic.edu.cn/handle/39GCC9TT/10821
专题理工科技学院
通讯作者Paul, Chinmoy
作者单位
1.Department of Mathematics and Computing,Indian Institute of Technology Dhanbad,Dhanbad,826004,India
2.Institute of Actuarial Science and Data Analytics,UCSI University,Kuala Lumpur,56000,Malaysia
3.Department of Statistics,PDUAM,Eraligool,Karimganj,788723,India
4.Department of Mathematics and Actuarial Science,The American University in Cairo,New Cairo,11835,Egypt
5.Department of Statistics and Data Science,Cornell University,14853-3901,United States
6.Department of Statistics and Data Science,Faculty of Science and Technology,Beijing Normal University-Hong Kong Baptist University United International College,Zhuhai,519087,China
7.Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science,BNU-HKBU United International College,Zhuhai,519087,China
8.Department of Mathematics,Faculty of Science,Zagazig University,Zagazig,44519,Egypt
推荐引用方式
GB/T 7714
Vishwakarma, Gajendra K.,Paul, Chinmoy,Hadi, Ali S.et al. An automated robust algorithm for clustering multivariate data[J]. Journal of Computational and Applied Mathematics, 2023, 429.
APA Vishwakarma, Gajendra K., Paul, Chinmoy, Hadi, Ali S., & Elsawah, A. M. (2023). An automated robust algorithm for clustering multivariate data. Journal of Computational and Applied Mathematics, 429.
MLA Vishwakarma, Gajendra K.,et al."An automated robust algorithm for clustering multivariate data". Journal of Computational and Applied Mathematics 429(2023).
条目包含的文件
条目无相关文件。
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Vishwakarma, Gajendra K.]的文章
[Paul, Chinmoy]的文章
[Hadi, Ali S.]的文章
百度学术
百度学术中相似的文章
[Vishwakarma, Gajendra K.]的文章
[Paul, Chinmoy]的文章
[Hadi, Ali S.]的文章
必应学术
必应学术中相似的文章
[Vishwakarma, Gajendra K.]的文章
[Paul, Chinmoy]的文章
[Hadi, Ali S.]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。