Details of Research Outputs

Status已发表Published
TitleVariable Selection for Distributed Sparse Regression Under Memory Constraints
Creator
Date Issued2024-06-01
Source PublicationCommunications in Mathematics and Statistics
ISSN2194-6701
Volume12Issue:2Pages:307-338
Abstract

This paper studies variable selection using the penalized likelihood method for distributed sparse regression with large sample size n under a limited memory constraint. This is a much needed research problem to be solved in the big data era. A naive divide-and-conquer method solving this problem is to split the whole data into N parts and run each part on one of N machines, aggregate the results from all machines via averaging, and finally obtain the selected variables. However, it tends to select more noise variables, and the false discovery rate may not be well controlled. We improve it by a special designed weighted average in aggregation. Although the alternating direction method of multiplier can be used to deal with massive data in the literature, our proposed method reduces the computational burden a lot and performs better by mean square error in most cases. Theoretically, we establish asymptotic properties of the resulting estimators for the likelihood models with a diverging number of parameters. Under some regularity conditions, we establish oracle properties in the sense that our distributed estimator shares the same asymptotic efficiency as the estimator based on the full sample. Computationally, a distributed penalized likelihood algorithm is proposed to refine the results in the context of general likelihoods. Furthermore, the proposed method is evaluated by simulations and a real example.

Keyword62H12 62J12 Distributed penalized likelihood algorithm Distributed sparse regression Memory constraints Variable selection
DOI10.1007/s40304-022-00291-w
URLView source
Indexed BySCIE
Language英语English
WOS Research AreaMathematics
WOS SubjectMathematics
WOS IDWOS:000921784400001
Scopus ID2-s2.0-85147185498
Citation statistics
Cited Times:1[WOS]   [WOS Record]     [Related Records in WOS]
Document TypeJournal article
Identifierhttp://repository.uic.edu.cn/handle/39GCC9TT/11774
CollectionBeijing Normal-Hong Kong Baptist University
Corresponding AuthorJiang, Xuejun
Affiliation
1.Department of Mathematics, Harbin Institute of Technology, Harbin, China
2.Department of Statistics and Data Science, Southern University of Science and Technology, Shenzhen, China
3.Beijing Normal University-Hong Kong Baptist University United International College, Zhuhai, China
4.Department of Mathematics and Statistics, University of North Carolina at Charlotte, Charlotte, United States
Recommended Citation
GB/T 7714
Wang, Haofeng,Jiang, Xuejun,Zhou, Minet al. Variable Selection for Distributed Sparse Regression Under Memory Constraints[J]. Communications in Mathematics and Statistics, 2024, 12(2): 307-338.
APA Wang, Haofeng, Jiang, Xuejun, Zhou, Min, & Jiang, Jiancheng. (2024). Variable Selection for Distributed Sparse Regression Under Memory Constraints. Communications in Mathematics and Statistics, 12(2), 307-338.
MLA Wang, Haofeng,et al."Variable Selection for Distributed Sparse Regression Under Memory Constraints". Communications in Mathematics and Statistics 12.2(2024): 307-338.
Files in This Item:
There are no files associated with this item.
Related Services
Usage statistics
Google Scholar
Similar articles in Google Scholar
[Wang, Haofeng]'s Articles
[Jiang, Xuejun]'s Articles
[Zhou, Min]'s Articles
Baidu academic
Similar articles in Baidu academic
[Wang, Haofeng]'s Articles
[Jiang, Xuejun]'s Articles
[Zhou, Min]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wang, Haofeng]'s Articles
[Jiang, Xuejun]'s Articles
[Zhou, Min]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.