Details of Research Outputs

Status已发表Published
TitleFast multivariate empirical cumulative distribution function with connection to kernel density estimation
Creator
Date Issued2021-10-01
Source PublicationComputational Statistics and Data Analysis
ISSN0167-9473
Volume162
Abstract

The problem of computing empirical cumulative distribution functions (ECDF) efficiently on large, multivariate datasets, is revisited. Computing an ECDF at one evaluation point requires O(N) operations on a dataset composed of N data points. Therefore, a direct evaluation of ECDFs at N evaluation points requires a quadratic O(N) operations, which is prohibitive for large-scale problems. Two fast and exact methods are proposed and compared. The first one is based on fast summation in lexicographical order, with a O(Nlog⁡N) complexity and requires the evaluation points to lie on a regular grid. The second one is based on the divide-and-conquer principle, with a O(Nlog⁡(N)) complexity and requires the evaluation points to coincide with the input points. The two fast algorithms are described and detailed in the general d-dimensional case, and numerical experiments validate their speed and accuracy. Secondly, a direct connection between cumulative distribution functions and kernel density estimation (KDE) is established for a large class of kernels. This connection paves the way for fast exact algorithms for multivariate kernel density estimation and kernel regression. Numerical tests with the Laplacian kernel validate the speed and accuracy of the proposed algorithms. A broad range of large-scale multivariate density estimation, cumulative distribution estimation, survival function estimation and regression problems can benefit from the proposed numerical methods.

KeywordEmpirical distribution function Fast CDF Fast KDE Fast kernel summation Nonparametric copula estimation Survival function
DOI10.1016/j.csda.2021.107267
URLView source
Indexed BySCIE ; SSCI
Language英语English
WOS Research AreaComputer Science ; Mathematics
WOS SubjectComputer Science, Interdisciplinary Applications ; Statistics & Probability
WOS IDWOS:000656685900002
Scopus ID2-s2.0-85106438590
Citation statistics
Cited Times:15[WOS]   [WOS Record]     [Related Records in WOS]
Document TypeJournal article
Identifierhttp://repository.uic.edu.cn/handle/39GCC9TT/9644
CollectionResearch outside affiliated institution
Corresponding AuthorLangrené, Nicolas
Affiliation
1.CSIRO Data61,Australia
2.EDF Lab,FiME,France
Recommended Citation
GB/T 7714
Langrené, Nicolas,Warin, Xavier. Fast multivariate empirical cumulative distribution function with connection to kernel density estimation[J]. Computational Statistics and Data Analysis, 2021, 162.
APA Langrené, Nicolas, & Warin, Xavier. (2021). Fast multivariate empirical cumulative distribution function with connection to kernel density estimation. Computational Statistics and Data Analysis, 162.
MLA Langrené, Nicolas,et al."Fast multivariate empirical cumulative distribution function with connection to kernel density estimation". Computational Statistics and Data Analysis 162(2021).
Files in This Item:
There are no files associated with this item.
Related Services
Usage statistics
Google Scholar
Similar articles in Google Scholar
[Langrené, Nicolas]'s Articles
[Warin, Xavier]'s Articles
Baidu academic
Similar articles in Baidu academic
[Langrené, Nicolas]'s Articles
[Warin, Xavier]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Langrené, Nicolas]'s Articles
[Warin, Xavier]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.