科研成果详情

题名Information splitting for big data analytics
作者
发表日期2017-02-23
会议名称2016 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY PROCEEDINGS - CYBERC 2016
会议录名称Proceedings - 2016 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2016
页码294-302
会议日期OCT 13-15, 2016
会议地点Chengdu
会议举办国PEOPLES R CHINA
摘要

Many statistical models require an estimation of unknown (co)-variance parameter(s). The estimation is usually obtained by maximizing a log-likelihood which involves log determinant terms. In principle, one requires the observed information-The negative Hessian matrix or the second derivative of the log-likelihood-To obtain an accurate maximum likelihood estimator according to the Newton method. When one uses the Fisher information, the expect value of the observed information, a simpler algorithm than the Newton method is obtained as the Fisher scoring algorithm. With the advance in high-Throughput technologies in the biological sciences, recommendation systems and social networks, the sizes of data sets-And the corresponding statistical models-have suddenly increased by several orders of magnitude. Neither the observed information nor the Fisher information is easy to obtained for these big data sets. This paper introduces an information splitting technique to simplify the computation. After splitting the mean of the observed information and the Fisher information, an simpler approximate Hessian matrix for the log-likelihood can be obtained. This approximated Hessian matrix can significantly reduce computations, and makes the linear mixed model applicable for big data sets. Such a spitting and simpler formulas heavily depend on matrix algebra transforms, and applicable to large scale breeding model, genetics wide association analysis.

关键词Breeding model Fisher information matrix Fisher scoring algorithm Geno-wide-Association Linear mixed model Observed information matrix Variance parameter estimation
DOI10.1109/CyberC.2016.64
URL查看来源
收录类别CPCI-S
语种英语English
WOS研究方向Computer Science
WOS类目Computer Science, Interdisciplinary Applications ; Computer Science, Theory & Methods
WOS记录号WOS:000401467600052
Scopus入藏号2-s2.0-85015868026
引用统计
被引频次:9[WOS]   [WOS记录]     [WOS相关记录]
文献类型会议论文
条目标识符https://repository.uic.edu.cn/handle/39GCC9TT/11512
专题个人在本单位外知识产出
作者单位
Laboratory of Computational Physics,Institute of Applied Physics and Computational Mathematics,Beijing,P.O.Box 8009,100088,China
推荐引用方式
GB/T 7714
Zhu, Shengxin,Gu, Tongxiang,Xu, Xiaowenet al. Information splitting for big data analytics[C], 2017: 294-302.
条目包含的文件
条目无相关文件。
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Zhu, Shengxin]的文章
[Gu, Tongxiang]的文章
[Xu, Xiaowen]的文章
百度学术
百度学术中相似的文章
[Zhu, Shengxin]的文章
[Gu, Tongxiang]的文章
[Xu, Xiaowen]的文章
必应学术
必应学术中相似的文章
[Zhu, Shengxin]的文章
[Gu, Tongxiang]的文章
[Xu, Xiaowen]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。