题名 | An improved system for sentence-level novelty detection in textual streams |
作者 | |
发表日期 | 2015 |
会议名称 | 2015 International Conference on Smart and Sustainable City and Big Data, ICSSC 2015 |
会议录名称 | IET Conference Publications
![]() |
卷号 | 2015 |
期号 | CP672 |
页码 | 1-6 |
会议日期 | July 26-27, 2015 |
会议地点 | Shanghai |
摘要 | Novelty detection in news events has long been a difficult problem. A number of models performed well on specific data streams but certain issues are far from being solved, particularly in large data streams from the WWW where unpredictability of new terms requires adaptation in the vector space model. We present a novel event detection system based on the Incremental Term Frequency-Inverse Document Frequency (TF-IDF) weighting incorporated with Locality Sensitive Hashing (LSH). Our system could efficiently and effectively adapt to the changes within the data streams of any new terms with continual updates to the vector space model. Regarding miss probability, our proposed novelty detection framework outperforms a recognised baseline system by approximately 16% when evaluating a benchmark dataset from Google News. |
关键词 | Big data First story detection Locality sensitive hashing Novelty detection Text mining |
DOI | 10.2139/ssrn.2828008 |
URL | 查看来源 |
语种 | 英语English |
Scopus入藏号 | 2-s2.0-84964296808 |
引用统计 | |
文献类型 | 会议论文 |
条目标识符 | https://repository.uic.edu.cn/handle/39GCC9TT/11008 |
专题 | 个人在本单位外知识产出 |
作者单位 | 1.International Doctoral Innovation Centre,University of Nottingham,Ningbo,United Kingdom 2.School of Computer Science,University of Nottingham,United Kingdom |
推荐引用方式 GB/T 7714 | Fu, Xinyu,Ch'ng, Eugene,Aickelin, Uweet al. An improved system for sentence-level novelty detection in textual streams[C], 2015: 1-6. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论