科研成果详情

发表状态已发表Published
题名ODE: Ontology-assisted data extraction
作者
发表日期2009-06-01
发表期刊ACM Transactions on Database Systems
ISSN/eISSN0362-5915
卷号34期号:2
摘要

Online databases respond to a user query with result records encoded in HTML files. Data extraction, which is important for many applications, extracts the records from the HTML files automatically. We present a novel data extraction method, ODE (Ontology-assisted Data Extraction), which automatically extracts the query result records from the HTML pages. ODE first constructs an ontology for a domain according to information matching between the query interfaces and query result pages from different Web sites within the same domain. Then, the constructed domain ontology is used during data extraction to identify the query result section in a query result page and to align and label the data values in the extracted records. The ontology-assisted data extraction method is fully automatic and overcomes many of the deficiencies of current automatic data extraction methods. Experimental results show that ODE is extremely accurate for identifying the query result section in an HTML page, segmenting the query result section into query result records, and aligning and labeling the data values in the query result records. © 2009 ACM.

关键词Data value alignment Domain ontology Label assignment
DOI10.1145/1538909.1538914
URL查看来源
收录类别SCIE
语种英语English
WOS研究方向Computer Science
WOS类目Computer Science, Information Systems ; Computer Science, Software Engineering
WOS记录号WOS:000268472600005
Scopus入藏号2-s2.0-68549083531
引用统计
被引频次:42[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符https://repository.uic.edu.cn/handle/39GCC9TT/6645
专题理工科技学院
通讯作者Su, Weifeng
作者单位
1.BNU-HKBU United International College
2.Shenzhen Key Laboratory of Intelligent Media and Speech,PKU-HKUST Shenzhen Hong Kong Institution
3.City University of Hong Kong,Kowloon,Tat Chee Avenue,Hong Kong
4.The Hong Kong University of Science and Technology,Kowloon,Clear Water Bay,Hong Kong
第一作者单位北师香港浸会大学
通讯作者单位北师香港浸会大学
推荐引用方式
GB/T 7714
Su, Weifeng,Wang, Jiying,Lochovsky, Frederick H. ODE: Ontology-assisted data extraction[J]. ACM Transactions on Database Systems, 2009, 34(2).
APA Su, Weifeng, Wang, Jiying, & Lochovsky, Frederick H. (2009). ODE: Ontology-assisted data extraction. ACM Transactions on Database Systems, 34(2).
MLA Su, Weifeng,et al."ODE: Ontology-assisted data extraction". ACM Transactions on Database Systems 34.2(2009).
条目包含的文件
条目无相关文件。
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Su, Weifeng]的文章
[Wang, Jiying]的文章
[Lochovsky, Frederick H.]的文章
百度学术
百度学术中相似的文章
[Su, Weifeng]的文章
[Wang, Jiying]的文章
[Lochovsky, Frederick H.]的文章
必应学术
必应学术中相似的文章
[Su, Weifeng]的文章
[Wang, Jiying]的文章
[Lochovsky, Frederick H.]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。