Details of Research Outputs

Status已发表Published
TitleODE: Ontology-assisted data extraction
Creator
Date Issued2009-06-01
Source PublicationACM Transactions on Database Systems
ISSN0362-5915
Volume34Issue:2
Abstract

Online databases respond to a user query with result records encoded in HTML files. Data extraction, which is important for many applications, extracts the records from the HTML files automatically. We present a novel data extraction method, ODE (Ontology-assisted Data Extraction), which automatically extracts the query result records from the HTML pages. ODE first constructs an ontology for a domain according to information matching between the query interfaces and query result pages from different Web sites within the same domain. Then, the constructed domain ontology is used during data extraction to identify the query result section in a query result page and to align and label the data values in the extracted records. The ontology-assisted data extraction method is fully automatic and overcomes many of the deficiencies of current automatic data extraction methods. Experimental results show that ODE is extremely accurate for identifying the query result section in an HTML page, segmenting the query result section into query result records, and aligning and labeling the data values in the query result records. © 2009 ACM.

KeywordData value alignment Domain ontology Label assignment
DOI10.1145/1538909.1538914
URLView source
Indexed BySCIE
Language英语English
WOS Research AreaComputer Science
WOS SubjectComputer Science, Information Systems ; Computer Science, Software Engineering
WOS IDWOS:000268472600005
Scopus ID2-s2.0-68549083531
Citation statistics
Cited Times:42[WOS]   [WOS Record]     [Related Records in WOS]
Document TypeJournal article
Identifierhttp://repository.uic.edu.cn/handle/39GCC9TT/6645
CollectionFaculty of Science and Technology
Corresponding AuthorSu, Weifeng
Affiliation
1.BNU-HKBU United International College
2.Shenzhen Key Laboratory of Intelligent Media and Speech,PKU-HKUST Shenzhen Hong Kong Institution
3.City University of Hong Kong,Kowloon,Tat Chee Avenue,Hong Kong
4.The Hong Kong University of Science and Technology,Kowloon,Clear Water Bay,Hong Kong
First Author AffilicationBeijing Normal-Hong Kong Baptist University
Corresponding Author AffilicationBeijing Normal-Hong Kong Baptist University
Recommended Citation
GB/T 7714
Su, Weifeng,Wang, Jiying,Lochovsky, Frederick H. ODE: Ontology-assisted data extraction[J]. ACM Transactions on Database Systems, 2009, 34(2).
APA Su, Weifeng, Wang, Jiying, & Lochovsky, Frederick H. (2009). ODE: Ontology-assisted data extraction. ACM Transactions on Database Systems, 34(2).
MLA Su, Weifeng,et al."ODE: Ontology-assisted data extraction". ACM Transactions on Database Systems 34.2(2009).
Files in This Item:
There are no files associated with this item.
Related Services
Usage statistics
Google Scholar
Similar articles in Google Scholar
[Su, Weifeng]'s Articles
[Wang, Jiying]'s Articles
[Lochovsky, Frederick H.]'s Articles
Baidu academic
Similar articles in Baidu academic
[Su, Weifeng]'s Articles
[Wang, Jiying]'s Articles
[Lochovsky, Frederick H.]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Su, Weifeng]'s Articles
[Wang, Jiying]'s Articles
[Lochovsky, Frederick H.]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.