Title | Automatic hierarchical classification of structured deep web databases |
Creator | |
Date Issued | 2006 |
Conference Name | 7th International Conference on Web Information Systems Engineering |
Source Publication | Web Information Systems – WISE 2006
![]() |
Editor | Karl Aberer, Zhiyong Peng, Elke A. Rundensteiner, Yanchun Zhang, Xuhui Li |
ISBN | 3540481052 |
ISSN | 0302-9743 |
Volume | Lecture Notes in Computer Science, vol 4255 |
Pages | 210-221 |
Conference Date | Wuhan, China |
Conference Place | OCT 23-26, 2006 |
Publication Place | Berlin |
Publisher | Springer |
Abstract | We present a method that automatically classifies structured deep Web databases according to a pre-defined topic hierarchy. We assume that there are some manually classified databases, i.e., training databases, in every node of the topic hierarchy. Each training database is probed using queries constructed from the node titles of the topic hierarchy and the query result counts reported by the database are used to represent the content of the database. Hence, when adding a new database it can be probed by the same set of queries and classified to a node whose training databases are most similar to the new one. Specifically, a support vector machine classifier is trained on each internal node of the topic hierarchy with these training databases and the new database can be classified into the hierarchy top-down level by level. A feature extension method is proposed to create discriminant features. Experiments run on real structured Web databases collected from the Internet show that this classification method is quite accurate. © Springer-Verlag Berlin Heidelberg 2006. |
DOI | 10.1007/11912873_23 |
URL | View source |
Indexed By | SCIE ; CPCI-S |
Language | 英语English |
WOS Research Area | Computer Science |
WOS Subject | Computer Science, Artificial Intelligence ; Computer Science, Information Systems ; Computer Science, Theory & Methods |
WOS ID | WOS:000241624200020 |
Scopus ID | 2-s2.0-33845241508 |
Citation statistics | |
Document Type | Conference paper |
Identifier | http://repository.uic.edu.cn/handle/39GCC9TT/6843 |
Collection | Research outside affiliated institution |
Affiliation | 1.Hong Kong University of Science and Technology,Hong Kong 2.City University,Hong Kong |
Recommended Citation GB/T 7714 | Su, Weifeng,Wang, Jiying,Lochovsky, Frederick. Automatic hierarchical classification of structured deep web databases[C]//Karl Aberer, Zhiyong Peng, Elke A. Rundensteiner, Yanchun Zhang, Xuhui Li. Berlin: Springer, 2006: 210-221. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment