The objective of this project was to develop techniques for
building highly scalable and effective metasearch engines
and related techniques. A metasearch engine interacts with
multiple local search engines so that a single query can be
used to search multiple local search engines.
We study two types of metasearch engines. The first type combines
multiple document search engines and will be called document
metasearch engines. The second type combines multiple database
driven search engines and will be called database metasearch
engines. For both types of metasearch engines, the issues
that we study include how to discover and classify search engines,
how to build wrappers for search engines, how to identify potentially
useful local search engines for each user query, and how to merge
the results from multiple local search engines. For database
metasearch engines, we also study how to integrate the search
interfaces of multiple search engines into a unified interface
and how to annotate the retrieved results (Please click
http://www.cs.binghamton.edu/~meng/DMSE.html
to visit the homepage of our Web Database Metasearch
Project).
For document metasearch engines, our WebScales
project aims to create a metasearch engine on top of essentially
all useful search engines on the Web. Due to the scale of the
problem (it is estimated that there are hundreds of thousands of
search engines), we are developing scalable solutions and building
automated tools to construct metasearch engines.
This research is funded by the following grants from the National
Science Foundation: IIS-9902872, IIS-9902792, EIA-9911099, IIS-0208574,
and IIS-0208434). The Principal Investigators of these projects are
Prof. Weiyi Meng at SUNY Binghamton (BU) and Prof. Clement Yu at
the University of Illinois at Chicago (UIC). Zonghuan Wu and Vijay
Raghavan of University of Louisiana at Lafayette are also collaborators.
Any opinions, findings and conclusions or recomendations expressed
on this sites are those of the PIs and do not necessarily reflect
the views of the National Science Foundation (NSF).
The following students have participated/are participating
in this project:
- King-Lup Liu (UIC, graduated with a PhD degree, now the chief
software architect at Webscalers)
- Wensheng Wu (UIUC, graduated with a PhD degree, now with IBM)
- Fang Liu (UIC, graduated with a PhD degree, now with Microsoft)
- Shuang Liu (UIC, graduated with a PhD degree, now with Ask.com)
- Wei Zhang (UIC)
- George Philip (UIC, graduated with a Masters degree)
- Chaojing Sun (UIC, graduated with a Masters degree)
- Zonghuan Wu (BU, graduated with a PhD degree, now a research
faculty at University of Louisiana at Lafayette)
- Hongkun Zhao (BU, graduated with a PhD degree, now at Bloomberg)
- Yiyao Lu (BU)
- Reza Hemayati (BU)
- Xian Li (BU)
- Wanjing Zhang (BU, graduated with a Masters degree, now with Ask.com)
- Wenxian Wang (BU, graduated with a Masters degree)
- Zhuogang Li (BU, graduated with a Masters degree)
- Yohan Mammen (BU, graduated with a Masters degree)
- Chintan Gandhi (BU, graduated with a Masters degree)
- Hongyu Sun (BU, graduated with a Masters degree)
- Jing Qiu (BU, graduated with a Masters degree)
Related Publications and Technical Reports
- Weiyi Meng, King-Lup Liu, Clement Yu, Xiaodong Wang,
Yu-Hsi Chang, and Naphtali Rishe.
Determining
Text Databases to Search in the Internet .
Proc. of the 24th International Conference on Very Large Data
Bases (VLDB'98) , New York City, August 1998, pp.14-25.
- Weiyi Meng, King-Lup Liu, Clement Yu, Wensheng Wu, and
Naphtali Rishe. Estimating
the Usefulness of Search Engines . Proc. of
the 15th International Conference on Data Engineering
(ICDE'99) , Sydney, Australia, March 1999, pp.146-153.
- Clement Yu, King-Lup Liu, Wensheng Wu, Weiyi Meng, and
Naphtali Rishe. Finding the
Most Similar Documents across Multiple Text Databases
. Proc. of the IEEE Conference on Advances in Digital
Libraries (ADL'99) , Baltimore, Maryland, May 1999, pp.150-162.
- King-Lup Liu, Clement Yu, Weiyi Meng, and Naphtali Rishe.
Discovery of Similarity Computation on the Internet. Proc.
of the ACM Conference on Digital Libraries (DL'99) (poster paper)
, University of California, Berkeley, August 1999, pp.232-233.
- Weiyi Meng, Clement Yu, and King-Lup Liu.
Detection of Heterogeneities in a Multiple Text Database
Environment . Proc. of the Fourth IFCIS
Conference on Cooperative Information Systems (CoopIS'99) ,
Edinburgh, Scotland, September 1999, pp.22-33.
- Clement Yu, Weiyi Meng, King-Lup Liu, Wensheng Wu, and
Naphtali Rishe. Efficient and
Effective Metasearch for a Large Number of Text Databases .
Proc. of Eighth ACM International Conference on Information
and Knowledge Management (CIKM'99) , Kansas City, November
1999, pp.217-224.
- Wenxian Wang, Weiyi Meng, and Clement Yu.
Concept Hierarchy Based Text Database Categorization
in a Metasearch Engine Environment .
Proc. of First International Conference on Web Information
Systems Engineering (WISE'2000) , Hong Kong, June 2000,
pp.283-290.
- King-Lup Liu, Weiyi Meng, and Clement Yu.
Discovery of Similarity Computations of Search Engines
. Proc. of Nineth ACM International Conference on
Information and Knowledge Management (CIKM'00) , Washington,
D.C., November 2000, pp.290-297.
- Zonghuan Wu, Weiyi Meng, Clement Yu, and Zhuogang Li.
Towards a Highly-Scalable and
Effective Metasearch Engine . Proc. of Tenth
World Wide Web Conference (WWW10), Hong Kong, May 2001, pp.386-395.
- Clement Yu, Weiyi Meng, Wensheng Wu, and King-Lup Liu.
Efficient and Effective
Metasearch for Text Databases Incorporating Linkages among Documents
. ACM SIGMOD Conference, May 2001, pp.187-198.
- Weiyi Meng, Zonghuan Wu, Clement Yu, and Zhuogang Li.
A Highly-Scalable and
Effective Method for Metasearch. ACM Transactions
on Information Systems 19(3), pp.310-335, July 2001.
- King-Lup Liu, Clement Yu, Weiyi Meng, A. Santoso, and C. Zhang.
Discovering the Representative of a Search Engine. Tenth
ACM International Conference on Information and Knowledge
Management (CIKM'01), (poster paper), Atlanta, Georgia,
November 2001, pp.577-579.
- Weiyi Meng, Wenxian Wang, Hongyu Sun, and Clement Yu.
Concept Hierarchy Based
Text Database Categorization . International
Journal on Knowledge and Information Systems , Vol. 4, Vol. 2,
pp.132-150, March 2002.
- Weiyi Meng, Clement Yu, and King-Lup Liu.
Building Efficient and Effective Metasearch Engines
. ACM Computing Surveys , Vol. 34, No. 1,
March 2002, pp.48-89.
- Fang Liu, Clement Yu, and Weiyi Meng.
Personalize Web Search by Mapping User Queries to Categories
. Proc. of Eleventh ACM International Conference
on Information and Knowledge Management (CIKM'02) ,
McLean, Virginia, November 2002, pp.558-565.
- King-Lup Liu, Clement Yu, and Weiyi Meng.
Discovering the Representative of a
Search Engine. Proc. of Eleventh ACM International
Conference on Information and Knowledge Management (CIKM'02),
(poster paper), pp.652-654, McLean, Virginia, November 2002.
- Clement Yu, King-Lup Liu, Weiyi Meng, Zonghuan Wu, and Naphtali Rishe.
A Methodology to Retrieve Text
Documents from Multiple Databases .
IEEE Transactions on Knowledge and Data Engineering,
Vol.14, No.6, November/December 2002, pp.1347-1361.
- King-Lup Liu, Clement Yu, Weiyi Meng, Wensheng Wu, and
Naphtali Rishe. A Statistical
Method for Estimating the Usefulness of Text Databases .
IEEE Transactions on Knowledge and Data Engineering,
14(6), pp.1422-1437, November/December 2002.
- Zonghuan Wu, Vijay Raghavan, Chun Du, M. Sai C, Weiyi Meng,
Hai He, and Clement Yu.
SE-LEGO: Creating Metasearch Engine on Demand.
Proc. of 26th ACM SIGIR Conference, Demo paper, pp.464,
Toronto, Canada, July 2003.
- Zonghuan Wu, Vijay Raghavan, Chun Du, Weiyi Meng, Hai He,
and Clement Yu. Creating
Customized Metasearch Engines on Demand Using SE-LEGO.
Proc. of Fourth International Conference on Web-Age Information
Management (WAIM'03), Demo paper, pp.503-505, Chengdu, China,
August 2003.
- Zonghuan Wu, Vijay Raghavan, Hua Qian, V. Rama K, Weiyi Meng,
Hai He, and Clement Yu. Towards
Automatic Incorporation of Search Engines into a Large-Scale
Metasearch Engine . 2003 IEEE/WIC International
Conference on Web Intelligence, pp.658-661, Halifax, Canada,
October 2003.
- Clement Yu, and Weiyi Meng.
Web Search Technology . In The Internet
Encyclopedia edited by Hossein Bidgoli, Wiley Publishers,
pp.738-753, 2003.
- Fang Liu, Clement Yu, and Weiyi Meng.
Personalized Web Search for Improving Retrieval Effectiveness
, IEEE Transactions on Knowledge and Data Engineering,
Vol.16, No.1, pp.28-40, January 2004.
- Wensheng Wu, Clement Yu, and Weiyi Meng.
Database Selection for Longer Queries .
Proceedings of the 2004 Meeting of the International Federation of
Classification Societies, pp.575-584, Chicago, July 2004 (invited).
- Hongkun Zhao, Weiyi Meng, Zonghuan Wu, Vijay Raghavan,
and Clement Yu.
Fully Automatic Wrapper Generation for Search Engines
. Proc. of 14th International World Wide Web
Conference (WWW14), pp.66-75, Chiba, Japan, May 2005.
- Yiyao Lu, Weiyi Meng, Liangcai Shu, Clement Yu, and King-Lup Liu.
Evaluation of Result Merging
Strategies for Metasearch Engines .
6th International Conference on Web Information Systems
Engineering (WISE05) , pp.53-66, New York City, November 2005.
- Dheerendranath Mundluru, Zonghuan Wu, Vijay Raghavan, Weiyi Meng,
Hongkun Zhao.
Automatically Extracting Subsequent Response Pages from Web Search
Sources. IEEE Workshop on Knowledge Acquisition
from Distributed, Autonomous, Semantically Heterogeneous Data and
Knowledge Sources , Houston, Texas, November 2005.
- Yiyao Lu, Weiyi Meng, Wanjing Zhang, King-Lup Liu, and
Clement Yu. Automatic
Extraction of Publication Time from News Search Results.
2nd International Workshop on Challenges in Web Information
Retrieval and Integration (WIRI2006), pp.141-150,
Atlanta, Georgia, April 2006.
- Yanyan Ling, Xiaofeng Meng, and Weiyi Meng.
Automated Extraction
of Hit Numbers From Search Result Pages. Seventh
International Conference on Web-Age Information Management
(WAIM 2006), pp.73-84, Hong Kong, June 2006.
- Wei Liu, Xiaofeng Meng, Weiyi Meng.
Vision-based Web Data Records Extraction.
Ninth International Workshop on the Web and Databases
(WebDB 2006), pp.20-25, Chicago, June 2006.
- Hongkun Zhao, Weiyi Meng, Clement Yu.
Automatic Extraction of Dynamic Record Sections From Search
Engine Result Pages . 32nd International Conference
on Very Large Data Bases (VLDB06), pp.989-1000, Seoul, Korea,
September 2006.
- Ronak Desai, Qi Yang, Zonghuan Wu, Weiyi Meng, Clement Yu.
Identifying Redundant
Search Engines in a Very Large Scale Metasearch Engine Context
. Proc. of 8th ACM International Workshop on Web
Information and Data Management (WIDM 2006), pp.51-58, November 2006.
- Reza T. Hemayati, Weiyi Meng, Clement Yu.
Semantic-based Grouping of Search Engine Results Using WordNet
. Joint Conference of the 9th Asia-Pacific Web Conference
and the 8th International Conference on Web-Age Information Management
(APWeb/WAIM'07), pp.678-686, HuangShan, China, June 2007.
- Yiyao Lu, Zonghuan Wu, Hongkun Zhao, Weiyi Meng, King-Lup
Liu, Vijay Raghavan, Clement Yu.
MySearchView: A Customized Metasearch Engine Generator.
26th ACM SIGMOD International Conference on Management of Data (SIGMOD 2007),
Demo paper, pp.1113-1115, Beijing, China, June 2007.
- King-Lup Liu, Weiyi Meng, Jing Qiu, Clement Yu, Vijay
Raghavan, Zonghuan Wu, Yiyao Lu, Hai He, Hongkun Zhao.
AllInOneNews: Development and
Evaluation of a Large-Scale News Metasearch Engine.
26th ACM SIGMOD International Conference on Management of Data ACM
(SIGMOD 2007), Industrial track, pp.1017-1028, Beijing, China, June 2007.
- Hongkun Zhao, Weiyi Meng, and Clement Yu.
Mining Templates
from Search Result Records of Search Engines. 13th
ACM International Conference on Knowledge Discovering and Data
Mining (SIGKDD 2007), pp.884-893, San Jose, California, August 2007.
- Weiyi Meng, and Hai He.
Data Search Engine. In Encyclopedia of Computer Science
and Engineering (Benjamin Wah, ed.), John Wiley & Sons, pp.826-834,
January 2009.
- Weiyi Meng.
Metasearch Engines
. In