CS 634 - Web Data Management Syllabus
Name : Weiyi Meng
Office : Q08, Engineering Building
Telephone : 777-4311
Fax : 777-4729
Advanced topics in web data management. Techniques for retrieving
and analyzing text documents, including basic text retrieval
methods and sentiment analysis issues. Modern search engine
technology, including the use of links and user behavior knowledge.
Advanced metasearch engine technology, including search engine
selection, wrapper generation and result fusion. Web database
integration techniques, including query interface extraction
and understanding and query interface integration. Some interesting
applications. Entrepreneurship issues, including information on
how to start a company, IP and funding issues. The topics may
vary when offered in different years.
When and where
Time: 8:30am --- 9:55am, Tuesday and Thursday
Classroom: SW 311
Prerequisite and Co-requisite
- Prerequisite: CS432/CS532 (Database Systems) or equivalent
- Co-requisite: CS533 (Information Retrieval) or equivalent
4:30pm --- 5:30pm, Tuesday, Thursday or by appointment
Part of the materials will be covered by the following books:
Weiyi Meng, Clement Yu. Advanced Metasearch Engine Technology.
Morgan & Claypool Publishers, December 2010.
here to download this book for free.
Eduard Dragut, Weiyi Meng, Clement Yu. Deep Web Query Interface
Understanding and Integration. Morgan & Claypool Publishers, 2012.
here to download this book for free.
Other course materials will come from published research
papers and lecture/tutorial notes.
But the following reference book is worth reading:
Search Engines: Information Retrieval in Practice by Bruce Croft,
Donald Metzler, and Trevor Strohman, Pearson Education, 2009.
Planned Topics (Click here for
more details; not necessarily covered in the following order)
- Topic 0-1: Introduce Instructor's Research
- Topic 0-2: Introduction to Course Projects
- Topic 0-3: How to Do Research?
- Topic 1: Brief Introduction to Text Retrieval
- Topic 2: Search Engine Technology
- Topic 3: Text Mining Topics
- Topic 4: Metasearch Engine Technology
- Topic 5: Web Database Integration System
- One week will be used to introduce course projects.
- One-two weeks will be used for students to make
presentations about their projects. The instructor will
lead discussions after presentations and all students are
expected to actively participate in the discussions.
- The rest of the time will be lectures given by the
Every student will do an individual course project.
A number of suggested projects will be provided by the
instructor and these projects will be briefly discussed
in the class. Students are encouraged to propose their
own course projects.
- Midterm Exam: 10%
- Final Exam: 10%
- Class Participation: 10%. Class participation includes
attendance and participation of class discussions. Student
attendance is required and will be checked regularly by the
instructor. Missing each class will result in a penalty of
0.5 point unless compelling reason for missing the class can
be presented in writing to the instructor. Class participation
will also be graded by how actively a student participates
in the discussions.
- Presentation: 10%. Each student is required to present his/her
course project to the entire class near the end of the semester.
Presentation will be graded by the quality of the content, the
quality of the slides and the smoothness and clarity of the
- Course Project: 60%. Several progress reports will be
required by specified dates before the final report is handed in.
Every progress report as well as the final report will be
separately graded based on its quality of content (originality,
creativity and technical content) and the quality of writing
(organization, logic, clarity and readability).
Academic honesty and integrity are expected of every student.
Dishonesty and cheating in all academic work related to this
course, when discovered, will be severely punished. Please read
the Student Academic Honesty Code at
Students must write their reports by themselves and using
their own languages. All referenced works (including ideas,
algorithms, programs, etc.) must be clearly cited within the
main body of the report and their full citations must be
listed at the end of the report. Students' own contributions
(new ideas, algorithms, programs, etc.) must be clearly
- Cell phone: Cell phones must be turned off or in vibrate
- Computer: Laptop/notebook computers should not be used in
general and definitely not for unrelated activities.
Journals and Conference Proceedings
The following are some of the leading journals and conferences
related to the subject of this course:
- IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE)
- ACM Transactions on Information Systems (ACM TOIS)
- Very Large Data Base Journal (VLDB Journal)
- World Wide Web Journal
- World Wide Web Conference
- International ACM SIGIR Conference on Research and Development
of Information Retrieval (ACM SIGIR)
- International Conference on Very Large Data Bases (VLDB)
- International Conference on the Management of Data (ACM SIGMOD)
- IEEE International Conference on Data Engineering (ICDE)
Web Sites for Computer Science Papers
Last change: December 27, 2013 / firstname.lastname@example.org