Relational
Data Community Discovery and Learning
Sponsor:
National
Science
Foundation

This proposed
project
addresses a three year integrated research and education program
focusing on
engaging into an in-depth research on a series of fundamental, open,
but very
important issues leading to relational data community discovery and
learning, built
upon our existing strength on the state-of-the-art research on this
topic.
The
intellectual merit
of this project includes the revolutionized understanding of the
unsupervised
general relational data clustering and learning as well as the expected
breakthrough
in the community discovery and learning methodologies that shall
undoubtedly
advance the literature of data mining and machine learning and generate
profound impact in the related areas.
The broader
impacts of
this project are two folds. Educationally, the development, the
implementation,
and the evaluation of the innovative community outreach activities
proposed in
this project shall promote the timely and effective knowledge
dissemination
related to relational data mining and machine learning and shall
further enrich
the pedagogical literature; the disseminated knowledge to the
collaborating
parties, especially the collaborating high school, shall further
advance and
enhance the high school education services and syllabi and develop the
model for
high schools' research and services to the whole society.
Technologically, the
expected breakthrough in developing the novel theory on relational data
community discovery and learning shall embrace a new era of
technological
revolution in a wide range of applications in the world, and in
particular,
shall benefit the collaborating organizations in developing and
advancing their
domain expertise in applications related to social network mining in
general
and Web data mining in particular.
It is
well-observed that
the whole world is full of data, and is also highly related in terms of
the
different types of the data objects such as people, organizations, and
events.
In many applications, it is intended to discover the hidden structures
through
such relationships involving different types of data objects in the
world, in
addition to "clusters" of the same type of data objects. For example,
in financial services, it is often needed to identify any potential
fraud
activities reflected in the normal transactions that involve people and
financial institutions; in commercial sales, it is often needed to link
the
customer purchase patterns to the potential sales promotion strategies
to
identify what kinds of customers are related to what kinds of
commercial products
through what kinds of service providers; in Web search industries, it
is
extremely desirable to identify what kinds of users use what kinds of
Web pages
and are highly influenced by what kinds of advertisements related to
what kinds
of commercial industries.
On the other
hand, it is
also true that it is too often that we do not have the luxury to have
any
training data with ground truth for knowledge discovery. Consequently,
unsupervised relational data learning is expected and desired for all
these
situations.
In this
research, we
focus on the most general scenario of relational data: the data objects
may
have attributes, homogeneous relations (among data objects of the same
type)
and/or heterogeneous relations (between data objects of different
types). Given
such general relational data, all the practical situations are
considered as
the special cases of this general scenario, and thus the novel unified
theory
as well as the related methodologies we wish to develop in this
research shall
be applicable to any real-world relational data knowledge discovery
problems,
potentially resulting in revolutionary technology development and
making the
proposed work fundamentally new and uniquely distinct from all the
existing
literature. Consequently, we define a relational data community in the
broad
sense that includes not only the local clusters of the same type of
data
objects, but more importantly also the global, hidden structures
incorporating
relationships with different types of data objects.
Relational
data
community discovery and learning is a fairly new area with many
challenging and
fundamentally new issues completely open.
On the other
hand,
solutions to these issues may lead to revolutionary technology
development that
shall generate significant societal impacts.
The
work to
be accomplished in this project shall be radically new because it is
based on
innovative preliminary research and it is to address a set of
fundamentally new
problems with fundamentally new solutions that not only aim at
developing a
better in-depth understanding of the literature, but more importantly
it is
likely to generate revolutionary technology development with
significant
societal impacts. Specifically,
this project focuses on the following three objectives to be achieved
synergistically:
(1) to address a series of challenging, fundamentally new, but very
important
issues on relational data community discovery and learning to lead to
the
development of a unified, fundamentally new theory on this topic to
have a
better in-depth understanding of the literature; (2) to extensively
evaluate
the theory and methodologies to be developed in collaborations with the
domain
experts in Web search industries as a specific application to social
network
mining; and (3) to develop and evaluate the innovative community
outreach and
education activities through the existing partnership with a local high
school
to further promote the knowledge dissemination from this research.

NSF Project Manager: Dr. Maria Zemankova

Project Personnel:
PI: Prof. Zhongfei
(Mark) Zhang
PhD students:
NSF REU
Students:
-
Greg Stoddard
-
Bryan Bernard
- Philip
Dexter
- Kevin
Hannon
- Daniel
F. McFadden

Partners:

Publications:
-
Ming Yang, Yingming Li, and Zhongfei (Mark) Zhang,
Multi-Task Learning with Gaussian Matrix Generalized Inverse Gaussian Model,
Proc. International Conference on Machine Learning, Atlanta, Georgia, USA,
June, 2013, (20% acceptance rate)
-
Xinyan Lu, Fei Wu, Siliang Tang, Zhongfei (Mark) Zhang, Xiaofei He, and
Yueting Zhuang,
A Low Rank Structural Large Margin Method for Cross-Modal Ranking,
Proc. The 36th Annual ACM SIGIR Conference, Dublin, Ireland,
July, 2013, (19.9% acceptance rate)
-
Yingming Li, ZhongAng Qi, Zhongfei (Mark) Zhang, and Ming Yang,
Learning with Limited and Noisy Tagging,
Proc. The ACM International Conference on Multimedia, Barcelona,
Catalunya, Spain,
October, 2013, (20% acceptance rate)
-
Xinyan Lu, Fei Wu, Siliang Tang, Zhongfei (Mark) Zhang, Xiaofei He, and
Yueting Zhuang,
Cross-Media Semantic Representation via Bi-directional Learning to Rank,
Proc. The ACM International Conference on Multimedia, Barcelona,
Catalunya, Spain,
October, 2013, (20% acceptance rate)
-
Xi Li, Anthony R. Dick, Chunhua Shen, Zhongfei Zhang, Anton van den
Hengel, and Hanzi Wang,
Visual Tracking With Spatio-Temporal Dempster-Shafer Information Fusion,
IEEE Transactions on Image Processing,
IEEE Signal Processing Society Press, Volune 22, Number 8, Pages
3028-3040, 2013
-
Zhen Guo, Zhongfei Zhang, Shenghuo Zhu, Yun Chi, Yihong Gong,
A Two-Level Topic Model towards Knowledge Discovery from Citation Networks,
IEEE Transactions on Knowledge and Data Engineering,
IEEE Computer Society Press, accepted, 2013
-
Xi Li, Weiming Hu, Chunhua Shen, Zhongfei Zhang, Anthony R. Dick, and Anton van den Hengel,
Context-Aware Hypergraph Construction for Robust Spectral Clustering,
IEEE Transactions on Knowledge and Data Engineering,
IEEE Computer Society Press, accepted, 2013
-
Zheng Fang and Zhongfei (Mark) Zhang,
Discriminant Transfer Learning on Manifold,
Proc. SIAM International Conference on Data Mining, Austin, TX, USA,
May, 2013, (25.5% acceptance rate)
[pdf]
-
Xi Li, Weiming Hu, Chunhua Shen, Zhongfei Zhang, Anthony Dick, and Anton
van den Hengel,
A Survey of Appearance Models in Visual Object Tracking,
ACM Transactions on Intelligent Systems and Technology,
ACM Press, accepted
[pdf]
-
Zheng Fang and Zhongfei (Mark) Zhang,
Simultaneously Combing Multi-View Multi-Label Learning with Maximum
Margin Classification,
Proc. IEEE International Conference on Data Mining, Brussels, Belgium,
December, 2012, (9.3% acceptance rate)
- Xi Li, Weiming Hu, Zhongfei Zhang, and David Suter,
An Incremental DPMM-Based Method for Trajectory Clustering, Modeling and Retrieval,
IEEE Transactions on Pattern Analysis and Machine Intelligence,
IEEE Computer Society Press, accepted, 2012
-
Fei Wu, Yahong Han, Xiang Liu, Jian Shao, Yueting Zhuang, Zhongfei Zhang,
The heterogeneous feature selection with structural sparsity for multimedia annotation and hashing: a survey,
nternational Journal of Multimedia Information Retrieval, Springer, Volume 1,
Number 1, Pages 3-15, April, 2012, DOI 10.1007/s13735-012-0001-9
-
Zhouzhou He, Zhongfei Zhang, Philip S. Yu,
Overlapping Community Detection Combining Content and Link,
Journal of Zhejiang University --- Part C: Computer Science, accepted, 2012
-
Weiming Hu, Haiqiang Zuo, Ou Wu, Yunfei Chen, Zhongfei Zhang, and David Suter,
Single and Multiple Object Tracking Using Log-Euclidean Riemannian Subspace and Block-Division Appearance Model,
IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Computer Society Press, accepted, 2012
-
Tianbing Xu, Zhongfei Zhang, Philip S. Yu, and Bo Long, Generative Models for Evolutionary Clustering,
ACM Transactions on Knowledge Discovery from Data, ACM Press, accepted, 2012
-
Zhongang Qi, Ming Yang, Zhongfei Zhang, Zhengyou Zhang, Multi-View Learning from Imperfect Tagging,
Proc. the 20th ACM International Conference on Multimedia, Nara, Japan, 2012, (20% acceptance rate)
-
Zhongang Qi, Ming Yang, Zhongfei Zhang, Zhengyou Zhang, Mining Noisy Tagging from Multi-Label Space,
Proc. the 21st ACM Conference on Information and Knowledge Management}, Maui, HW, USA, 2012, (14.3% acceptance rate)
-
Zhongfei (Mark) Zhang, Zhengyou Zhang, Ramesh Jain, and Yueting Zhuang, Guest Editors' Introduction: Special Section on Connected Multimedia,
Journal of Multimedia, Academy Publisher, Volume 7, Number 1, 2012
-
Zhongfei Zhang, Zhengyou Zhang, Ramesh Jain, Yueting Zhuang, Noshir Contractor, Alexander G. Hauptmann, Alejandro (Alex) Jaimes, Wanqing Li,
Alexander C. Loui, Tao Mei, Nicu Sebe, Yonghong Tian, Vincent S. Tseng, Qing Wang, Changsheng Xu, Huimin Yu, Shiwen Yu, Societally Connected Multimedia across Cultures,
Journal of Zhejiang University --- Part C (Computer Science), Springer, accepted, 2012
-
Mingyu Fan, Xiaoqin Zhang;, Zhouchen Lin, Zhongfei Zhang and Hujun Bao, Geodesic based semi-supervised multi-manifold feature extraction,
Proc. IEEE International Conference on Data Mining, Brussels, Belgium, December, 2012, (9.3% acceptance rate)
-
Jun Gao, Weiming Hu, Zhongfei (Mark) Zhang, Ou Wu, Unsupervised Ensemble Learning for Mining Top-n Outliers,
Proc. the 16th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Kuala Lumpur, Malaysia, May, 2012, (8.3% acceptance rate)
-
Yi Xu, Zhongfei (Mark) Zhang, Bo Long, and Philip S. Yu, Pattern Change Discovery between High Dimensional Datasets, Proc. the 20th ACM Conference on Information and Knowledge Management, Glasgow, UK, October, 2011, (15% acceptance rate)
- Bo Long and Zhongfei (Mark) Zhang, A General Model for Relational
Clustering, in Data Mining, Edited by Kimito Funatsu, IN-TECH, 2011
- Zhongfei (Mark) Zhang and Ruofei Zhang, Multimedia Information
Mining, in Multimedia Image and Video Processing, 2nd Edition, Edited
by Ling Guan, Yifeng He, and Sun-Yuan Kung, CRC Press, 2011
- Weiming Hu, Xi Li, Xiaoqin Zhang, Xinchu Shi, Stephen Maybank,
and Zhongfei Zhang, Incremental Tensor Subspace Learning and Its
Applications to Foreground Segmentation and Tracking, International
Journal of Computer Vision, Springer, Volume 91, No. 3, pages 303-327,
2011
- Zhongang Qi, Ming Yang, Zhongfei (Mark) Zhang, and Zhengyou
Zhang,
Mining Partially Annotated Images, Proc. the 17th ACM Conference on
Knowledge Discovery and Data Mining, San Diego, CA, USA, August, 2011,
(6.9% acceptance rate)
-
Jun Gao, Weiming Hu, Zhongfei (Mark) Zhang, Ou Wu, RKOF: Robust Kernel-Based Local Outlier Detection, Proc. the 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Shenzhen, China, 2011, (17.5% acceptance rate)
- Zhongfei Zhang and Haroon Khan, A Holistic, In-Compression
Approach to Mining Independent Motion Segments for Massive Surveillance
Video Collections, in Video Search and Mining, Edited by Dan Schonfeld,
Caifeng Shan, Dacheng Tao, and Liang Wan, Springer, ISBN
978-3-642-12899-8, 2010
- Zhongfei (Mark) Zhang and Ruofei Zhang, Multimedia Data Mining,
in Data Mining and Knowledge Discovery Handbook, 2nd Ed., Edited by
Oded Maimon and Lior Rokach, Springer, 2010
- Zhongfei (Mark) Zhang, Bo Long, Zhen Guo, Tianbing Xu, and Philip
S. Yu, Machine Learning Approaches to Link-Based Clustering, in Link
Mining: Models, Algorithms and Applications, Edited by Philip S. Yu,
Christos Faloutsos, and Jiawei Han, Springer, 2010
- Zhen Guo, Zhongfei (Mark) Zhang, Eric P. Xing, and Christos
Faloutsos, A Max Margin Framework on Image Annotation and Multimodal
Image Retrieval, in Multimedia, Edited by Vedran Kordic, IN-TECH, 2010
- Zhongfei (Mark) Zhang, Zhen Guo, Christos Faloutsos, Eric P.
Xing, and Jia-Yu Pan, On the scalability and adaptability for
multimodal image retrieval and image annotation, in Machine Learning
Techniques for Adaptive Multimedia Retrieval: Technologies Applications
and Perspectives, Edited by Roger Wei, Idea Group Inc., 2010
- Xi Li, Weiming Hu, Hanzi Wang, and Zhongfei (Mark) Zhang, Robust
Object Tracking Using A Spatial Pyramid Heat Kernel Structural
Information Representation, Nurocomputing, Elsevier Science, Volume 73,
No. 16-18, pages 3179-3190, October, 2010
- Bo Long, Zhongfei (Mark) Zhang, and Philip S. Yu, A General
Framework for Relation Graph Clustering, Knowledge and Information
Systems, Springer, Volume 24, Number 3, Pages 393-413, September, 2010
- Bo Long, Zhongfei (Mark) Zhang, and Philip S. Yu, Relational Data
Clustering: Models, Algorithms, and Applications, Taylor &
Francis/CRC Press, 2010, ISBN: 9781420072617
- Zhongfei (Mark) Zhang, Zhen Guo, and Jia-Yu (Tim) Pan, A
Multiple-Instance Learning Based Approach to Multimodal Data Mining,
International Journal of Digital Library Systems, IGI Global, Volume 1,
Number 2, Pages 23-41, April-June, 2010
- Xi Li, Weiming Hu, Hanzi Wang, and Zhongfei (Mark) Zhang, Linear
Discriminant Analysis Using Rotational Invariant L1 Norm,
Nurocomputing, Elsevier Science, Volume 73, No. 13-15, pages 2571-2579,
August, 2010
- Xi Li, Weiming Hu, Zhongfei (Mark) Zhang, and Hanzi Wang, Heat
Kernel Based Local Binary Pattern for Face Representation and
Classification, IEEE Signal Processing Letters, IEEE Signal Processing
Society Press, Volume 17, Issue 3, Pages 308-311, March, 2010
- Jun Gao, Weiming Hu, Zhongfei (Mark) Zhang, Ou Wu, Local Outlier
detection Based on Kernel Regression, Proc. International Conference on
Pattern Recognition, Istanbul, Turkey, August, 2010, (54% acceptance
rate)
- Zhen Guo, Shenghuo Zhu, Yun Chi, Zhongfei (Mark) Zhang, and
Yihong Gong, Unsupervised Learning from Linked Documents, Proc.
International Conference on Pattern Recognition, Istanbul, Turkey,
August, 2010, (54% acceptance rate)
- Zhen Guo, Shenghuo Zhu, Zhongfei (Mark) Zhang, Yun Chi, and
Yihong Gong, A Topic Model for Linked Documents and Update Rules for
its Estimation, Proc. AAAI, Atlanta, GA, USA, July, 2010, (26.9%
acceptance rate)
- Zhen GuoBo Long, Zhongfei (Mark) Zhang, and Philip S.
Yu, Relational Data Clustering: Models, Algorithms, and Applications,
Taylor & Francis/CRC Press, 2009, ISBN: 9781420072617
- Zhongfei Zhang and Haroon Khan, A Holistic,
In-Compression Approach to Mining Independent Motion Segments for
Massive Surveillance Video Collections, in Video Search and Mining,
Edited by Dan Schonfeld, Caifeng Shan, Dacheng Tao, and Liang Wan,
Springer, 2009
- Zhongfei (Mark) Zhang and Ruofei Zhang,
Multimedia Data Mining, in Data Mining and Knowledge Discovery
Handbook, 2nd Ed., Edited by Oded Maimon and Lior Rokach, Springer,
2009
-
Zhongfei (Mark) Zhang, Bo Long, Zhen Guo,
Tianbing Xu, and Philip S. Yu, Machine Learning Approaches to
Link-Based Clustering, in Link Mining: Models, Algorithms and
Applications, Edited by Philip S. Yu, Christos Faloutsos, and Jiawei
Han, Springer, 2009
-
Zhen Guo, Zhongfei (Mark) Zhang, Eric P.
Xing, and Christos Faloutsos, A Max Margin Framework on Image
Annotation and Multimodal Image Retrieval, in Multimedia, Edited by
Vedran Kordic, IN-TECH, 2009
-
Zhongfei (Mark) Zhang, Zhen Guo, Christos
Faloutsos, Eric P. Xing, and Jia-Yu Pan, On the scalability and
adaptability for multimodal image retrieval and image annotation, in
Machine Learning Techniques for Adaptive Multimedia Retrieval:
Technologies Applications and Perspectives, Edited by Roger Wei, Idea
Group Inc., 2010
-
Bo Long, Zhongfei (Mark) Zhang, and Philip
S. Yu, A General Framework for Relation Graph Clustering, Knowledge and
Information Systems Journal, Elsevier Science Press, Accepted, 2009
-
Zhen Guo, Zhongfei (Mark) Zhang, Shenghuo
Zhu, Yun Chi, and Yihong Gong, Knowledge Discovery from Citation
Networks, Proc. IEEE International Conference on Data Mining, Miami,
FL, USA, December, 2009
-
Zhen Guo, Shenghuo Zhu, Yun Chi, Zhongfei
(Mark) Zhang, and Yihong Gong, A latent topic model for linked
documents, Proc. ACM International Conference SIGIR, Boston, MA, USA,
July, 2009
-
Xi Li, Weiming Hu, Zhongfei (Mark) Zhang,
and Yang Liu, Spectral Graph Partitioning Based on A Random Walk
Diffusion Similarity Measure, Proc. Asian Conference on Computer
Vision, XiAn, China, September, 2009

Code Release
:
-
Yi Xu, Zhongfei (Mark) Zhang, Bo Long, and Philip S. Yu, Pattern Change Discovery between High Dimensional Datasets, Proc. the 20th ACM Conference on Information and Knowledge Management, Glasgow, UK, October, 2011, (15% acceptance rate) [code]
-
Zhen Guo, Zhongfei (Mark) Zhang, Shenghuo
Zhu, Yun Chi, and Yihong Gong, Knowledge Discovery from Citation
Networks, Proc. IEEE International Conference on Data Mining, Miami,
FL, USA, December, 2009 [code]

Data Release
:
-
Yi Xu, Zhongfei (Mark) Zhang, Bo Long, and Philip S. Yu, Pattern Change Discovery between High Dimensional Datasets, Proc. the 20th ACM Conference on Information and Knowledge Management, Glasgow, UK, October, 2011, (15% acceptance rate) [data]
-
Zhen Guo, Zhongfei (Mark) Zhang, Shenghuo
Zhu, Yun Chi, and Yihong Gong, Knowledge Discovery from Citation
Networks, Proc. IEEE International Conference on Data Mining, Miami,
FL, USA, December, 2009 [data]
This material
is based
upon the work supported by the National Science Foundation under Award
No.
0812114.
Any opinions,
findings,
and conclusions or recommendations expressed in this material are those
of the author(s)
and do not necessarily reflect the views of the National Science
Foundation.
Go back to the Multimedia Computing
Research Lab homepage