Data Intensive
Computing for General Relational Data Learning
Sponsor:
National
Science
Foundation

This project
addresses a three year integrated research and
education program focusing on engaging into an in-depth research in
developing novel parallel frameworks for a wide spectrum of
state-of-the-art solutions to a series of fundamental problems in
relational data learning. The PIs shall focus on unsupervised
relational data community
discovery and analysis for the relational data learning,
built upon the PIs' existing strength on the
state-of-the-art research in relational data mining and parallel
computation and scheduling. The technologies developed from this
project shall have immediate important applications with
broader societal impacts such as social network analysis, biological
information discovery, financial and economic development
analysis and prediction, natural disaster prediction, as well as
military intelligence analysis.
It is
well-observed that the whole world is full of data that are
highly related and of diverse data object types such as people,
organizations, and events. In many applications, it is intended to
discover the hidden structures through such relationships involving
different types of data objects in the world, in addition to
"clusters" of the same type of data objects.
On the other hand,
it is too often that there is no luxury to have any training
data with ground truth for knowledge discovery. Consequently,
unsupervised relational data community discovery is expected and
desired for
all these applications.
Unsupervised
relational data learning typically involves a large
collection of data objects and thus algorithms for the relational data
learning are computation-intensive. This calls for massively
parallel solutions in order to make the algorithms scalable to large
collections of data. The advances in data center technology
make it possible and cost-effective to take advantage of hundreds of
thousands of commodity hardware to perform massive parallel data
intensive computation. Yet, the system architecture and emerging
parallel programming paradigms in the data center technology pose
many challenges in designing parallel solutions.
The intellectual merit of this project includes the revolutionized
understanding
in the context of distributed implementation of a wide spectrum of
state-of-the-art solutions
to the fundamental problems in the literature of relational data
learning
as well as the expected breakthrough in the interdisciplinary and
multidisciplinary research
communities including parallel computation and scheduling, data mining
and
machine learning, and
pattern analysis, that shall undoubtedly advance the literature in
these
areas.
The broader
impacts include the phenomenal societal impacts in
the expected breakthrough in developing parallel computing paradigms
on general relational
data learning that can be immediately deployed in important
applications
such as social network analysis, biological
information discovery, financial and economic development
analysis and prediction, natural disaster prediction, as well as
military intelligence analysis. The integrated innovative community
outreach
component shall contribute substantially to the revolution of high
school
curricula specifically and the K-12 education of the nation in general.
With the stated motivations, this project focuses on the following four
objectives
to be achieved synergistically: (1) to develop novel theory and
methodologies
to study a series of challenging and open unsupervised general
relational
data community discovery problems;
(2) to develop novel parallel computing paradigms tailored to the
theory
and methodologies
to be developed for the unsupervised general relational data community
discovery;
(3) to extensively
evaluate the developed parallel computing paradigms working closely
with
the industrial collaborators and to showcase the technology in
real-world applications; and (4) to develop and evaluate
the innovative community outreach component working closely with local
high schools
for developing the curriculum for high school students' independent
scientific research.

NSF Project Manager: Dr. Jie Yang

Project Personnel:
PI: Prof. Zhongfei
(Mark) Zhang
PhD students:
- Yi
Xu
- Pin
Zhao
- Shuangfei
Zhai
NSF REU
Students:
- Philip
Dexter
- Kevin
Hannon
- Daniel
F. McFadden

Partners:

Publications:
-
Yi Xu, Yilin Zhu, Zhongfei (Mark) Zhang, Yaqing Zhang, Philip S. Yu (2015).
Convex Approximation to the Integral Mixture Models Using Step Functions. IEEE International Conference on Data Mining.
Status = ACCEPTED; Acknowledgement of Federal Support = Yes
-
Shuangfei Zhai and Zhongfei (Mark) Zhang (2015). Dropout Training of Matrix Factorization and Autoencoder for Link Prediction in Sparse Graphs. SIAM International Conference on Data Mining. . Status = PUBLISHED; Acknowledgement of Federal Support = Yes
-
Zhong Ji, Yunlong Yu, Yanwei Pang, Yingming Li, Zhongfei Zhang (2015). Marginal Fisher Regression Classification for Face Recognition. PacificRim Conference on Multimedia. . Status = ACCEPTED; Acknowledgement of Federal Support = Yes
-
Xueyi Zhao, Xi Li, and Zhongfei Zhang (2015). Multimedia Retrieval via Deep Learning to Rank. IEEE International Conference on Image Processing. . Status = PUBLISHED; Acknowledgement of Federal Support = Yes
-
Yi Xu, Zhongfei (Mark) Zhang, Yaqing Zhang, Philip S. Yu (2015). Sensor Network Partitioning based on Homogeneity. IEEE International Conference on Data Science and Analytical Analysis. . Status = ACCEPTED; Acknowledgement of Federal Support = Yes
-
Siliang Tang, Fei Wu, Si Li, and Zhongfei (Mark) Zhang (2015). Sketch the Storyline with CHARCOAL: a Non parametric Approach. International Joint Conference on Artificial Intelligence. . Status = PUBLISHED; Acknowledgement of Federal Support = Yes
-
Fei Wu, Jun Song, Xi Li, Yi Yang, Zhongfei (Mark) Zhang, and Yueting Zhuang (2015). Structured embedding via pairwise relations and longrange interactions. AAAI. . Status = PUBLISHED; Acknowledgement of Federal Support = Yes
-
Chunhua Shen, Xi Li, Anthony R. Dick, Zhongfei Zhang, Yueting Zhang (2015). Online MetricWeighted Linear Representations for Robust Visual Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence. . Status = ACCEPTED; Acknowledgment of Federal Support = Yes ; Peer Reviewed = Yes
-
Fei Wu, Xinyang Jiang, Xi Li, Siliang Tang, Weiming Lu, Zhongfei Zhang, and Yueting Zhuang (2015). CrossModal Learning to Rank via Latent Joint Representation. IEEE Transactions on Image Processing. . Status = ACCEPTED; Acknowledgment of Federal Support = Yes ; Peer Reviewed = Yes
-
Peiguang Jing, Zhong Ji, Yunlong Yu, Zhongfei Zhang (2015). Visual Search Reranking with Relevant Local Discriminant Analysis. Neurocomputing. . Status = ACCEPTED; Acknowledgment of Federal Support = Yes ; Peer Reviewed = Yes
-
Shengkang Yu, Xi Li, Xueyi Zhao, Zhongfei Zhang, Fei Wu (2015). Tracking News Article Evolution by Dense Subgraph Learning. Neurocomputing. . Status = ACCEPTED; Acknowledgment of Federal Support = Yes ; Peer Reviewed = Yes
-
Xueyi Zhao, Chenyi Zhang, Zhongfei Zhang (2015). Distributed crossmedia multiple binary subspace learning. International Journal of Multimedia Information Retrieval. . Status = ACCEPTED; Acknowledgment of Federal Support = Yes ; Peer Reviewed = Yes
-
Xueyi Zhao, Xi Li, and Zhongfei Zhang (2015). Multimedia Retrieval via Deep Learning to Rank. IEEE Signal Processing Letters. . Status = ACCEPTED; Acknowledgment of Federal Support = Yes ; Peer Reviewed = Yes
-
Xueyi Zhao, Xi Li, Zhongfei Zhang (2015). Joint Structural Learning to Rank with Deep Linear Feature Learning. IEEE Transactions on Knowledge and Data Engineering. . Status = ACCEPTED; Acknowledgment of Federal Support = Yes ; Peer Reviewed = Yes
-
Yaqing Zhang, Xi Li, Zhongfei Zhang, and Fei Wu (2015). Deep Learning Driven Blockwise Moving Object Detection with Binary Scene Modeling. Neurocomputing. . Status = ACCEPTED; Acknowledgment of Federal Support = Yes ; Peer Reviewed = Yes
-
Zhen Guo, Zhongfei Zhang, Eric P. Xing, and Christos Faloutsos (2015). Multimodal Data Mining in a Multimedia Database Based on Structured Max Margin Learning. ACM Transactions on Knwoledge Discovery and Data Mining. Status = ACCEPTED; Acknowledgment of Federal Support = Yes ; Peer Reviewed = Yes
-
Mingyu Fan, Xiaoqin Zhang, Zhouchen Lin, Zhongfei Zhang, and Hujun Bao,
A Regularized Approach for Geodesic-Based Semisupervised Multimanifold Learning,
IEEE Transactions on Image Processing, IEEE Signal Processing Society Press, Volume 23, Issue 5, May, Pages 2133-2147, 2014
-
Ismail El Sayad, Jean Martinet, Zhongfei (Mark) Zhang, and Peter Eisert,
Multilayer Semantic Analysis In Image Databases,
in Real World Data Mining Applications,
Edited by Mahmoud Abou-Nasr, Robert Stahlbock, Stefan Lessmann, and Gary M. Weiss,
Springer,
2014
-
Zhongfei Zhang, Yueting Zhuang, Ramesh Jain, Jia-Yu Pan,
Editorial of the Special Issue on the Cross-Media Analysis,
International Journal of Multimedia Information Retrieval,
Springer, Volume 3, Number 3, October, 2014, DOI 10.1007/s13735-014-0060-1
-
Ming Yang, Yingming Li, Zhongfei Zhang,
Scientific Articles Recommendation with Topic Regression and Relational Matrix Factorization,
Journal of Zhejiang University --- Part C: Computer Science,
2014
-
Xi Li, Weiming Hu, Chunhua Shen, Zhongfei Zhang, Anthony R. Dick, and Anton van den Hengel,
Context-Aware Hypergraph Construction for Robust
Spectral Clustering,
IEEE Transactions on Knowledge and Data Engineering,
IEEE Computer Society Press, Volume 26, Issue 10, October, Pages 2588-2597, 2014
-
Zhen Guo, Zhongfei Zhang, Shenghuo Zhu, Yun Chi, Yihong Gong,
A Two-Level Topic Model towards Knowledge Discovery from Citation Networks,
IEEE Transactions on Knowledge and Data Engineering,
IEEE Computer Society Press, Volume 26, Issue 4, April, Pages 780 - 794, 2014
-
Fei Wu, Jun Song, Xi li, Yi Yang, Zhongfei (Mark) Zhang, and Yueting Zhuang,
Structured embedding via pairwise relations and long-range interactions,
Proc. AAAI, (AAAI 2015), Austin, TX, USA , January, 2015, (26.67% acceptance rate)
-
Te Pi, Xi Li, and Zhongfei (Mark) Zhang,
Structural Bregman Distance Functions Learning to Rank with Self-Reinforcement,
Proc. IEEE International Conference on Data Mining}, (ICDM 2014), Shenzhen, Guangdong, China, December, 2014, (9.8\% acceptance rate)
-
Jiangtao Yin, Lixin Gao, and Zhongfei (Mark) Zhang,
Scalable Nonnegative Matrix Factorization with Block-wise Updates,
Proc. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, (ECML/PKDD 2014), Nancy, France, September, 2014, (23.8% acceptance rate)
-
Zheng Fang and Zhongfei (Mark) Zhang,
Cross Domain Shared Subspace Learning for Unsupervised Transfer Classification,
Proc. International Conference on Pattern Recognition, Stockholm, Sweden, August, 2014, (18% acceptance rate)
-
Xueyi Zhao, Chenyi Zhang, and Zhongfei (Mark) Zhang,
Distributed Binary Subspace Learning on Large-Scale Cross-Media Data,
Proc. IEEE International Conference on Multimedia and Expo, Chengdu, China, July, 2014, (33% acceptance rate)
-
Zheng Fang and Zhongfei (Mark) Zhang,
Discriminative Feature Selection for Multi-View Cross-Domain,
ACM International Conference on Information and Knowledge Management,
San Francisco, CA, USA, October, 2013 (acceptance rate 16.8)
-
Yingming Li, Ming Yang, Zhongfei (Mark) Zhang,
Scientific Articles Recommendation,
ACM International Conference on Information and Knowledge Management,
San Francisco, CA, USA, October 2013 (acceptance rate 16.8)
-
Ming Yang, Yingming Li, and Zhongfei (Mark) Zhang,
Multi-Task Learning with Gaussian Matrix Generalized Inverse Gaussian Model,
Proc. International Conference on Machine Learning, Atlanta, Georgia, USA,
June, 2013, (20% acceptance rate)
-
Xinyan Lu, Fei Wu, Siliang Tang, Zhongfei (Mark) Zhang, Xiaofei He, and
Yueting Zhuang,
A Low Rank Structural Large Margin Method for Cross-Modal Ranking,
Proc. The 36th Annual ACM SIGIR Conference, Dublin, Ireland,
July, 2013, (19.9% acceptance rate)
-
Yingming Li, ZhongAng Qi, Zhongfei (Mark) Zhang, and Ming Yang,
Learning with Limited and Noisy Tagging,
Proc. The ACM International Conference on Multimedia, Barcelona,
Catalunya, Spain,
October, 2013, (20% acceptance rate)
-
Xinyan Lu, Fei Wu, Siliang Tang, Zhongfei (Mark) Zhang, Xiaofei He, and
Yueting Zhuang,
Cross-Media Semantic Representation via Bi-directional Learning to Rank,
Proc. The ACM International Conference on Multimedia, Barcelona,
Catalunya, Spain,
October, 2013, (20% acceptance rate)
-
Xi Li, Anthony R. Dick, Chunhua Shen, Zhongfei Zhang, Anton van den
Hengel, and Hanzi Wang,
Visual Tracking With Spatio-Temporal Dempster-Shafer Information Fusion,
IEEE Transactions on Image Processing,
IEEE Signal Processing Society Press, Volune 22, Number 8, Pages
3028-3040, 2013
-
Zhen Guo, Zhongfei Zhang, Shenghuo Zhu, Yun Chi, Yihong Gong,
A Two-Level Topic Model towards Knowledge Discovery from Citation Networks,
IEEE Transactions on Knowledge and Data Engineering,
IEEE Computer Society Press, accepted, 2013
-
Xi Li, Weiming Hu, Chunhua Shen, Zhongfei Zhang, Anthony R. Dick, and Anton van den Hengel,
Context-Aware Hypergraph Construction for Robust Spectral Clustering,
IEEE Transactions on Knowledge and Data Engineering,
IEEE Computer Society Press, accepted, 2013
-
Zheng Fang and Zhongfei (Mark) Zhang,
Discriminant Transfer Learning on Manifold,
Proc. SIAM International Conference on Data Mining, Austin, TX, USA,
May, 2013, (25.5% acceptance rate)
[pdf]
-
Xi Li, Weiming Hu, Chunhua Shen, Zhongfei Zhang, Anthony Dick, and Anton
van den Hengel,
A Survey of Appearance Models in Visual Object Tracking,
ACM Transactions on Intelligent Systems and Technology,
ACM Press, accepted
[pdf]
-
Zheng Fang and Zhongfei (Mark) Zhang,
Simultaneously Combing Multi-View Multi-Label Learning with Maximum
Margin Classification,
Proc. IEEE International Conference on Data Mining, Brussels, Belgium,
December, 2012, (9.3% acceptance rate)
- Xi Li, Weiming Hu, Zhongfei Zhang, and David Suter,
An Incremental DPMM-Based Method for Trajectory Clustering, Modeling and Retrieval,
IEEE Transactions on Pattern Analysis and Machine Intelligence,
IEEE Computer Society Press, accepted, 2012
-
Fei Wu, Yahong Han, Xiang Liu, Jian Shao, Yueting Zhuang, Zhongfei Zhang,
The heterogeneous feature selection with structural sparsity for multimedia annotation and hashing: a survey,
nternational Journal of Multimedia Information Retrieval, Springer, Volume 1,
Number 1, Pages 3-15, April, 2012, DOI 10.1007/s13735-012-0001-9
-
Zhouzhou He, Zhongfei Zhang, Philip S. Yu,
Overlapping Community Detection Combining Content and Link,
Journal of Zhejiang University --- Part C: Computer Science, accepted, 2012
-
Weiming Hu, Haiqiang Zuo, Ou Wu, Yunfei Chen, Zhongfei Zhang, and David Suter,
Single and Multiple Object Tracking Using Log-Euclidean Riemannian Subspace and Block-Division Appearance Model,
IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Computer Society Press, accepted, 2012
-
Tianbing Xu, Zhongfei Zhang, Philip S. Yu, and Bo Long, Generative Models for Evolutionary Clustering,
ACM Transactions on Knowledge Discovery from Data, ACM Press, accepted, 2012
-
Zhongang Qi, Ming Yang, Zhongfei Zhang, Zhengyou Zhang, Multi-View Learning from Imperfect Tagging,
Proc. the 20th ACM International Conference on Multimedia, Nara, Japan, 2012, (20% acceptance rate)
-
Zhongang Qi, Ming Yang, Zhongfei Zhang, Zhengyou Zhang, Mining Noisy Tagging from Multi-Label Space,
Proc. the 21st ACM Conference on Information and Knowledge Management}, Maui, HW, USA, 2012, (14.3% acceptance rate)
-
Zhongfei (Mark) Zhang, Zhengyou Zhang, Ramesh Jain, and Yueting Zhuang, Guest Editors' Introduction: Special Section on Connected Multimedia,
Journal of Multimedia, Academy Publisher, Volume 7, Number 1, 2012
-
Zhongfei Zhang, Zhengyou Zhang, Ramesh Jain, Yueting Zhuang, Noshir Contractor, Alexander G. Hauptmann, Alejandro (Alex) Jaimes, Wanqing Li,
Alexander C. Loui, Tao Mei, Nicu Sebe, Yonghong Tian, Vincent S. Tseng, Qing Wang, Changsheng Xu, Huimin Yu, Shiwen Yu, Societally Connected Multimedia across Cultures,
Journal of Zhejiang University --- Part C (Computer Science), Springer, accepted, 2012
-
Mingyu Fan, Xiaoqin Zhang;, Zhouchen Lin, Zhongfei Zhang and Hujun Bao, Geodesic based semi-supervised multi-manifold feature extraction,
Proc. IEEE International Conference on Data Mining, Brussels, Belgium, December, 2012, (9.3% acceptance rate)
-
Jun Gao, Weiming Hu, Zhongfei (Mark) Zhang, Ou Wu, Unsupervised Ensemble Learning for Mining Top-n Outliers,
Proc. the 16th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Kuala Lumpur, Malaysia, May, 2012, (8.3% acceptance rate)
-
Yi Xu, Zhongfei (Mark) Zhang, Bo Long, and Philip S. Yu, Pattern Change Discovery between High Dimensional Datasets, Proc. the 20th ACM Conference on Information and Knowledge Management, Glasgow, UK, October, 2011, (15% acceptance rate)
- Bo Long and Zhongfei (Mark) Zhang, A General Model for Relational
Clustering, in Data Mining, Edited by Kimito Funatsu, IN-TECH, 2011
- Zhongfei (Mark) Zhang and Ruofei Zhang, Multimedia Information
Mining, in Multimedia Image and Video Processing, 2nd Edition, Edited
by Ling Guan, Yifeng He, and Sun-Yuan Kung, CRC Press, 2011
- Weiming Hu, Xi Li, Xiaoqin Zhang, Xinchu Shi, Stephen Maybank,
and Zhongfei Zhang, Incremental Tensor Subspace Learning and Its
Applications to Foreground Segmentation and Tracking, International
Journal of Computer Vision, Springer, Volume 91, No. 3, pages 303-327,
2011
- Zhongang Qi, Ming Yang, Zhongfei (Mark) Zhang, and Zhengyou
Zhang,
Mining Partially Annotated Images, Proc. the 17th ACM Conference on
Knowledge Discovery and Data Mining, San Diego, CA, USA, August, 2011,
(6.9% acceptance rate)
-
Jun Gao, Weiming Hu, Zhongfei (Mark) Zhang, Ou Wu, RKOF: Robust Kernel-Based Local Outlier Detection, Proc. the 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Shenzhen, China, 2011, (17.5% acceptance rate)
- Zhongfei Zhang and Haroon Khan, A Holistic, In-Compression
Approach to Mining Independent Motion Segments for Massive Surveillance
Video Collections, in Video Search and Mining, Edited by Dan Schonfeld,
Caifeng Shan, Dacheng Tao, and Liang Wan, Springer, ISBN
978-3-642-12899-8, 2010
- Zhongfei (Mark) Zhang and Ruofei Zhang, Multimedia Data Mining,
in Data Mining and Knowledge Discovery Handbook, 2nd Ed., Edited by
Oded Maimon and Lior Rokach, Springer, 2010
- Zhongfei (Mark) Zhang, Bo Long, Zhen Guo, Tianbing Xu, and Philip
S. Yu, Machine Learning Approaches to Link-Based Clustering, in Link
Mining: Models, Algorithms and Applications, Edited by Philip S. Yu,
Christos Faloutsos, and Jiawei Han, Springer, 2010
- Zhen Guo, Zhongfei (Mark) Zhang, Eric P. Xing, and Christos
Faloutsos, A Max Margin Framework on Image Annotation and Multimodal
Image Retrieval, in Multimedia, Edited by Vedran Kordic, IN-TECH, 2010
- Zhongfei (Mark) Zhang, Zhen Guo, Christos Faloutsos, Eric P.
Xing, and Jia-Yu Pan, On the scalability and adaptability for
multimodal image retrieval and image annotation, in Machine Learning
Techniques for Adaptive Multimedia Retrieval: Technologies Applications
and Perspectives, Edited by Roger Wei, Idea Group Inc., 2010
- Xi Li, Weiming Hu, Hanzi Wang, and Zhongfei (Mark) Zhang, Robust
Object Tracking Using A Spatial Pyramid Heat Kernel Structural
Information Representation, Nurocomputing, Elsevier Science, Volume 73,
No. 16-18, pages 3179-3190, October, 2010
- Bo Long, Zhongfei (Mark) Zhang, and Philip S. Yu, A General
Framework for Relation Graph Clustering, Knowledge and Information
Systems, Springer, Volume 24, Number 3, Pages 393-413, September, 2010
Code Release
:
-
Yi Xu, Zhongfei (Mark) Zhang, Bo Long, and Philip S. Yu, Pattern Change Discovery between High Dimensional Datasets, Proc. the 20th ACM Conference on Information and Knowledge Management, Glasgow, UK, October, 2011, (15% acceptance rate) [code]

Data Release
:
-
Yi Xu, Zhongfei (Mark) Zhang, Bo Long, and Philip S. Yu, Pattern Change Discovery between High Dimensional Datasets, Proc. the 20th ACM Conference on Information and Knowledge Management, Glasgow, UK, October, 2011, (15% acceptance rate) [data]
This material
is based
upon the work supported by the National Science Foundation under Award
No.
1017828.
Any opinions,
findings,
and conclusions or recommendations expressed in this material are those
of the author(s)
and do not necessarily reflect the views of the National Science
Foundation.
Go back to the Multimedia Computing
Research Lab homepage