Applications   /   Bayesian   /   Distributed Classification   /   Embedding   /   Ensemble   /   Max-Margin   /   Multi-View   /   Multi-view;Max-Margin   /   Online   /   PLSA   /   Pattern Mining   /   Recommendation   /   Semi-defined Classification   /   Transfer Learning   /   multi-task   /   recommendation;autoencoder   /   transfer learning; CNN   /   transfer learning;autoencoder   /   transfer learning;recommendation   /  
Pic Attention-driven Factor Model for Explainable Personalized Recommendation

Jingwu Chen, Fuzhen Zhuang, SIGIR, 2018.

[pdf] [bib]

Latent Factor Models (LFMs) based on Collaborative Filtering (CF) have been widely applied in many recommendation systems, due to their good performance of prediction accuracy. In addition to users’ ratings, auxiliary information such as item features is often used to improve performance, especially when ratings are very sparse. ...read more
Pic Nonlinear Maximum Margin Multi-view Learning with Adaptive Kernel

Jia He, Changying Du, Changde Du, Fuzhen Zhuang, Qing He, Guoping Long, IJCAI, 2017.

[pdf] [bib]

Existing multi-view learning methods based on kernel function either require the user to select and tune a single predefined kernel or have to compute and store many Gram matrices to perform multiple kernel learning. Apart from the huge consumption of manpower, computation and memory resources, most of these models seek point estimation of their parameters, ...read more
Pic Mining Precise-positioning Episode Rules from Event Sequences

Xiang Ao, Ping Luo, Jin Wang, Fuzhen Zhuang, Qing He, ICDE, 2017.

[pdf] [bib]

Episode Rule Mining is a popular framework for discovering sequential rules from event sequential data. However, traditional episode rule mining methods only tell that the consequent event is likely to happen within a given time intervals after the occurrence of the antecedent events. As a result, they cannot satisfy the requirement of many time sensitive applications, ...read more
Pic Sequential Transfer Learning: Cross-domain Novelty Seeking Trait Mining for Recommendation

Fuzhen Zhuang, Yingmin Zhou, Fuzheng Zhang, Xiang Ao, Xing Xie, Qing He, WWW, 2017.

[pdf] [bib]

Recent studies in psychology suggest that novelty-seeking trait is highly related to consumer behavior, which has a profound business impact on online recommendation. This paper studies the problem of mining novelty seeking trait across domains to improve the recommendation performance in target domain. ...read more
Pic Representation Learning with Pair-wise Constraints for Collaborative Ranking

Fuzhen Zhuang, Dan Luo, Nicholas Jing Yuan, Xing Xie, Qing He, WSDM, 2017.

[pdf] [bib]

Last decades have witnessed a vast amount of interest and research in recommendation systems. Collaborative filtering which uses the known preferences of a group of users to make recommendations or predictions of the unknown preferences for other users, is one of the most successful approaches to build recommendation systems. ...read more
Pic Transfer Learning with Manifold Regularized Convolutional Neural Network

Fuzhen Zhuang, Lang Huang, Jia He, Jixin Ma, Qing He, KSEM, 2017.

[pdf] [bib]

Deep learning has been recently proposed to learn robust representation for various tasks and deliver state-of-the-art performance in the past few years. Most researchers attribute such success to the substantially increased depth of deep learning models. However, training a deep model is time-consuming and need huge amount of data. ...read more
Pic Recommendation in heterogeneous information network via dual similarity regularization

Jing Zheng, Jian Liu, Fuzhen Zhuang, Jingzhi Li, Bin Wu, International Journal of Data Science and Analytics, 2017.

[pdf] [bib]

Recommender system has caught much attention from multiple disciplines, and many techniques are proposed to build it. Recently, social recommendation becomes a hot research direction. The social recommendation methods tend to leverage social relations among users obtained from social network to alleviate data sparsity and cold-start problems in recommender systems. ...read more
Pic Self-organizing Weighted Incremental Probabilistic Latent Semantic Analysis

Ning Li, Wenjuan Luo, Kun Yang, Fuzhen Zhuang, Qing He, Zhongzhi Shi, International Journal of Machine Learning and Cybernetics, 2017.

[pdf] [bib]

PLSA (Probabilistic Latent Semantic Analysis) is a popular topic modeling technique which has been widely applied to text mining applications to discover the underlying topics embedded in the data corpus. However, due to the variability of increasing data, it is necessary to discover the dynamic topics and process the large dataset incrementally. ...read more
Pic Supervised Representation Learning with Double Encoding-layer Autoencoder for Transfer Learning

Fuzhen Zhuang, Xiaohu Cheng, Ping Luo, Fuzhen Zhuang, Qing He, ACM Transactions on Intelligent Systems and Technology, 2017.

[pdf] [bib]

Transfer learning has gained a lot of attention and interest in the past decade. One crucial research issue in transfer learning is how to find a good representation for instances of different domains such that the divergence between domains can be reduced with the new representation. Recently, deep learning has been proposed to learn more robust or higher-level features for transfer learning. ...read more
Pic Representation learning via Dual-Autoencoder for recommendation

Fuzhen Zhuang, Zhiqiang Zhang, Mingda Qiana, Xing Xie, Qing He, Neural Networks, 2017.

[pdf] [bib]

Recommendation has provoked vast amount of attention and research in recent decades. Most previous works employ matrix factorization techniques to learn the latent factors of users and items. And many subsequent works consider external information, e.g., social relationships of users and items’ attributions, ...read more
Pic Semantic Feature Learning for Heterogeneous Multi-task Classification via Non-negative Matrix Factorization

Fuzhen Zhuang, Xuebing Li, Xin Jin, Dapeng Zhang, Lirong Qiu, Qing He, IEEE Transactions on Cybernetics, 2017.

[pdf] [bib]

Multitask learning (MTL) aims to learn multiple related tasks simultaneously instead of separately to improve the generalization performance of each task. Most existing MTL methods assumed that the multiple tasks to be learned have the same feature representation. However, this assumption may not hold for many real-world applications. ...read more
Pic Online Bayesian Max-Margin Subspace Multi-View Learning

Jia He, Changying Du, Fuzhen Zhuang, Xin Yin, Qing He, Guoping long, IJCAI, 2016.

[pdf] [bib]

Last decades have witnessed a number of studies devoted to multi-view learning algorithms, however, few efforts have been made to handle online multi-view learning scenarios. In this paper, we propose an online Bayesian multi-view learning algorithm to learn predictive subspace with maxmargin principle. ...read more
Pic Supervised Representation Learning: Transfer Learning with Deep Autoencoders

Fuzhen Zhuang, Xiaohu Cheng, Ping Luo, Sinno Jialin Pan, Qing He, IJCAI, 2015.

[pdf] [bib]

Transfer learning has attracted a lot of attention in the past decade. One crucial research issue in transfer learning is how to find a good representation for instances of different domains such that the divergence between domains can be reduced with the new representation. Recently, deep learning has been proposed to learn more robust or higherlevel features for transfer learning. ...read more
Pic Heterogeneous Multi-task Semantic Feature Learning for Classification

Xin Jin, Fuzhen Zhuang, Sinno Jianlin Pan, Changying Du, Ping Luo, Qing He, CIKM, 2015.

[pdf] [bib]

Multi-task Learning (MTL) aims to learn multiple related tasks si- multaneously instead of separately to improve generalization per- formance of each task. Most existing MTL methods assumed that the multiple tasks to be learned have the same feature representa- tion. However, this assumption may not hold for many real-world applications. ...read more
Pic Representation Learning via Semi-supervised Autoencoder for Multi-task Learning

Fuzhen Zhuang, Dan Luo, Xin Jin, Hui Xiong, Ping Luo, Qing He, ICDM, 2015.

[pdf] [bib]

Multi-task learning aims at learning multiple related but different tasks. In general, there are two ways for multitask learning. One is to exploit the small set of labeled data from all tasks to learn a shared feature space for knowledge sharing. In this way, the focus is on the labeled training samples while the large amount of unlabeled data is not sufficiently considered. ...read more
Pic Collaborating between Local and Global Learning for Distributed Online Multiple Tasks

Xin Jin, Ping Luo, Fuzhen Zhuang, Jia He, Qing He, CIKM, 2015.

[pdf] [bib]

This paper studies the novel learning scenarios of Distributed Online Multi-tasks (DOM), where the learning individuals with continuously arriving data are distributed separately and meanwhile they need to learn individual models collaboratively. It has three characteristics: distributed learning, ...read more
Pic Bayesian Maximum Margin PCA

Changying Du, Shandian Zhe, Fuzhen Zhuang, Yuan Qi, Qing He, Zhongzhi Shi, AAAI, 2015.

[pdf] [bib]

Supervised dimensionality reduction has shown great advantages in finding predictive subspaces. Previous methods rarely consider the popular maximum margin principle and are prone to overfitting to usually small training data, especially for those under the maximum likelihood framework. In this paper, ...read more
Pic Festival, Date and Limit Line-Predicting Vehicle Accident Rate in Beijing

Xinyu Wu, Ping Luo, Qing He, Tianshu Feng, Fuzhen Zhuang, SDM, 2015.

[pdf] [blog] [bib]

Thousands of vehicle accidents happen every day in Beijing, leading to huge losses. Government traffic management bureau, hospitals, and insurance companies put massive manpower and material resources to deal with accidents. For more reasonable resource assignment, in this study we focus on the prediction of daily Vehicle Accident Rate (VAR), ...read more
Pic Online Frequent Episode Mining

Xiang Ao, Ping Luo, Chengkai Li , Fuzhen Zhuang, Qing He, ICDE, 2015.

[pdf] [bib]

Frequent episode mining is a popular framework for discovering sequential patterns from sequence data. Previous studies on this topic usually process data offline in a batch mode. However, for fast-growing sequence data, old episodes may become obsolete while new useful episodes keep emerging. More importantly, ...read more
Pic QPLSA: Utilizing quad-tuples for aspect identification and rating

Wenjuan Luo, Fuzhen Zhuang, Weizhong Zhao, Qing He, Zhongzhi Shi, Physica A, 2015.

[pdf] [bib]

Aspect level sentiment analysis is important for numerous opinion mining and market analysis applications. In this paper, we study the problem of identifying and rating review aspects, which is the fundamental task in aspect level sentiment analysis. Previous review aspect analysis methods seldom consider entity or rating but only 2-tuples, ...read more
Pic Combining supervised and unsupervised models via unconstrained probabilistic embedding

Xiang Ao, Ping Luo, Xudong Ma, Fuzhen Zhuang, Qing He, Zhongzhi Shi, Zhiyong Shen, Information Sciences, 2014.

[pdf] [bib]

In this study, we consider an ensemble problem in which we combine outputs coming from models developed in the supervised and unsupervised modes. By jointly considering the grouping results coming from unsupervised models we aim to improve the classification accuracy of supervised model ensemble. Here, ...read more
Pic Triplex transfer learning: exploiting both shared and distinct concepts for text classification

Fuzhen Zhuang, Ping Luo, Changying Du, Qing He, Zhongzhi Shi, Hui Xiong, IEEE Transactions on Cybernetics, 2014.

[pdf] [bib]

Transfer learning focuses on the learning scenarios when the test data from target domains and the training data from source domains are drawn from similar but different data distribution with respect to the raw features. Some recent studies argued that the high-level concepts (e.g. word clusters) can help model the data distribution difference, ...read more
Pic Nonparametric Bayesian Multi-Task Large-margin Classification

Changying Du, Jia He, Fuzhen Zhuang, Yuan Qi, Qing He, ECAI, 2014.

[pdf] [bib]

In this paper, we present a nonparametric Bayesian multi-task large-margin classification model which can cluster tasks into the most appropriate number of groups and induce flexible model sharing within each task group simultaneously. Specifically, we first show a very simple method to integrate large margin learning with hierarchical Bayesian models by employing an important variant of the standard SVMi. ...read more
Pic Transfer Learning with Multiple Sources via Consensus Regularized Autoencoders

Fuzhen Zhuang, Xiaohu Cheng, Sinno Jialin Pan, Wenchao Yu, Qing He, Zhongzhi Shi, ECML/PKDD, 2014.

[pdf] [bib]

Knowledge transfer from multiple source domains to a target domain is crucial in transfer learning. Most existing methods are focused on learning weights for different domains based on the similarities between each source domain and the target domain or learning more precise classifiers from the source domain data jointly by maximizing their consensus of predictions on the target domain data. ...read more
Pic Discovering and learning sensational episodes of news events

Xiang Ao, Ping Luo, Chengkai Li, Fuzhen Zhuang, Qing He, Zhongzhi Shi, WWW, 2014.

[pdf] [bib]

This paper studies the problem of discovering and learning sensational 2-episodes, i.e., pairs of co-occurring news events. To find all frequent episodes, we propose an efficient algorithm, MEELO, which significantly outperforms conventional methods. Given many frequent episodes, we rank them by their sensational effect. ...read more
Pic Concept Learning for Cross-Domain Text Classification: A General Probabilistic Framework

Fuzhen Zhuang, Ping Luo, Peifeng Yin, Qing He, Zhongzhi Shi, IJCAI, 2013.

[pdf] [bib]

Cross-domain learning targets at leveraging the knowledge from source domains to train accurate models for the test data from target domains with different but related data distributions. To tackle the challenge of data distribution difference in terms of raw features, previous works proposed to mine high-level concepts (e. ...read more
Pic Parallel sampling from big data with uncertainty distribution

Qing He, Haocheng Wang, Fuzhen Zhuang, Tianfeng Shang, Zhongzhi Shi, Fuzzy Sets and Systems, 2015.

[pdf] [bib]

Data are inherently uncertain in most applications. Uncertainty is encountered when an experiment such as sampling is to proceed, the result of which is not known to us while leading to variety of potential outcomes. With the rapid developments of data collection and distribution storage technologies, ...read more
Pic Embedding with Autoencoder Regularization

Wenchao Yu, Guangxiang Zeng , Ping Luo, Fuzhen Zhuang, Qing He, Zhongzhi Shi, ECML/PKDD, 2013.

[pdf] [bib]

The problem of embedding arises in many machine learning applications with the assumption that there may exist a small number of variabilities which can guarantee the “semantics” of the original high-dimensional data. Most of the existing embedding algorithms perform to maintain the locality-preserving property. ...read more
Pic Shared Structure Learning for Multiple Tasks with Multiple Views

Xin Jin, Fuzhen Zhuang, Shuhui Wang, Qing He, Zhongzhi Shi, ECML/PKDD, 2013.

[pdf] [bib]

Real-world problems usually exhibit dual-heterogeneity, i.e., every task in the problem has features from multiple views, and multiple tasks are related with each other through one or more shared views. To solve these multi-task problems with multiple views, we propose a shared structure learning framework, ...read more
Pic Triplex transfer learning: exploiting both shared and distinct concepts for text classification

Fuzhen Zhuang, Ping Luo, Changying Du, Qing He, Zhongzhi Shi, WSDM, 2013.

[pdf] [bib]

Transfer learning focuses on the learning scenarios when the test data from target domains and the training data from source domains are drawn from similar but different data distribution with respect to the raw features. Some recent studies argued that the high-level concepts (e.g. word clusters) can help model the data distribution difference, ...read more
Pic Parallel Feature Selection Using Positive Approximation Based on MapReduce

Qing He, Xiaohu Cheng, Fuzhen Zhuang, Zhongzhi Shi, International Conference on Fuzzy Systems and Knowledge Discovery, 2014.

[pdf] [bib]

Over the last few decades, feature selection has been a hot research area in pattern recognition and machine learning, and many famous feature selection algorithms have been proposed. Among them, feature selection using positive approximation(FSPA) is an accelerator for traditional rough set based feature selection algorithms, ...read more
Pic Scalable Bootstrap Clustering for Massive Data

Haocheng Wang, Fuzhen Zhuang, Qing He, SNPD, 2014.

[pdf] [bib]

The bootstrap provides a simple and powerful means of improving the accuracy of clustering. However, for today’s increasingly large datasets, the computation of bootstrap-based quantities can be prohibitively demanding. In this paper we introduce the Bag of Little Bootstraps Clustering (BLBC), a new procedure which utilizes the Bag of Little Bootstraps technique to obtain a robust, ...read more
Pic Energy model for rumor propagation on social networks

Shuo Han, Fuzhen Zhuang, Qing He, Zhongzhi Shi, Xiang Ao, Physica A: Statistical Mechanics and its Applications, 2013.

[pdf] [blog] [bib]

With the development of social networks, the impact of rumor propagation on human lives is more and more significant. Due to the change of propagation mode, traditional rumor propagation models designed for word-of-mouth process may not be suitable for describing the rumor spreading on social networks. ...read more
Pic Ratable Aspects over Sentiments: Predicting Ratings for Unrated Reviews

Wenjuan Luo, Fuzhen Zhuang, Xiaohu Cheng, Qing He, Zhongzhi Shi, ICDM, 2014.

[pdf] [bib]

Most existing ratable aspect generating methods for aspect mining focus on identifying and rating aspects of reviews with overall ratings, while huge amount of unrated reviews are beyond their ability. This drawback motivates the research problem in this paper: predicting aspect ratings and overall ratings for unrated reviews. ...read more
Pic Multi-task Multi-view Learning for Heterogeneous Tasks

Xin Jin, Fuzhen Zhuang, Hui Xiong, Changying Du, Ping Luo, Qing He, CIKM, 2014.

[pdf] [bib]

Most existing ratable aspect generating methods for aspect mining focus on identifying and rating aspects of reviews with overall ratings, while huge amount of unrated reviews are beyond their ability. This drawback motivates the research problem in this paper: predicting aspect ratings and overall ratings for unrated reviews. ...read more
Pic Balanced Seed Selection for Budgeted Influence Maximization in Social Networks

Shuo Han, Fuzhen Zhuang, Qing He, Zhongzhi Shi, PAKDD, 2014.

[pdf] [bib]

Given a budget and a network where different nodes have different costs to be selected, the budgeted influence maximization is to select seeds on budget so that the number of final influenced nodes can be maximized. In this paper, we propose three strategies to solve this problem. First, Billboard strategy chooses the most influential nodes as the seeds. ...read more
Pic Clustering in extreme learning machine feature space

Qing He, Xin Jin, Changying Du, Fuzhen Zhuang, Zhongzhi Shi, Neurocomputing, 2014.

[pdf] [bib]

Extreme learning machine (ELM), used for the “generalized” single-hidden-layer feedforward networks (SLFNs), is a unified learning platform that can use a widespread type of feature mappings. In theory, ELM can approximate any target continuous function and classify any disjoint regions; in application, ...read more
Pic PDMiner: 基于云计算的并行分布式数据挖掘工具平台

Qing He, Fuzhen Zhuang, Li Zeng, 中国科学: 信息科学, 2014.

[pdf] [bib]

随着信息技术和互联网的发展, 各种信息呈现爆炸性增长, 且包含丰富的知识. 从海量数据信息中挖掘得到有用的知识仍然是一个挑战性的课题. 近几十年来, 数据挖掘技术, 作为从海量数据信息中挖掘有用信息的关键技术已经引起了广泛的兴趣和研究. 但是由于数据规模的增长, 以往的很多研究工作并不能有效地处理大规模数据, 因此, 开发设计或者扩展已有算法使之能处理大规模数据集, 已经成为数据挖掘中非常重要的研究课题. 近年来, 基于云计算的数据挖掘技术研究已经成为一个热点话题, 本文中我们研究开发一个基于大规模数据处理平台Hadoop 的并行分布式数据挖掘工具平台PDMiner. 在PDMiner 中, 开发实现了各种并行数据挖掘算法, ...read more
Pic Parallel extreme learning machine for regression based on MapReduce

Qing He, Tianfeng Shang, Fuzhen Zhuang, Zhongzhi Shi, Neurocomputing, 2013.

[pdf] [bib]

Regression is one of the most basic problems in data mining. For regression problem, extreme learning machine (ELM) can get better generalization performance at a much faster learning speed. However, the enlarging volume of datasets makes regression by ELM on very large scale datasets a challenging task. ...read more
Pic Exploiting relevance, coverage, and novelty for query-focused multi-document summarization

Wenjuan Luo, Fuzhen Zhuang, Qing He, Zhongzhi Shi, Knowledge-Based Systems, 2013.

[pdf] [bib]

Summarization plays an increasingly important role with the exponential document growth on the Web. Specifically, for query-focused summarization, there exist three challenges: (1) how to retrieve query relevant sentences; (2) how to concisely cover the main aspects (i.e., topics) in the document; and (3) how to balance these two requests. ...read more
Pic Particle swarm optimization using dimension selection methods

Xin Jin, Yongquan Liang, Dongping Tian, Fuzhen Zhuang, Applied Mathematics and Computation, 2013.

[pdf] [bib]

Particle swarm optimization (PSO) has undergone many changes since its introduction in 1995. Being a stochastic algorithm, PSO and its randomness present formidable challenge for the theoretical analysis of it, and few of the existing PSO improvements have make an effort to eliminate the random coefficients in the PSO updating formula. ...read more
Pic A parallel incremental extreme SVM classifier

Qing He, Changying Du, Qun Wang, Fuzhen Zhuang, Zhongzhi Shi, Neurocomputing, 2011.

[pdf] [bib]

The classification algorithm extreme SVM (ESVM) proposed recently has been proved to provide very good generalization performance in relatively short time, however, it is inappropriate to deal with large-scale data set due to the highly intensive computation. Thus we propose to implement an efficient parallel ESVM (PESVM) based on the current and powerful parallel programming framework MapReduce. ...read more
Pic Inductive Transfer Learning for Unlabeled Target-domain via Hybrid Regularization

Fuzhen Zhuang, Ping Luo, Qing He, ZhongZhi Shi, Chinese Science Bulletin, 2009.

[pdf] [bib]

Recent years have witnessed an increasing interest in transfer learning. This paper deals with the classification problem that the target-domain with a different distribution from the source-domain is totally unlabeled, and aims to build an inductive model for unseen data. Firstly, we analyze the problem of class ratio drift in the previous work of transductive transfer learning, ...read more
Pic Multi-Task Semi-Supervised Semantic Feature Learning for Classification

Changying Du, Fuzhen Zhuang, Qing He, Zhongzhi Shi, ICDM, 2012.

[pdf] [bib]

Multi-task learning has proven to be useful to boost the learning of multiple related but different tasks. Meanwhile, latent semantic models such as LSA and NMF are popular and effective methods to extract discriminative semantic features of high dimensional dyadic data. In this paper, we present a method to combine these two techniques together by introducing a new matrix tri-factorization based formulation for semi-supervised latent semantic learning, ...read more
Pic Multi-view learning via probabilistic latent semantic analysis

Fuzhen Zhuang, George Karypis, Xia Ning, Qing He, Zhongzhi Shi, Information Sciences, 2012.

[pdf] [bib]

Multi-view learning arouses vast amount of interest in the past decades with numerous real-world applications in web page analysis, bioinformatics, image processing and so on. Unlike the most previous works following the idea of co-training, in this paper we propose a new generative model for Multi-view Learning via Probabilistic Latent Semantic Analysis, ...read more
Pic Mining Distinction and Commonality across Multiple Domains Using Generative Model for Text Classification

Fuzhen Zhuang, Ping Luo, Zhiyong Shen, Qing He, Yuhong Xiong, Zhongzhi Shi, Hui Xiong, IEEE Transactions on Data & Knowledge Engineering, 2012.

[pdf] [bib]

The distribution difference among multiple data domains has been considered for the cross-domain text classification problem. In this study, we show two new observations along this line. First, the data distribution difference may come from the fact that different domains use different key words to express the same concept. ...read more
Pic Combining Supervised and Unsupervised Models via Unconstrained Probabilistic Embedding

Xudong Ma, Ping Luo, Fuzhen Zhuang, Qing He, Zhongzhi Shi, Zhiyong Shen, IJCAI, 2011.

[pdf] [bib]

Ensemble learning with output from multiple supervised and unsupervised models aims to improvethe classification accuracy of supervised model ensembleby jointly considering the grouping results from unsupervised models. In this paper we cast this ensemble task as an unconstrained probabilistic embedding problem. ...read more
Pic D-LDA: A Topic Modeling Approach without Constraint Generation for Semi-defined Classification

Fuzhen Zhuang, Ping Luo, Zhiyong Shen, Qing He, Yuhong Xiong, Zhongzhi Shi, ICDM, 2010.

[pdf] [bib]

We study what we call semi-defined classification, which deals with the categorization tasks where the taxonomy of the data is not well defined in advance. It is motivated by the real-world applications, where the unlabeled data may also come from some other unknown classes besides the known classes for the labeled data. ...read more
Pic Collaborative Dual-PLSA: mining distinction and commonality across multiple domains for text classification

Fuzhen Zhuang, Ping Luo, Zhiyong Shen, Qing He, Yuhong Xiong, Zhongzhi Shi, Hui Xiong, CIKM, 2010.

[pdf] [bib]  Best Paper Candidate

The distribution difference among multiple data domains has been considered for the cross-domain text classification problem. In this study, we show two new observations along this line. First, the data distribution difference may come from the fact that different domains use different key words to express the same concept. ...read more
Pic Exploiting Associations between Word Clusters and Document Classes for Cross-Domain Text Categorization

Fuzhen Zhuang, Ping Luo, Hui Xiong, Qing He, Yuhong Xiong, Zhongzhi Shi, SDM, 2010.

[pdf] [bib]  Best Paper Candidate

Cross-domain text categorization targets on adapting the knowledge learnt from a labeled source domain to an unlabeled target domain, where the documents from the source and target domains are drawn from different distributions. However, in spite of the different distributions in raw-word features, the associations between word clusters (conceptual features) and document classes may remain stable across different domains. ...read more
Pic Cross-Domain Learning from Multiple Sources: A Consensus Regularization Perspective

Fuzhen Zhuang, Ping Luo, Hui Xiong, Yuhong Xiong, Qing He, Zhongzhi Shi, IEEE Transactions on Data & Knowledge Engineering, 2010.

[pdf] [bib]

Classification across different domains studies how to adapt a learning model from one domain to another domain which shares similar data characteristics. While there are a number of existing works along this line, many of them are only focused on learning from a single source domain to a target domain. ...read more
Pic Transfer learning from multiple source domains via consensus regularization

Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He, CIKM, 2008.

[pdf] [bib]

Recent years have witnessed an increased interest in transfer learning. Despite the vast amount of research performed in this field, there are remaining challenges in applying the knowledge learnt from multiple source domains to a target domain. First, data from multiple source domains can be semantically related, ...read more