Premium
Multi‐transfer: Transfer learning with multiple views and multiple sources
Author(s) -
Tan Ben,
Zhong Erheng,
Xiang Evan Wei,
Yang Qiang
Publication year - 2014
Publication title -
statistical analysis and data mining: the asa data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.381
H-Index - 33
eISSN - 1932-1872
pISSN - 1932-1864
DOI - 10.1002/sam.11226
Subject(s) - computer science , leverage (statistics) , transfer of learning , exploit , complement (music) , domain (mathematical analysis) , knowledge transfer , artificial intelligence , machine learning , transfer (computing) , information retrieval , data mining , biochemistry , chemistry , knowledge management , computer security , mathematics , complementation , mathematical analysis , parallel computing , gene , phenotype
Transfer learning, which aims to help learning tasks in a target domain by leveraging knowledge from auxiliary domains, has been demonstrated to be effective in different applications such as text mining, sentiment analysis, and so on. In addition, in many real‐world applications, auxiliary data are described from multiple perspectives and usually carried by multiple sources. For example, to help classify videos on Youtube, which include three perspectives: image, voice and subtitles, one may borrow data from Flickr, Last.FM and Google News. Although any single instance in these domains can only cover a part of the views available on Youtube, the piece of information carried by them may compensate one another. If we can exploit these auxiliary domains in a collective manner, and transfer the knowledge to the target domain, we can improve the target model building from multiple perspectives. In this article, we consider this transfer learning problem as Transfer Learning with Multiple Views and Multiple Sources . As different sources may have different probability distributions and different views may compensate or be inconsistent with each other, merging all data in a simplistic manner will not give an optimal result. Thus, we propose a novel algorithm to leverage knowledge from different views and sources collaboratively, by letting different views from different sources complement each other through a co‐training style framework, at the same time, it revises the distribution differences in different domains. We conduct empirical studies on several real‐world datasets to show that the proposed approach can improve the classification accuracy by up to 8% against different kinds of state‐of‐the‐art baselines.