Semi-Supervised Discriminative Transfer Learning in Cross-Language Text Classification

Document Type

Conference Proceeding

Publication Date


Publication Title

2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)


Institute of Electrical and Electronics Engineers

Publisher Location

Boca Raton, FL

First page number:


Last page number:



Cross-Language Text Classification (CLTC) has been increasing its attention to multilingual data due to its exponentially growing. CLTC aims to classify text documents in a label-scarce language, by leveraging classification information in a label-rich language. We propose a novel semi-supervised Discriminative Transfer Learning method (DTL) for the CLTC problem of a semi-supervised setting. A small number of paired labeled data in bilingual documents constructs a discriminative transfer model that maximizes the correlations of the documents in both languages, while a large number of unlabeled data are used for accurate data reconstruction. The discriminative transfer model minimizes the discrepancy between bilingual subspaces prioritizing discriminative features to improve the text classification performance without an automatic machine translation that most state-of-the-art methods require. The performance of DTL is empirically and statistically assessed by intensive experiments with the publicly available data, Reuters RCV1/RCV2 collections. The experimental results demonstrate that DTL outperforms several representative state-of-the-art methods in CLTC in terms of accuracy and efficiency.


Cross-language text classificaiton; Semi supervised learning; Subspace learning


Computer Sciences



UNLV article access