Self-learning Recursive Neural Networks for Structured Data Classification

© 2014 IEEE. Automatic classification of structured data is a challenging task and its relevance to many domains is evident. However, collecting labeled data may turn to be a quite expensive task and sometimes even prone to mislabeling. A technical solution to this problem consists in combining few labeled data samples and a significant amount of unlabeled data samples to train a classifier. Likewise, the present paper deals with the classification of partially labeled tree-like structured data. To carry on this task, we suggest an adapted variant of recursive neural networks (RNNs) that is equipped with semi-supervision mechanisms capable of learning from labeled and unlabeled tree-like data. Accordingly RNNs rely on self-learning to actively pre-label data which will be combined with originally labeled one during the learning process. The semi-supervised RNNs approach is presented and evaluated on real-world extensible Markup Language (XML) collection of documents in the context of digital libraries. The initial empirical experiments show high quality results.

