A Bi-Criteria Active Learning Algorithm for Dynamic Data Streams

Authors: Mohamad, S., Bouchachia, A. and Sayed-Mouchaweh, M.

Journal: IEEE Transactions on Neural Networks and Learning Systems

Volume: 29

Issue: 1

Pages: 74-86

eISSN: 2162-2388

ISSN: 2162-237X

DOI: 10.1109/TNNLS.2016.2614393

Abstract:

Active learning (AL) is a promising way to efficiently build up training sets with minimal supervision. A learner deliberately queries specific instances to tune the classifier's model using as few labels as possible. The challenge for streaming is that the data distribution may evolve over time, and therefore the model must adapt. Another challenge is the sampling bias where the sampled training set does not reflect the underlying data distribution. In the presence of concept drift, sampling bias is more likely to occur as the training set needs to represent the whole evolving data. To tackle these challenges, we propose a novel bi-criteria AL (BAL) approach that relies on two selection criteria, namely, label uncertainty criterion and density-based criterion. While the first criterion selects instances that are the most uncertain in terms of class membership, the latter dynamically curbs the sampling bias by weighting the samples to reflect on the true underlying distribution. To design and implement these two criteria for learning from streams, BAL adopts a Bayesian online learning approach and combines online classification and online clustering through the use of online logistic regression and online growing Gaussian mixture models, respectively. Empirical results obtained on standard synthetic and real-world benchmarks show the high performance of the proposed BAL method compared with the state-of-the-art AL methods.

https://eprints.bournemouth.ac.uk/24782/

Source: Scopus

A Bi-Criteria Active Learning Algorithm for Dynamic Data Streams.

Authors: Mohamad, S., Bouchachia, A. and Sayed-Mouchaweh, M.

Journal: IEEE Trans Neural Netw Learn Syst

Volume: 29

Issue: 1

Pages: 74-86

eISSN: 2162-2388

DOI: 10.1109/TNNLS.2016.2614393

Abstract:

Active learning (AL) is a promising way to efficiently build up training sets with minimal supervision. A learner deliberately queries specific instances to tune the classifier's model using as few labels as possible. The challenge for streaming is that the data distribution may evolve over time, and therefore the model must adapt. Another challenge is the sampling bias where the sampled training set does not reflect the underlying data distribution. In the presence of concept drift, sampling bias is more likely to occur as the training set needs to represent the whole evolving data. To tackle these challenges, we propose a novel bi-criteria AL (BAL) approach that relies on two selection criteria, namely, label uncertainty criterion and density-based criterion. While the first criterion selects instances that are the most uncertain in terms of class membership, the latter dynamically curbs the sampling bias by weighting the samples to reflect on the true underlying distribution. To design and implement these two criteria for learning from streams, BAL adopts a Bayesian online learning approach and combines online classification and online clustering through the use of online logistic regression and online growing Gaussian mixture models, respectively. Empirical results obtained on standard synthetic and real-world benchmarks show the high performance of the proposed BAL method compared with the state-of-the-art AL methods.

https://eprints.bournemouth.ac.uk/24782/

Source: PubMed

A Bi-Criteria Active Learning Algorithm for Dynamic Data Streams

Authors: Mohamad, S., Bouchachia, A. and Sayed-Mouchaweh, M.

Journal: IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

Volume: 29

Issue: 1

Pages: 74-86

eISSN: 2162-2388

ISSN: 2162-237X

DOI: 10.1109/TNNLS.2016.2614393

https://eprints.bournemouth.ac.uk/24782/

Source: Web of Science (Lite)

A Bi-Criteria Active Learning Algorithm for Dynamic Data Streams

Authors: Mohamad, S., Bouchachia, A., Sayed-Mouchaweh, M. and Mohamad, M.

Journal: IEEE Transactions on Neural Networks and Learning Systems

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

ISSN: 2162-2388

https://eprints.bournemouth.ac.uk/24782/

Source: Manual

A Bi-Criteria Active Learning Algorithm for Dynamic Data Streams.

Authors: Mohamad, S., Bouchachia, A. and Sayed-Mouchaweh, M.

Journal: IEEE transactions on neural networks and learning systems

Volume: 29

Issue: 1

Pages: 74-86

eISSN: 2162-2388

ISSN: 2162-237X

DOI: 10.1109/tnnls.2016.2614393

Abstract:

Active learning (AL) is a promising way to efficiently build up training sets with minimal supervision. A learner deliberately queries specific instances to tune the classifier's model using as few labels as possible. The challenge for streaming is that the data distribution may evolve over time, and therefore the model must adapt. Another challenge is the sampling bias where the sampled training set does not reflect the underlying data distribution. In the presence of concept drift, sampling bias is more likely to occur as the training set needs to represent the whole evolving data. To tackle these challenges, we propose a novel bi-criteria AL (BAL) approach that relies on two selection criteria, namely, label uncertainty criterion and density-based criterion. While the first criterion selects instances that are the most uncertain in terms of class membership, the latter dynamically curbs the sampling bias by weighting the samples to reflect on the true underlying distribution. To design and implement these two criteria for learning from streams, BAL adopts a Bayesian online learning approach and combines online classification and online clustering through the use of online logistic regression and online growing Gaussian mixture models, respectively. Empirical results obtained on standard synthetic and real-world benchmarks show the high performance of the proposed BAL method compared with the state-of-the-art AL methods.

https://eprints.bournemouth.ac.uk/24782/

Source: Europe PubMed Central

A Bi-Criteria Active Learning Algorithm for Dynamic Data Streams

Authors: Mohamad, S., Bouchachia, A. and Sayed-Mouchaweh, M.

Journal: IEEE Transactions on Neural Networks and Learning Systems

Volume: 29

Issue: 1

Pages: 74-86

ISSN: 2162-2388

Abstract:

Active learning (AL) is a promising way to efficiently building up training sets with minimal supervision. A learner deliberately queries specific instances to tune the classifier’s model using as few labels as possible. The challenge for streaming is that the data distribution may evolve over time and therefore the model must adapt. Another challenge is the sampling bias where the sampled training set does not reflect the underlying data distribution. In presence of concept drift, sampling bias is more likely to occur as the training set needs to represent the whole evolving data. To tackle these challenges, we propose a novel bi-criteria AL approach (BAL) that relies on two selection criteria, namely label uncertainty criterion and density-based cri- terion . While the first criterion selects instances that are the most uncertain in terms of class membership, the latter dynamically curbs the sampling bias by weighting the samples to reflect on the true underlying distribution. To design and implement these two criteria for learning from streams, BAL adopts a Bayesian online learning approach and combines online classification and online clustering through the use of online logistic regression and online growing Gaussian mixture models respectively. Empirical results obtained on standard synthetic and real-world benchmarks show the high performance of the proposed BAL method compared to the state-of-the-art AL methods

https://eprints.bournemouth.ac.uk/24782/

Source: BURO EPrints