Correntropy-based density-preserving data sampling as an alternative to standard cross-validation

Authors: Budka, M. and Gabrys, B.

Journal: Proceedings of the International Joint Conference on Neural Networks

ISBN: 9781424469178

DOI: 10.1109/IJCNN.2010.5596717

Abstract:

Estimation of the generalization ability of a predictive model is an important issue, as it indicates expected performance on previously unseen data and is also used for model selection. Currently used generalization error estimation procedures like cross-validation (CV) or bootstrap are stochastic and thus require multiple repetitions in order to produce reliable results, which can be computationally expensive if not prohibitive. The correntropy-based Density Preserving Sampling procedure (DPS) proposed in this paper eliminates the need for repeating the error estimation procedure by dividing the available data into subsets, which are guaranteed to be representative of the input dataset. This allows to produce low variance error estimates with accuracy comparable to 10 times repeated cross-validation at a fraction of computations required by CV, which has been investigated using a set of publicly available benchmark datasets and standard classifiers. © 2010 IEEE.

http://eprints.bournemouth.ac.uk/16572/

Source: Scopus

Correntropy-based density-preserving data sampling as an alternative to standard cross-validation

Authors: Budka, M., Gabrys, B. and IEEE

Journal: 2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010

ISSN: 2161-4393

http://eprints.bournemouth.ac.uk/16572/

Source: Web of Science (Lite)

Correntropy–based density–preserving data sampling as an alternative to standard cross–validation

Authors: Budka, M. and Gabrys, B.

Conference: World Congress on Computational Intelligence (WCCI 2010)

Abstract:

Estimation of the generalization ability of a predictive model is an important issue, as it indicates expected performance on previously unseen data and is also used for model selection. Currently used generalization error estimation procedures like cross–validation (CV) or bootstrap are stochastic and thus require multiple repetitions in order to produce reliable results, which can be computationally expensive if not prohibitive. The correntropy–based Density Preserving Sampling procedure (DPS) proposed in this paper eliminates the need for repeating the error estimation procedure by dividing the available data into subsets, which are guaranteed to be representative of the input dataset. This allows to produce low variance error estimates with accuracy comparable to 10 times repeated cross–validation at a fraction of computations required by CV, which has been investigated using a set of publicly available benchmark datasets and standard classifiers.

http://eprints.bournemouth.ac.uk/16572/

Source: Manual

Correntropy–based density–preserving data sampling as an alternative to standard cross–validation

Authors: Budka, M. and Gabrys, B.

Pages: 1-8

Publisher: IEEE

ISBN: 978-1-4244-6916-1

DOI: 10.1109/IJCNN.2010.5596717

Abstract:

Estimation of the generalization ability of a predictive model is an important issue, as it indicates expected performance on previously unseen data and is also used for model selection. Currently used generalization error estimation procedures like cross–validation (CV) or bootstrap are stochastic and thus require multiple repetitions in order to produce reliable results, which can be computationally expensive if not prohibitive. The correntropy–based Density Preserving Sampling procedure (DPS) proposed in this paper eliminates the need for repeating the error estimation procedure by dividing the available data into subsets, which are guaranteed to be representative of the input dataset. This allows to produce low variance error estimates with accuracy comparable to 10 times repeated cross–validation at a fraction of computations required by CV, which has been investigated using a set of publicly available benchmark datasets and standard classifiers.

http://eprints.bournemouth.ac.uk/16572/

http://ieeexplore.ieee.org/search/srchabstract.jsp?tp=&arnumber=5596717&queryText%3DBudka%26openedRefinements%3D*%26searchField%3DSearch+All

Source: Manual

Correntropy–based density–preserving data sampling as an alternative to standard cross–validation

Authors: Budka, M. and Gabrys, B.

Conference: World Congress on Computational Intelligence (WCCI 2010)

Dates: 18-23 July 2010

Pages: 1-8

Publisher: IEEE

ISBN: 9781424481262

ISSN: 1098-7576

DOI: 10.1109/IJCNN.2010.5596717

Abstract:

Estimation of the generalization ability of a predictive model is an important issue, as it indicates expected performance on previously unseen data and is also used for model selection. Currently used generalization error estimation procedures like cross–validation (CV) or bootstrap are stochastic and thus require multiple repetitions in order to produce reliable results, which can be computationally expensive if not prohibitive. The correntropy–based Density Preserving Sampling procedure (DPS) proposed in this paper eliminates the need for repeating the error estimation procedure by dividing the available data into subsets, which are guaranteed to be representative of the input dataset. This allows to produce low variance error estimates with accuracy comparable to 10 times repeated cross–validation at a fraction of computations required by CV, which has been investigated using a set of publicly available benchmark datasets and standard classifiers.

http://eprints.bournemouth.ac.uk/16572/

http://ieeexplore.ieee.org/search/srchabstract.jsp?tp=&arnumber=5596717&queryText%3DBudka%26openedRefinements%3D*%26searchField%3DSearch+All

Source: Manual

Preferred by: Marcin Budka

Correntropy-based density-preserving data sampling as an alternative to standard cross-validation.

Authors: Budka, M. and Gabrys, B.

Journal: IJCNN

Pages: 1-8

Publisher: IEEE

ISBN: 978-1-4244-6916-1

http://eprints.bournemouth.ac.uk/16572/

http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=5581822

Source: DBLP

Correntropy–based density–preserving data sampling as an alternative to standard cross–validation

Authors: Budka, M. and Gabrys, B.

Pages: 1-8

Publisher: IEEE

ISBN: 978-1-4244-6916-1

ISSN: 1098-7576

Abstract:

Estimation of the generalization ability of a predictive model is an important issue, as it indicates expected performance on previously unseen data and is also used for model selection. Currently used generalization error estimation procedures like cross–validation (CV) or bootstrap are stochastic and thus require multiple repetitions in order to produce reliable results, which can be computationally expensive if not prohibitive. The correntropy–based Density Preserving Sampling procedure (DPS) proposed in this paper eliminates the need for repeating the error estimation procedure by dividing the available data into subsets, which are guaranteed to be representative of the input dataset. This allows to produce low variance error estimates with accuracy comparable to 10 times repeated cross–validation at a fraction of computations required by CV, which has been investigated using a set of publicly available benchmark datasets and standard classifiers.

http://eprints.bournemouth.ac.uk/16572/

http://ieeexplore.ieee.org/search/srchabstract.jsp?tp=&arnumber=5596717&queryText%3DBudka%26openedRefinements%3D*%26searchField%3DSearch+All

Source: BURO EPrints