Meta-level learning for the effective reduction of model search space.
Authors: Abbas, A.
Conference: Bournemouth University, Faculty of Science and TechnologyAbstract:
The exponential growth of volume, variety and velocity of the data is raising the need for investigation of intelligent ways to extract useful patterns from the data. It requires deep expert knowledge and extensive computational resources to find the mapping of learning methods that leads to the optimized performance on a given task. Moreover, numerous configurations of these learning algorithms add another level of complexity. Thus, it triggers the need for an intelligent recommendation engine that can advise the best learning algorithm and its configurations for a given task. The techniques that are commonly used by experts are; trial-and-error, use their prior experience on the specific domain, etc. These techniques sometimes work for less complex tasks that require thousands of parameters to learn. However, the state-of-the-art models, e.g. deep learning models, require well-tuned hyper-parameters to learn millions of parameters which demand specialized skills and numerous computationally expensive and time-consuming trials. In that scenario, Meta-level learning can be a potential solution that can recommend the most appropriate options efficiently and effectively regardless of the complexity of data. On the contrary, Meta-learning leads to several challenges; the most critical ones being model selection and hyper-parameter optimization.
The goal of this research is to investigate model selection and hyper-parameter optimization approaches of automatic machine learning in general and the challenges associated with them. In machine learning pipeline there are several phases where Meta-learning can be used to effectively facilitate the best recommendations including 1) pre-processing steps, 2) learning algorithm or their combination, 3) adaptivity mechanism parameters, 4) recurring concept extraction, and 5) concept drift detection. The scope of this research is limited to feature engineering for problem representation, and learning strategy for algorithm and its hyper-parameters recommendation at Meta-level.
There are three studies conducted around the two different approaches of automatic machine learning which are model selection using Meta-learning and hyper-parameter optimization. The first study evaluates the situation in which the use of additional data from a different domain can improve the performance of a meta-learning system for time-series forecasting, with focus on cross- domain Meta-knowledge transfer. Although the experiments revealed limited room for improvement over the overall best base-learner, the meta-learning approach turned out to be a safe choice, minimizing the risk of selecting the least appropriate base-learner. There are only 2% of cases recommended by meta- learning that are the worst performing base-learning methods. The second study proposes another efficient and accurate domain adaption approach but using a different meta-learning approach. This study empirically confirms the intuition that there exists a relationship between the similarity of the two different tasks and the depth of network needed to fine-tune in order to achieve accuracy com- parable with that of a model trained from scratch. However, the approach is limited to a single hyper-parameter which is fine-tuning of the network depth based on task similarity. The final study of this research has expanded the set of hyper-parameters while implicitly considering task similarity at the intrinsic dynamics of the training process. The study presents a framework to automatically find a good set of hyper-parameters resulting in reasonably good accuracy, by framing the hyper-parameter selection and tuning within the reinforcement learning regime. The effectiveness of a recommended tuple can be tested very quickly rather than waiting for the network to converge. This approach produces accuracy close to the state-of-the-art approach and is found to be comparatively 20% less computationally expensive than previous approaches. The proposed methods in these studies, belonging to different areas of automatic machine learning, have been thoroughly evaluated on a number of benchmark datasets which confirmed the great potential of these methods.