Using a meta-model to compensate for training-evaluation mismatches

Lamprecht, Dylan; Barnard, Etienne

View/Open

Lamprecht-2020-using-meta-model.pdf (577.1Kb)

Date

2020

Author

Lamprecht, Dylan

Barnard, Etienne

Metadata

Show full item record

Abstract

One of the fundamental assumptions of machine learning is that learnt models are applied to data that is identically distributed to the training data. This assumption is often not realistic: for example, data collected from a single source at different times may not be distributed identically, due to sampling bias or changes in the environment. We propose a new architecture called a meta-model which predicts performance for unseen models. This approach is applicable when several ‘proxy’ datasets are available to train a model to be deployed on a ‘target’ test set; the architecture is used to identify which regression algorithms should be used as well as which datasets are most useful to train for a given target dataset. Finally, we demonstrate the strengths and weaknesses of the proposed meta-model by making use of artificially generated datasets using a variation of the Friedman method 3 used to generate artificial regression datasets, and discuss real-world applications of our approach.

URI

http://hdl.handle.net/10394/36640

Collections

Faculty of Engineering [1123]