z-logo
open-access-imgOpen Access
A Submodularity Framework for Data Subset Selection
Author(s) -
Katrin Kirchhoff,
Jeff Bilmes,
Kai Wei,
Yuzong Liu,
Arindam Mandal,
Chris Bartels
Publication year - 2013
Language(s) - English
Resource type - Reports
DOI - 10.21236/ada595011
Subject(s) - selection (genetic algorithm) , computer science , artificial intelligence
: This report describes the outcome of the project A Submodularity Framework for Data Subset Selection. The goal of the project was to develop and evaluate novel submodular functions for the purpose of subselecting large sets of acoustic and text data. The subselected data sets were used to train acoustic models for automatic speech recognition or translation models for machine translation, respectively. The submodular selection techniques were evaluated against random data selection and the best comparable data selection technique previously reported in the literature. Our results demonstrate that submodular data selection outperforms all baseline techniques, i.e. for a fixed data subset size, submodular selection resulted in systems with better performance. Additionally, submodular selection was applied to the problem of feature selection, where it outperformed standard modular feature selection techniques.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom