Premium
Identification of Critical Batch Operating Parameters in Fed‐Batch Recombinant E. coli Fermentations Using Decision Tree Analysis
Author(s) -
Buck Kristan K. S.,
Subramanian Venkatanarayanan,
Block David E.
Publication year - 2002
Publication title -
biotechnology progress
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.572
H-Index - 129
eISSN - 1520-6033
pISSN - 8756-7938
DOI - 10.1021/bp020112p
Subject(s) - identification (biology) , decision tree , categorical variable , batch processing , computer science , information gain ratio , production (economics) , data mining , microbiology and biotechnology , mathematics , machine learning , biology , botany , macroeconomics , economics , programming language
To develop a useful fermentation process model, it is first necessary to identify which batch operating parameters are critical in determining the process outcome. To identify critical processing inputs in large databases, we have explored the use of Decision Tree Analysis with the decision metrics of Gain (i.e., Shannon Entropy changes), Gain Ratio, and a multiple hypergeometric distribution. The usefulness of this approach lies in its ability to treat “categorical” variables, which are typical of archived fermentation databases, as well as “continuous” variables. In this work, we demonstrate the use of Decision Tree Analysis for the problem of optimizing recombinant green fluorescent protein production in E. coli . A database of 85 fermentations was generated to examine the effect of 15 process input parameters on final biomass yield, maximum recombinant protein concentration, and productivity. The use of Decision Tree Analysis led to a considerable reduction in the fermentation database through the identification of the significant as well as insignificant inputs. However, different decision metrics selected different inputs and different numbers of inputs to classify the data for each output.