Premium
Using Breath Metabolic Profiling to Identify Genetic Variants in Alcohol Metabolism
Author(s) -
Kang Sunwoo,
White Joseph Robert,
Gross Eric Richard
Publication year - 2020
Publication title -
the faseb journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.709
H-Index - 277
eISSN - 1530-6860
pISSN - 0892-6638
DOI - 10.1096/fasebj.2020.34.s1.05815
Subject(s) - aldh2 , acetaldehyde , aldehyde dehydrogenase , metabolomics , artificial intelligence , machine learning , chemistry , computer science , ethanol , chromatography , biochemistry , gene
Aldehyde dehydrogenase 2 (ALDH2) metabolizes acetaldehyde to acetic acid. An ALDH2 genetic variant, ALDH2*2, limits acetaldehyde metabolism. As a result, after consuming alcohol, the ALDH2*2 genetic variant causes acetaldehyde accumulation, facial flushing, and tachycardia; commonly referred to as Asian glow. Here we describe how we developed a machine learning algorithm to distinguish between genetic variants of ALDH2 using data collected from the human breath mass spectrum. To develop this machine learning algorithm, we recruited human volunteers to perform an alcohol challenge after obtaining IRB approval. For these volunteers, we performed genotyping in addition to subjecting volunteers to a 0.25g/kg alcohol challenge. During the alcohol challenge, we measured the human breath mass spectra for 16 people (8 ALDH2*1*1 and 8 ALDH2*1*2) using selective ion flow mass spectrometry. This created a metabolic profile for each person of the full mass spectrum which consisted of 1,155 ion measurements complemented with 14 metabolite panel measurements. To develop a machine learning algorithm to predict ALDH2 genotype, standard scaler (SS) and principal component analysis (PCA) were applied to normalize the variance in ion counts and maximize high dimensional input. The visualization of metabolomics data showed two distinct clusters that are dependent on the ethanol intake (Figure 1A) and on ALDH2 genotype (Figure 1B). Based on the cluster distinction, we were able to verify that our data could characterize ethanol metabolism. Next, we applied ~20 classifier algorithms such as Support Vector Machine, Perceptron, Gaussian Naïve Bayes, Stochastic Gradient Descent, and Decision trees. The prediction through Random Forest yielded three novel features from the raw mass spectra that are not included in metabolite panel. These algorithms were able to achieve a training accuracy above 90% threshold for identifying between ALDH2 genotypes. These finding suggests a non‐invasive method to determine ALDH2 genotype that may be used to complement existing methods to determine the response a person has to alcohol. Support or Funding Information This research received grant funding from Stanford Chem‐H and NIH NIGMS GM119522.Visualization of Temporal HEB Metabolomics Profiling. Figure 1A: 3D PCA on Ethanol Catabolism with SVM distinction kernel. Blue=baseline, red=after ethanol exposure / Figure 1B: 3D Visualization of Partial Gradient of Acetaldehyde and Ethanol. Blue=ALDH2 cohort, red=ALDH2*2 cohort