A Design-of-Experiments-Based Approach for Efficient Estimation of Bimodal Gaussian Mixture Weights
Author(s) -
Gustavo S. Leal,
Lupercio F. Bessegato,
Yasmin S. M. Xavier,
Farid Melgani,
Pedro P. Balestrassi
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3614023
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Normal mixture models are widely used to represent data arising from latent subpopulations. We propose a Design-of-Experiments (DOE) and Response Surface Methodology (RSM) framework to estimate the weights of a bimodal Gaussian mixture when component families are known. The procedure is non-iterative: rather than alternating Expectation Maximization (EM) steps, it performs a double-stage method - fit a quadratic response surface to the sample log-likelihood over the weight simplex and solve one constrained optimization - followed by a final Maximum Likelihood re-estimation of means and variances. This yields predictable runtime (driven by design size) and reduced sensitivity to initialization. The pipeline uses (i) k-medians to obtain preliminary component parameters and 99% confidence intervals (CIs) for component proportions; (ii) builds a simplex-lattice mixture design within those CI bounds; (iii) fits a quadratic response surface to log-likelihood; and (iv) optimizes this surface under sum-to-one constraints. We validate the method in 27 Monte Carlo scenarios (n = 100, 500, 1000; low/medium/high differentiation and three weight settings). In medium/high separation, it attains comparable likelihoods to EM while achieving more favorable BIC in multiple scenarios and indistinguishable AIC in many, whereas EM is preferable under low separation. Two real data sets - Old Faithful (Waiting variable) and Photovoltaic Energy (Production variable) - further confirm applicability, with lower AIC/BIC in Old Faithful and lower BIC in PV; clustering agreement is high (κ ≈ 0.99 - 1.00). Overall, DOE-RSM offers a simple, interpretable, and often more parsimonious method, and constitutes a non-iterative alternative for mixture-weight estimation.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom