## An improved data-driven stochastic model for reduced errors in output statistics computed using adaptive stochastic collocation

## Citation

, "An improved data-driven stochastic model for reduced errors in output statistics computed using adaptive stochastic collocation", *11th US National Congress on Computational Mechanics*, July 25-28, 2011

## Abstract

The process of uncertainty quantification in engineering systems involves constructing a stochastic model for all the variable input parameters and estimating the variability in system performance under this model. Given a particular stochastic model, the propagation of uncertainty through a numerical model of the system can be efficiently achieved with high precision using a number of well-known techniques like generalized polynomial chaos, stochastic collocation, etc. along with modifications to these methods using local adaptive refinement. In order to take advantage of the high-fidelity that these methods offer, we seek greater accuracy in the input stochastic model so as to reduce the overall error in the prediction of output uncertainties. In this paper, we describe a method to construct a data-driven stochastic model that can be used to compute the statistics of an output parameter that quantifies the device performance, with greater accuracy. Assuming that the uncertain inputs can be represented by continuous random variables, we can construct a stochastic model by defining a probability density function (PDF) on the range of values that the variable can take. When the nature of randomness is known a priori, it is possible to describe the uncertainty by some standard PDF. When the actual form of the PDF is unknown, we use a non-parametric method to estimate the PDF from a finite sample of values that this variable takes. This is most commonly done by minimizing the mean integrated squared error (MISE) of the PDF or by maximizing the likelihood function. However, there is no guarantee that the PDFs estimated in this way, will be optimal in terms of accurately computing the output statistics. In order to obtain accurate statistics (e.g. mean and variance) of the uncertainty in the output, we develop a stochastic model that takes the nature of the system into account. We model the system in question by a class of functions that propagate the input uncertainty into an output parameter of interest. The choice of the specific class of functions is motivated by the method used to propagate the uncertainty; in this case, adaptive stochastic collocation. We pick a Reproducing Kernel Hilbert Space (RKHS) that is associated with these functions and use the Kernel Moment Matching (KMM) method to formulate an objective function for an optimization problem that converges to the PDF that minimizes the error in any statistical moment, e.g. mean or variance, of the output. A leave-one-out cross validation scheme is used to generalize the error in the objective function and thereby ensure that the estimation does not result in an over-fitting of the given data. We use this procedure to modify existing methods to estimate the PDF and show that the resulting estimate of the PDF can be used to compute the statistics of the output with greater accuracy.