Modern Adaptive Experiment Design: Machine Learning Perspective

Mutny, Mojmir, Modern Adaptive Experiment Design: Machine Learning Perspective.”, 2024. PhD Thesis, ETH Zurich

Abstract: A key challenge in science and engineering is to design experiments to learn about some unknown quantity of interest. Optimal experimental design is a branch of statistics addressing the most informative allocation of experiments in order to infer an unknown quantity of interest. In this thesis, we fundamentally revisit and extend the methods and approaches of optimal design of experiments when our unknown quantity is a member of a reproducing kernel Hilbert space (RKHS). Estimation in RKHS is a versatile, yet sample-efficient non-parametric statistical method that allows us to capture non-linear natural phenomena using the concept of a similarity also known as the kernel. The process to set up an statistical experiment design pipeline begins by establishing a mathematical model and defining the experiment’s goal as a utility. We consider the available experiments and adapt to uncertain outcomes. This thesis specifically tackles adaptive experiment design, where previous experiment outcomes inform future utility design and selection strategies. We first outline a general methodology for adaptive experiment design, addressing uncertainty and combinatorial issues through relaxation techniques. We introduce a master algorithm and its linearized form based on the Frank-Wolfe algorithm, applicable to general experimental design problems. This algorithm is demonstrated in applications involving various utilities, including reward minimization and information gathering. Subsequent sections delve into the impact of structural assumptions about the unknown quantity in RKHS, such as additivity and projection pursuit structures. We analyze these models and additionally introduce novel mathematical structures in RKHS, aiming to enhance the richness and resource efficiency of experiment design. Adaptive design requires statistical estimation and confidence estimates for the unknown quantity in the presence of randomness. We thoroughly analyze this issue from a worst-case perspective, developing confidence sets for probability distributions from parametrized families. Departing from the worst-case perspective, we provide likelihood-based confidence sets for experiments with well-specified likelihood models that provide significant improvement over the worst-case analysis. Additionally, we explore complex experiment design scenarios, where the design involves executing a policy in a sequence of steps generating trajectories. We provide a tractable reformulation of this problem using Markov chains, providing examples and modifications to traditional experimental design methods. Lastly, we demonstrate the application and versatility of modern adaptive experiment design in various domains, including enzyme optimization, adaptive Poisson sensing for spatio-temporal events, learning differential equations, and classical problems related to inferring functionals of unknown quantities.

Use Google Scholar for full citation