Masters Theses

Orcid ID

https://orcid.org/0009-0001-3670-0813

Date of Award

8-2024

Degree Type

Thesis

Degree Name

Master of Science

Major

Computer Science

Major Professor

Hairong Qi

Committee Members

Hairong Qi, Catherine Schuman, Dan Wilson, Amir Sadovnik

Abstract

Reinforcement Learning (RL) has made significant strides in various domains, yet developing effective control policies for environments with complex, nonlinear dynamics remains a challenge, particularly for policy gradient methods. These methods often struggle due to high-variance in gradient estimates, non-convex optimization landscapes, and sample inefficiency, resulting in unstable learning, suboptimal policies, and trade-offs between performance and reproducibility. The quest for more robust, stable, and effective methods has led to numerous innovations and remains a critical area of research. Proximal Policy Optimization (PPO) has gained popularity in recent years due to its balance in performance, training stability, and computational efficiency. In contrast with their nonlinear counterparts, linear systems are simpler, more predictable, and easier to analyze. Koopman Theory has emerged as a powerful framework for studying nonlinear systems through a globally-linear operator that acts on a higher-dimensional space of measurement functions. Combining these two ideas, Koopman-Inspired Proximal Policy Optimization (KIPPO) extends PPO to learn a simplifying representation of the underlying system's dynamics while retaining essential features for effective policy learning. This is achieved through a Koopman-approximation auxiliary network and carefully designed constraints that enable balancing the complexity of latent dynamics. Results demonstrate improvements over the PPO baseline with 8-60% increased performance while reducing variability by up to 91% when evaluated on diverse continuous control tasks. The study also examines the effects and interactions of key hyperparameters and the impacts of individual loss components through an ablation study, providing a comprehensive analysis of the approach.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Share

COinS