You can find already published work on my Google scholar page.
Research videos are available on YouTube.

2017

  1. Safe Model-based Reinforcement Learning with Stability Guarantees
    Felix Berkenkamp, Matteo Turchetta, Angela P. Schoellig, Andreas Krause
    Technical report, ArXiv.

    Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world systems. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this paper, we present a learning algorithm that explicitly considers safety in terms of stability guarantees. Specifically, we extend control theoretic results on Lyapunov stability verification and show how to use statistical models of the dynamics to obtain high-performance control policies with provable stability certificates. Moreover, under additional regularity assumptions in terms of a Gaussian process prior, we prove that one can effectively and safely collect data in order to learn about the dynamics and thus both improve control performance and expand the safe region of the state space. In our experiments, we show how the resulting algorithm can safely optimize a neural network policy on a simulated inverted pendulum, without the pendulum ever falling down.
    Close

    @techreport{Berkenkamp2017SafeRL,
      title = {Safe Model-based Reinforcement Learning with Stability Guarantees},
      publisher = {ArXiv},
      author = {Berkenkamp, Felix and Turchetta, Matteo and Schoellig, Angela P. and Krause, Andreas},
      year = {2017},
      url = {\href{https://arxiv.org/abs/1705.08551}{arXiv:1705.08551 [stat.ML]}}
    }
    
          
  2. Constrained Bayesian optimization with Particle Swarms for Safe Adaptive Controller Tuning
    Rikky R.P.R. Duivenvoorden, Felix Berkenkamp, Nicolas Carion, Andreas Krause, Angela P. Schoellig
    in Proc. of the IFAC (International Federation of Automatic Control) World Congress.

    Tuning controller parameters is a recurring and time-consuming problem in control. This is especially true in the field of adaptive control, where good performance is typically only achieved after significant tuning. Recently, it has been shown that constrained Bayesian optimization is a promising approach to automate the tuning process without risking system failures during the optimization process. However, this approach is computationally too expensive for tuning more than a couple of parameters. In this paper, we provide a heuristic in order to efficiently perform constrained Bayesian optimization in high-dimensional parameter spaces by using an adaptive discretization based on particle swarms. We apply the method to the tuning problem of an L1 adaptive controller on a quadrotor vehicle and show that we can reliably and automatically tune parameters in experiments.
    Close

    @inproceedings{Duivenvoorden2017SafeOptSwarm,
      author = {Duivenvoorden, Rikky R.P.R. and Berkenkamp, Felix and Carion, Nicolas and Krause, Andreas and Schoellig, Angela P.},
      booktitle = {Proc. of the IFAC (International Federation of Automatic Control) World Congress},
      title = {Constrained {B}ayesian optimization with Particle Swarms for Safe Adaptive Controller Tuning},
      year = {2017}
    }
    
          
  3. Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization
    Alonso Marco, Felix Berkenkamp, Philipp Hennig, Angela P. Schoellig, Andreas Krause, Stefan Schaal, Sebastian Trimpe
    in Proc. of the International Conference on Robotics and Automation (ICRA).

    In practice, the parameters of control policies are often tuned manually. This is time-consuming and frustrating. Reinforcement learning is a promising alternative that aims to automate this process, yet often requires too many experiments to be practical. In this paper, we propose a solution to this problem by exploiting prior knowledge from simulations, which are readily available for most robotic platforms. Specifically, we extend Entropy Search, a Bayesian optimization algorithm that maximizes information gain from each experiment, to the case of multiple information sources. The result is a principled way to automatically combine cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. We apply the resulting method to a cart-pole system, which confirms that the algorithm can find good control policies with fewer experiments than standard Bayesian optimization on the physical system only.
    Close

    @inproceedings{Marco17VirtualvsReal,
      author = {Marco, Alonso and Berkenkamp, Felix and Hennig, Philipp and Schoellig, Angela P. and Krause, Andreas and Schaal, Stefan and Trimpe, Sebastian},
      booktitle = {Proc. of the International Conference on Robotics and Automation (ICRA)},
      title = {Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with {B}ayesian Optimization},
      year = {2017}
    }
    
          

2016

  1. Safe Learning of Regions of Attraction for Uncertain, Nonlinear Systems with Gaussian Processes
    Felix Berkenkamp, Riccardo Moriconi, Angela P. Schoellig, Andreas Krause
    in Proc. of the IEEE Conference on Decision and Control.

    Control theory can provide useful insights into the properties of controlled, dynamic systems. One important property of nonlinear systems is the region of attraction (ROA), which is a safe subset of the state space in which a given controller renders an equilibrium point asymptotically stable. The ROA is typically estimated based on a model of the system. However, since models are only an approximation of the real world, the resulting estimated safe region can contain states outside the ROA of the real system. This is not acceptable in safety-critical applications. In this paper, we consider an approach that learns the ROA from experiments on a real system, without ever leaving the ROA of the real system. This approach enables us to find an estimate of the real ROA, without risking safety-critical failures. Based on regularity assumptions on the model errors in terms of a Gaussian process prior, we determine a region in which an equilibrium point is asymptotically stable with high probability, according to an underlying Lyapunov function. Moreover, we actively select areas of the state space to evaluate in order to expand the ROA. We demonstrate the effectiveness of this method in simulated experiments.
    Close

    @inproceedings{Berkenkamp2016ROA,
      title = {Safe Learning of Regions of Attraction for Uncertain, Nonlinear Systems with {G}aussian Processes},
      booktitle = {Proc. of the IEEE Conference on Decision and Control},
      author = {Berkenkamp, Felix and Moriconi, Riccardo and Schoellig, Angela P. and Krause, Andreas},
      year = {2016},
      pages = {4661--4666},
      url = {\href{http://arxiv.org/abs/1603.04915}{arXiv:1603.04915 [cs.SY]}}
    }
    
          
  2. Safe Exploration in Finite Markov Decision Processes with Gaussian Processes
    Matteo Turchetta, Felix Berkenkamp, Andreas Krause
    in Proc. of the Conference on Neural Information Processing Systems (NIPS).

    In classical reinforcement learning, when exploring an environment, agents accept arbitrary short term loss for long term gain. This is infeasible for safety critical applications, such as robotics, where even a single unsafe action may cause system failure. In this paper, we address the problem of safely exploring finite Markov decision processes (MDP). We define safety in terms of an, a priori unknown, safety constraint that depends on states and actions. We aim to explore the MDP under this constraint, assuming that the unknown function satisfies regularity conditions expressed via a Gaussian process prior. We develop a novel algorithm for this task and prove that it is able to completely explore the safely reachable part of the MDP without violating the safety constraint. To achieve this, it cautiously explores safe states and actions in order to gain statistical confidence about the safety of unvisited state-action pairs from noisy observations collected while navigating the environment. Moreover, the algorithm explicitly considers reachability when exploring the MDP, ensuring that it does not get stuck in any state with no safe way out. We demonstrate our method on digital terrain models for the task of exploring an unknown map with a rover.
    Close

    @inproceedings{Turchetta2016SafeMDP,
      title = {Safe Exploration in Finite {M}arkov {D}ecision {P}rocesses with {G}aussian Processes},
      booktitle = {Proc. of the Conference on Neural Information Processing Systems (NIPS)},
      author = {Turchetta, Matteo and Berkenkamp, Felix and Krause, Andreas},
      year = {2016},
      pages = {4305--4313},
      url = {\href{http://arxiv.org/abs/1606.04753}{arXiv:1606.04753 [cs.LG]}}
    }
    
          
  3. Bayesian Optimization for Maximum Power Point Tracking in Photovoltaic Power Plants
    Hany Abdelrahman, Felix Berkenkamp, Jan Poland, Andreas Krause
    in Proc. of the European Control Conference (ECC).

    • Best Application Paper Award

    The amount of power that a photovoltaic (PV) power plant generates depends on the DC voltage that is applied to the PV panels. The relationship between this control input and the generated power is non-convex and has multiple local maxima. Moreover, since the generated power depends on time-varying environmental conditions, such as solar irradiation, the location of the global maximum changes over time. Maximizing the amount of energy that is generated over time is known as the maximum power point tracking (MPPT) problem. Traditional approaches to solve the MPPT problem rely on heuristics and data-based gradient estimates. These methods typically converge to local optima and thus waste energy. Our approach formalizes the MPPT problem as a Bayesian optimization problem. This formalization admits algorithms that can find the maximum power point after only a few evaluations at different input voltages. Specifically, we model the power-voltage curve as a Gaussian process (GP) and use the predictive uncertainty information in this model to choose control inputs that are informative about the location of the maximum. We extend the basic approach by including operational constraints and making it computationally tractable so that the method can be used on real systems. We evaluate our method together with two standard baselines in experiments, which show that our approach outperforms both.
    Close

    @inproceedings{Abdelrahman16Bayesian,
      author = {Abdelrahman, Hany and Berkenkamp, Felix and Poland, Jan and Krause, Andreas},
      booktitle = {Proc. of the European Control Conference (ECC)},
      title = {Bayesian Optimization for Maximum Power Point Tracking in Photovoltaic Power Plants},
      year = {2016},
      pages = {2078--2083}
    }
    
          
  4. Safe Controller Optimization for Quadrotors with Gaussian Processes
    Felix Berkenkamp, Angela P. Schoellig, Andreas Krause
    in Proc. of the IEEE International Conference on Robotics and Automation (ICRA).

    One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to obtain an initial controller, but ultimately the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning step, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters on the real system, safety-critical system failures may happen. In this paper, we overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SafeOpt, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, SafeOpt automatically optimizes the parameters of a control law while guaranteeing safety. It models the underlying performance measure as a Gaussian process and only explores new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.
    Close

    @inproceedings{Berkenkamp2016SafeOpt,
      title = {Safe Controller Optimization for Quadrotors with {G}aussian Processes},
      booktitle = {Proc. of the IEEE International Conference on Robotics and Automation (ICRA)},
      author = {Berkenkamp, Felix and Schoellig, Angela P. and Krause, Andreas},
      year = {2016},
      pages = {493--496},
      url = {\href{http://arxiv.org/abs/1509.01066}{arXiv:1509.01066 [cs.RO]}}
    }
    
          
  5. Bayesian Optimization with Safety Constraints: Safe and Automatic Parameter Tuning in Robotics
    Felix Berkenkamp, Andreas Krause, Angela P. Schoellig
    Technical report, ArXiv.

    Robotics algorithms typically depend on various parameters, the choice of which significantly affects the robot’s performance. While an initial guess for the parameters may be obtained from dynamic models of the robot, parameters are usually tuned manually on the real system to achieve the best performance. Optimization algorithms, such as Bayesian optimization, have been used to automate this process. However, these methods may evaluate parameters during the optimization process that lead to safety-critical system failures. Recently, a safe Bayesian optimization algorithm, called SafeOpt, has been developed and applied in robotics, which guarantees that the performance of the system never falls below a critical value; that is, safety is defined based on the performance function. However, coupling performance and safety is not desirable in most cases. In this paper, we define separate functions for performance and safety. We present a generalized SafeOpt algorithm that, given an initial safe guess for the parameters, maximizes performance but only evaluates parameters that satisfy all safety constraints with high probability. It achieves this by modeling the underlying and unknown performance and constraint functions as Gaussian processes. We provide a theoretical analysis and demonstrate in experiments on a quadrotor vehicle that the proposed algorithm enables fast, automatic, and safe optimization of tuning parameters. Moreover, we show an extension to context- or environment-dependent, safe optimization in the experiments.
    Close

    @techreport{Berkenkamp2016BayesianSafety,
      title = {Bayesian Optimization with Safety Constraints: Safe and Automatic Parameter Tuning in Robotics},
      publisher = {ArXiv},
      author = {Berkenkamp, Felix and Krause, Andreas and Schoellig, Angela P.},
      year = {2016},
      url = {\href{http://arxiv.org/abs/1602.04450}{arXiv:1602.04450 [cs.RO]}}
    }
    
          

2015

  1. Safe and Robust Learning Control with Gaussian Processes
    Felix Berkenkamp, Angela P. Schoellig
    in Proc. of the European Control Conference (ECC).

    This paper introduces a learning-based robust control algorithm that provides robust stability and performance guarantees during learning. The approach uses Gaussian process (GP) regression based on data gathered during operation to update an initial model of the system and to gradually decrease the uncertainty related to this model. Embedding this data-based update scheme in a robust control framework guarantees stability during the learning process. Traditional robust control approaches have not considered online adaptation of the model and its uncertainty before. As a result, their controllers do not improve performance during operation. Typical machine learning algorithms that have achieved similar high-performance behavior by adapting the model and controller online do not provide the guarantees presented in this paper. In particular, this paper considers a stabilization task, linearizes the nonlinear, GP-based model around a desired operating point, and solves a convex optimization problem to obtain a linear robust controller. The resulting performance improvements due to the learning-based controller are demonstrated in experiments on a quadrotor vehicle.
    Close

    @inproceedings{Berkenkamp2015Robust,
      title = {Safe and Robust Learning Control with  Processes},
      booktitle = {Proc. of the European Control Conference (ECC)},
      author = {Berkenkamp, Felix and Schoellig, Angela P.},
      year = {2015},
      pages = {2501--2506}
    }
    
          
  2. Safe and Automatic Controller Tuning with Gaussian Processes
    Felix Berkenkamp, Angela P. Schoellig, Andreas Krause
    in Proc. of the Workshop on Machine Learning in Planning and Control of Robot Motion, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

    One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to design an initial controller, but ultimately, the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters, safety-critical system failures may happen. We overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, \textscSafeOpt, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, \textscSafeOpt automatically optimizes the parameters of a control law while guaranteeing system safety and stability. It achieves this by modeling the underlying performance measure as a Gaussian process and only exploring new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.
    Close

    @inproceedings{Berkenkamp2015SafeTuning,
      title = {Safe and Automatic Controller Tuning with  Processes},
      booktitle = {Proc. of the Workshop on Machine Learning in Planning and Control of Robot Motion, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
      author = {Berkenkamp, Felix and Schoellig, Angela P. and Krause, Andreas},
      year = {2015}
    }
    
          

2014

  1. Learning-based Robust Control: Guaranteeing Stability while Improving Performance
    Felix Berkenkamp, Angela P. Schoellig
    in Proc. of the Workshop on Machine Learning in Planning and Control of Robot Motion, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

    To control dynamic systems, modern control theory relies on accurate mathematical models that describe the system behavior. Machine learning methods have proven to be an effective method to compensate for initial errors in these models and to achieve high-performance maneuvers by adapting the system model and control online. However, these methods usually do not guarantee stability during the learning process. On the other hand, the control community has traditionally accounted for model uncertainties by designing robust controllers. Robust controllers use a mathematical description of the uncertainty in the dynamic system derived prior to operation and guarantee robust stability for all uncertainties. Unlike machine learning methods, robust control does not improve the control performance by adapting the model online. This paper combines machine learning and robust control theory for the first time with the goal of improving control performance while guaranteeing stability. Data gathered during operation is used to reduce the uncertainty in the model and to learn systematic errors. Specifically, a nonlinear, nonparametric model of the unknown dynamics is learned with a Gaussian Process. This model is used for the computation of a linear robust controller, which guarantees stability around an operating point for all uncertainties. As a result, the robust controller improves its performance online while guaranteeing robust stability. A simulation example illustrates the performance improvements due to the learning-based robust controller.
    Close

    @inproceedings{Berkenkamp2014LearningBased,
      title = {Learning-based Robust Control: Guaranteeing Stability while Improving Performance},
      booktitle = {Proc. of the Workshop on Machine Learning in Planning and Control of Robot Motion, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
      author = {Berkenkamp, Felix and Schoellig, Angela P.},
      year = {2014}
    }
    
          
  2. Hybrid Model Predictive Control of Stratified Thermal Storages in Buildings
    Felix Berkenkamp, Markus Gwerder
    Energy and Buildings, vol. 84.

    In this paper a generic model predictive control (MPC) algorithm for the management of stratified thermal storage tanks in buildings is proposed that can be used independently of the building’s heat/cold generation, consumption and consumption control. The main components of the considered storage management are the short term load forecasting (STLF) of the heat/cold consumer(s) using weather forecast, the MPC algorithm using a bilinear dynamic model of the stratified storage and operating modes of the heat/cold generator(s), modeled as static operating points. The MPC algorithm chooses between these operating modes to satisfy the predicted cold demand with minimal costs. By considering the generator(s) in terms of operating modes the bilinearity in the storage model is resolved which leads to a hybrid MPC problem. For computational efficiency this problem is approximated by an iterative algorithm that converges to a close to optimal solution. Simulation results suggest that the approach is well suited for the use in buildings with a limited number of heat/cold generators. Additionally, the approach is promising for practical use because of its independence from the heat/cold consumer’s control and because it requires limited information and instrumentation on the plant, i.e. low costs for control equipment.
    Close

    @article{Berkenkamp2014Hybrid,
      title = {Hybrid Model Predictive Control of Stratified Thermal Storages in Buildings},
      volume = {84},
      issn = {0378-7788},
      timestamp = {2015-09-09T07:19:30Z},
      journal = {Energy and Buildings},
      author = {Berkenkamp, Felix and Gwerder, Markus},
      year = {2014},
      pages = {233--240}
    }