Welcome! I am currently a researcher at Toyota Research Institute working on new approaches for learning and control under uncertainty beyond the limits of stability of cars. Besides, I will start as an assistant professor in the Mechanical, Aerospace & Nuclear Engineering Department at Rensselaer Polytechnic Institute in Fall 2024.
More information about my research group will be updated soon. Feel free to reach out if you are a potential Ph.D. student or would like to collaborate.
Research Interests. My research lies broadly at the intersection of scientific machine learning, reinforcement learning, control theory, formal methods, and optimization. My research particularly aims to build autonomous systems that can learn how to operate in the real world while respecting underlying data, computational, safety, and perception limitations. Specifically, the recurrent theme in my work is that data alone is never the only source of knowledge available for learning. By deriving formal techniques to incorporate prior knowledge into learning, I have shown significant improvement in data efficiency, explainability, and generalization to unseen tasks or environments. Furthermore, I have demonstrated how to use such knowledge for the formal verification of the systems, particularly in the context of safety-critical applications such as vehicle control at the limits, aircraft control, and robotics.
Download my most recent cv.
Ph.D. in Electrical and Computer Eng, August 2023
The University of Texas at Austin, US
B.S. and M.S. in Aerospace Eng, 2018
ISAE-SUPAERO, France
M.S. (COMASIC) in Computer Science, 2017
École Polytechnique, France
Class preparatory (junior undergraduate level) in Mathematics and Physics, 2014
Lycée Fénelon, France
We present a framework and algorithms for learning uncertainty-aware models of controlled dynamical systems, that leverage a priori physics knowledge. Through experiments on a hexacopter and on other simulated robotic systems, we demonstrate that the proposed approach yields data-efficient models that generalize beyond the training dataset, and that these learned models result in performant model-based reinforcement learning and model predictive control policies. A journal version of this paper will be submitted soon with the open-source code.
We propose a data-efficient, learning-based control approach used on a Toyota Supra to achieve autonomous drifting on various trajectories. Experiments show that scarce amounts of driving data – less than three minutes – is sufficient to achieve high-performance drifting on various trajectories with speeds up to 45mph, and 4× improvement in tracking performance, smoother control inputs, and faster computation time compared to baselines.
We propose an approach to accelerate the training and integration of neural ordinary differential equations. We demonstrate that the approach enjoys integration and training times that are an order of magnitude faster than the current state-of-the-art, without any loss in accuracy.
Effective inclusion of physics-based knowledge into deep neural network models of dynamical systems can greatly improve data efficiency and generalization. We develop a framework to train dynamics models while incorporating a priori system knowledge as inductive bias. The proposed approach learns to predict the system dynamics two orders of magnitude more accurately than a baseline approach that does not include prior knowledge, given the same training dataset.
We develop a learning-based control algorithm with performance guarantees under streaming and noisy data only from a single and ongoing trial. Despite the scarcity of data, we show that the algorithm can provide performance comparable to reinforcement learning algorithms trained over millions of environment interactions, while outperforming existing techniques combining system identification and model predictive control.
We develop an algorithm for inverse reinforcement learning (IRL) in POMDPs that can incorporate high-level task specifications in the learning process. The algorithm significantly improves the scalability of existing IRL in POMDPs techniques due to our efficient approach to find optimal policies on POMDPs. Further, by leveraging the side information, we show that our algorithm has improved data-efficiency over existing approaches.
We develop a probabilistic control algorithm, GTLProCo, for swarms of agents with heterogeneous dynamics and objectives, subject to high-level task specifications. The resulting algorithm, agnostic to the number of agents comprising the swarm, not only achieves decentralized control of the swarm but also significantly improves scalability over state-of-the-art existing algorithms.
We provide a case study for DaTaReach and DaTaControl, algorithms for reachability analysis and control of systems with a priori unknown nonlinear dynamics. In a scenario involving an F-16 aircraft diving towards the ground from low altitude and high downward pitch angle, we show how DaTaControl prevents a ground collision using only the measurements obtained during the dive and elementary laws of physics as side information.
We develop a control algorithm for the safety of a control-affine system with unknown nonlinear dynamics in the sense of confinement in a given safe set. The algorithm leverages robust nonlinear feedback control laws integrated with on-the-fly, data-driven approximations to output a control signal that guarantees the boundedness of the closed-loop system in the given set.
A shield is attached to a system to guarantee safety by correcting the system’s behavior at runtime. Existing methods that employ design-time synthesis of shields do not scale to multi-agent systems. Moreover, such shields are typically implemented in a centralized manner, requiring global information on the state of all agents in the system. We address these limitations through a new approach where the shields are synthesized at runtime and do not require global information. There is a shield onboard every agent, which can only modify the behavior of the corresponding agent. In this approach, which is fundamentally decentralized, the shield on every agent has two components: a pathfinder that corrects the behavior of the agent and an ordering mechanism that dynamically modifies the priority of the agent. The current priority determines if the shield uses the pathfinder to modify behavior of the agent. We derive an upper bound on the maximum deviation for any agent from its original behavior. We prove that the worst-case synthesis time is quadratic in the number of agents at runtime as opposed to exponential at design-time for existing methods. We test the performance of the decentralized, runtime shield synthesis approach on a collision-avoidance problem. For 50 agents in a 50x50 grid, the synthesis at runtime requires a few seconds per agent whenever a potential collision is detected. In contrast, the centralized design-time synthesis of shields for a similar setting is intractable beyond 4 agents in a 5x5 grid.
We develop a control algorithm that ensures the safety, in terms of confinement in a set, of a system with unknown, 2nd-order nonlinear dynamics. The algorithm removes a series of standard, limiting assumptions considered in the related literature since it does not require global boundedness, growth conditions, or a priori approximations of the unknown dynamics’ terms.
We develop DaTaReach and DaTaControl for the reachability analysis and control of systems with a priori unknown nonlinear dynamics. The resulting algorithms not only are suitable for settings with real-time requirements but also provide provable performance guarantees. To this end, they merge noisy data from only a single finite-horizon trajectory and, if available, various forms of side information (from elementary principles) on the underlying dynamics.
We develop two data-driven algorithms, DaTaReach and DaTaControl, for the on-the-fly over-approximation of the reachable set and constrained near-optimal control of systems with unknown dynamics. These algorithms are suitable for scenarios with severely limited data by leveraging various forms of side information on the underlying unknown dynamics.
Safety and performance are often two competing objectives in sequential decision-making problems. Our goal is to blend a performant and a safe controller to generate a single controller that is safer than the performant and accumulates higher rewards than the safe controller. To this end, we propose a blending algorithm using the framework of contextual multi-armed multi-objective bandits.
As the number of agents comprising a swarm increases, individual-agent-based control techniques for collective task completion become computationally intractable. Instead, we develop an algorithm, agnostic to the size of the swarm, to control, in a decentralized and probabilistic manner, a collective property of the swarm: its density distribution constrained with high-level task specifications expressed in temporal logic.
We study the synthesis of mode switching protocols for a class of discrete-time switched linear systems in which the mode jumps are governed by Markov decision processes (MDPs). We propose efficient convex optimization-based formulations to find stabilizing policy in the MDP.
We investigate approaches and algorithms for both offline and on-the-fly verification of low-level drone control algorithms. Specifically, we seek to verify that low-level control algorithms for drones satisfy performance requirements desired during the conception of the system. To this end, we perform an in depth analysis of quadrotor dynamics and its uncertainties. Then, we build on Taylor-based methods to estimate the reachable sets of the systems, which are then exploited to verify if the system could reach unsafe states. The results obtained during this master thesis are as follows: (a) We develop a hardware and software in the loop, Gazebo-based swarm simulator for the Crazyflie drones, (b) We leverage the Crazyflie simulator to investigate safety of dynamical systems through Taylor-based methods and abstract interpretation. Specifically, We investigate on-the-fly, lightweight, real-time verification (reach and safety properties) algorithms to be embedded on the Crazyflie drones.