Featured Research

This website was obsoleted and being transferred. Please check our new website!

The ultimate goal of our research is to build trustworthy, interactive, and human-centered autonomous agents that can perceive, understand, and reason about the physical world; safely interact and collaborate with humans and other agents, and clearly explain their behaviors to build trust with humans so that they can benefit society in daily lives. To achieve this goal, we have been pursuing interdisciplinary research and unifying the techniques and tools from robotics, machine learning, reinforcement learning, explainable AI, control theory, optimization, and computer vision.

Explainable Relational Reasoning and Multi-Agent Interaction Modeling (Social & Physical)

We investigate relational reasoning and interaction modeling in the context of the trajectory prediction task, which aims to generate accurate, diverse future trajectory hypotheses or state sequences based on historical observations. Our research introduced the first unified relational reasoning toolbox that systematically infers the underlying relations/interactions between entities at different scales (e.g., pairwise, group-wise) and different abstraction levels (e.g., multiplex) by learning dynamic latent interaction graphs and hypergraphs from observable states (e.g., positions) in an unsupervised manner. The learned latent graphs are explainable and generalizable, significantly improving the performance of downstream tasks, including prediction, sequential decision making, and control. We also proposed a physics-guided relational learning approach for physical dynamics modeling.

Related Publications:

1. EvolveGraph: Multi-Agent Trajectory Prediction with Dynamic Relational Reasoning, NeurIPS 2020.

2. RAIN: Reinforced Hybrid Attention Inference Network for Motion Forecasting, ICCV 2021.

3. Interaction Modeling with Multiplex Attention, NeurIPS 2022.

4. Learning Physical Dynamics with Subequivariant Graph Neural Networks, NeurIPS 2022.

5. Grouptron: Dynamic Multi-Scale Graph Convolutional Networks for Group-Aware Dense Crowd Trajectory Forecasting, ICRA 2022.

6. Multi-Agent Dynamic Relational Reasoning for Social Robot Navigation, submitted to IEEE Transactions on Robotics (T-RO), under review.

Interaction-Aware Decision Making and Model-Based Control

How Will Self-Driving Cars Be Insured in the Future? Although autonomous navigation in simple, static environments has been well studied, it remains challenging for robots to navigate in highly dynamic, interactive scenarios (e.g., intersections, narrow corridors) where humans are involved. Robots must learn a safe and efficient behavior policy that can model the interactions, coordinate with surrounding static/dynamic entities, and generalize to out-of-distribution (OOD) situations. Our research introduced a novel interaction-aware decision making framework for autonomous vehicles based on reinforcement learning, which integrates human internal state inference, domain knowledge, trajectory prediction, and counterfactual reasoning systematically. We also investigate model-based control methods that leverage the learned pairwise and group-wise relations for social robot navigation around human crowds. Both methods achieve superior performance in the corresponding tasks in terms of a wide range of evaluation metrics and provide explainable, human-understandable intermediate representations to build both users’ and developers’ trust.

Related Publications:

1. Reinforcement Learning for Autonomous Driving with Latent State Inference and Spatial-Temporal Relationships, ICRA 2021.

2. Autonomous Driving Strategies at Intersections: Scenarios, State-of-the-Art, and Future Outlooks, ITSC 2021.

4. Robust Driving Policy Learning with Guided Meta Reinforcement Learning, ITSC 2023.

5. Game Theory-Based Simultaneous Prediction and Planning for Autonomous Vehicle Navigation in Crowded Environments, ITSC 2023.

6. Interactive Autonomous Navigation with Internal State Inference and Interactivity Estimation, submitted to IEEE Transactions on Robotics (T-RO), under review.

7. Multi-Agent Dynamic Relational Reasoning for Social Robot Navigation, submitted to IEEE Transactions on Robotics (T-RO), under review.

Vision and Language Models for Embodied Intelligence

We investigate foundation models and vision language models (VLMs) for robotics and autonomous systems to enhance their reasoning capability and reliability. For example, inferring the short-term and long-term intentions of traffic participants and understanding the contextual semantics of scenes are the keys to scene understanding and situational awareness of autonomous vehicles. Moreover, how to enable autonomous agents (e.g., self-driving cars) to explain their reasoning, prediction, and decision making processes to human users (e.g., drivers, passengers) in a human understandable form (e.g., natural language) to build humans’ trust remains largely underexplored. Therefore, we created the first multimodal dataset for a new risk object ranking and natural language explanation task in urban scenarios and a rich dataset for intention prediction in autonomous driving, establishing benchmarks for corresponding tasks. Meanwhile, my research introduced novel methods that achieve superior performance on these problems.

Related Publications:

1. LOKI: Long Term and Key Intentions for Trajectory Prediction, ICCV 2021.

2. Important Object Identification with Semi-Supervised Learning for Autonomous Driving, ICRA 2022.

3. DRAMA: Joint Risk Localization and Captioning in Driving, WACV 2023.

4. Rank2Tell: A Multimodal Dataset for Joint Driving Importance Ranking and Reasoning, WACV 2024

Improving Generalizability by Learning Context Relations

How to generalize the prediction to different scenarios is largely underexplored. In contrast to recent works that use the Cartesian coordinate system and global context images directly as input, we propose to leverage human prior knowledge including the comprehension of pairwise relations between agents and pairwise context information extracted by self-supervised learning approaches to attain an effective Frenet-based representation. We demonstrate that our approach achieves superior performance in terms of overall performance, zero-shot, and few-shot transferability across different traffic scenarios with diverse layouts.

Related Publications:

1. Multi-Agent Driving Behavior Prediction across Different Scenarios with Self-Supervised Domain Knowledge, ITSC 2021.

Continual / Lifelong Learning from Incremental Data

The current mainstream research focuses on how to achieve accurate prediction on one large dataset. However, whether the multi-agent trajectory prediction model can be trained with a sequence of datasets, i.e., continual learning settings, remains a question. Can the current prediction methods avoid catastrophic forgetting? Can we utilize the continual learning strategy in the multi-agent trajectory prediction application? Motivated by the generative replay methods in continual learning literature, we propose a multi-agent interaction behavior prediction framework with a graph neural network-based conditional generative memory system to mitigate catastrophic forgetting. To the best of our knowledge, this work is the first attempt to study the continual learning problem in multi-agent interaction behavior prediction problems. We empirically show that several approaches in literature indeed suffer from catastrophic forgetting, and our approach succeeds in maintaining a low prediction error when datasets come sequentially.

Related Publications:

1. Continual Multi-Agent Interaction Behavior Prediction With Conditional Generative Memory, IEEE Robotics and Automation Letters, 2021.

Diverse Prediction and Generation with Deep Generative Models

The objective of generative models is to approximate the true data distribution, with which one can generate new samples similar to real data points with a proper variance. Generative models have been widely employed in representation learning and distribution approximation. We designed novel trajectory or human skeleton motion prediction methods based on deep generative models, which generate accurate and diverse prediction hypotheses. These methods can be broadly applied to time series prediction problems.

Related Publications:

1. Spatio-Temporal Graph Dual-Attention Network for Multi-Agent Prediction and Tracking, IEEE Transactions on Intelligent Transportation Systems, 2022.

2. Interaction-aware Multi-agent Tracking and Probabilistic Behavior Prediction via Adversarial Learning, ICRA 2019.

3. Conditional Generative Neural System for Probabilistic Trajectory Prediction, IROS 2019.

4. Coordination and Trajectory Prediction for Vehicle Interactions via Bayesian Generative Modeling, IV 2019.

5. Wasserstein Generative Learning with Kinematic Constraints for Probabilistic Interactive Driving Behavior Prediction, IV 2019.

State Estimation with Learning-Based Models

We proposed a constrained mixture sequential Monte Carlo method that mitigates mode collapse in sequential Monte Carlo methods for tracking multiple targets and significantly improves tracking accuracy. Since prediction is a step in state estimation, we also proposed that the prior update in the state estimation framework can be implemented with any learning-based interaction-aware prediction model. The results in complex traffic scenarios show that using the prediction model outperforms purely physical models by a large margin due to the capability of relational reasoning. In particular, our method performs significantly better when handling missing or noisy sensor measurements.

Related Publications:

1. Spatio-Temporal Graph Dual-Attention Network for Multi-Agent Prediction and Tracking, IEEE Transactions on Intelligent Transportation Systems, 2022.

2. Generic Tracking and Probabilistic Prediction Framework and Its Application in Autonomous Driving, IEEE Transactions on Intelligent Transportation Systems, 2021.

3. Interaction-aware Multi-agent Tracking and Probabilistic Behavior Prediction via Adversarial Learning, ICRA 2019.

4. Generic Vehicle Tracking Framework Capable of Handling Occlusions Based on Modified Mixture Particle Filter, IV 2018.

Trustworthy Autonomous Systems Laboratory (TASL)

Featured Research

Explainable Relational Reasoning and Multi-Agent Interaction Modeling (Social & Physical)

Interaction-Aware Decision Making and Model-Based Control

Vision and Language Models for Embodied Intelligence

Improving Generalizability by Learning Context Relations

Continual / Lifelong Learning from Incremental Data

Diverse Prediction and Generation with Deep Generative Models

State Estimation with Learning-Based Models

Bourns College of Engineering