You can access my all of papers at 🔗Google Scholar.

2025

Residual Reward Models for Preference-based Reinforcement Learning
Residual Reward Models for Preference-based Reinforcement Learning

Chenyang Cao, Miguel Rogel-García, Mohamed Nabai, Xueqian Wang, Nicholas Rhinehart†(† corresponding author)

arXiv 2025 Conference

We propose a residual reward model for reward learning by effectively taking advantage of human prior knowledge.

Residual Reward Models for Preference-based Reinforcement Learning
Residual Reward Models for Preference-based Reinforcement Learning

Chenyang Cao, Miguel Rogel-García, Mohamed Nabai, Xueqian Wang, Nicholas Rhinehart†(† corresponding author)

arXiv 2025 Conference

We propose a residual reward model for reward learning by effectively taking advantage of human prior knowledge.

FOSP: Fine-tuning Offline Safe Policy through World Models
FOSP: Fine-tuning Offline Safe Policy through World Models

Chenyang Cao, Yuchen Xin, Silang Wu, Longxiang He, Zichen Yan, Junbo Tan, Xueqian Wang†(† corresponding author)

ICLR 2025 Conference

We propose a safe offline-to-online reinforcement learning algorithm by leveraging world models. It ensures the agent safely moves in the environment during online fine-tuning.

FOSP: Fine-tuning Offline Safe Policy through World Models
FOSP: Fine-tuning Offline Safe Policy through World Models

Chenyang Cao, Yuchen Xin, Silang Wu, Longxiang He, Zichen Yan, Junbo Tan, Xueqian Wang†(† corresponding author)

ICLR 2025 Conference

We propose a safe offline-to-online reinforcement learning algorithm by leveraging world models. It ensures the agent safely moves in the environment during online fine-tuning.

2024

A Wristband Haptic Feedback System for Robotic Arm Teleoperation
A Wristband Haptic Feedback System for Robotic Arm Teleoperation

Silang Wu, Huayue Liang, Chenyang Cao, Chongkun Xia, Xueqian Wang, Houde Liu†(† corresponding author)

ROBIO 2024 Conference

We make a wristband teleoperation system to provide haptic feedback from the end-effector.

A Wristband Haptic Feedback System for Robotic Arm Teleoperation
A Wristband Haptic Feedback System for Robotic Arm Teleoperation

Silang Wu, Huayue Liang, Chenyang Cao, Chongkun Xia, Xueqian Wang, Houde Liu†(† corresponding author)

ROBIO 2024 Conference

We make a wristband teleoperation system to provide haptic feedback from the end-effector.

Offline Goal-Conditioned Reinforcement Learning for Safety-Critical Tasks with Recovery Policy
Offline Goal-Conditioned Reinforcement Learning for Safety-Critical Tasks with Recovery Policy

Chenyang Cao, Zichen Yan, Renhao Lu, Junbo Tan†, Xueqian Wang†(† corresponding author)

ICRA 2024 Conference

We propose an offline goal-conditioned reinforcement learning algorithm to solve the planning problem in constrained environments without interacting with them. The algorithm combines the advantages of efficient planning and safe obstacle avoidance, and effectively balances the optimization of both aspects.

Offline Goal-Conditioned Reinforcement Learning for Safety-Critical Tasks with Recovery Policy
Offline Goal-Conditioned Reinforcement Learning for Safety-Critical Tasks with Recovery Policy

Chenyang Cao, Zichen Yan, Renhao Lu, Junbo Tan†, Xueqian Wang†(† corresponding author)

ICRA 2024 Conference

We propose an offline goal-conditioned reinforcement learning algorithm to solve the planning problem in constrained environments without interacting with them. The algorithm combines the advantages of efficient planning and safe obstacle avoidance, and effectively balances the optimization of both aspects.