I am a Ph.D. candidate at IRMV Lab, Shanghai Jiao Tong University, advised by Prof. Hesheng Wang.

I am interested in robot learning from human videos, egocentric hand-object interaction prediction, LiDAR place recognition, and occupancy forecasting.

πŸ”₯ News

  • May 2026: Selected as an Outstanding Reviewer of IEEE Robotics and Automation Letters (RA-L).
  • May 2026: Uni-Hand has been accepted by T-PAMI.
  • Nov. 2025: MADiff has been accepted by T-PAMI.
  • Jun. 2025: Four papers have been accepted by IROS 2025.
  • Feb. 2025: Spatiotemporal Decoupling for Efficient Vision-Based Occupancy Forecasting has been accepted by CVPR 2025.

πŸ“ Publications

#: Equal contribution, *: Corresponding author.

Learning from Human Videos

arXiv 2026
Robot Learning from Human Videos survey

Robot Learning from Human Videos: A Survey

Junyi Ma, Erhang Zhang, Haoran Yang, Ditao Li, Chenyang Xu, Guangming Wang, Hesheng Wang*

T-PAMI 2026
Uni-Hand

Uni-Hand: Universal Hand Motion Forecasting in Egocentric Views

Junyi Ma, Wentao Bao, Jingyi Xu, Guanzhong Sun, Yu Zheng, Erhang Zhang, Xieyuanli Chen, Hesheng Wang*

T-PAMI 2025
MADiff

MADiff: Motion-Aware Mamba Diffusion Models for Hand Trajectory Prediction on Egocentric Videos

Junyi Ma#, Xieyuanli Chen#, Wentao Bao, Jingyi Xu, Hesheng Wang*

IROS 2025
Zero-Shot Temporal Interaction Localization

Zero-Shot Temporal Interaction Localization for Egocentric Videos

Erhang Zhang#, Junyi Ma#, Yin-Dong Zheng, Yixuan Zhou, Hesheng Wang*

Preprint
EgoLoc

EgoLoc: A Generalizable Solution for Temporal Interaction Localization in Egocentric Videos

Junyi Ma#, Erhang Zhang#, Yin-Dong Zheng, Yuchen Xie, Yixuan Zhou, Hesheng Wang*

HOI Prediction

IROS 2025
MMTwin

MMTwin: Novel Diffusion Models for Multimodal 3D Hand Trajectory Prediction

Junyi Ma, Wentao Bao, Jingyi Xu, Guanzhong Sun, Xieyuanli Chen, Hesheng Wang*

IROS 2025
Diff-IP2D

Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on Egocentric Videos

Junyi Ma, Jingyi Xu, Xieyuanli Chen, Hesheng Wang*

Place Recognition and SLAM

RA-L / IROS 2022
OverlapTransformer

OverlapTransformer: An Efficient and Yaw-Angle-Invariant Transformer Network for LiDAR-Based Place Recognition

Junyi Ma, Jun Zhang, Jintao Xu, Rui Ai, Weihao Gu, Xieyuanli Chen*

TIE 2022
SeqOT

SeqOT: A Spatial-Temporal Transformer Network for Place Recognition Using Sequential LiDAR Data

Junyi Ma, Xieyuanli Chen, Jingyi Xu, Guangming Xiong*

TII 2023
CVTNet

CVTNet: A Cross-View Transformer Network for Place Recognition Using LiDAR Data

Junyi Ma, Guangming Xiong, Jingyi Xu, Xieyuanli Chen*

Point Cloud and Occupancy Forecasting

CVPR 2024
Cam4DOcc

Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications

Junyi Ma#, Xieyuanli Chen#, Jiawei Huang, Jingyi Xu, Zhen Luo, Jintao Xu, Weihao Gu, Rui Ai, Hesheng Wang*

RA-L / ICRA 2024
PCPNet

PCPNet: An Efficient and Semantic-Enhanced Transformer Network for Point Cloud Prediction

Zhen Luo, Junyi Ma, Zijie Zhou, Guangming Xiong Mentorship

πŸ† Honors and Awards

  • Outstanding Master's Thesis, Beijing Institute of Technology, 2023.
  • National Scholarship for Graduate Students, Ministry of Education of China, 2022.
  • National Scholarship for Undergraduate Students, Ministry of Education of China, 2019.
  • Outstanding Master's Graduates in Beijing, 2023.
  • Outstanding Bachelor's Graduates in Beijing, 2020.
  • Best Paper Award at IEEE International Conference on Unmanned Systems (ICUS), 2021.
  • Outstanding Paper Presented at the Autonomous Robotic Technology Seminar (ARTS), 2023.

πŸŽ“ Educations

  • Shanghai Jiao Tong University, Ph.D. candidate at IRMV Lab. Supervisor: Prof. Hesheng Wang.
  • Beijing Institute of Technology, M.S. in Mechanical Engineering, 2023. Supervisors: Prof. Guangming Xiong and Prof. Xieyuanli Chen.
  • Beijing Institute of Technology, B.S. in Mechanical Engineering, 2020. Bachelor thesis advisor: Prof. Oliver DΓΌrr.

πŸ“¦ Datasets

  • Haomo Dataset: mobile-robot LiDAR dataset collected in urban Beijing. Description
  • Cues-Poses Dataset: a toy dataset about mapping multiple cues to mutual poses of robots. Description
  • Cam4DOcc: benchmark for camera-only 4D occupancy forecasting. Description
  • CABH Benchmark: egocentric videos capturing human hands performing simple object manipulation tasks. Description

πŸ“„ Patents

  • [China Utility Model] Huilong Yu, Ziang Tian, Junyi Ma, Haotian Dong, Junqiang Xi, and Guangming Xiong. A multifunctional unmanned platform for subterranean space. ZL202123083457.8
  • [China Appearance Design] Huilong Yu, Ziang Tian, Junyi Ma, Haotian Dong, Junqiang Xi, and Guangming Xiong. A multifunctional unmanned caterpillar for subterranean space. ZL202130813635.4
  • [China Invention Publication] Guangming Xiong, Junyi Ma, Jingyi Xu, and Jiarui Song. A reliability analysis-based multi-robot cooperative localization and mapping method. ZL202110318362.5

🀝 Service