About Me
I’m a Ph.D candidate in MCG Group, School of Computer Science, Nanjing University, under the supervision of Prof. Limin Wang. I’m currently a research intern at Tencent ARC Lab, working on research related to world models. Before that, I was also a research intern at ByteDance, focusing on discrete autoregressive video generation. Besides, I have a long-standing focus on Multi-Object Tracking.
Before that, I spent wonderful years as an undergraduate in the Department of Computer Science and Technology, Nanjing University and received my Bachelor of Science in June 2021. During this period, I studied the impact propagation of code changes with Jiaming Xu and Prof. Liang Wang.
News
- 🎉 Two papers are accepted by ECCV 2026.
- 🔭 We propose HATReID-MOT, rethinking and improving the ReID cues in MOT tasks.
- 🎉 One paper is accepted by CVPR 2025.
- 🔭 Regarding Multiple Object Tracking as ID Prediction problems, a streamlined yet effective method MOTIP is proposed.
- 🎉 One paper is accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence.
- 🚀 MeMOTR is released on arXiv, a simple but effective long-term memory-augmented multi-object tracker.
- 🎉 One paper is accepted by ICCV 2023, the paper and code are released.
- 🚀 Fengyuan Shi and I release Dynamic MDETR, a sparse and light decoder for visual grounding.
- 🚀 I release a modern and user-friendly personal homepage template.
Publications
Internships
Research on world models, specifically causal diffusion video generation.
Focused on discrete autoregressive video generation and motion-aware visual tokenizer.
Worked as an intern on mini-app development.
Education
Research in Computer Vision and Deep Learning, supervised by Prof. Limin Wang.
Focused on Computer Vision and Deep Learning, mainly about RGB-D scene recognition.
Worked on propagation of the effects after code commits.
Honors
Academic Service
Journal Review
- IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
- IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
- IEEE Transactions on Multimedia (TMM)
- Computer Vision and Image Understanding (CVIU)
Conference Review
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- European Conference on Computer Vision (ECCV)
- Annual Conference on Neural Information Processing Systems (NeurIPS)
- ACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia (SIGGRAPH ASIA)
- AAAI Conference on Artificial Intelligence (AAAI)
Contact
Contact
Friends
- Xuyang Cao www.xuyangcao.com/