News
2025-06: Two papers are accepted to ICCV 2025.
2024-12: Two papers are accepted to ACM MM 2024 and AAAI 2025.
2024-02: Two papers are accepted to CVPR 2024.
2023-02: Two papers are accepted to CVPR 2023.
2022-05: I win the championship of PIC Challenge HCVG Track at ACM MM 2022.
2021-06: I receive the Best Paper Award of CVPR 2021 PIC Workshop.
Selected Publications
* indicates equal contributions. See the full list in Google Scholar
Your browser does not support the video tag.
SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses
Chaolei Tan* , Zihang Lin*, Junfu Pu, Zhongang Qi, Wei-Yi Pei, Zhi Qu, Yexin Wang, Ying Shan, Wei-Shi Zheng, Jian-Fang Hu
ACM International Conference on Multimedia (ACM MM) , 2024
Paper /
Website
A large-scale video dataset with densely annotated paragraph timestamps to enable the new research direction of multi-paragraph video grounding on both long-form videos and long-term queries.
Your browser does not support the video tag.
Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding
Chaolei Tan , Jianhuang Lai, Wei-Shi Zheng, Jian-Fang Hu
Computer Vision and Pattern Recognition (CVPR) , 2024
Paper
First attempt to explore weakly-supervised setting of video paragraph grounding, where a siamese learning framework jontly conducting feature alignment and boundary regression is proposed.
Your browser does not support the video tag.
Hierarchical Semantic Correspondence Networks for Video Paragraph Grounding
Chaolei Tan , Zihang Lin, Jian-Fang Hu, Wei-Shi Zheng, Jianhuang Lai
Computer Vision and Pattern Recognition (CVPR) , 2023
Paper
Introducing hierarchical modeling into video paragraph grounding by hierarchically aligning semantic correspondence across videos and paragraphs for temporal decoding at multiple granularities.
Your browser does not support the video tag.
ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations
Tianming Liang, Kun-Yu Lin, Chaolei Tan , Jianguo Zhang, Wei-Shi Zheng, Jian-Fang Hu
International Conference on Computer Vision (ICCV) , 2025
Paper /
Website
Our first attempt to adapt pretrained foundational visual grounding models to Referring Video Object Segmentation (RVOS).
Your browser does not support the video tag.
Ranking Distillation for Open-Ended Video Question Answering with Insufficient Labels
Tianming Liang, Chaolei Tan , Beihao Xia, Wei-Shi Zheng, Jian-Fang Hu
Computer Vision and Pattern Recognition (CVPR) , 2024
Paper
Tackle the incomplete annotation issues in open-ended video question answering with ranking distillation.
Your browser does not support the video tag.
Collaborative Static and Dynamic Vision-Language Streams for Spatio-Temporal Video Grounding
Zihang Lin, Chaolei Tan , Jian-Fang Hu, Zhi Jin, Tiancai Ye, Wei-Shi Zheng
Computer Vision and Pattern Recognition (CVPR) , 2023
Paper
To model the collaborative static and dynamic vision-language streams for better spatio-temporal video grounding.
Honors and Awards
HKUST RedBird PhD Scholarship
CSIG Outstanding Master's Dissertation Honorable Mention Award
SYSU Outstanding Master's Dissertation Award
1st Place Award of PIC Challenge HCVG Track at ACM MM 2022
Best Paper Award of PIC Workshop at CVPR 2021
SYSU Guangdong Guangda Further Study Scholarship
Guangdong Soong Ching-ling Scholarship