Publications

— [!NEW] —

Kaifeng Gao, Jiaxin Shi, Hanwang Zhang, Chunping Wang, Jun Xiao, Long Chen. Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing. In International Conference on Machine Learning (ICML) 2025. [arXiv][Code]

Kaifeng Gao, Siqi Chen, Hanwang Zhang, Jun Xiao, Yueting Zhuang, and Qianru Sun, Generalized Visual Relation Detection with Diffusion Models. Accepted to be published on IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) 2025.
Wenqing Wang, Kaifeng Gao, Yawei Luo, Tao Jiang, Fei Gao, Jian Shao, Jianwen Sun, Jun Xiao. Triple Correlations-Guided Label Supplementation for Unbiased Video Scene Graph Generation. In Proceedings of the 31st ACM International Conference on Multimedia (ACM MM), 2023. [Paper][arXiv]
Kaifeng Gao, Long Chen, Hanwang Zhang, Jun Xiao, Qianru Sun. Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video Relation Detection. In International Conference on Learning Representations (ICLR), 2023. [Paper][Code]
Kaifeng Gao, Long Chen, Yulei Niu, Jian Shao, and Jun Xiao. Classification-Then-Grounding: Reformulating Video Scene Graphs as Temporal Bipartite Graphs, In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR), 2022. [Paper][Code]
Shaoning Xiao, Long Chen, Kaifeng Gao, Zhao Wang, Yi Yang, Jun Xiao. Rethinking Multi-Modal Alignment in Video Question Answering from Feature and Sample Perspectives. In The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022. [Paper]
Kaifeng Gao, Long Chen, Yifeng Huang, and Jun Xiao. Video Relation Detection via Tracklet based Visual Transformer. In Proceedings of the 29th ACM International Conference on Multimedia (ACM MM), 2021. [Paper][arXiv][Code]

Kitephone @ ZJU

Publications

Poster