Publications

2025

  • Multi-subject open-set personalization in video generation
    Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Yuwei Fang, Kwot Sin Lee, Ivan Skorokhodov, Kfir Aberman, Jun-Yan Zhu, Ming-Hsuan Yang, Sergey Tulyakov
    In CVPR, 2025.
  • Mind the time: Temporally-controlled multi-event video generation
    Ziyi Wu, Aliaksandr Siarohin, Willi Menapace, Ivan Skorokhodov, Yuwei Fang, Varnith Chordia, Igor Gilitschenski, Sergey Tulyakov
    In CVPR, 2025.

2024

  • VIMI: Grounding Video Generation through Multi-modal Instruction
    Yuwei Fang, Willi Menapace, Aliaksandr Siarohin, Tsai-Shien Chen, Kuan-Chien Wang, Ivan Skorokhodov, Graham Neubig, Sergey Tulyakov
    In EMNLP, 2024.
    [PDF]
  • VIA: A Spatiotemporal Video Adaptation Framework for Global and Local Video Editing
    Jing Gu, Yuwei Fang, Ivan Skorokhodov, Peter Wonka, Xinya Du, Sergey Tulyakov, Xin Eric Wang
    Under Review, NeurIPS 2024.
  • Evaluating very long-term conversational memory of llm agents
    Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, Yuwei Fang
    In ACL, 2024.
    [PDF] [Project]
  • Plug: Leveraging pivot language in cross-lingual instruction tuning
    Zhihan Zhang, Dong-Ho Lee, Yuwei Fang, Wenhao Yu, Mengzhao Jia, Meng Jiang, Francesco Barbieri
    In ACL, 2024.
  • MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation
    Jackson Wang, Daniil Ostashev, Yuwei Fang, Sergey Tulyakov, Kfir Aberman
    In SIGGRAPH Asia, 2024.
  • Panda-70m: Captioning 70m videos with multiple cross-modality teachers
    Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Ekaterina Deyneka, Hsiang-wei Chao, Byung Eun Jeon, Yuwei Fang, Hsin-Ying Lee, Jian Ren, Ming-Hsuan Yang, Sergey Tulyakov
    In CVPR, 2024.
    [PDF] [Code]
  • Snap video: Scaled spatiotemporal transformers for text-to-video synthesis
    Willi Menapace, Aliaksandr Siarohin, Ivan Skorokhodov, Ekaterina Deyneka, Tsai-Shien Chen, Anil Kag, Yuwei Fang, Aleksei Stoliar, Elisa Ricci, Jian Ren, Sergey Tulyakov
    In CVPR, 2024.
    [PDF]
  • i-Code Studio: A Configurable and Composable Framework for Integrative AI
    Yuwei Fang, Mahmoud Khademi, Chenguang Zhu, Ziyi Yang, Reid Pryzant, Yichong Xu, Yao Qian, Takuya Yoshioka, Lu Yuan, Michael Zeng, Xuedong Huang
    In EMNLP (System Demonstrations), 2024.
    [PDF]
  • i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
    Ziyi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Mei Gao, Yi-Ling Chen, Robert Gmyr, Noel Codella, Naoyuki Kanda, Bin Xiao, Yuan Lu, Takuya Yoshioka, Michael Zeng, Xuedong Huang
    In NAACL, 2024.
    [PDF]

2023

  • i-Code: An Integrative and Composable Multimodal Learning Framework
    Ziyi Yang*, Yuwei Fang*, Chenguang Zhu, Reid Pryzant, Dongdong Chen, Yu Shi, Yichong Xu, Yao Qian, Mei Gao, Yi-Ling Chen, Liyang Lu, Yujia Xie, Robert Gmyr, Noel Codella, Naoyuki Kanda, Bin Xiao, Yuan Lu, Takuya Yoshioka, Michael Zeng, Xuedong Huang
    In AAAI, 2023.
    [PDF]
  • Unifying Vision, Text, and Layout for Universal Document Processing
    Zineng Tang, Ziyi Yang, Guoxin Wang, Yuwei Fang, Yang Liu, Chenguang Zhu, Michael Zeng, Cha Zhang, Mohit Bansal
    In CVPR, 2023.
    [PDF] [Code]
  • MACSum: Controllable Summarization with Mixed Attributes
    Yusen Zhang, Yang Liu, Ziyi Yang, Yuwei Fang, Yulong Chen, Dragomir Radev, Chenguang Zhu, Michael Zeng, Rui Zhang
    In TACL, 2023.
    [PDF] [Code]

2022

  • Retrieval Augmentation for Commonsense Reasoning: A Unified Approach
    Wenhao Yu, Chenguang Zhu, Zhihan Zhang, Shuohang Wang, Zhuosheng Zhang, Yuwei Fang, Meng Jiang
    In EMNLP, 2022.
    [PDF] [Code] [Leaderboard]
  • Task Compass: Scaling Multi-task Pre-training with Task Prefix
    Zhuosheng Zhang, Shuohang Wang, Yichong Xu, Yuwei Fang, Wenhao Yu, Yang Liu, Hai Zhao, Chenguang Zhu, Michael Zeng
    In Findings of EMNLP, 2022.
    [PDF]
  • Leveraging Knowledge in Multilingual Commonsense Reasoning
    Yuwei Fang, Shuohang Wang, Yichong Xu, Ruochen Xu, Siqi Sun, Chenguang Zhu, Michael Zeng
    In Findings of ACL, 2022.
    [PDF] [Leaderboard]
  • Training Data is More Valuable than You Think: A Simple and Effective Method by Retrieving from Training Data
    Shuohang Wang, Yichong Xu, Yuwei Fang, Yang Liu, Siqi Sun, Ruochen Xu, Chenguang Zhu, Michael Zeng
    In ACL, 2022.
    [PDF] [Code]
  • KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering
    Donghan Yu, Chenguang Zhu, Yuwei Fang, Wenhao Yu, Shuohang Wang, Yichong Xu, Xiang Ren, Yiming Yang, Michael Zeng
    In ACL, 2022.
    [PDF]
  • Dict-BERT: Enhancing Language Model Pre-training with Dictionary
    Wenhao Yu, Chenguang Zhu, Yuwei Fang, Donghan Yu, Shuohang Wang, Yichong Xu, Michael Zeng, Meng Jiang
    In Findings of ACL, 2022.
    [PDF] [Code]
  • RetGen: A Joint framework for Retrieval and Grounded Text Generation Modeling
    Yizhe Zhang, Siqi Sun, Xiang Gao, Yuwei Fang, Chris Brockett, Michel Galley, Jianfeng Gao, Bill Dolan
    In AAAI, 2022.
    [PDF] [Code]

2021

  • FILTER: An enhanced fusion method for cross-lingual language understanding
    Yuwei Fang*, Shuohang Wang*, Zhe Gan, Siqi Sun, Jingjing Liu
    In AAAI, 2021.
    [PDF] [Code] [Leaderboard]
  • LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval
    Siqi Sun*, Yen-Chun Chen*, Linjie Li, Shuohang Wang, Yuwei Fang, Jingjing Liu
    In NAACL, 2021.
    [PDF] [Code]
  • Cluster-former: Clustering-based sparse transformer for long-range dependency encoding
    Shuohang Wang, Luowei Zhou, Zhe Gan, Yen-Chun Chen, Yuwei Fang, Siqi Sun, Yu Cheng, Jingjing Liu
    In Findings of ACL, 2021.
    [PDF] [Leaderboard]

2020

  • Hierarchical graph network for multi-hop question answering
    Yuwei Fang, Siqi Sun, Zhe Gan, Rohit Pillai, Shuohang Wang, Jingjing Liu
    In EMNLP, 2020.
    [PDF] [Code] [Leaderboard]
  • Cross-Thought for Sentence Encoder Pre-training
    Shuohang Wang, Yuwei Fang, Siqi Sun, Zhe Gan, Yu Cheng, Jing Jiang, Jingjing Liu
    In EMNLP, 2020.
    [PDF] [Code]
  • Contrastive Distillation on Intermediate Representations for Language Model Compression
    Siqi Sun, Zhe Gan, Yu Cheng, Yuwei Fang, Shuohang Wang, Jingjing Liu
    In EMNLP, 2020.
    [PDF] [Code]