I am a Senior Researcher at Microsoft Dynamics 365 AI Research, primarily working on Vision-and-Language Representation Learning, Generative Pre-training, and Adversarial Machine Learning. I also have broad interests on other machine learning topics. I received my Ph.D. degree from Duke University in Spring 2018. Before that, I received my Master's and B.Sc. from Peking University in 2013 and 2010, respectively. My Ph.D. advisor is Lawrence Carin. I can be reached at zhe.gan@microsoft.com.
I am serving (or, has served) as an Area Chair for ICML 2021, ACL 2021, ICLR 2021 and NeurIPS 2020/2019, and a Senior Program Committee (SPC) member for AAAI 2021/2020, and received AAAI-20 Outstanding SPC Award.
Research Highlights:
- [2021/01] Two papers accepted by ICLR 2021. Topics include using information-theoretic tools for (i) improved robustness of language models (BERT and RoBERTa) and (ii) zero-shot voice style transfer.
- [2021/01] Our Meta Module Network wins the Best Student Paper Honorable Mention Award at WACV 2021.
- [2020/09] Our VILLA paper got accepted to NeurIPS 2020 as a Spotlight paper with review scores 8887. It is the first known effort that studies large-scale adversarial training for vision-and-langauge representation learning in both pre-training and finetuning stages.
- [2020/09] 7 long papers accepted by EMNLP 2020: (i) 6 of them accepted to the main conference, and (ii) 1 of them accepted to Findings of EMNLP 2020. Topics include: (i) Large-scale LM compression; (ii) Sentence embedding pre-training; (iii) Video+language pre-training; (iv) Constrained text generation via pre-training; (v) Summarization; (vi) Multi-hop reasoning for QA; and (vii) text style transfer.
- [2020/09] We achieve #1 on the XTREME and XGLUE leaderboards for cross-lingual language understanding. Welcome to check our FILTER paper.
- [2020/09] 2 papers accepted by ACCV 2020. Topics include: (i) face image editing; and (ii) unsupervised domain adaptation.
- [2020/08] Will serve as an Area Chair for ICLR 2021, and a SPC member (i.e., Meta-Reviewer) for AAAI 2021.
- [2020/07] Two papers got accepted to ECCV 2020. (1) UNITER: a state-of-the-art pre-trained Vision+Language (V+L) model; (2) VALUE (ECCV Spotlight): the first work on probing pre-trained V+L models.
- [2020/06] Two papers got accepted to ICML 2020. (i) CLUB: a novel upper bound of mutual information that is deeply connected with contrastive learning. (ii) GOT: a graph optimal transport framework for cross-domain alignment that can be used for V+L and NLP problems, such as VQA and NMT.
- [2020/06] At this year's CVPR, we will give a tutorial on "Recent Advances in Vision-and-Language Research", covering the recent popular multi-modal pre-training methods and other topics. More details are provided in the website here.
- [2020/04] Two CVPR and three ACL papers got accepted, respectively. CVPR papers cover topics including: (i) high-resolution image synthesis from salient object layout, and (ii) a new dataset for video-and-language understanding. ACL papers cover topics include: (i) text summarization based on discourse units, (ii) BERT for text generation, and (iii) text generation that models the distant future.
- [2020/03] Will serve as an Area Chair for NeurIPS 2020.
- [2020/01] I received AAAI-20 Outstanding SPC Award.
- [2019/09] Our new work UNITER achieves SOTA on 6 Vision-and-Language tasks across 9 datasets (VQA, VCR, NLVR, Img-Txt Retrieval, Visual Entailment, Referring Expression).
- [2019/09] Our latest Adversarial Training model has beaten Facebook's RoBERTa on GLUE benchmark. Paper is available here.
- [2019/08] 4 papers got accepted to EMNLP. Topics include (i) BERT model compression, (ii) domain adaptation for MRC, (iii) domain adaptation for text style transfer, and (iv) image caption evaluation.
© January 2021 Zhe Gan