Xianjie Dai

I am a Research Assistant at SJTU, where I work with Prof. Peilin Zhao and Dr. Chenghao Liu.

Previously, I was a Research Assistant at KAUST, where I worked with Prof. Mohamed Elhoseiny at Vision-CAIR and collaborated closely with Dr. Jian Ding on generative models and embodied AI. My research focused on end-to-end vision-language and vision-language-action models.

I received my M.S. degree in Computer Science from EPFL. At EPFL, I did my semester project at the eCEO Lab with Prof. Devis Tuia and Dr. Li Mi. Before that, I obtained my bachelor's degree from PolyU.

Email  /  Scholar  /  LinkedIn  /  GitHub

profile photo

Research

My current work focuses on in-context learning and generalization in embodied AI. More broadly, I am interested in multimodal learning, vision-language models, representation learning, and reliable decision-making systems.

Publications

Current research thumbnail Recent Work on Embodied AI and Robot Learning
Manuscripts under review.
KTIR thumbnail Knowledge-aware Text-Image Retrieval for Remote Sensing Images
Li Mi, Xianjie Dai, Javiera Castillo-Navarro, Devis Tuia
IEEE Transactions on Geoscience and Remote Sensing (TGRS), 2024
paper
ConVQG thumbnail ConVQG: Contrastive Visual Question Generation with Multimodal Guidance
Li Mi, Syrielle Montariol, Javiera Castillo-Navarro, Xianjie Dai, Antoine Bosselut, Devis Tuia
AAAI Conference on Artificial Intelligence (AAAI), 2024
paper / project page

Education

École Polytechnique Fédérale de Lausanne (EPFL)
M.Sc. in Computer Science

The Hong Kong Polytechnic University
B.Eng. (Hons) in Electronic and Information Engineering

Misc

My research experience spans embodied AI, vision-language learning, biomedical imaging, and multimodal representation learning.

Website template adapted from Jon Barron's academic website.