Xianjie Dai

I am a Research Assistant at SJTU, where I work with Prof. Peilin Zhao and Dr. Chenghao Liu.

Previously, I was a Research Assistant at KAUST, where I worked with Prof. Mohamed Elhoseiny at Vision-CAIR and collaborated closely with Dr. Jian Ding on generative models and embodied AI. My research focused on end-to-end vision-language and vision-language-action models.

I received my M.S. degree in Computer Science from EPFL. At EPFL, I did my semester project at the eCEO Lab with Prof. Devis Tuia and Dr. Li Mi. Before that, I obtained my bachelor's degree from PolyU.

Email / Scholar / LinkedIn / GitHub

Research

My current work focuses on in-context learning and generalization in embodied AI. More broadly, I am interested in multimodal learning, vision-language models, representation learning, and reliable decision-making systems.

Publications

	Recent Work on Embodied AI and Robot Learning Manuscripts under review.
	Knowledge-aware Text-Image Retrieval for Remote Sensing Images Li Mi, Xianjie Dai, Javiera Castillo-Navarro, Devis Tuia IEEE Transactions on Geoscience and Remote Sensing (TGRS), 2024 paper
	ConVQG: Contrastive Visual Question Generation with Multimodal Guidance Li Mi, Syrielle Montariol, Javiera Castillo-Navarro, Xianjie Dai, Antoine Bosselut, Devis Tuia AAAI Conference on Artificial Intelligence (AAAI), 2024 paper / project page

Education

École Polytechnique Fédérale de Lausanne (EPFL)
M.Sc. in Computer Science

The Hong Kong Polytechnic University
B.Eng. (Hons) in Electronic and Information Engineering

Misc

My research experience spans embodied AI, vision-language learning, biomedical imaging, and multimodal representation learning.

Website template adapted from Jon Barron's academic website.