Biography
Hello! I am a final-year PhD Candidate in computer science at Hong Kong Baptist University (HKBU), advised by Dr. Jing Ma. My research interests lie in the development of novel foundation models and their applications. Specifically, this includes:
Currently, I am visiting the National University of Singapore (NUS), under the supervision of Professor Mohan Kankanhalli.
I plan to graduate in the fall of 2025 and am seeking a full-time researcher/postdoc position. If you are interested, please feel free to contact me. (email: cszyluo AT comp.hkbu.edu.hk, wechat ID: chiyeunglaw)
For those who are interested in working with me, please feel free to email me. Remote collaboration is also welcome!
Representative Works
- Code Intelligence and LLMs: WizardCoder: Empowering Code Large Language Models with Evol-Instruct This is the first work that truly closes the gap between open-source and closed-source Code LLMs, widely used in the following works. [2024/01/04] We released WizardCoder-33B-V1.1, the SOTA OSS Code LLM on EvalPlus Leaderboard. [2023/08/26] We released WizardCoder-Python-34B-V1.0, which achieves the 73.2 pass@1 and surpasses GPT4 (2023/03/15), ChatGPT-3.5, and Claude2 on the HumanEval. [2023/06/16] Our WizardCoder-15B-V1.0 achieves 57.3 pass@1 score on HumanEval, more than 20 points higher than the SOTA open-source LLMs. (ICLR 2024).
- Multimodal Representation Learning: LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse Retrieval. Our LexLIP introduces a new paradigm for image-text retrieval, which outperforms CLIP with 5.8x faster retrieval speed and 19.1x less index storage memory. (Codes, ICCV 2023)
- Language Models: Positional Artefacts Propagate Through Masked Language Model Embeddings We are the first to discover the outlier neurons phenomenon in the Transformer-based LMs. This finding is widely used to quantize transformers in the recent works. (ACL 2021)
Working Experiences
- On-going, Research Intern, Rhymes AI.
- Feb. 2024 - June. 2024, Research Intern, Language Technology Lab, Alibaba DAMO Academy - Singapore.
- May. 2023 - Jan. 2024, WizardCoder Core Contributor, WizardLM Team, Microsoft.
- Nov. 2022 - Jan. 2024, Research Intern, Data, Knowledge, and Intelligence Group, Microsoft Research Lab - Asia (MSRA).
- Jan. 2022 - July. 2022, Research Intern, NLP Research Group, Fuxi AI Lab, NetEase, Inc.
- Jan. 2021 - July. 2021, Research Intern, NLP Research Group, Fuxi AI Lab, NetEase, Inc.
- May. 2020 - Aug. 2020, Engineer Intern, Jovi Chatbot Group, VIVO Communication Technology Co. Ltd.
Academic Services
- Conference Reviewer: EMNLP 2022-23, ACL 2023, ACL Rolling Review 2023-24, AAAI 2023-24, ACCV 2022, ECCV 2024
- Workshop Reviewer: Instruction Workshop @NeurIPS 2023
- Student Volunteer: EMNLP 2022
Invited Talks
- "WizardCoder: Empowering Code Large Language Models with Evol-Instruct" at Neurosymbolic Reading Group, MIT. (Slides)