Keynote Speeches

Keynote 1

Title:  Navigating Challenges in LLMs: Problems and Directions

Speaker:  Prof Wei Lu,  Singapore University of Technology and Design (SUTD). 


Wei Lu is currently an Associate Professor and Associate Head (Research) of the Information Systems Technology and Design Pillar of the Singapore University of Technology and Design (SUTD). He is also the Director of the StatNLP Research Group, which focuses on fundamental research on Natural Language Processing (NLP) and Large Language Models (LLMs). He is currently serving as an Action Editor for Computational Linguistics (CL) and Transactions of the Association for Computational Linguistics (TACL). He served as a Senior Area Chair for ACL, EMNLP, and NAACL. He also served as a PC Chair for NLPCC 2022 and IJCNLP/AACL 2023. He received the Best Paper Award at EMNLP 2011 (top 0.16%), Best System Paper Award at SemEval 2022 (top 0.45%), and Area Chair Award (‘Resources and Evaluation’ Area) at ACL 2023 (top 0.42% within the area).


I will discuss some challenges associated with the current state of LLMs and offer our perspectives on them. Specifically, I will share some of our recent endeavors that center around the following research themes:

Structures: Modern chat models are great at many things but are recognized that they may encounter challenges in tasks such as structured prediction. I’ll discuss some theoretical limitations with such models while offering some observations on the strong potential of alternative models such as masked language models for such tasks. Based on that, I’ll share some perspectives on how to improve chat models’ capabilities on such tasks.

Reasoning: The chain-of-thought (CoT) prompting method demonstrates LLMs’ capabilities to carry out step-by-step reasoning. We argue that LLMs may be able to perform structured multi-dimensional reasoning, which is demonstrated by our Tab-CoT prompting mechanism. We discuss how this may serve as a step towards better understanding the emergent behaviors of LLMs, one of the most fundamental research problems in LLMs.

Fine-tuning: While various parameter-efficient fine-tuning (PEFT) methods are successfully proposed, the feasibility of achieving further storage reduction remains a research question. We have identified an approach that can be applied to all existing PEFT techniques, demonstrating its effectiveness in significantly reducing the storage requirements for additional parameters. 

Pre-training: I’ll discuss some of our ongoing efforts to investigate better ways to effectively pre-train LLMs. One of our current focuses is to build effective yet relatively small LLMs. I will elaborate on the significance of such a direction, how such models may benefit the community, and what their practical implications could be.

Keynote 2

Title: Speech Processing in the Era of GPT

Speaker:  Prof Lei Xie, Northwestern Polytechnical University (NPU), China


Lei Xie is a professor at the School of Computer Science, Northwestern Polytechnical University (NPU) in Xi’an, China. He serves as the leader of the Audio, Speech, and Language Processing Group (ASLP@NPU). Prior to joining NPU, he held positions at renowned institutions such as Vrije Universiteit Brussel (VUB), City University of Hong Kong, and The Chinese University of Hong Kong. With a prolific academic career, Lei Xie has authored and co-authored more than 300 papers published in esteemed journals and conferences such as IEEE/ACM Transactions on Audio, Speech and Language Processing, IEEE Transactions on Multimedia, Interspeech, ICASSP, ASRU, SLT, ACL, AAAI and ACM Multimedia. His contributions have earned him several prestigious best paper awards at flagship conferences. Lei Xie’s research endeavors encompass a broad range of speech processing, including speech enhancement, speech recognition, speaker recognition and speech synthesis. His team maintains extensive collaborations with industry leaders, including Microsoft, Alibaba, Tencent, Huawei, Xiaomi, Bytedance, and Meituan. In addition to his research achievements, Dr. Xie serves as a Senior Area Editor (SAE) for the IEEE/ACM Transactions on Audio, Speech, and Language Processing. He is also a member of the IEEE Speech and Language Technical Committee (SLTC) and the vice-chair of special interest group of ISCA Chinese Spoken Laugange Processing (SIG-CSLP). Furthermore, he actively contributes to the academic community by serving as a chair in numerous conferences and technical committees.


In the era of GPT, speech processing continues to play a pivotal role. With the recent integration of auditory functionalities into ChatGPT, the super assistant now can hear and speak. This natural user interface empowers such a super assistant to seamlessly operate from any location. In this presentation, I will delve into the emerging opportunities and challenges within the field of speech processing. Specifically, I will highlight the significance of speech front-end processing in mitigating interferences and enhancing system robustness. Furthermore, I will explore how the advent of GPT shapes the trajectory of speech processing tasks, especially speech recognition and generation, and address the imperative of safeguarding our speech in light of the imminent zero-shot voice cloning capability.