Program Details

Day 1, Sunday, 11 Dec 2022

kindly take note that the tutorials and grand challenges will be conducted online only.

Tutorials

Tutorial 1: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Speech Processing 
Presenters: Yu Zhang, Bo Li, Daniel Park, Google
Time: 9:30-11:30, Sunday, 11 Dec 2022
Download T1 material (Alternative links: Part1, Part2, Part3)

Tutorial 2: TorchAudio Tutorial
Presenters: Xiaohui Zhang, Zhaoheng Ni, Jeff Hwang, Caroline Chen, Meta
Time: 9:30-11:30, Sunday, 11 Dec 2022
Download T2 material

Tutorial 3: Towards Solving Cocktail Party Problem with Artificial Intelligence
Presenter: Dr. Chenglin Xu, Kuaishou Technology
Time: 13:00-15:00, Sunday, 11 Dec 2022
Download T3 material

Tutorial 4: Quantum Machine Learning for Speech Processing: from Theoretical Foundations to Practices
Presenters: Prof. Jun Qi, Fudan Unversity, Shanghai, China; Huck Yang, Ph.D. candidate, Georgia Insitute of Technology, Atlanta, GA, USA
Time: 13:00-15:00, Sunday, 11 Dec 2022
Download T4 material

Tutorial 5: Recent Advances on Automatic Dialogue Evaluation
Presenters: Luis Fernando D’Haro, Universidad Polit_ecnica de Madrid; Chen Zhang, National University of Singapore
Time: 15:30-17:30, Sunday, 11 Dec 2022
Download T5 material

Grand Challenges

Challenge 1: Conversational Short-Phrase Speaker Diarization Challenge (CSSD)

Time: 9:30-11:30, Sunday, 11 Dec 2022
Chair: Qingqing Zhang

GC1.1 (#126) The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines
Gaofeng Cheng, Yifan Chen, Runyan Yang, Qingxuan Li, Zehui Yang, Lingxuan Ye, Pengyuan Zhang, Qingqing Zhang, Lei Xie, Yanmin Qian, Kong Aik Lee and Yonghong Yan

GC1.2 (#129) Spectral Clustering Based EEND-vector Clustering: A Robust System Fine-tuned on Simulated Conversations
Kai Li

GC1.3 (#130) The X-Lance Speaker Diarization System for the Conversational Short-phrase Speaker Diarization Challenge 2022
Tao Liu, Xu Xiang, Zhengyang Chen, Bing Han, Kai Yu and Yanmin Qian

GC1.4 (#132) TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge
Bowen Pang, Huan Zhao, Gaosheng Zhang, Xiaoyue Yang, Yang Sun, Li Zhang, Qing Wang and Lei Xie

Challenge 2: Intelligent Cockpit Speech Recognition Challenge (ICSRC)

Time: 13:00-15:00, Sunday, 11 Dec 2022
Chair: Lei Xie

GC2.1 (#142) The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results
Ao Zhang, Fan Yu, Kaixun Huang, Lei Xie, Longbiao Wang, Eng Siong Chng, Hui Bu, Binbin Zhang, Wei Chen and Xin Xu

GC2.2 (#139) The FawAI ASR System for the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge
Yujia Sun, Bing Ge, Bo Chen, Zhen Fu, Jinxin He, Hongwei Gao and Xue Wang

GC2.3 (#140) LeVoice ASR Systems for the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge
Yan Jia, Mi Hong, Jingyu Hou, Kailong Ren, Sifan Ma, Jin Wang, Yinglin Ji, Fangzhen Peng,  Lin Yang and Junjie Wang

GC2.4 (#141) Efficient Conformer-Based CTC Model for Intelligent Cockpit Speech Recognition
Hanzhi Guo, Yunshu Chen, Xukang Xie, Gaopeng Xu and Wei Guo

Challenge 3: Chinese-English Code-Switching Automatic Speech Recognition (CSASR)

Time: 15:30-17:30, Sunday, 11 Dec 2022
Chair: Qingqing Zhang

GC3.1 (#138) Summary on the ISCSLP 2022 Chinese-English Code-switching ASR Challenge
Shuhao Deng, Chengfei Li, Jinfeng Bai, Qingqing Zhang, Wei-Qiang Zhang, Runyan Yang, Gaofeng Cheng, Pengyuan Zhang and Yonghong Yan

GC3.2 (#135) The NPU-ASLP System for The ISCSLP 2022 Magichub Code-Switching ASR Challenge
Yuhao Liang, Peikun Chen, Fan Yu, Xinfa Zhu, Tianyi Xu, Yingying Gao and Lei Xie

GC3.3 (#136) Hybrid CTC Language Identification Structure for Mandarin-English Code-Switching ASR
Hengxin Yin, Guangyu Hu, Fei Wang and Pengfei Ren

Day 2, Monday, 12 Dec 2022

Opening Session

Time: 8:30-9:00, Monday, 12 Dec 2022

Keynote Speech 1

Time: 9:00-10:00, Monday, 12 Dec 2022
Title: Advancing end-to-end automatic speech recognition and beyond
Speaker: Dr Jinyu Li, Partner Applied Science Manager, Microsoft
Chair: Nancy Chen

Oral 1: Speech Recognition I

Session Chairs: Jen-Tzung Chien, Siqi Cai
Time: 10:30-12:30, Monday, 12 Dec 2022

OS1.1 (#99) An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition
Chao-Han Huck Yang, Jun Qi, Sabato Marco Siniscalchi and Chin-Hui Lee

OS1.2 (#25) Adaptive Attention Network with Domain Adversarial Training for Multi-Accent Speech Recognition
Yanbing Yang, Hao Shi, Yuqin Lin, Meng Ge, Longbiao Wang, Qingzhi Hou and Jianwu Dang

OS1.3 (#26) Multilingual Zero Resource Speech Recognition Base on Self-Supervise Pre-Trained Acoustic Models
Haoyu Wang, Wei-Qiang Zhang, Hongbin Suo and Yulong Wan

OS1.4 (#47) Towards Language-universal Mandarin-English Speech Recognition with Unsupervised Label Synchronous Adaptation
Song Li, Haoneng Luo, Wenxuan Hu, Yuan Liu, Shiliang Zhang, Lin Li and Qingyang Hong

OS1.5 (#49) Sequence Distribution Matching for Unsupervised Domain Adaptation in ASR
Qingxuan Li, Han Zhu, Liuping Luo, Gaofeng Cheng, Pengyuan Zhang, Jiasong Sun and Yonghong Yan

OS1.6 (#86) Improving Rare Words Recognition through Homophone Extension and Unified Writing for Low-resource Cantonese Speech Recognition
Ho-Lam Chung, Junan Li, Pengfei Liu, Wai Kim LEUNG, Xixin Wu and Helen Meng

Oral 2: Speech Production and Perception I

Session Chairs: Aijun Li, Changhuai You
Time: 10:30-12:30, Monday, 12 Dec 2022

OS2.1 (#54) Perception and Production of Mandarin Vowels by Teenagers – Blind and Sighted
Moyu Chen, Jing Qi, and Xiyu Wu

OS2.2 (#70) The Production of Contrastive Focus by Children Learning Mandarin Chinese
Jing Lu and Ping Tang

OS2.3 (#77) Production Characteristics of Vowels in the Standard Chinese by Preschool Bilingual Teachers
Jiao Lin Pan and Yuan Jia

OS2.4 (#81) Effects of Aspiration on Tone Production and Perception in Standard Chinese
Chong Cao and Aijun Li

OS2.5 (#84) The Disyllabic Tone Production and Tone Context Effect in Mandarin-speaking Children with Cochlear Implants
Jingwen Cheng, Yingming Gao, Yuchen Yan, Xiaoli Feng, Binghuai Lin, and Jinsong Zhang

OS2.6 (#03) A Preliminary Ultrasonic Investigation of Tenseness in Northern Yi
Shuwen Chen

Oral 3: Speech Synthesis

Session Chairs: Yuan-Fu Liao
Time: 14:00-16:00, Monday, 12 Dec 2022

OS3.1 (#02) Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis
Chunyu Qiang, Peng Yang, Hao Che, Xiaorui Wang and Zhongyuan Wang

OS3.2 (#37) Multi-speaker Multi-style Text-to-speech Synthesis with Single-speaker Single-style Training Data Scenarios
Qicong Xie, Tao Li, Xinsheng Wang, Zhichao Wang, Lei Xie, Guoqiao Yu and Guanglu Wan

OS3.3 (#51) Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Kun Song, Jian Cong, Xinsheng Wang, Yongmao Zhang, Lei Xie, Ning Jiang and Haiying Wu

OS3.4 (#52) AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents
Yongmao Zhang, Zhichao Wang, Peiji Yang, Hongshen Sun, Zhisheng Wang and Lei Xie

OS3.5 (#28) CorrectSpeech: A Fully Automated System for Speech Correction and Accent Reduction
Daxin Tan, Liqun Deng, Nianzu Zheng, Yu Ting Yeung, Xin Jiang, Xiao Chen and Tan Lee

OS3.6 (#62) HILvoice: Human-in-the-Loop Style Selection for Elder-Facing Speech Synthesis
Xueyuan Chen, Qiaochu Huang, Xixin Wu, Zhiyong Wu and Helen Meng

Special Session 1: Data Augmentation in Speech Technologies

Session Chairs: Rohan Kumar Das
Time: 14:00-16:00, Monday, 12 Dec 2022

SS1.1 (#103) Dynamic Thresholding on FixMatch with Weak and Strong Data Augmentations for Sound Event Detection
Tanmay Khandelwal and Rohan Kumar Das

SS1.2 (#118) Data Augmentation for Infant Cry Classification
Aastha Kachhi, Shreya Chaturvedi, Hemant A. Patil and Dipesh Kumar Singh

SS1.3 (#122) Low Pass Filtering and Bandwidth Extension for Robust Anti-spoofing Countermeasure Against Codec Variabilities
Yikang Wang, Xingming Wang, Hiromitsu Nishizaki and Ming Li

SS1.4 (#08) Improving Speech Recognition with Augmented Synthesized Data and Conditional Model Training
Shaofei Xue, Jian Tang and Yazhu Liu

SS1.5 (#85) Speaking Style Compensation on Synthetic Audio for Robust Keyword Spotting
Houjun Huang and Yanmin Qian

SS1.6 (#45) A Study on Joint Modeling and Data Augmentation of Multi-Modalities for Audio-Visual Scene Classification
Qing Wang, Jun Du, Siyuan Zheng, Yunqing Li, Yajian Wang, Yuzhong Wu, Hu Hu, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Yannan Wang and Chin-Hui Lee

Oral 4: Voice Conversion & Spoofing Speech Detection

Session Chairs: Junichi Yamagishi
Time: 16:30-18:45, Monday, 12 Dec 2022

OS4.1 (#38) End-to-End Voice Conversion with Information Perturbation
Qicong Xie, Shan Yang, Yi Lei, Lei Xie and Dan Su

OS4.2 (#42) Mix-Guided VC: Any-to-many Voice Conversion by Combining ASR and TTS Bottleneck Features
Zeqing Zhao, Sifan Ma, Yan Jia, Jingyu Hou, Lin Yang and Junjie Wang

OS4.3 (#76) A New Spoken Language Teaching Tech: Combining Multi-attention and AdaIN for One Shot Cross Language Voice Conversion
Dengfeng Ke, Wenhan Yao, Ruixin Hu, Qi Luo, Liangjie Huang, Qi Luo and Wentao Shu

OS4.4 (#116) The Impact of Room Acoustics on Replay Speech Signal
Madhu R. Kamble and Hemant A. Patil

OS4.5 (#124) Effect of Speaker-Microphone Proximity on Pop Noise: Continuous Wavelet Transform-Based Approach
Priyanka Gupta and Hemant A. Patil

OS4.6 (#127) Synthetic Voice Detection and Audio Splicing Detection using SE-Res2Net-Conformer Architecture
Lei Wang, Benedict Yeoh and Jun Wah Ng

OS4.7 (#144) Audio Splicing Localization: Can We Accurately Locate the Splicing Tampering?
Zhiping Zeng and Zhizheng Wu

Oral 5: Speech Enhancement and Separation

Session Chairs: Fei Chen, Xiaohai Tian
Time: 16:30-18:45, Monday, 12 Dec 2022

OS5.1 (#71) Masking-based Neural Beamformer for Multichannel Speech Enhancement
Shuai Nie, Shan Liang, Zhanlei Yang, Longshuai Xiao, Wenju Liu and Jianhua Tao

OS5.2 (#36) Deep Multi-task Cascaded Acoustic Echo Cancellation and Noise Suppression
Junjie Li, Meng Ge, Longbiao Wang and Jianwu Dang

OS5.3 (#30) Boosting the Performance of SpEx+ by Attention and Contextual Mechanism
Chenyi Li, Zhiyong Wu, Wei Rao, Yannan Wang and Helen Meng

OS5.4 (#16) Assessing the Effect of Temporal Misalignment between the Probe and Processed Speech Signals on Objective Speech Quality Evaluation
Shangdi Liao and Fei Chen

OS5.5 (#06) Speech-enhanced and Noise-aware Networks for Robust Speech Recognition
Hung-Shin Lee, Pin-Yuan Chen, Yao-Fei Cheng, Yu Tsao and Hsin-Min Wang

OS5.6 (#57) Separate-to-Recognize: Joint Multi-target Speech Separation and Speech Recognition for Speaker-attributed ASR
Yuxiao Lin, Zhihao Du, ShiLiang Zhang, Fan Yu, Zhou Zhao and Fei Wu

OS5.7 (#133)Speech Enhancement Based on CycleGAN with Noise-informed Training
Wen-Yuan Ting, Syu-Siang Wang, Hsin-Li Chang, Borching Su and Yu Tsao

Day 3, Tuesday, 13 Dec 2022

Keynote Speech 2

Time: 8:30-9:30, Tuesday, 13 Dec 2022
Title: Recent progress in code-switch Singapore English+Mandarin large vocabulary continuous speech recognition
Speaker: Prof Eng Siong Chng, Associate Professor, Nanyang Technological University
Chair: Rong Tong

Oral 6: Speech Recognition II

Session Chairs: Huck Yang
Time: 10:00-12:00, Tuesday, 13 Dec 2022

OS6.1 (#10) Incorporating VAD into ASR System by Multi-task Learning
Meng Li, Yan Xia and Feng Lin

OS6.2 (#20) Improving ASR in Reverberant Environments
Yen-Lun Liao, Chi-Han Lin, Ren-Yuan Lyu and Jyh-Shing Roger Jang

OS6.3 (#23) 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Zhao You, Shulin Feng, Dan Su and Dong Yu

OS6.4 (#32) Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition
Yuting Yang, Binbin Du and Yuke Li

OS6.5 (#72)  Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study
Keyu An, Ji Xiao and Zhijian Ou

OS6.6 (#73) Ensemble and Re-ranking based on Language Models to Improve ASR
Shu-Fen Tsai, Shih-Chan Kuo, Ren-Yuan Lyu and Jyh-Shing Roger Jang

Oral 7: Speech Production and Perception II

Session Chairs: Shuwen Chen, Yanfeng Lu
Time: 10:00-12:00, Tuesday, 13 Dec 2022

OS7.1 (#19) Acoustic and Perceptual Study of Tones in Jin Chinese (Togtoh Variety)
Yue Wang and Wen Liu

OS7.2 (#53) Acoustic-perceptual correlates of whispered Mandarin consonants
Min Xu, Jing Shao, Hongwei Ding and Lan Wang

OS7.3 (#55) Bilingual Advantage? Perception of the Japanese Consonant Length Contrast by Monolingual vs Bilingual Speakers of Mongolian
Kimiko Tsukada, Yurong Yurong and Badmaavanchin Munguntsetseg

OS7.4 (#66) Multichannel Emotional Perception in Chinese Female: Faces, Voices and Bodies
Ruiqi Ge and Xiyu Wu

OS7.5 (#128) Coda Nasal Perception in Wenzhou Wu and Rugao Mandarin by Native Speakers of Standard Mandarin
Yanyang Chen, Xinya Zhang, Ying Chen and Jiazheng Wang

OS7.6 (#22) Objective Hand Complexity Comparison between Two Mandarin Chinese Cued Speech Systems
Li Liu, Gang Feng, Xiaoxi Ren and Xianping Ma

Oral 8: Speech Synthesis & Speaker Embedding

Session Chairs: Xiaoxiao Miao
Time: 13:00-15:00, Tuesday, 13 Dec 2022

OS8.1 (#34) Rhythm-controllable Attention with High Robustness for Long Sentence Speech Synthesis
Dengfeng Ke, Yayue Deng, Yukang Jia, Jinlong Xue, Qi Luo, Ya Li, Jianqing Sun, Jiaen Liang and Binghuai Lin

OS8.2 (#74) AdaptiveFormer : A Few-shot Speaker Adaptative Speech Synthesis Model based on FastSpeech2
Dengfeng Ke, Ruixin Hu, Qi Luo, Liangjie Huang, WenHan Yao, Wentao Shu, Jinsong Zhang and Yanlu Xie

OS8.3 (#27) ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis
Jinlong Xue, Yayue Deng, Yichen Han, Ya Li, Jianqing Sun and Jiaen Liang

OS8.4 (#78) Low-Resource Speech Synthesis with Speaker-Aware Embedding
Li-Jen Yang, I-Ping Yeh and Jen-Tzung Chien

OS8.5 (#44) A Phone-Level Speaker Embedding Extraction Framework with Multi-Gate Mixture-of-Experts Based Multi-Task Learning
Zhijunyi Yang, Mengjie Du, Rongfeng Su, Xiaokang Liu, Nan Yan and Lan Wang

OS8.6 (#120) Shuffle is What You Need
Wan Lin, Lantian Li and Dong Wang

Special Session 2: Deep Noise Reduction

Session Chairs: Xueliang Zhang, Lei Wang
Time: 13:00-15:00, Tuesday, 13 Dec 2022

SS2.1 (#104) On the Use of Absolute Threshold of Hearing-based Loss for Full-band Speech Enhancement
Rohith Mars and Rohan Kumar Das

SS2.2 (#106) RAT: RNN-Attention Transformer for Speech Enhancement
Tailong Zhang, Shulin He, Hao Li and Xueliang Zhang

SS2.3 (#109) A Speech-Noise-Equilibrium Loss Function for Deep Learning-Based Speech Enhancement
Weitong Zhao, Fushi Xie, Kang Ouyang and Nengheng Zheng

SS2.4 (#101) Speakerfilter-Pro: an Improved Target Speaker Extractor Combines the Time Domain and Frequency Domain
Shulin He, Hao Li and Xueliang Zhang

SS2.5 (#105)  Two-Branch Network with Selective Kernel Convolution for Time-Domain Speech Enhancement
Hui Li, Zhihua Huang and Chuangjian Guo

SS2.6 (#11) Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Full-Band Speech Enhancement
Guochen Yu, Andong Li, Wenzhe Liu, Chengshi Zheng, Yutian Wang and Hui Wang

Day 4, Wednesday, 14 Dec 2022

Keynote Speech 3

Time: 8:30-9:30, Wednesday, 14 Dec 2022
Title: Automated Assessment and Feedback: the Role of Spoken Grammatical Error Correction
Speaker: Kate Knill, Principal Research Associate, University of Cambridge
Chair: Junichi Yamagishi

Oral 9: Multimodality

Session Chairs: Ming-Hsiang Su, Ya Li
Time: 10:00-12:00, Wednesday, 14 Dec 2022

OS9.1 (#46) Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function
Qing Wang, Hang Chen, Ya Jiang, Zhe Wang, Yuyang Wang, Jun Du and Chin-Hui Lee

OS9.2 (#108) Multi-Task Joint Learning for Embedding Aware Audio-Visual Speech Enhancement
Chenxi Wang, Hang Chen, Jun Du, Baocai Yin and Jia Pan

OS9.3 (#80) Multimodal Automatic Speech Fluency Evaluation Method for Putonghua Proficiency Test Propositional Speaking Section
Jiajun Liu, Huazhen Meng, Yunfei Shen, Linna Zheng and Aishan Wumaier

OS9.4 (#114) Cantonese Neural Speech Synthesis from Found Newscasting Video Data and its Speaker Adaptation
Raymond Chung

OS9.5 (#107) A Preliminary Study on Taiwanese OCR for Assisting Textual Database Construction from Historical Documents
Yuan-Fu Liao, Yu-Hsuan Huang, Matus Pleva, Daniel Hládek and Ming-Hsiang Su

OS9.6 (#18) Reconstruction of Speech Spectrogram based on Non-invasive EEG Signal
Di Zhou, Masashi Unoki, Gaoyan Zhang and Jianwu Dang

Oral 10: Speech Prosody

Session Chair: Bin Li, Huayun Zhang
Time: 10:00-12:00, Wednesday, 14 Dec 2022

OS10.1 (#07) J-TranPSP: A Joint Transition-based Model for Prosodic Structure Prediction, Word Segmentation and PoS Tagging
Binbin Shen, Jian Luan, Shengyan Zhang, Quanbo Shen and Yujun Wang

OS10.2 (#12) A Mandarin Prosodic Boundary Prediction Model Based on Multi-Source Semi-Supervision
Peiyang shi, Zengqiang Shang and Pengyuan Zhang

OS10.3 (#59) English lexical stresses in non-native speech under adverse conditions
Mosi He, Ting Zhang, Bin Li and Kin Cheung

OS10.4 (#35) Stress Gravity of Neutral Tone Words in Different Information Structures
Jingwen Huang and Aijun Li

OS10.5 (#67) Prosodic Encoding of Mandarin Chinese Intonation by Uygur Speakers in Declarative and Interrogative Sentences
Tong Li, Hui Feng and Yuan Jia

OS10.6 (#102) In-group Advantage for Chinese and English Emotional Prosody in Quiet and Noise Conditions
Yuhan Yan, Shanpeng Li and Ying Chen

Oral 11: Lightweight Model & Knowledge Distillation

Session Chairs: Yanmin Qian, Yi Zhou
Time: 13:00-15:00, Wednesday, 14 Dec 2022

OS11.1 (#48) Multi-Resolution Stacked 1D-CNN for Small-Footprint Keyword Spotting with Two-Stage Detection
Jian Tang and Shaofei Xue

OS11.2 (#65) Lightweight End-to-End Deep Learning Model for Music Source Separation
Yao-Ting Wang, Yi-Xing Lin, Kai-Wen Liang, Tzu-Chiang Tai and Jia-Ching Wang

OS11.3 (#97) AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Kun Song, Heyang Xue, Xinsheng Wang, Jian Cong, Yongmao Zhang, lei Xie, Bing Yang, Xiong Zhang and Dan Su

OS11.4 (#43) Label-free Knowledge Distillation with Contrastive Loss for Light-weight Speaker Recognition
Zhiyuan Peng, Xuanji He, Ke Ding, Tan Lee and Guanglu Wan

OS11.5 (#121) Improving Speech Separation with Knowledge Distilled from Self-supervised Pre-trained Models
Bowen Qu, Chenda Li, Jinfeng Bai and Yanmin Qian

OS11.6 (#111) Text-Informed Knowledge Distillation for Robust Speech Enhancement and Recognition
Wei Wang, Wangyou Zhang, Shaoxiong Lin and Yanmin Qian

Oral 12: Speech Technology for Health

Session Chair: Nan Yan, Jeremy Wong
Time: 13:00-15:00, Wednesday, 14 Dec 2022

OS12.1 (#94) Prediction of Depression Severity Based on Transformer Encoder and CNN Model
Jiahao Lu, Bin Liu, Zheng Lian, Cong Cai, Jianhua Tao and Ziping Zhao

OS12.2 (#05) Depressive Tendency Recognition by Fusing Speech and Text Features: A Comparative Analysis
Xiaoyong Lu, Yimin He, Jingyi Yuan, Tao Pan and Yafan Wang

OS12.3 (#17) Medical Difficult Airway Detection using Speech Technology
Zhikai Zhou, Shuang Cao, Zhengyang Chen, Bei Liu, Ming Xia, Hong Jiang and Yanmin Qian

OS12.4 (#88) CUEMPATHY: A Counseling Speech Dataset for Psychotherapy Research
Dehua Tao, Harold Chui, Sarah Luk and Tan Lee

OS12.5 (#24) Aphasia Detection for Cantonese-Speaking and Mandarin-Speaking Patients Using Pre-Trained Language Models
Ying Qin, Tan Lee, Anthony Pak Hin Kong and Feng Lin

OS12.6 (#39) Respiratory and laryngeal influences on voice in post-stroke dysarthria: a pilot study
Tinghao Zhao, Xiaoxia Du, Juan Liu, Rongfeng Su, Nan Yan and Lan Wang

Oral 13: Listening Comprehension of Machines and Humans

Session Chair: Wei-Qiang Zhang, Yanfeng Lu
Time: 15:30-17:30, Wednesday, 14 Dec 2022

OS13.1 (#110) End-to-end speech topic classification based on pre-trained model Wavlm
Tengfei Cao, Liang He and Fangjing Niu

OS13.2 (#79) BERT-based Chinese Medicine Named Entity Recognition Model Applied to Medication Reminder Dialogue System
Tsung-Hsien Yang, Matus Pleva, Daniel Hládek and Ming-Hsiang Su

OS13.3 (#29) Dialogue scenario classification based on social factors
Yuning Liu, Di Zhou, Masashi Unoki, Jianwu Dang and Aijun Li

OS13.4 (#112) BERT-LID: Leveraging BERT to Improve Spoken Language Identification
Yuting Nie, Junhong Zhao, Wei-Qiang Zhang and Jinfeng Bai

OS13.5 (#123) An Exploratory Study for Quantifying the Contextual Information for Successful Chinese L2 Speech Comprehension
Rian Bao, Linkai Peng, Yuchen Yan and Jinsong Zhang

OS13.6 (#92) The Contribution of Phonological and Fluency Factors to Chinese L2 Comprehensibility Ratings: A Case Study of Urdu-speaking Learners
Rian Bao, Linkai Peng, Yingming Gao and Jinsong Zhang

Oral 14: Acoustic Phonetics & Prosody

Session Chair: Wen Liu, Bo Li
Time: 15:30-17:30, Wednesday, 14 Dec 2022

OS14.1 (#21) An Acoustic Study on Fricative Vowel [iʑ] in Zhongwei Chinese
Xinyi Zhang and Wen Liu

OS14.2 (#69) Acoustic Features of Consonants of Standard Chinese and English by Uyghur Native Speakers
Yuan Jia and Xintong  Zuo

OS14.3 (#33) A Study on Mandarin Chinese “Bu” Tone Sandhi Followed by English Words
Kaige Gao and Xiyu Wu

OS14.4 (#68) An Entropy-based Study on the Acquisition of Mandarin Initial Consonants by Korean Learners
Xiaoli Feng, Yingming Gao, Jinsong Zhang and Yanchun Cao

OS14.5 (#58) Impacts of Aging on Suprasegmental and Segmental Encoding of Vocally-Expressed Confidence in Wuxi Dialect
Yujie Ji, Qiqi Sun, Zhikang Peng and Xiaoming Jiang

OS14.6 (#31) Acceptance of Tonal and Segmental Variability Correlates to Inventory Size in Mandarin Chinese
Julie Siying Chen and Stephen Politzer-Ahles

SIG-CSLP Assembly

Time: 17:30-18:00, Wednesday, 14 Dec 2022

Closing Session

Time: 18:00-18:30, Wednesday, 14 Dec 2022