Grand Challenge – CSSD

ISCSLP 2022 Conversational Short-phrase Speaker Diarization Challenge (CSSD)

Summary

On July 6, 2022, ISCSLP 2022 Conversational Short-phrase Speaker Diarization Challenge (CSSD) which is jointly sponsored by the Institute of Acoustics CAS, Northwestern Polytechnical University, Institute for InfoComm Research of A*STAR Singapore, Shanghai Jiaotong University and Magic Data (Beijing Aishu Smart Technology Co., Ltd.), is officially opened for registration. Groups and individuals from academia and industry are welcome to register for the competition.

Challenge Background

Dialogue scenarios are one of the most essential and challenging scenarios for speech processing technology. In daily conversations, people casually respond to each other and continue the conversation with coherent questions and comments rather than bluntly answering each other’s questions. Accurately detecting the speech activity of each person in a conversation is critical for many downstream tasks such as natural language processing and machine translation. The evaluation metric for speaker classification systems, the classification error rate (DER), has long been used as a standard evaluation metric for speaker classification. However, it fails to pay enough attention to short dialogue phrases. These short dialogue phrases are short but play an essential role at the semantic level. The speech community also lacks evaluation metrics to effectively assess the accuracy of short speech classification in conversations.

To solve this problem, we open-sourced the MagicData-RAMC Chinese conversational speech dataset, which contains 180 hours of manually annotated conversational speech data.  For the CSSD evaluation, we also prepare 20 hours of dialogue data for testing purposes, and manually annotate the speaker’s timestamps. For the CSSD challenge, we also design a new accuracy evaluation metric to calculate the accuracy of sentence-level speaker diarization. By advancing research on segmentation and clustering techniques for dialogue data, we aim to further promote reproducible research in this field.

Timeline

  • 2022-07-04, Open Registration.
  • 2022-07-22 12:00 am, Registration Deadline.
  • 2022-07-24 12:00 am, Open Training Set and Evaluation Metrics.
  • 2022-09-13 12:00 am, Open Evaluation Set.
  • 2022-09-15 12:00 am, Final Submission Deadline.
  • 2022-09-16 12:00 am, Announcement of Results and Rankings.
  • 2022-09-24, Paper Submission Deadline.

Registration and Details

https://magichub.com/competition/sec-competition/