Challenge Rules
- Papers/submissions other than the two sub-tasks for this challenge will not be included in this challenge.
- Participants can use audio data other than the audio data released as a part of this challenge as long as the audio data is publicly available. Needless to say, the participants will be responsible to ensure that the blind test utterances don't appear as part of the training process. (Please note if we find evidence of the latter when validating the leaderboard, the submission in question will be discarded). Use of pre-trained acoustic model (existing in the public domain) is also allowed. Participants are free to use data given for both sub-tasks to build models for any of the sub-tasks.
- Participants need not limit themselves to the text of the transcriptions provided in the released datasets to build a language model or lexicon, if required. Use of pre-trained language models and data-driven G2P are also allowed. However, participants need to provide a comprehensive description of the resources used for this purpose. They will be required to make those resources publicly available for reproducibility of the results/performance.
- Registered participants need to follow the license for each dataset for its further use and distribution.
- Word Error Rate (WER) averaged across the languages in the blind test set will be used as the metric for evaluation in sub-task1. For sub-task2, average WER will be used as the metric for evaluating the performance of the Hindi-English and Bengali-English code-switching ASR systems.
- The systems submitted are expected to beat the baseline system in terms of WER, however, innovative systems that come close to the baseline may be considered.
- Only the audio for the blind test set for both sub-tasks will be released. Each team is expected to run their systems on the blind test sets and submit the ASR hypotheses to us for evaluation. A maximum of five submissions are allowed. Teams will need to share their final ASR model, along with the paper and resources used to be able to reproduce the hypotheses against the blind sets.