2011 NIST Language Recognition Evaluation Test Set

URL: https://dss2.princeton.edu/data/128/
Description: Contains selected training data and the evaluation test set for the 2011 NIST Language Recognition Evaluation. It consists of approximately 204 hours of conversational telephone speech and broadcast audio collected by the Linguistic Data Consortium (LDC) in the following 24 languages and dialects: Arabic (Iraqi), Arabic (Levantine), Arabic (Maghrebi), Arabic (Standard), Bengali, Czech, Dari, English (American), English (Indian), Farsi, Hindi, Lao, Mandarin, Punjabi, Pashto, Polish, Russian, Slovak, Spanish, Tamil, Thai, Turkish, Ukrainian and Urdu.
Sample
Format: Single study
Title: 2011 NIST Language Recognition Evaluation Test Set
Format: Single study

Tools