The project encompasses the accumulation and annotation of audio recordings in both Chinese and English. These recordings are sourced from diverse demographics to ensure variety in dialects, accents, and linguistic nuances. The dataset is a blend of naturally occurring speech and scripted recordings, providing a rich resource for nuanced language processing.