Phone Conversations in Irish English: Enhance Data Annotation

Phone Conversations in Irish English

Project Overview

Objective

Our project aimed to develop a comprehensive dataset of Irish English phone conversations. This dataset is vital for training and enhancing speech recognition models, making them more adaptable and accurate in understanding the nuances of Irish English dialects.

Scope

We undertook the extensive task of collecting and annotating a wide range of phone conversations in Irish English. This effort was directed towards creating a dataset that captures the diverse accents, colloquialisms, and speech patterns unique to Irish English speakers.

  • img4
  • img4
  • img4
  • img4

Sources

  • The collection involved gathering audio recordings of phone conversations across various contexts, including customer service calls, business communications, casual conversations, and emergency services.
  • There was a focus on ensuring a diverse representation of speakers in terms of age, gender, socioeconomic background, and regional accents within Ireland.
  • We successfully collected a comprehensive set of audio recordings, successfully generating a rich and varied representation of phone conversations, encompassing diverse contexts and speaker characteristics in Ireland.
img4
  • img4
  • img4

Data Collection Metrics

  • Total Conversations Recorded: 120,000
  • From Urban Areas: 70,000
  • From Rural Regions: 50,000

Annotation Process

Stages

  1. Transcription Accuracy: Each conversation was meticulously transcribed to ensure a high level of accuracy, capturing every nuance of the spoken word.
  2. Dialect Identification: Conversations were classified based on regional dialects, enriching the dataset with linguistic diversity.
  3. Contextual Tagging: We tagged conversations with contextual data such as the topic, emotional tone, and speaking pace.

Annotation Metrix

  • Conversations Annotated: 120,000
  • Dialect Categories Identified: 5 major dialects
  • Contextual Tags Applied: 120,000
  • img4
  • img4
  • img4
  • img4

Quality Assurance

Continuous Evaluation: Regular assessments of the dataset’s effectiveness in training speech recognition models.
Privacy Compliance: Ensured all conversations were anonymized and compliant with privacy standards.
Feedback Integration: Feedback from linguists and speech recognition experts was incorporated to refine the dataset.

QA Metrics:

  • Accuracy in Speech Recognition Models: 95%
  • Annotation Consistency Rate: 99%
  • User Satisfaction Rate: 96%

Conclusion

Our Irish English phone conversation dataset has significantly improved the performance of speech recognition models. The models trained with our dataset demonstrate enhanced understanding and accuracy in deciphering Irish English accents and dialects. This advancement is crucial for businesses and services aiming to provide better voice recognition solutions to customers in Ireland.

  • icon
    Quality Data Creation
  • icon
    Guaranteed
    TAT
  • icon
    ISO 9001:2015, ISO/IEC 27001:2013 Certified
  • icon
    HIPAA
    Compliance
  • icon
    GDPR
    Compliance
  • icon
    Compliance and Security

Let's Discuss your Data collection
Requirement With Us

To get a detailed estimation of requirements please reach us.

Get a Quote icon