Bank Cheque Dataset: Empowering Financial Technology

Bank Cheque Dataset

Project Overview:

Objective

As a leading data collection and annotation company, we embarked on a project to create a comprehensive dataset for bank cheque processing. Our aim was to support advancements in Optical Character Recognition (OCR), fraud detection, and the automation of cheque clearance processes.

Scope

We successfully collected a diverse range of bank cheque images, ensuring a variety of handwriting styles, bank designs, and conditions, including wear-and-tear. Our expert team meticulously annotated critical components such as payee names, amounts, signatures, and MICR codes, catering to the intricate needs of machine learning models.

  • img4
  • img4
  • img4
  • img4

Sources

  • Personal cheques
  • Business cheques
  • Cheques from various banks and financial institutions
  • Photocopies of cheques (to simulate fraud scenarios)
  • Old and worn-out cheques
img4
  • img4
  • img4

Data Collection Metrics

  • Total Cheques Collected: 250,000 (an increase from the initial 200,000)
  • Personal Cheques: 150,000
  • Business Cheques: 70,000
  • Photocopied Cheques: 15,000

Annotation Process

Stages

  1. Cheque Pre-processing: We standardized images for brightness and alignment, ensuring consistency.
  2. Component Annotation: Our team highlighted key cheque elements for precise data feeding.
  3. Text Transcription: Manual transcription of written elements was done for accurate OCR training.
  4. Forgery Tagging: We tagged photocopied and potentially fraudulent cheques to enhance fraud detection capabilities.
  5. Validation: Utilized existing OCR tools for preliminary verification, followed by thorough manual reviews.

Annotation Metrics

  • Total Annotations: 1,250,000
  • Payee Names: 250,000
  • Amounts in Words: 250,000
  • Amounts in Figures: 250,000
  • Date Annotations: 250,000
  • Signature Annotations: 175,000
  • MICR Annotations: 75,000
  • img4
  • img4
  • img4
  • img4

Quality Assurance

Existing OCR Verification: We cross-checked transcriptions against current OCR solutions.
Banking Expert Review: Consultation with banking professionals ensured real-world relevance and accuracy.
Inter-annotator Agreement: Multiple annotators reviewed the same images for consistent annotations.

QA Metrics:

  • Annotations Reviewed by Banking Experts: 50,000 (20% of total annotations)
  • Inconsistencies Identified and Rectified: 25,000 (10% of total annotations)

Conclusion

Our proficiency in data collection and annotation has culminated in the creation of the Bank Cheque Dataset, poised to transform various aspects of banking automation, particularly in the realms of cheque clearance and fraud detection..With an extensive range of cheque samples and detailed annotations, it stands as an indispensable tool for fintech advancements.

  • icon
    Quality Data Creation
  • icon
    Guaranteed
    TAT
  • icon
    ISO 9001:2015, ISO/IEC 27001:2013 Certified
  • icon
    HIPAA
    Compliance
  • icon
    GDPR
    Compliance
  • icon
    Compliance and Security

Let's Discuss your Data collection
Requirement With Us

To get a detailed estimation of requirements please reach us.

Get a Quote icon