Bank Cheque Dataset

Project Overview:

Objective

As a leading data collection and annotation company, we embarked on a project to create a comprehensive dataset for bank cheque processing. Our aim was to support advancements in Optical Character Recognition (OCR), fraud detection, and the automation of cheque clearance processes.

Scope

We have successfully gathered a wide variety of bank cheque images. These images feature different handwriting styles, bank designs, and conditions, such as wear-and-tear. Furthermore, our expert team has carefully annotated important components like payee names, amounts, signatures, and MICR codes. Moreover, these detailed annotations cater specifically to the needs of machine learning models.

Bank Cheque Dataset
Bank Cheque Dataset
Bank Cheque Dataset
Bank Cheque Dataset

Sources

  • Personal cheques
  • Business cheques
  • Cheques from various banks and financial institutions
  • Photocopies of cheques (to simulate fraud scenarios)
  • Old and worn-out cheques
case study-post
Bank Cheque Dataset
Bank Cheque Dataset

Data Collection Metrics

  • Total Cheques Collected: 250,000 (an increase from the initial 200,000)
  • Personal Cheques: 150,000
  • Business Cheques: 70,000
  • Photocopied Cheques: 15,000

Annotation Process

Stages

  1. Cheque Pre-processing: We standardized images for brightness and alignment, ensuring consistency.
  2. Component Annotation: Our team has meticulously highlighted key cheque elements; therefore, the data fed into the system is precise.
  3. Text Transcription: We manually transcribed the written elements to ensure that the OCR training is accurate. Furthermore, we meticulously reviewed the content to verify its fidelity. Additionally, we double-checked every detail to guarantee precision.
  4. Forgery Tagging: Additionally, we tagged photocopied and potentially fraudulent cheques, which enhances our fraud detection capabilities.
  5. Validation: Initially, we utilized existing OCR tools for preliminary verification. Subsequently, we conducted thorough manual reviews thereafter.

Annotation Metrics

  • Total Annotations: 1,250,000
  • Payee Names: 250,000
  • Amounts in Words: 250,000
  • Amounts in Figures: 250,000
  • Date Annotations: 250,000
  • Signature Annotations: 175,000
  • MICR Annotations: 75,000
Bank Cheque Dataset
Bank Cheque Dataset
Bank Cheque Dataset
Bank Cheque Dataset

Quality Assurance

Stages

  • Existing OCR Verification: We meticulously compared transcriptions with current OCR solutions to ensure accuracy.
  • Furthermore, Banking Expert Review: Banking professionals were consulted to guarantee that the content is relevant and precise in real-world contexts.
  • Moreover, Inter-annotator Agreement: Multiple annotators reviewed the same images, ensuring consistent and reliable annotations.

QA Metrics

  • Annotations Reviewed by Banking Experts: 50,000 (20% of total annotations)
  • Inconsistencies Identified and Rectified: 25,000 (10% of total annotations)

Conclusion

Our expertise in data collection and annotation has led to the creation of the Bank Cheque Dataset, which is set to revolutionize various aspects of banking automation. Specifically, it will greatly enhance cheque clearance and fraud detection processes. Furthermore, with a wide variety of cheque samples and detailed annotations, this dataset is an essential tool for fintech advancements.
Technology

Quality Data Creation

Technology

Guaranteed TAT

Technology

ISO 9001:2015, ISO/IEC 27001:2013 Certified

Technology

HIPAA Compliance

Technology

GDPR Compliance

Technology

Compliance and Security

Let's Discuss your Data collection Requirement With Us

To get a detailed estimation of requirements please reach us.

Scroll to Top