Bank Cheque Dataset

Project Overview:

Objective

As a leading data collection and annotation company, we embarked on a project to create a comprehensive dataset for bank cheque processing. Our aim was to support advancements in Optical Character Recognition (OCR), fraud detection, and the automation of cheque clearance processes.

Scope

We have successfully gathered a wide variety of bank cheque images. These images feature different handwriting styles, bank designs, and conditions, such as wear-and-tear. Furthermore, our expert team has carefully annotated important components like payee names, amounts, signatures, and MICR codes. These detailed annotations cater to the specific needs of machine learning models.

Bank Cheque Dataset
Bank Cheque Dataset
Bank Cheque Dataset
Bank Cheque Dataset

Sources

  • Personal cheques
  • Business cheques
  • Cheques from various banks and financial institutions
  • Photocopies of cheques (to simulate fraud scenarios)
  • Old and worn-out cheques
Cheque Detection
Bank Cheque Dataset

Data Collection Metrics

  • Total Cheques Collected: 250,000 (an increase from the initial 200,000)
  • Personal Cheques: 150,000
  • Business Cheques: 70,000
  • Photocopied Cheques: 15,000

Annotation Process

Stages

  1. Cheque Pre-processing: We standardized images for brightness and alignment, ensuring consistency.
  2. Component Annotation: Our team highlighted key cheque elements, so the data fed into the system is precise.
  3. Text Transcription: We manually transcribed the written elements to ensure the OCR training is accurate.
  4. Forgery Tagging: Additionally, we tagged photocopied and potentially fraudulent cheques, which enhances our fraud detection capabilities.
  5. Validation: Initially, we used existing OCR tools for preliminary verification. Afterward, we performed thorough manual reviews.

Annotation Metrics

  • Total Annotations: 1,250,000
  • Payee Names: 250,000
  • Amounts in Words: 250,000
  • Amounts in Figures: 250,000
  • Date Annotations: 250,000
  • Signature Annotations: 175,000
  • MICR Annotations: 75,000
Bank Cheque Dataset
Bank Cheque Dataset
Bank Cheque Dataset

Quality Assurance

Stages

  • Existing OCR Verification: We meticulously compared transcriptions with current OCR solutions to ensure accuracy.
  • Banking Expert Review: Banking professionals were consulted to guarantee that the content is relevant and precise in real-world contexts.
  • Inter-annotator Agreement: Multiple annotators reviewed the same images, ensuring consistent and reliable annotations.

QA Metrics

  • Annotations Reviewed by Banking Experts: 50,000 (20% of total annotations)
  • Inconsistencies Identified and Rectified: 25,000 (10% of total annotations)

Conclusion

Our expertise in data collection and annotation has led to the creation of the Bank Cheque Dataset, which is set to revolutionize various aspects of banking automation. Specifically, it will greatly enhance cheque clearance and fraud detection processes. With a wide variety of cheque samples and detailed annotations, this dataset is an essential tool for fintech advancements.

quality dataset

Quality Data Creation

Guaranteed TAT​

Guaranteed TAT

ISO 9001:2015, ISO/IEC 27001:2013 Certified​

ISO 9001:2015, ISO/IEC 27001:2013 Certified

HIPAA Compliance​

HIPAA Compliance

GDPR Compliance​

GDPR Compliance

Compliance and Security​

Compliance and Security

Let's Discuss your Data collection Requirement With Us

To get a detailed estimation of requirements please reach us.

Scroll to Top