Bank Cheque Dataset
Home » Case Study » Computer Vision » Bank Cheque Dataset
Project Overview:
Objective
As a leading data collection and annotation company, we embarked on a project to create a comprehensive dataset for bank cheque processing. Our aim was to support advancements in Optical Character Recognition (OCR), fraud detection, and the automation of cheque clearance processes.
Scope
We have successfully gathered a wide variety of bank cheque images. These images feature different handwriting styles, bank designs, and conditions, such as wear-and-tear. Furthermore, our expert team has carefully annotated important components like payee names, amounts, signatures, and MICR codes. Moreover, these detailed annotations cater specifically to the needs of machine learning models.
Sources
- Personal cheques
- Business cheques
- Cheques from various banks and financial institutions
- Photocopies of cheques (to simulate fraud scenarios)
- Old and worn-out cheques
Data Collection Metrics
- Total Cheques Collected: 250,000 (an increase from the initial 200,000)
- Personal Cheques: 150,000
- Business Cheques: 70,000
- Photocopied Cheques: 15,000
Annotation Process
Stages
- Cheque Pre-processing: We standardized images for brightness and alignment, ensuring consistency.
- Component Annotation: Our team has meticulously highlighted key cheque elements; therefore, the data fed into the system is precise.
- Text Transcription: We manually transcribed the written elements to ensure that the OCR training is accurate. Furthermore, we meticulously reviewed the content to verify its fidelity. Additionally, we double-checked every detail to guarantee precision.
- Forgery Tagging: Additionally, we tagged photocopied and potentially fraudulent cheques, which enhances our fraud detection capabilities.
- Validation: Initially, we utilized existing OCR tools for preliminary verification. Subsequently, we conducted thorough manual reviews thereafter.
Annotation Metrics
- Total Annotations: 1,250,000
- Payee Names: 250,000
- Amounts in Words: 250,000
- Amounts in Figures: 250,000
- Date Annotations: 250,000
- Signature Annotations: 175,000
- MICR Annotations: 75,000
Quality Assurance
Stages
- Existing OCR Verification: We meticulously compared transcriptions with current OCR solutions to ensure accuracy.
- Furthermore, Banking Expert Review: Banking professionals were consulted to guarantee that the content is relevant and precise in real-world contexts.
- Moreover, Inter-annotator Agreement: Multiple annotators reviewed the same images, ensuring consistent and reliable annotations.
QA Metrics
- Annotations Reviewed by Banking Experts: 50,000 (20% of total annotations)
- Inconsistencies Identified and Rectified: 25,000 (10% of total annotations)
Conclusion
Quality Data Creation
Guaranteed TAT
ISO 9001:2015, ISO/IEC 27001:2013 Certified
HIPAA Compliance
GDPR Compliance
Compliance and Security
Let's Discuss your Data collection Requirement With Us
To get a detailed estimation of requirements please reach us.