KHATT-Arabic Dataset
Home » Dataset Download » KHATT-Arabic Dataset
KHATT-Arabic Dataset
Datasets
KHATT-Arabic Dataset
File
KHATT-Arabic Dataset
Use Case
Computer Vision
Description
Explore the KHATT-Arabic Dataset, a unique collection of unconstrained handwritten Arabic texts from 1,000 diverse writers.
KHATT-Arabic, developed through collaborative efforts led by Professor Sabri Mahmoud at King Fahd University of Petroleum and Minerals (KFUPM) in Dhahran, Saudi Arabia, in partnership with Professor Fink of TU-Dortmund, Germany, and Dr. Märgner of TU-Braunschweig, Germany, is a comprehensive database for unconstrained Arabic handwriting research. This dataset consists of handwritten Arabic texts created by 1,000 diverse writers, making it a rich resource for various handwriting recognition studies including, but not limited to, text recognition and writer identification.
Dataset Features:
Contributors:Â Handwritten forms from 1,000 unique writers.
Resolution Quality:Â Images scanned at 200, 300, and 600 DPI to accommodate different research needs.
Diversity:Â Contributors vary by nationality, age, gender, handedness, and educational background.
Writing Styles:Â Includes natural, unrestricted handwriting styles.
Content Variety:
Unique Texts:Â 2,000 paragraphs on varied topics such as arts, education, health, nature, and technology, along with their line-segmented images.
Similar Texts:Â 2,000 paragraphs covering all Arabic characters and shapes, each with line-segmented images.
Free Texts:Â Paragraphs on topics freely chosen by the writers.
Annotation:Â All paragraph and line images come with manually verified ground truths and Latin transliterations of Arabic texts.
Dataset Splits:Â The dataset is organized into training (70%), validation (15%), and testing (15%) sets.
Research Applications:
The KHATT-Arabic dataset is designed to support advancements in several areas of handwriting analysis, including writer identification, line segmentation, noise removal, and binarization techniques, in addition to general handwritten text recognition.
Contact Us
Quality Data Creation
Guaranteed TAT
ISO 9001:2015, ISO/IEC 27001:2013 Certified
HIPAA Compliance
GDPR Compliance
Compliance and Security
Let's Discuss your Data collection Requirement With Us
To get a detailed estimation of requirements please reach us.