Hoda Dataset

Hoda Dataset


Hoda Dataset


Hoda Dataset

Use Case

Hoda Dataset


Comprehensive Dataset for Persian Handwriting Recognition

About the Dataset

The Dataset is the first extensive collection of Persian manuscript figures, comprising 102,353 black and white manuscript examples. This dataset was developed during a master’s project focused on the recognition of handwritten forms. The data is sourced from approximately 12,000 registration forms for the 2014 national master’s exam and the continuous associate exam of the University of Applied Sciences.

Dataset Features:

Resolution: 200 DPI (dots per inch)

Total Samples: 102,352

Training Samples: 6,000 per class

Test Samples: 2,000 per class

Additional Samples: 22,352

Purpose and Applications:

This dataset is ideal for training AI models in handwriting recognition, particularly for Persian script. Its comprehensive nature ensures a wide variety of examples, improving the robustness and accuracy of machine learning algorithms.

Additional Content:

The Dataset serves as a valuable resource for researchers and developers in the field of AI and machine learning, providing a solid foundation for advancing handwritten text recognition technologies. This collection not only facilitates academic research but also supports practical applications in automated form processing and digital archiving.

Contact Us

Please enable JavaScript in your browser to complete this form.

Quality Data Creation


Guaranteed TAT


ISO 9001:2015, ISO/IEC 27001:2013 Certified


HIPAA Compliance


GDPR Compliance


Compliance and Security

Let's Discuss your Data collection Requirement With Us

To get a detailed estimation of requirements please reach us.

Scroll to Top