Pneumonia & Tuberculosis with Normal & Non-X-ray Dataset​

Pneumonia & Tuberculosis with Normal & Non-X-ray Dataset

Datasets

Pneumonia & Tuberculosis with Normal & Non-X-ray Dataset

File

Pneumonia & Tuberculosis with Normal & Non-X-ray Dataset

Use Case

Computer Vision

Description

Explore a diverse chest X-ray dataset for AI training, including Normal, Pneumonia, Tuberculosis, and Non-X-ray images. Perfect for classification.

Description:

The Pneumonia & Tuberculosis with Normal & Non-X-ray Dataset is a meticulously curated collection of chest X-rays and non-X-ray images designed for developing robust AI models. This dataset enables effective classification of Normal, Pneumonia, Tuberculosis (TB), and Non-X-ray (anomaly) categories, making it ideal for medical research, open-set recognition, and anomaly detection, particularly in resource-constrained environments.

Key Features of the Dataset

  1. Diverse and Reliable Data Sources
  • Chest X-rays: Gathered from global medical datasets including Kermany, RSNA, NIAID, NLM, and Belarus, offering a comprehensive representation of Normal, Pneumonia, and TB cases.
  • Non-X-ray Images: A unique addition of non-medical images, such as animals and objects, sourced from the Pavan Sanagapati dataset, ideal for testing model robustness and open-set recognition.
  1. Balanced Dataset Splits
  • Training Set: 85% of the data for model learning.
  • Validation Set: 5% for hyperparameter tuning.
  • Test Set: 10% for performance evaluation.
  • Includes a CSV file containing metadata (image name, label, source, split) for organized access and analysis.
  1. Multi-Class Categorization

The dataset encompasses the following categories:

  • Normal: Healthy chest X-rays.
  • Pneumonia: Chest X-rays diagnosed with pneumonia.
  • Tuberculosis (TB): X-rays of patients with TB.
  • Non-X-ray: Non-medical images for open-set anomaly detection.
  1. Optimized for AI Training and Research
  • Built for evaluating classification, anomaly detection, and generalization.
  • Supports supervised learning, transfer learning, and custom algorithm testing.

Advantages of This Dataset

  1. Diverse Applications
    • Train AI models for multiclass classification.
    • Evaluate open-set recognition performance by detecting anomalies.
    • Analyze model generalization across diverse imaging conditions and sources.
  1. High Clinical Relevance
    • Offers a real-world dataset for addressing medical challenges in resource-limited settings.
    • Enables research into global healthcare disparities with data from diverse countries.
  2. Robust Evaluation Metrics
    • Designed for fairness in evaluation with proportional splits across sources.
    • Allows in-depth study of model performance across Normal, Pneumonia, TB, and non-medical categories.

Dataset Composition and Distribution

Split

Normal

Pneumonia

Tuberculosis

Non-X-ray

Train

4,667

3,633

3,573

1,155

Validate

274

213

207

67

Test

584

427

417

135

How to Use This Dataset

  1. Download the dataset and extract it into your working directory.
  2. Use the dataset_splits_with_metadata.csv file for image metadata (e.g., image name, label, source, and split).
  3. Preprocess the images by resizing, normalizing, or applying augmentation techniques as needed.
  4. Train, validate, and test your models using the pre-defined splits for reproducible results.

Applications of This Dataset

  • Multiclass classification for detecting and distinguishing Normal, Pneumonia, TB, and Non-X-ray anomalies.
  • Training and evaluating models for open-set recognition and anomaly detection.
  • Advancing healthcare research in settings with limited medical imaging resources.

Conclusion

The Pneumonia & Tuberculosis with Normal & Non-X-ray Dataset is an invaluable tool for researchers and AI developers aiming to create accurate, generalizable, and robust models for healthcare. It provides a unique blend of medical and non-medical data, enabling comprehensive testing and real-world applicability.

Contact Us

Please enable JavaScript in your browser to complete this form.
Technology

Quality Data Creation

Technology

Guaranteed TAT

Technology

ISO 9001:2015, ISO/IEC 27001:2013 Certified

Technology

HIPAA Compliance

Technology

GDPR Compliance

Technology

Compliance and Security

Let's Discuss your Data collection Requirement With Us

To get a detailed estimation of requirements please reach us.

Scroll to Top

Please provide your details to download the Dataset.