100 Sports Image Classification


Use Case

Computer Vision


This version is an update of the earlier version. I ran a data set quality evaluation program on the previous version which found a considerable number of duplicate and near duplicate images.

About Dataset


I’ve updated this dataset to make it better for you to use. In the previous version, there were many duplicate or very similar images. Having duplicates can make it seem like your model is more accurate than it really is. So, I’ve removed those duplicates in this new version.

I collected these images from the internet and then checked for duplicates using a program I made. Any duplicates I found were deleted to make sure there’s no mixing up of images between the training, testing, and validation sets. After that, I resized all the images to be the same size and converted them to a standard format.

There’s also a file included that lists each image’s location, what it shows, and which part of the dataset it belongs to (like training, testing, or validation).

This dataset is clean and ready to use. If you create a good model with it, you should be able to get at least 95% accuracy on the test set. With advanced techniques like transfer learning, you might even achieve over 98% accuracy on the test set. If you find this dataset helpful, please give it an upvote. Thanks!


This is a collection of pictures of 100 different sports. Each picture is in JPG format and has dimensions of 224×224 pixels with 3 color channels. The data is divided into three main folders: one for training, one for testing, and one for validation.

Additionally, there’s a CSV file included. This file lists all the image filenames along with their corresponding sports category. You can use this CSV file to organize your own training, testing, and validation datasets if you want to.


This dataset is made to be really clean and easy to use. It doesn’t have any bad images or duplicates mixed up between the training, testing, and validation sets. It’s great for testing out your models. I designed it so you can easily apply Keras preprocessing functions like ImageDataGenerator.flow_from_directory. If you prefer using the CSV file, you can use ImageDataGenerator.flow_from_dataframe. I made sure that in each image, the main part showing the sport takes up about half of the picture. Because of this, even if your model isn’t super complicated, it should still perform really well. You can expect training and validation accuracies to be in the high 90s with this dataset.


quality dataset

Quality Data Creation

Guaranteed TAT

ISO 9001:2015, ISO/IEC 27001:2013 Certified

HIPAA Compliance

GDPR Compliance

Compliance and Security

