How can I classify sports images?
Tuesday, May 29, 2018
By Jovica Turcinovic
The image classification is one of the best fields where the AI and NN are achieving fantastic results!
I grew up in the Former Yugoslavia. All Yugoslav republics, later independent countries, are recognized as nations that excel in numerous sports. Yugoslavia was among the global leaders in sports, especially when it comes to collective sports.
The above countries are strong competitors in water polo, handball, volleyball etc.
Let's imagine we have all sports images from the last ~70 years in one place. How would we classify the images by each sport?
Do we need a days and months of work? Or can it be done within a day, or even hours ... Turns out that this is possible. Wait, it gets better- it can be done by using just a few lines of program code. Hm, how?
Thanks to the best DL course fast.ai, and their fast.ai library we are able to achieve that goal. I highly recommend fast.ai as the first and best pick for the fastest learning option for those who wish to dive into the fascinating world of AI and DL.
In order to demonstrate the solution for the classification of sports images, I chose five sports for the classification:
-> Basketball, Soccer, Volleyball, Water-polo and American football <-
As you'll see soon, it turned out that with a corpus of ~40 images per sport, I achieved world class accuracy in classifying a sports image. With more than 99% on the validation data.
How did I start?
Actually, I still don't have all sports images of Yugoslavian sports from within the last 70 years. But, eventually, I can be prepared for that day.
So, what can I do? As you can presume, I can pick some random images using the google search engine based on the sports and put it into the proper folders (classified training data set can be downloaded from here, the validation set can be downloaded from here).
Random American football
Random water polo
Once I grab enough images from the google search engine, I can then categorize the images to "train" and "valid" folders.
As part of the this task, we should also prepare some test images.
Implementation and the testing process
Let's import the necessary libraries:
from fastai.imports import * from fastai.transforms import * from fastai.conv_learner import * from fastai.model import * from fastai.dataset import * from fastai.sgdr import * from fastai.plots import *
The next step is to define the path to the train/valid/test folders and also to define the batch size:
PATH = "d:\\Work\\Learning\\FastAI2018\\fastai\\tutorials\\data\\sports\\" sz=128
Let's check the results upon the 5 epochs and learning rate set to 0.01 running the following lines of code:
arch=resnet34 data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz)) learn = ConvLearner.pretrained(arch, data, precompute=True) learn.fit(0.01, 5)
After training the network,
I was surprised that accuracy reached ~98% in just 5 seconds.
It's always a good idea to plot the some correct, incorrect images and some most uncertain results.
First, let's define some methods which may be of help:
log_preds = learn.predict() preds = np.argmax(log_preds, axis=1) # from log probabilities to 0 or 1 probs = np.exp(log_preds[:,1]) # pr(dog) def rand_by_mask(mask): return np.random.choice(np.where(mask), 4, replace=True) def rand_by_correct(is_correct): return rand_by_mask((preds == data.val_y)==is_correct) def plot_val_with_title(idxs, title): imgs = np.stack([data.val_ds[x] for x in idxs]) title_probs = [probs[x] for x in idxs] print(title) return plots(data.val_ds.denorm(imgs), rows=1, titles=title_probs) def plots(ims, figsize=(12,6), rows=1, titles=None): f = plt.figure(figsize=figsize) for i in range(len(ims)): sp = f.add_subplot(rows, len(ims)//rows, i+1) sp.axis('Off') if titles is not None: sp.set_title(titles[i], fontsize=16) plt.imshow(ims[i]) def load_img_id(ds, idx): return np.array(PIL.Image.open(PATH+ds.fnames[idx])) def plot_val_with_title(idxs, title): imgs = [load_img_id(data.val_ds,x) for x in idxs] title_probs = [probs[x] for x in idxs] print(title) return plots(imgs, rows=1, titles=title_probs, figsize=(16,8))
Let's plot few random correctly classified images using the following function:
plot_val_with_title(rand_by_correct(True), "Correctly classified")
Let's plot some incorrectly classified images using the following command:
plot_val_with_title(rand_by_correct(False), "Incorrectly classified")
As we may see, there are actually two images which were not classified properly (sorry Bogdan Bogdanovic :) ).
Let us check how we can find the most accurate ones.
Define the following functions:
def most_by_mask(mask, mult): idxs = np.where(mask) return idxs[np.argsort(mult * probs[idxs])[:4]] def most_by_correct(y, is_correct): mult = -1 if (y==1)==is_correct else 1 return most_by_mask(((preds == data.val_y)==is_correct) & (data.val_y == y), mult)
In order to see the most relevant ones, use the following command (American football):
plot_val_with_title(most_by_correct(0, True), "Most correct american football")
Do the same for basketball:
plot_val_with_title(most_by_correct(1, True), "Most correct basket")
What are the most incorrect basket images? Let run the following command:
plot_val_with_title(most_by_correct(1, False), "Most incorrect basket")
What else would be useful? Ah, yes - the most uncertain images.
This will help us see which kind of the images we should add to the training corpus in order to improve the accuracy.
Execute the following command in order to see the desired results:
most_uncertain = np.argsort(np.abs(probs -0.5))[:4] plot_val_with_title(most_uncertain, "Most uncertain predictions")
Ok, the results are pretty good, but how could we further improve our model?
The secret is in using data augmentation. This means we can produce a lot of images based on the existing one by zooming, flipping, rotating, etc.
The fast.ai library contains powerful functions for this matter.
In order to see this in action, let's define the get augmented data function:
tfms = tfms_from_model(resnet34, sz, aug_tfms=transforms_side_on, max_zoom=1.2) def get_augs(): data = ImageClassifierData.from_paths(PATH, bs=2, tfms=tfms, num_workers=1) x,_ = next(iter(data.aug_dl)) return data.trn_ds.denorm(x)
Plot several examples of data augmented images:
ims = np.stack([get_augs() for i in range(6)]) plots(ims, rows=2)
Define the model again, but now with powerful data augmentation in place in 50 epochs:
arch=resnet34 data = ImageClassifierData.from_paths(PATH, tfms=tfms) learn = ConvLearner.pretrained(arch, data, precompute=True) sz=512 learn.fit(1e-2, 50)
You'll end up with the 99.5% accuracy!
fast.ai has a powerful feature for analyzing models. One is definitely plotting the confusion matrix. I personally like visual representations a lot:
The possibilities of deep learning are infinite. In the further posts I will try to present some of the experiments which I am working on and which may give you ideas regarding issues which you are working on.
Share with me your ideas/attempts/projects in the comments section of this blog post.
Let's code a wonderful world :)