Here I’ll show you how to collect, preprocess, and augment the data required for our model training.
Introduction
In the previous article of this series, we talked about the different approaches you can take to create a face mask detector. In this article, we’ll prepare a dataset for the mask detector solution.
The procedure of gathering images, preprocessing them, and augmenting the resulting dataset is essentially the same for any image dataset. We’ll take the long way through to cover real-life scenarios where data is scarce. I’ve obtained the images from two different sources, and I will show you how to standardize and augment them for future labeling.
Although there are several automated tools that make this process painless, we’ll do it the hard way to learn more.
We'll be using a Roboflow dataset that contains 149 images of people wearing face masks, all of them with black padding and the "same dimensions," and another set of images that obtained from a completely different source at Kaggle that only contains human faces (without masks). With these two data sets representing two classes – faces in masks and faces without masks – let’s go through the steps to achieve a standardized and augmented dataset.
Roboflow Dataset Normalization
I’ll be using Kaggle notebooks to run the code in this article because they provide easy access to computing power, and they're pre-configured with all of the tools we'll need so we won't have to install Python, Tensorflow, or anything else. But they are not mandatory; you can achieve the same result running a Jupyter Notebook locally if you prefer to do so.
In this case, I manually downloaded the dataset, zipped and uploaded it to a Kaggle Notebook. To launch a Kaggle Notebook, go to https://kaggle.com, log in, go to Notebooks in the left panel, and click New notebook. Once it’s running, upload the zip file and run the following cells.
Basic libraries import:
import os
import matplotlib.pyplot as plt
import cv2
from PIL import Image,ImageOps
Let’s explore the images’ dimensions. We’ll read each image, get its shape, and get the unique dimensions in our dataset:
shapes = []
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
if filename.endswith('.jpg'):
shapes.append(cv2.imread(os.path.join(dirname, filename)).shape)
print('Unique shapes at imageset: ',set(shapes))
Here is where I got something I was not expecting to see. This is the output:
Unique shapes at imageset: {(415, 415, 3), (415, 416, 3), (416, 415, 3), (416, 416, 3)}
As you may know, we cannot feed any model with images of different dimensions. Let’s normalize them to a single dimension (415x415):
def make_square(image, minimun_size=256, fill=(0, 0, 0, 0)):
x, y = image.size
size = max(minimun_size, x, y)
new_image = Image.new('RGB', (size, size), fill)
new_image.paste(image, (int((size - x) / 2), int((size - y) / 2)))
return new_image
counter = 0
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
if filename.endswith('.jpg'):
counter += 1
new_image = Image.open(os.path.join(dirname, filename))
new_image = make_square(new_image)
new_image = new_image.resize((415, 415))
new_image.save("/kaggle/working/"+str(counter)+"-roboflow.jpg")
if counter == 150:
break
The convenient directory to save files in Kaggle and get them as output is /kaggle/working
.
Before downloading the normalized dataset, run this cell to zip all images that way you’ll find the final archive easier:
!zip -r /kaggle/working/output.zip /kaggle/working/
!rm -rf /kaggle/working/*.jpg
Now you can look for the output.zip file in the directory explorer on the right-hand side:
Normalization of the Human Face Dataset
The approach to this task is slightly different from the one we chose for the Roboflow dataset above. This time, the dataset contains 4,000+ images, all of them of completely different dimensions. Go to the dataset link and launch a Jupyter Notebook from there. We’ll select the first 150 images.
Basic imports:
import os
import matplotlib.pyplot as plt
import cv2
from PIL import Image
If you want to explore the dataset:
counter = 0
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
if filename.endswith('.jpg'):
counter += 1
print('Images in directory: ',counter)
%matplotlib inline
plt.figure()
image = cv2.imread('/kaggle/input/human-faces/Humans/1 (719).jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
plt.imshow(image)
plt.show()
shapes = []
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
if filename.endswith('.jpg'):
shapes.append(cv2.imread(os.path.join(dirname, filename)).shape)
print('Unique shapes at imageset: ',set(shapes))
This last cell returns a huge variety of dimensions, so the normalization is imperative. Let’s resize all images to (415x415), black-padded:
def make_square(image, minimun_size=256, fill=(0, 0, 0, 0)):
x, y = image.size
size = max(minimun_size, x, y)
new_image = Image.new('RGBA', (size, size), fill)
new_image.paste(image, (int((size - x) / 2), int((size - y) / 2)))
return new_image
counter = 0
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
if filename.endswith('.jpg'):
counter += 1
test_image = Image.open(os.path.join(dirname, filename))
new_image = make_square(test_image)
new_image = new_image.convert("RGB")
new_image = new_image.resize((415, 415))
new_image.save("/kaggle/working/"+str(counter)+"-kaggle.jpg")
if counter == 150:
Break
To download the dataset:
!zip -r /kaggle/working/output.zip /kaggle/working/
!rm -rf /kaggle/working/*.jpg
Now you will find it easily in the right-hand panel.
Dataset Augmentation
Once you have both datasets normalized, it’s time to join the data and augment the resulting set. Data augmentation gives us a way to artificially generate more small training data from a relatively small data set. Augmentation is often necessary because any model needs a huge amount of data to achieve good results during training.
Unzip both files on your computer, place all images in the same folder, zip them, launch a new Kaggle Notebook (mine is here), and upload the resulting file.
Next, Let’s see what you have to do to augment the data. We could cut some corners using automated services, but we’ve decided to do everything by ourselves, so as to learn more.
Basic imports:
import numpy as np
from numpy import expand_dims
import os
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import cv2
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
from PIL import Image
Let’s go straight to the augmentation. We’ll use the ImageDataGenerator
method from Keras, which is widely used in the computer vision community:
def data_augmentation(filename)
image_data = []
image = cv2.imread(filename,3)
samples = expand_dims(image, 0)
datagen1 = ImageDataGenerator(zoom_range=[0.5,1.2])
datagen2 = ImageDataGenerator(brightness_range=[0.2,1.0])
datagen3 = ImageDataGenerator(rotation_range=20)
it1 = datagen1.flow(samples, batch_size=1)
it2 = datagen2.flow(samples, batch_size=1)
it3 = datagen3.flow(samples, batch_size=1)
image_data.append(image)
for i in range(5):
batch1 = it1.next()
batch2 = it2.next()
batch3 = it3.next()
image1 = batch1[0].astype('uint8')
image2 = batch2[0].astype('uint8')
image3 = batch3[0].astype('uint8')
image_data.append(image1)
image_data.append(image2)
image_data.append(image3)
return image_data
To implement it, let’s iterate over every image in the /kaggle/input directory and save all results in /kaggle/working for future download:
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
print(os.path.join(dirname, filename))
result = data_augmentation(os.path.join(dirname, filename))
for i in range(16):
cv2.imwrite('/kaggle/working/'+str(counter)+'.jpg', result[i])
Again, before the download, just run the next to lines to make the files easier to find in the right panel:
!zip -r /kaggle/working/output.zip /kaggle/working/
!rm -rf /kaggle/working/*.jpg
Now you can download the output.zip file.
Next Step
In the next article, we’ll see how to properly label the resulting images in order to train a YOLO model. Stay tuned!
Sergio Virahonda grew up in Venezuela where obtained a bachelor's degree in Telecommunications Engineering. He moved abroad 4 years ago and since then has been focused on building meaningful data science career. He's currently living in Argentina writing code as a freelance developer.