What is Image Classification
and Recognition?
AI Code for Business
Image classification and recognition is a computer vision technique that allows machines to identify and categorize various elements of images and/or videos. Image recognition models are trained to take an image as input and output one or more labels describing the image. This core fuction is a foundational component in solving many computers vision based machine learning business problems.
Why is Image Classification and Recognition Important?
Image classification and recognition is one of the most foundational and widely-applicable computer vision task for a number of technology companies. From building better content to competing more effectively in the industrial market, there are several avenues that large and small business can follow to better utilize and benefit from image classification and recognition. The methodology involving patterns and extraction of useful features has become a building block of many complex computer vision techniques (i.e., object detection, image segmentation, etc.). Additionally, there are numerous standalone applications that make it broad and highly-generalizable for artificial intelligence task. The benefits of image classification include but are not limited to automated image organization, user-generated content moderation, enhanced visual search, automated photo and video tagging, assistance in security threats, leveraging of hiring processes, and identification of variables in images.
Benefits of Image Classification and Recognition in Businesses
Automation
Image processing and machine vision systems play an increasingly important role in quality assurance in an ever-growing number of industries. It is a technology that uses images or image data to identify objects within an image and can recognize, annotate, label, and organize image content. This technology can be used in small and large businesses to recognize images of products, content lists for eCommerce sites, gender classification problem, digital photography, and signs like menu items in restaurants. Small business owners can use image recognition software to automatically populate forms and documents with information about their store, products, and employees without needing to manually type in the data.
Security Threats
As image recognition technology advances, it continue to get better and become more useful. Of all the available AI based solutions on the table today, image recognition software has some of the best offerings in this industry. Image algorithms, using machine learning, as used to gain a better understanding the cybersecurity threats face by customers. A simple screenshot can yield a lot about a customer’s situation, and if they need further assistance; image recognition software has made such process even easier. The old idiom of "a picture is worth a thousand words", is in today's world evolved to "an image is worth a million data points" (to coin a phrase).
Challenges for Image Classification and Recognition
Different perspectives of one person caught on camera can confuse image classification solutions. For example, the system can list the one person as several individuals. It is important for businesses to provide solutions that consolidate all media into a single-person media profiles. Image classification tasks can be almost instataneous or take hours to process depending on the construction and deployment of models and the number of categories that need to be referenced in the result. Maintaining fast processing speeds that is relevant for time-sensitive investigations is an important challenge.
Working of Image Classification and Recognition
The following example covers a gender-based image classification problem. To construct a model that recognize persons robustly, we make use of a readily available image dataset consisting of 27,305 facial images that are labeled on the basis of age, gender, and ethnicity.
Follow the steps below and run the code on the colab notebook linked here. (To run the code, click on the round ▶️ next to each cell)
Cell 1: Imports the python libraries needed. Define and configure the Matplotlib style that will be used for visualization.
!pip install --upgrade --no-cache-dir gdown
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import keras
from keras import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from PIL import Image
import matplotlib.image as mpg
import seaborn as sns
plt.style.use('ggplot')
plt.rcParams['font.family'] = 'sans-serif'
plt.rcParams['font.serif'] = 'Ubuntu'
plt.rcParams['font.monospace'] = 'Ubuntu Mono'
plt.rcParams['font.size'] = 14
plt.rcParams['axes.labelsize'] = 12
plt.rcParams['axes.labelweight'] = 'bold'
plt.rcParams['axes.titlesize'] = 12
plt.rcParams['xtick.labelsize'] = 12
plt.rcParams['ytick.labelsize'] = 12
plt.rcParams['legend.fontsize'] = 12
plt.rcParams['figure.titlesize'] = 12
plt.rcParams['image.cmap'] = 'jet'
plt.rcParams['image.interpolation'] = 'none'
plt.rcParams['figure.figsize'] = (10, 10)
plt.rcParams['axes.grid']=True
plt.rcParams['lines.linewidth'] = 2
plt.rcParams['lines.markersize'] = 8
colors = ['xkcd:pale orange', 'xkcd:sea blue', 'xkcd:pale red', 'xkcd:sage green', 'xkcd:terra cotta',
'xkcd:dull purple', 'xkcd:teal', 'xkcd: goldenrod', 'xkcd:cadet blue', 'xkcd:scarlet']
bbox_props = dict(boxstyle="round,pad=0.3", fc=colors[0], alpha=.5)
Cell 2: Downloads a copy of the dataset, unzips, and loads it into a pandas dataframe. Sort the data and grab 1000 rows.
!gdown --id 1I0uRE2zK89XT4PqPWHdJNqSTkk7G468J
!unzip age_gender.zip
data = pd.read_csv('age_gender.csv')
data = data.sample(frac=1).reset_index().loc[0:1000]
data
Cell 3: This step extracts the pixel data for each image, transforms it into the format and structure needed to train the image classifier model, and loads it into the NEW_PIXELS list variable.
def Convert(string):
li = list(string.split(" "))
return li
PIXELS=[]
for i in range(len(data)):
PIXELS.append(Convert(data.pixels[i]))
NEW_PIXELS = []
for p in range(len(PIXELS)):
new_pixels = []
for q in range(len(PIXELS[p])):
new_pixels.append(int(PIXELS[p][q]))
NEW_PIXELS.append(np.array(new_pixels).reshape((48,48,1)))
Cell 4: Displays a count to show that there is an acceptable balance of Male and Female gender images in the data.
images = np.array(NEW_PIXELS)
data = data.drop(columns=['index','img_name'])
sns.countplot(data.gender,palette='plasma')
plt.xticks([0,1],['Male','Female'])
plt.grid(True)
Cell 5: Grabs 4 random images in the data and display it. Label 0 is for Male and 1 for Female. Feel free to run the cell multiple times.
data.pixels = NEW_PIXELS
for i in range(1,5):
J = np.random.choice(np.arange(0,1000,1))
plt.subplot(2,2,i)
plt.title('Label : '+ str(data.gender[J]),fontsize=20)
plt.imshow(np.array(data.pixels[J]).reshape((48,48)),cmap='gray')
plt.tight_layout()
Cell 6: The machine learning model is defined. We go make use of the convolutional neural network (CNN, or ConvNet) algorithm as it works well for image classification. The model is traned on 85% of the data with the remaining 15% used to test and validate the model. We train it for 10 epochs.
classifier = Sequential()
size = 48
classifier.add(Conv2D(64, (2, 2), input_shape = (size,size, 1), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Conv2D(64, (2, 2), input_shape = (size,size, 1), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Flatten())
classifier.add(Dense(units = 1, activation = 'sigmoid'))
labels = np.array(data.gender)
train_images, test_images, train_labels, test_labels = train_test_split(images, labels, test_size=0.15)
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
history = classifier.fit(train_images, train_labels, epochs=20, batch_size=10,
validation_data=(test_images, test_labels))
Cell 7: Displays the accuracy achieved for the model during training and validation at each training epoch. The model as constructed and trained achieves a validation accuracy of ~ 80%. Meaning that given a random facial image, it will classify the gender correcly 8 out ot 10 times. (Note that as the data is randomized. When you run this exercise, the numbers you get will differ but be close.)
val_loss = history.history['val_accuracy']
plt.plot(np.arange(1,len(loss)+1,1),loss,color='navy', label = 'Accuracy')
plt.plot(np.arange(1,len(loss)+1,1),val_loss,color='red',label='Validation Accuracy')
plt.legend(fontsize=15)
Cell 8: This cell provide a visualization of the actual labels (male=0, female=1) vs the prediction of the model on the test images. While the actual labels are a clear 0 or 1, the model prediction falls in between with predicted probabilites lower than 0.5 labeled male and higher labeled female.
plt.plot(classifier.predict(test_images),'.',color='red',label='Predicted Probabilty')
plt.plot(test_labels,'.',color='navy',label='Actual Labels')
plt.xlabel('Instance Number')
plt.ylabel('Probability')
plt.legend()
Cell 9: Visualizes the results with a confusion matrix on the model prediction on the test data showing how many it got right: male predicted as male and female predicted as female, and how many it got wrong: male predicted as female and female predicted as male.
predictions = classifier.predict(test_images)
decision = []
for p in predictions:
if p>=0.5:
decision.append(1)
else:
decision.append(0)
sns.heatmap(confusion_matrix(decision,test_labels),cmap='plasma',annot=True,annot_kws={"size": 32})
plt.xticks([0.50,1.50],['Male','Female'],fontsize=20)
plt.yticks([0.50,1.50],['Male','Female'],fontsize=20)
Cell 10: Downloads a sample image to test the model.
!gdown --id 13V9GPEdi72_3f08h7qujEYubBb8IUI7a
Cell 11: Loads and displays the default image. (To try out the model on your select facial image, upload it. On the first row of the cell, rename the image filename in between single quotes from test.jpg to the filename of the uploaded image and click run.)
filename = 'test.jpg'
def rgb2gray(rgb):
return np.dot(rgb[...,:3], [0.2989, 0.5870, 0.1140])
def ImagePreProcessing(image_path):
img = Image.open(image_path)
img = img.resize((48,48))
img_array = np.asarray(img)
baw_img = rgb2gray(img_array).astype(int)
final_img = baw_img.reshape((48,48))
return final_img
plt.imshow(ImagePreProcessing(filename),cmap='gray')
Cell 12: Runs the model prediction on the image and output the result if male or female along with the probability of the prediction.
image_test = ImagePreProcessing(filename)
prob = classifier.predict(np.array([image_test]))[0][0]
if prob < 0.5:
prob = 1 - prob
print("Classifier model predicts male with probability or %.2f" %(prob*100) + '%')
else:
print("Classifier model predicts female with probability or %.2f" %(prob*100) + '%')
This simple machine learning classfier model predicts the gender of a facial image. As the dataset used for the training of this model also contains age and ethnicity specifics, the model can be further enhanced to provide a prediction for these two characteristics.
Conclusion
Ongoing improvements in image classification algorithms and technologies means that the ease and cost of implementation will continue to go down. The goal of this article to present a quick overview of image classification with code and engender ideas on how your business can benefit from the ability to classify visual data.
Check out the other articles to see more applications and related code on maching learning. If you need support and would like to find out more, get in touch with the contact link.