# MATLAB: Eigenfaces for Facial Recognition

Use MATLAB to construct the following code.

Eigenfaces for Facial Recognition

Facial recognition is the source of motivation behind the development of eigenfaces. Facial recognition is a computer program deciding whether not there is a face in a provided image and possibly if it’s a face the computer program has seen before.

Eigenfaces is the name given to a set of eigenvectors when they are used in face recognition. These eigenfaces are derived from the covariance matrix of the probability distribution over the space of face images. The eigenfaces themselves form a basis set of all images used to construct the covariance matrix.

In this project you will create a facial recognition system. The program reduces each face image to a vector, then uses Principal Component Analysis (PCA) to find the space of faces. This space is spanned by just a few vectors, which means each face can be defined by just a set of coefficients weighting these vectors.

Eigenfaces Resources

Check out the resources below. These sources even contain code that you can use to complete this MATLAB project.

__https://en.wikipedia.org/wiki/Eigenface__

__https://blog.cordiner.net/2010/12/02/eigenfaces-face-recognition-matlab/__

__https://wellecks.wordpress.com/tag/eigenfaces/__

Load training set images from a .mat file

clear;

close all;

clc;

Load the training set of 2414 images into the MATLAB workspace: Each image in the training set is a 48 px by 42 px image of a face. The training set faces are saved in the MAT file training_set_faces.mat. When you load this data file into the MATLAB workspace, you will see a 3D array called yalefaces where the entry yalefaces(:,:,1) is the first image of the training set, yalefaces(:,:,2) is the second image, etc.

**MATLAB functions**: load

% Your code here. Load the training images from the provided

% MAT file training_set_faces.mat

load

Plot one of the images from the training set 3D array to demonstrate you’ve correctly imported the training set images.

**MATLAB functions**: imshow

% Plot one of the images you just loaded into MATLAB

imshow(Yalefaces(:,:.1234),’IntialMaginication’,400);

title({‘Yale Faces Image #1234 Test Upload’});

TIPS IF YOU GET PURE WHITE OR BLACK IMAGES:

This will occur once the image data has been convert to the double precision data type (double). When converting to double precision, the uint8 grayscale color values ranging from 0 (black) to 255 (white) are scaled down to the range 0 (black) to 1 (white). The image plotting function imshow is expecting grayscale values 0-255, so we need to scale back up to this range. Here are two ways to do this.

- imshow: Use the ‘DisplayRange’ option input for imshow. Display range of a grayscale image is specified as a two-element vector of the form [low high]. The imshow function displays the value low (and any value less than low) as black, and it displays the value high (and any value greater than high) as white. Values between low and high are displayed as intermediate shades of gray, using the default number of gray levels. Use the minimum value in the image as black, and the maximum value as white. See the imshow help file for more information.
- imagesc: Use another image-plotting function imagesc and set the colormap to gray. See the imagesc help file for more information.

Vectorize the images and combine into 2D array

create a 2D array that contains all the vectorized images in the training set. First, convert each image into a vector, simply by concatenating the rows of pixels in the original image, resulting in a single column with 48 × 42 = 2016 elements. Then combine all 2414 training set image column vectors into a single matrix T, where each column of the matrix is an image. Each row of this matrix T corresponds to a pixel location. The 2D array must be assigned to the variable T. The size of T will be 2016 × 2414.

**MATLAB functions**: reshape (or using a for loop)

% Vectorize training set images into a 2D array called T

% We need to be sure the data type of our T matrix is double for the rest

% of the computations in this code (mean, subtraction, eig, etc.)

T = double(T);

Calculate the mean-shifted images

The mean face is found by taking the average pixel value across all images in the training set at each pixel location. So the mean face pixel (2,3) is the average of the pixel values from all the images at the training set for that one pixel (2,3). The mean face will be a column vector of size 2016 × 1. Remember that the pixel locations correspond to the rows of T.

**MATLAB functions**: mean

% Find the mean image of the training set

Plot the mean face. First you need to reshape the mean face (a column vector) back into a 48 x 42 array. It should look like a face 🙂 Now that we are working with images in the double precision data type, the pixel values have been scaled down to the range 0 – 1. As described earlier, use one of two options to expand the scale back to 0-255 so the face is visible.

**MATLAB functions**: reshape (or using a for loop), imshow

% Mean face plot

Subtract off the mean face column vector from every image in the training set (the columns of T). Call this shifted 2D array shifted_T. shifted_T will be the same size as T, 2016 × 2414.

**MATLAB functions**: -, repmat or for loop

% Subtract of mean face from every image in the training set

Calculate covariance of shifted_T and obtain eigenfaces

MATLAB covariance: C = cov(A). If A is a matrix whose columns represent random variables and whose rows represent observations, C is the covariance matrix with the corresponding column variances along the diagonal.

For our application, the random variables are each pixel location and the observations are the 2414 training images. We have 2016 random variables and 2414 observations of each random variable.

Why the covariance matrix? As Wikipedia says on its Eigenfaces page: “Informally, eigenfaces can be considered a set of “standardized face ingredients”, derived from statistical analysis of many pictures of faces.” The statiscal analysis we are performing is called principal component analysis (PCA). Wikipedia again: “PCA is mathematically defined as an orthogonal linear transformation that transforms the data to a new coordinate system such that the greatest variance by some projection of the data comes to lie on the first coordinate (called the first principal component), the second greatest variance on the second coordinate, and so on. And the covariance matrix contains these variances. Read more about principal component analysis if you’re interested: __https://en.wikipedia.org/wiki/Principal_component_analysis __

Our random variables are the pixel values, which are organized in rows. So that means we need to take the transpose of shifted_T when we compute the covariance.

S = cov(shifted_T’);

The size of S size is 2016 x 2016. The covariance matrix entries represents the joint variablility of two pixels; e.g. S(2000,40) represents the joint variabiliy of the pixel at position 2000 in the vectorized 48 x 42 px image and the pixel at position 40.

Obtain the eigenvalues & eigenvectors of S. The eigenvectors in this application of facial recognition are called eigenfaces. The result will be a diagonal matrix containing the eigenvectors and a 2016 x 2016 array containing the corresponding 2016 eigenvectors in columns.

% Find the eigenfaces and associated eigevalues

Sort eigenvalues in descending order and then sort their corresponding eigenvectors in that same order. **EIGENVALUES AND EIGENVECTORS COME IN PAIRS AND MUST STAY TOGETHER, I.E. SORT IN THE SAME ORDER. **MATLAB does not consistenly sort the eigenvalues in ascending order, so you need to write code below that can handle MATLAB sorting in either ascending or descenting order. I suggest turning the eigenvalue diagonal matrix vector. Then use sort to put the eigenvalues into descending order (largest eigenvalue first). Output both the sorted eigenvalues and the indicies that tell you how they were sorted when you call sort. Use these indicies to sort the eigenvectors (eigenfaces) into the same order. We need to keep the eigenvalues and their corresponding eigenvectors together.

**MATLAB functions**: diag, sort

% Sort eigenvalues and eigenfaces from largest to smallest (descending order)

Plot the first 9 Eigenfaces

Plot the eigenfaces associated with the largest 9 eigenvalues. You first need to reshape the columns corresponding to the first 9 eigenvalues into 48 x 42 arrays for plotting.

**MATLAB functions**: reshape or for loop, imshow utilizing the ‘DisplayRange’ option because the pixel values in the eigenfaces are so small after converting to double

Try plotting the 9 eigenfaces in a 3×3 figure array using the function subplot.

% Plot the first 9 eigenfaces associated with the largest 9 eigenvalues

Evaluate the number of eigenfaces needed to represent 95% of the total variance

How many eigenfaces do we need to represent 95% percentage of the data?

BIG HINT: Use Wikipedia as a resource here __https://en.wikipedia.org/wiki/Eigenface#Practical_implementation__

The number of principal components k is determined arbitrarily by setting a threshold on the total variance.

The total variance is where n = number of eigenvalues (equal to the number pixels in each image). Your task is to find the smallest k that satisfies:

% Find the number of eigenfaces that represent 95% of the data in the

% training set

Facial Recognition

Only consider the eigenfaces that represent 95% of the data by reducing the size of the eigenvectors/eigenfaces matrix. Only inlude the eigenvectors that you need according to this threshold. The result will be a 2D array of size 2016 x k where k is found in the previous section.

% Only use eigenfaces that represent 95% of the data by

% reducing the size of the eigenfaces matrix.

Project the training images onto the eigenfaces subspace by taking the dot product of each eigenface (eigenvector) with each image with the mean face subtracted. The eigenfaces and mean-subtracted training images are vectors (row or column) with 48 x 42 = 2016 elements. The result of this dot product a scalar (1 x 1 array) representing how much of this input image is “in the direction” of an eigenface. Or in other words, the dot product is a measure of how similar the input face is to the particular eigenface.

The more similar the input image is to a particular eigenface, the larger the contribution from this eigenface to the reconstructed image when using the eigenfaces as a basis. This is similar to a Fourier coefficient and reconstructing signals from Fourier coefficients and sines and cosines (the basis functions for Fourier series).

Store all these projections for all the training images into a 2D array. The number of elements in each column will equal the number of eigenfaces you are using (k). There will be 2414 columns for the 2414 training set images.

**MATLAB functions**: for loop or matrix multiplication

% Project the training images onto the eigenfaces subspace.

So now we have the coefficients (weights) to reconstruct all of the training images from a linear combination of the eigenfaces with the appropriate weights, just like a Fourier series. These coefficient vectors can be thought of as a type of compressed representation of the input image. Instead of 2016 numbers to describe an image (grayscale values at all the pixels), we only need 2016 numbers and the eigenfaces to describe any image.

Reconstruct one of the test images

Compare one original image from the yalefaces 3D array to it’s reconstructed image using the weights you just found and the eigenfaces. A reconstructed image is the sum of the mean face (2016 x 1 column vector) and the k weights (or coefficients) multiplied by their corresponding eigenface. Just like a Fourier Series!

Plot both the original image from the yalefaces 3D array and its reconstruction from the weights and the eigenfaces. You will need to reshape the vectorized reconstructed image to a 2D array before plotting.

**MATLAB functions**: matrix multiplcation or for loop, reshape, imshow

% Visually compare the original and reconstructed images by plotting

Test facial recognition using a face from the training set–should be a perfect match!

First test the similarity score and facial recognition ability of your code using one of the images from the training set. This image will have a similarity score of 1 and should be a perfect match to one of the images in the training set (itself).

% Set the input image variable (your choice of what variable to use) to one of the training images.

Calculate the similarity of the ‘test image’ to each training image by taking the dot product of the mean subtracted input image with each eigenface/eigenvector.

In general: First vectorize the input image. Then subtract off the mean image vector. Then take the dot product of the test image (resized to a vector) with each eigenface. The result will be a vector of a length equal to the number of eigenfaces that you are using (k).

% Vectorize if not already done so, subtract off mean from test image if not already done

% so. Find the mean-subtracted image’s eigenfaces coefficients.

Plot the weights of the input image on the eigenfaces using a stem plot. In other words, plot the entries of the vector you just found above.

**MATLAB function**: stem

% Plot of eigenfaces coefficients for the input image

Assign a similarity score to the input image by comparing the coefficients/weights associated with each eigenface found for the input image and the training set images. To perform facial recognition, the similarity score is calculated between an input face image and each of the training images. The matched face is the one with the highest similarity, and the magnitude of the similarity score indicates the confidence of the match (with a unit value indicating an exact match).

Use a similarity score based in the inverse Euclidean distance defined as

is the weight/coefficient vector of eigenface n and is the coefficient vector for the input image. The index n varies from 1 to the number of images in the training set (2414).

**MATLAB function: **norm

% Calculate the similarity score of the input image with respect to each

% eigenface and store in a vector of length equal to the number

% of training images (2414).

Find the image in the training set with the highest similarity score. This is the training set image that is the closest match to the input image. For an input image that is one of the images in the training set, the max similarity score should be 1, indicating a perfect match.

**MATLAB function: **max

% Find closest training set image to the input image according to the

% similarity score

Display the input image and the closest training set image to visually check the accuracy of your facial recognition code. For this input image from the test set, you should see the same faces.

**MATLAB functions**: imshow or imagesc with colormap set to gray

% Visually check the code’s facial recognition ability

Now test facial recognition using a non-human face

Second, test the image of a mandrill baboon (mandrill.bmp). This is a face but not a human face. You will copy and paste a lot of code from the previous section because you will be completing many of the same tasks, just on a different image.

First import the mandrill.bmp file into MATLAB. Convert the image to the double data type. Resize the image to be the same size as the images in the training set 48 x 42 px.

MATLAB functions: imread, im2double, imresize

% Load the mandrill.bmp image. Convert to double. Resize to training set

% image size.

Calculate the similarity of the input image to each training image by taking the dot product of the mean subtracted input image with each eigenface/eigenvector.

First subtract the mean image off of the input image. Then take the dot product of the input image (resized to a vector) with each eigenface eigenvector. The result will be a vector of a length equal to the number of eigenfaces that you are using (k).

% Subtract off mean from input image. Find the mean-subtracted image’s

% eigenfaces coefficients.

Plot the weights of the input image on the eigenfaces using a stem plot. In other words, plot the entries of the vector you just found above.

**MATLAB function**: stem

% Plot of eigenfaces coefficients for the input image

Assign a similarity score to the input image by comparing the coefficients/weights associated with each eigenface found for the input image and the training set images. Use the same equation for the similarity score as in the previous section.

**MATLAB function: **norm

% Calculate the similarity score of the input image with respect to each

% eigenface and store in a vector of length equal to the number

% of training images.

Find the image in the training set with the highest similarity score. This is the training set image that is the closest match to the input image.

**MATLAB function: **max

% Find closest training set image to the input image according to the

% similarity score

Display the input image and the closest training set image to visually check the accuracy of your facial recognition code.

**MATLAB functions**: imshow or imagesc with colormap gray

% Visually check the code’s facial recognition ability

Extra Credit

Implement morphing. Given two images, compute an animation that shows one being transformed continuously into the other, using the eigenfaces representation. Also, try transforming an image “away” from another one, to exaggerate differences between two faces. Make a video from your morphed