Generating EigenFaces with Mahout SVD to recognize person faces
2013/04/17 11 Comments
In this tutorial, we are going to describe how to generate and use eigenfaces to recognize people faces.
Eigenfaces are a set of eigenvectors derived from the covariance matrix of the probability distribution of the high-dimensional vector space of possible faces of human beings. It can be used to identify a face on a picture from a person face database very quickly. In this post, we’ll not give much details on the mathematical aspects but if you are interested on those, you can look at the excellent post Face Recognition using Eigenfaces and Distance Classifiers: A Tutorial from the Onionesque Reality Blog.
Requirements
To do this tutorial, you would need to have the following softwares installed on your machine:
- Java >= 1.6
- Hadoop
- Mahout
- Maven
You can find the instructions to install those from a previous post Playing with the Mahout recommendation engine on a Hadoop cluster.
Compiling the code
All the sourcecode, the training sets and testing sets are in the github repository at https://github.com/fredang/mahout-eigenface-example/
You can fetch the files from this repository by typing:
$ git clone https://github.com/fredang/mahout-eigenface-example.git
This repository is structured as follow:
- src/main/java/com/chimpler/example/eigenface/GenerateCovarianceMatrix.java: code to generate the covariance matrix from the images from the training set
- src/main/java/com/chimpler/example/eigenface/ComputeEigenFaces.java: code to compute the eigenfaces
- src/main/java/com/chimpler/example/eigenface/ComputeDistance.java: code to test the model with the testing set
- src/main/java/com/chimpler/example/eigenface/Helper.java: code used to do some matrix operations and image operations
- images/yalefaces-test: some additional images to add to the yalefaces testing set
- images/cats-train: training set for cat faces (not used in this example)
- images/cats-test: testing set for cat faces (not used in this example)
Once you fetched the project, you can compile it using maven:
$ mvn clean package assembly:single
It creates a jar file in the directory target which all the dependencies and the compiled class from the src/main/java directory.
Preparing the data set
You can download the yale face database by going to this page: http://vision.ucsd.edu/content/yale-face-database
Unzip the file:
$ unzip yalefaces.zip
Now we are going to split this file into two sets: a training set and a testing set:
$ mkdir training-set $ mv yalefaces/* training-set/
For the testing set, we are removing the sad facial expression from the training set and move it to the testing set:
$ mkdir testing-set $ mv training-set/*.sad testing-set/
We also add two non face images(hamburger and cat) and one person face unknown to the training set(Bruce Lee):
$ cp [MAHOUT EIGENFACE EXAMPLE DIRECTORY]/images/yalefaces-test/* testing-set
Training the model
The training is implemented in the class GenerateCovarianceMatrix to generate the covariance matrix.
The arguments of this class are:
- image width: it is used to scale down the image width so that the computation does not take too much memory
- image height
- training directory: directory containing the training face images
- output directory
public class Helper { public static void writeImage(String filename, double[] imagePixels, int width, int height) throws Exception { BufferedImage meanImage = new BufferedImage(width, height, BufferedImage.TYPE_BYTE_GRAY); WritableRaster raster = meanImage.getRaster(); // convert byte array to byte array int[] pixels = new int[imagePixels.length]; for(int i = 0 ; i < imagePixels.length ; i++) { pixels[i] = (int)imagePixels[i]; } raster.setPixels(0, 0, width, height, pixels); ImageIO.write(meanImage, "gif", new File(filename)); } public static double[] readImagePixels(String imageFileName, int width, int height) throws Exception { BufferedImage colorImage = ImageIO.read(new File(imageFileName)); // convert to grayscale image BufferedImage greyImage = new BufferedImage( width, height, BufferedImage.TYPE_BYTE_GRAY); greyImage.getGraphics().drawImage(colorImage, 0, 0, width, height, null); byte[] bytePixels = ((DataBufferByte)greyImage.getRaster().getDataBuffer()).getData(); double[] doublePixels = new double[bytePixels.length]; for(int i = 0 ; i < doublePixels.length ; i++) { doublePixels[i] = (double)(bytePixels[i] & 255); } return doublePixels; } public static double[][] computeDifferenceMatrixPixels(double[][] matrixPixels, double[] meanColumn) { int rowCount = matrixPixels.length; int columnCount = matrixPixels[0].length; double[][] diffMatrixPixels = new double[rowCount][columnCount]; for(int i = 0 ; i < rowCount ; i++) { for(int j = 0 ; j < columnCount ; j++) { diffMatrixPixels[i][j] = matrixPixels[i][j] - meanColumn[i]; } } return diffMatrixPixels; } public static double[] computeDifferencePixels(double[] pixels, double[] meanColumn) { int pixelCount = pixels.length; double[] diffPixels = new double[pixelCount]; for(int i = 0 ; i < pixelCount ; i++) { diffPixels[i] = pixels[i] - meanColumn[i]; } return diffPixels; } public static double[][] readMatrixSequenceFile(String fileName) throws Exception { Configuration configuration = new Configuration(); FileSystem fs = FileSystem.get(configuration); Reader matrixReader = new SequenceFile.Reader(fs, new Path(fileName), configuration); List<double[]> rows = new ArrayList<double[]>(); IntWritable key = new IntWritable(); VectorWritable value = new VectorWritable(); while(matrixReader.next(key, value)) { Vector vector = value.get(); double[] row = new double[vector.size()]; for(int i = 0 ; i < vector.getNumNondefaultElements() ; i++) { Element element = vector.getElement(i); row[element.index()] = element.get(); } rows.add(row); } return rows.toArray(new double[rows.size()][]); } public static void writeMatrixSequenceFile(String matrixSeqFileName, double[][] covarianceMatrix) throws Exception{ int rowCount = covarianceMatrix.length; int columnCount = covarianceMatrix[0].length; Configuration configuration = new Configuration(); FileSystem fs = FileSystem.get(configuration); Writer matrixWriter = new SequenceFile.Writer(fs, configuration, new Path(matrixSeqFileName), IntWritable.class, VectorWritable.class); IntWritable key = new IntWritable(); VectorWritable value = new VectorWritable(); double[] doubleValues = new double[columnCount]; for(int i = 0 ; i < rowCount ; i++) { key.set(i); for(int j = 0 ; j < columnCount ; j++) { doubleValues[j] = covarianceMatrix[i][j]; } Vector vector = new DenseVector(doubleValues); value.set(vector); matrixWriter.append(key, value); } matrixWriter.close(); } public static double[] computeWeights(double[] diffImagePixels, double[][] eigenFaces) { int pixelCount = eigenFaces.length; int eigenFaceCount = eigenFaces[0].length; double[] weights = new double[eigenFaceCount]; for(int i = 0 ; i < eigenFaceCount ; i++) { for(int j = 0 ; j < pixelCount ; j++) { weights[i] += diffImagePixels[j] * eigenFaces[j][i]; } } return weights; } public static double[] reconstructImageWithEigenFaces( double[] weights, double[][] eigenFaces, double[] meanImagePixels) throws Exception { int pixelCount = eigenFaces.length; int eigenFaceCount = eigenFaces[0].length; // reconstruct image from weight and eigenfaces double[] reconstructedPixels = new double[pixelCount]; for(int i = 0 ; i < eigenFaceCount ; i++) { for(int j = 0 ; j < pixelCount ; j++) { reconstructedPixels[j] += weights[i] * eigenFaces[j][i]; } } // add mean for(int i = 0 ; i < pixelCount ; i++) { reconstructedPixels[i] += meanImagePixels[i]; } double min = Double.MAX_VALUE; double max = -Double.MAX_VALUE; for(int i = 0 ; i < reconstructedPixels.length ; i++) { min = Math.min(min, reconstructedPixels[i]); max = Math.max(max, reconstructedPixels[i]); } double[] normalizedReconstructedPixels = new double[pixelCount]; for(int i = 0 ; i < reconstructedPixels.length ; i++) { normalizedReconstructedPixels[i] = (255.0 * (reconstructedPixels[i] - min)) / (max - min); } return normalizedReconstructedPixels; } public static double computeImageDistance(double[] pixelImage1, double[] pixelImage2) { double distance = 0; int pixelCount = pixelImage1.length; for(int i = 0 ; i < pixelCount ; i++) { double diff = pixelImage1[i] - pixelImage2[i]; distance += diff * diff; } return Math.sqrt(distance / pixelCount); } public static List listImageFileNames(String directoryName) { File directory = new File(directoryName); List imageFileNames = new ArrayList(); for(File imageFile: directory.listFiles()) { // if (imageFile.getName().endsWith(".gif")) { imageFileNames.add(imageFile.getAbsolutePath()); // } } Collections.sort(imageFileNames); return imageFileNames; } public static String getShortFileName(String fullFileName) { return new File(fullFileName).getName(); } }
public class GenerateCovarianceMatrix { private static double[][] convertImagesToMatrix(Collection imageFileNames, int width, int height) throws Exception { int columnIndex = 0; double[][] pixelMatrix = new double[width * height][imageFileNames.size()]; for(String fileName: imageFileNames) { System.out.println("Reading file " + fileName); double[] pixels = Helper.readImagePixels(fileName, width, height); for(int i = 0 ; i < pixels.length ; i++) { pixelMatrix[i][columnIndex] = pixels[i]; } columnIndex++; } return pixelMatrix; } private static double[] computeMeanColumn(double[][] pixelMatrix) { int pixelCount = pixelMatrix.length; double[] meanColumn = new double[pixelCount]; int columnCount = pixelMatrix[0].length; for(int i = 0 ; i < pixelCount ; i++) { int sum = 0; for(int j = 0 ; j < columnCount ; j++) { sum += pixelMatrix[i][j]; } meanColumn[i] = sum / columnCount; } return meanColumn; } private static double[][] computeCovarianceMatrix(double[][] diffMatrixPixels) { int rowCount = diffMatrixPixels.length; int columnCount = diffMatrixPixels[0].length; double[][] covarianceMatrix = new double[columnCount][columnCount]; for(int i = 0 ; i < columnCount ; i++) { for(int j = 0 ; j < columnCount ; j++) { int sum = 0; for(int k = 0 ; k < rowCount ; k++) { sum += diffMatrixPixels[k][i] * diffMatrixPixels[k][j]; } covarianceMatrix[i][j] = sum; } } return covarianceMatrix; } public static void main(String args[]) throws Exception { if (args.length != 4) { System.out.println("Arguments: width height trainingDirectory outputDirectory"); System.exit(1); } int width = Integer.parseInt(args[0]); int height = Integer.parseInt(args[1]); String imageDirectory = args[2]; String outputDirectory = args[3]; File outputDirectoryFile = new File(outputDirectory); if (!outputDirectoryFile.exists()) { outputDirectoryFile.mkdir(); } List imageFileNames = Helper.listImageFileNames(imageDirectory); System.out.println("Reading " + imageFileNames.size() + " images..."); double[][] pixelMatrix = convertImagesToMatrix(imageFileNames, width, height); double[] meanColumn = computeMeanColumn(pixelMatrix); Helper.writeImage(outputDirectory + "/mean-image.gif", meanColumn, width, height); double[][] diffMatrixPixels = Helper.computeDifferenceMatrixPixels(pixelMatrix, meanColumn); Helper.writeMatrixSequenceFile(outputDirectory + "/diffmatrix.seq", diffMatrixPixels); double[][] covarianceMatrix = computeCovarianceMatrix(diffMatrixPixels); Helper.writeMatrixSequenceFile(outputDirectory + "/covariance.seq", covarianceMatrix); } }
$ java -cp target/mahout-eigenface-example-1.0-jar-with-dependencies.jar com.chimpler.example.eigenface.GenerateCovarianceMatrix 80 60 [TRAINING_SET_DIRECTORY] output
This program:
- reads all the n image files from the training directory
- convert each image to greyscale and scale down the image
- create a matrix M with each column representing an image. The column has a length of w x h and each of its element represents a shade of grey with a value between 0(black) and 255(white).
- compute the mean image and write it in output/mean-image.gif. It is computed by averaging each pixel of the images
- compute the diff matrix DM by substracting the mean image to M
- Compute the covariance matrix transpose(DM) x DM. It gives the matrix of size n x n
- write the diff matrix DM to output/diffmatrix.seq
- write the covariance matrix to output/covariance.seq
Now we need to compute the eigenvectors of the covariance matrix. It can be done using the Mahout Singular Value Decomposition(SVD).
To use it, first copy the file covariance.seq to HDFS:
$ hadoop fs -put output/covariance.seq covariance.seq
Then run the Mahout SVD:
$ mahout svd --input covariance.seq --numRows 150 --numCols 150 --rank 50 --output output
We set the –numRows and –numCols to the size of the covariance matrix (150 x 150) and the rank to 50 (we usually set it to one third of the number of images).
The computed eigen vectors might contain extra eigenvectors with invalid eigenvalues. To fix this, we can run mahout cleansvd:
$ mahout cleansvd -ci covariance.seq -ei output -o output2
We can now copy the clean eigen vector to the local filesystem:
$ hadoop fs -get output2/cleanEigenvectors output/cleanEigenvectors
Then execute the java class ComputeEigenFaces to create the eigenfaces.
public class ComputeEigenFaces { private static void writeEigenFaceImage(String filename, double[][] eigenFacePixels, int width, int height, int columnIndex) throws Exception { BufferedImage meanImage = new BufferedImage(width, height, BufferedImage.TYPE_BYTE_GRAY); WritableRaster raster = meanImage.getRaster(); double min = Double.MAX_VALUE; double max = -Double.MAX_VALUE; for(int i = 0 ; i < eigenFacePixels.length ; i++) { min = Math.min(min, eigenFacePixels[i][columnIndex]); max = Math.max(max, eigenFacePixels[i][columnIndex]); } int[] pixels = new int[eigenFacePixels.length]; for(int i = 0 ; i < eigenFacePixels.length ; i++) { pixels[i] = (int)(255.0 * (eigenFacePixels[i][columnIndex] - min) / (max - min)); } raster.setPixels(0, 0, width, height, pixels); ImageIO.write(meanImage, "gif", new File(filename)); } private static double[][] computeEigenFaces(double[][] diffMatrix, double[][] eigenVectors) { int pixelCount = diffMatrix.length; int imageCount = eigenVectors[0].length; int rank = eigenVectors.length; double[][] eigenFaces = new double[pixelCount][rank]; for(int i = 0 ; i < rank ; i++) { double sumSquare = 0; for(int j = 0 ; j < pixelCount ; j++) { for(int k = 0 ; k < imageCount ; k++) { eigenFaces[j][i] += diffMatrix[j][k] * eigenVectors[i][k]; } sumSquare += eigenFaces[j][i] * eigenFaces[j][i]; } double norm = Math.sqrt(sumSquare); for(int j = 0 ; j < pixelCount ; j++) { eigenFaces[j][i] /= norm; } } return eigenFaces; } public static void main(String args[]) throws Exception { if (args.length != 7) { System.out.println("Arguments: eigenVectorFileName diffMatrixFileName meanImageFileName width height trainingDirectory outputDirectory"); System.exit(1); } String eigenVectorsFileName = args[0]; String diffMatrixFileName = args[1]; String meanImageFilename = args[2]; int width = Integer.parseInt(args[3]); int height = Integer.parseInt(args[4]); String trainingDirectory = args[5]; String outputDirectory = args[6]; File outputDirectoryFile = new File(outputDirectory); if (!outputDirectoryFile.exists()) { outputDirectoryFile.mkdir(); } double[] meanPixels = Helper.readImagePixels(meanImageFilename, width, height); double[][] eigenVectors = Helper.readMatrixSequenceFile(eigenVectorsFileName); double[][] diffMatrix = Helper.readMatrixSequenceFile(diffMatrixFileName); double[][] eigenFaces = computeEigenFaces(diffMatrix, eigenVectors); int rank = eigenVectors.length; for(int i = 0 ; i < rank ; i++) { writeEigenFaceImage(outputDirectory + "/eigenface-" + i + ".gif", eigenFaces, width, height, i); } double minDistance = Double.MAX_VALUE; double maxDistance = -Double.MAX_VALUE; Helper.writeMatrixSequenceFile(outputDirectory + "/eigenfaces.seq", eigenFaces); List imageFileNames = Helper.listImageFileNames(trainingDirectory); int imageCount = diffMatrix[0].length; int pixelCount = width * height; double[][] weightMatrix = new double[imageCount][]; for(int i = 0 ; i < imageCount ; i++) { double[] diffImagePixels = new double[pixelCount]; for(int j = 0 ; j < pixelCount ; j++) { diffImagePixels[j] = diffMatrix[j][i]; } double[] weights = Helper.computeWeights(diffImagePixels, eigenFaces); double[] reconstructedImagePixels = Helper.reconstructImageWithEigenFaces( weights, eigenFaces, meanPixels); String shortFileName = Helper.getShortFileName(imageFileNames.get(i)); Helper.writeImage(outputDirectory + "/ef-" + shortFileName, reconstructedImagePixels, width, height); double[] imagePixels = Helper.readImagePixels(imageFileNames.get(i), width, height); double distance = Helper.computeImageDistance(imagePixels, reconstructedImagePixels); minDistance = Math.min(minDistance, distance); maxDistance = Math.max(maxDistance, distance); System.out.printf("Reconstructed Image distance for %1$s: %2$f\n", shortFileName, distance); weightMatrix[i] = weights; } Helper.writeMatrixSequenceFile(outputDirectory + "/weights.seq", weightMatrix); System.out.println("Min distance = " + minDistance); System.out.println("Max distance = " + maxDistance); } }
To run the program:
$ java -cp target/mahout-eigenface-example-1.0-jar-with-dependencies.jar com.chimpler.example.eigenface.ComputeEigenFaces output/cleanEigenvectors output/diffmatrix.seq output/mean-image.gif 80 60 [TRAINING_SET_DIRECTORY] output
It creates the eigenfaces matrix in output/eigenfaces.seq and the images representing those eigenfaces in the output directory:
It also tries to reconstruct the faces of the training sets using the eigenfaces. To do that it computes the weight of each eigenface by doing a scalar product of the image pixel column with each eigenface column and then normalize it. Then it sums up each pixel of the eigenfaces weighted by those weights. You can think of this process as superposing the eigenfaces layers and give them a different transparency value (can be negative) to try to reconstruct the original image.
After having reconstructed the image, it computes the distance between the original image and the reconstructed image (using euclidian distance between the pixels):
Reconstructed Image distance for subject01.centerlight: 37.395691 Reconstructed Image distance for subject01.glasses: 32.350212 Reconstructed Image distance for subject01.happy: 27.559056 Reconstructed Image distance for subject01.leftlight: 28.008936 Reconstructed Image distance for subject01.noglasses: 47.047757 Reconstructed Image distance for subject01.normal: 32.627928 Reconstructed Image distance for subject01.rightlight: 25.465009 Reconstructed Image distance for subject01.sleepy: 23.635308 Reconstructed Image distance for subject01.surprised: 45.947206 Reconstructed Image distance for subject01.wink: 32.132286 [...] Min distance = 14.470855648264822 Max distance = 47.047756576566904
These distances are quite small which means that our eigenfaces allows to efficiently represent faces.
Testing the Model
Now that we have trained our model, we are going to test it.
In the training set, we have some of the same people than in the training set but with a different facial expression. We also have two images with are not person face(hamburger and cat) and one image of a new person(Bruce Lee).
The class ComputeDistance tests if the images in the testing directory can be recognized as a person face and find the most similar image in the training set.
public class ComputeDistance { public static Object[] findClosestImage(double[] weights, double[][] weightMatrix) { int imageCount = weightMatrix.length; int eigenFaceCount = weightMatrix[0].length; int closestImageIndex = -1; double minWeightSquareDistance = Double.MAX_VALUE; for(int i = 0 ; i < imageCount ; i++) { double distance = 0; for(int j = 0 ; j < eigenFaceCount ; j++) { distance += (weightMatrix[i][j] - weights[j]) * (weightMatrix[i][j] - weights[j]); } if (distance < minWeightSquareDistance) { minWeightSquareDistance = distance; closestImageIndex = i; } } return new Object[]{closestImageIndex, Math.sqrt(minWeightSquareDistance / eigenFaceCount)}; } public static void main(String args[]) throws Exception { if (args.length != 8) { System.out.println("Arguments: eigenFacesFileName meanImageFileName weightSeqFilename width height trainingDirectory testImageDirectory outputDirectory"); System.exit(1); } String eigenFacesFilename = args[0]; String meanImageFilename = args[1]; String weightSeqFileName = args[2]; int width = Integer.parseInt(args[3]); int height = Integer.parseInt(args[4]); String trainImageDirectory = args[5]; String testImageDirectory = args[6]; String outputDirectory = args[7]; double[] meanPixels = Helper.readImagePixels(meanImageFilename, width, height); double[][] eigenFaces = Helper.readMatrixSequenceFile(eigenFacesFilename); double[][] weightMatrix = Helper.readMatrixSequenceFile(weightSeqFileName); List testImageFileNames = Helper.listImageFileNames(testImageDirectory); List trainImageFileNames = Helper.listImageFileNames(trainImageDirectory); for(String testImageFileName: testImageFileNames) { double[] imagePixels = Helper.readImagePixels(testImageFileName, width, height); double[] diffImagePixels = Helper.computeDifferencePixels(imagePixels, meanPixels); double[] weights = Helper.computeWeights(diffImagePixels, eigenFaces); double[] reconstructedImagePixels = Helper.reconstructImageWithEigenFaces( weights, eigenFaces, meanPixels); double distance = Helper.computeImageDistance(imagePixels, reconstructedImagePixels); String shortTestFileName = Helper.getShortFileName(testImageFileName); System.out.printf("Reconstructed Image distance for %1$s: %2$f\n", shortTestFileName, distance); Helper.writeImage(outputDirectory + "/test-ef-" + shortTestFileName, reconstructedImagePixels, width, height); Object[] closestImageInfo = findClosestImage(weights, weightMatrix); int closestImageIndex = (Integer)closestImageInfo[0]; double closestImageSimilarity = (Double)closestImageInfo[1]; System.out.printf("Image %1$s is most similar to %2$s: %3$f\n", shortTestFileName, Helper.getShortFileName(trainImageFileNames.get(closestImageIndex)), closestImageSimilarity); } } }
To run the program:
$ java -cp target/mahout-eigenface-example-1.0-jar-with-dependencies.jar com.chimpler.example.eigenface.ComputeDistance output/eigenfaces.seq output/mean-image.gif output/weights.seq 68 68 [TRAINING_SET_DIRECTORY] [TESTING_SET_DIRECTORY] output
For each image of the testing set, it computes the weight that needs to be applied on each eigenface to reconstruct the image and generate the reconstructed image in the output directory:
As expected, the images representing the face of people from the training set are well reconstructed but not the cat and the hamburger images. The reconstructed face of Bruce Lee is not recognizable but we can see that it is still a face. The program also computes the distance between the original image and the reconstructed image. It also tries for each test image, to find the most similar image in the training set by comparing the eigenfaces weight using euclidian distance:
Reconstructed Image distance for brucelee.gif: 51.404904 Image brucelee.gif is most similar to subject03.surprised: 447.574353 Reconstructed Image distance for cat.gif: 65.154281 Image cat.gif is most similar to subject05.centerlight: 638.072675 Reconstructed Image distance for hamburger.gif: 52.313601 Image hamburger.gif is most similar to subject01.rightlight: 684.214467 Reconstructed Image distance for subject01.sad: 32.473280 Image subject01.sad is most similar to subject01.sleepy: 101.895815 Reconstructed Image distance for subject02.sad: 22.418869 Image subject02.sad is most similar to subject02.noglasses: 104.859642 Reconstructed Image distance for subject03.sad: 35.468822 Image subject03.sad is most similar to subject03.noglasses: 120.972063 Reconstructed Image distance for subject04.sad: 30.370102 Image subject04.sad is most similar to subject04.normal: 0.000000 [...]
Those results confirm the visual interpretations we made previously: the distance between the reconstructed image and the original image of the hamburger and the cat are pretty high, also the weight distance with the images from the training set is pretty high. The image of Bruce Lee is reconstructed fairly but the weight distance is low.
For the other people faces, this distance is pretty small and it successfully associates them to the face of the same person from the training set.
Using the weight distance, we can define two thresholds:
- T1: threshold at which the images represent a face
- T2: weight threshold at which the image represents a face from the training set
So if the weight distance is above T1, then the image does not represent a face. Between T1 and T2, it represents an unknown face. And below T2, it represents a face from the training set. Choosing those thresholds is done heuristically.
Conclusion
We show in this post how to generate the eigenfaces from a training set and then uses those eigentafces to recognize person’s face. We also introduce some metrics to to determine if an image represents a person face or not and if it is similar to a face from the training set.
If you are trying this tutorial with other images make sure that:
- the faces are in the same position in the image
- the faces have the same scale/rotation angle
- the faces have the same brightness/contrasts
Some techniques were developed to alleviate those constraints. You can find several papers about this on the web.
Hi, thank you for your hands-on tutorials with Mahout. I have some recurrent issue and I would like a bit of help:
When typing:
$ java -cp target/mahout-eigenface-example-1.0-jar-with-dependencies.jar com.chimpler.example.eigenface.ComputeDistance output/eigenfaces.seq output/mean-image.gif output/weights.seq 68 68 [TRAINING_SET_DIRECTORY] [TESTING_SET_DIRECTORY] output
I get the following error:
Error: Could not find or load main class com.chimpler.example.eigenface.GenerateCovarianceMatrix
And I don’t know why, I simply followed your instructions.
Thanks in advance!
Hi,
Can you try to recompile the project by typing:
mvn clean package assembly:single
With “mvn assembly:single”, it does not compile the classes from the project and include them in the jar. Sorry for the mistake I’ll update the post.
Let me know if that helps.
Thanks
Thank you, it is now working!
I think I also noticed a few mistakes in your article to correct (but these ones were easy enough to be fixed by myself):
*Instead of:
$ mv yalefaces/*.sad testing-set/
It must be:
$ mv training-set/*.sad testing-set/
(because we first moved every file from yalefaces to training-set)
*When we perform the SVD command, I think it must be “–numRows 152 –numCols 152” instead of 150 (because we added the cat and the hamburger files in our folder?). Maybe I did something wrong before but Mahout did ask for this value.
Glad to hear that it’s working for you now.
Thank you for your feedback. I have updated the post with your change.
For the SVD command, the hamburger, cat and bruce lee image should be in the testing directory and not in the training directory. Not sure where those 2 extra files come from, do you have any hidden files in the directory or any files that get generated by the file explorer(thumbnail, …)?
Thank you
Hi ,
Thanks for the tutorial
Iam getting this exception when
java -cp target/mahout-eigenface-example-1.0-jar-with-dependencies.jar com.chimpler.example.eigenface.ComputeDistance output/eigenfaces.seq output/mean-image.gif output/weights.seq 68 68 [TRAINING_SET_DIRECTORY] [TESTING_SET_DIRECTORY] output issued
Exception in thread “main” java.lang.ArrayIndexOutOfBoundsException: 4624
at com.chimpler.example.eigenface.Helper.computeWeights(Helper.java:158)
at com.chimpler.example.eigenface.ComputeDistance.main(ComputeDistance.java:72)
diffImagePixels.length==4624
pixelCount.length==4800(j value)
In method computeWeights()
weights[i] += diffImagePixels[j] * eigenFaces[j][i];
j value is greater than diffImagePixels.length so we got this exception
could you explain why i am getting low value for diffImagePixels.length?
I tried the example with all pics and with 3 samples same exception occuring.
I am also getting same error :xception in thread “main” java.lang.ArrayIndexOutOfBoundsException: 4624
at com.chimpler.example.eigenface.Helper.computeWeights(Helper.java:158)
at com.chimpler.example.eigenface.ComputeDistance.main(ComputeDistance.java:72)
Arguments: width height trainingDirectory outputDirectory
Arguments: eigenVectorFileName diffMatrixFileName meanImageFileName width height trainingDirectory outputDirectory
Arguments: eigenFacesFileName meanImageFileName weightSeqFilename width height trainingDirectory testImageDirectory outputDirectory
I don’t know how does this value will change to run code well
hi while trying to run this code .. got a error :
$ mahout svd –input covariance.seq –numRows 150 –numCols 150 –rank 50 –output output
14/04/10 10:27:15 INFO mapred.JobClient: Task Id : attempt_201404101005_0001_m_000001_2, Status : FAILED
org.apache.mahout.math.CardinalityException: Required cardinality 152 but got 150
at org.apache.mahout.math.DenseVector.dot(DenseVector.java:241)
at org.apache.mahout.math.hadoop.TimesSquaredJob$TimesSquaredMapper.scale(TimesSquaredJob.java:238)
at org.apache.mahout.math.hadoop.TimesSquaredJob$TimesSquaredMapper.map(TimesSquaredJob.java:229)
at org.apache.mahout.math.hadoop.TimesSquaredJob$TimesSquaredMapper.map(TimesSquaredJob.java:194)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
Pingback: Generating EigenFaces with Mahout SVD to recognize person faces | Developer tips & tricks
Great tutorial!
Can I classify color image with Mahout SVD?
If not, is there any solution? Thanks so much 🙂
Wonderful. Where do you read the papers(or books) that have these algorithms ? Can I do this with R ?