Playing with OpenCV in Scala to do face detection with Haarcascade classifier using a webcam

Detecting objects in images has been used in many applications: auto tagging pictures (e.g. Facebook, Phototime), counting the number of people in a street(e.g. Placemeter), classifying pictures, … And even in Bistro, a device to feed cats.

This post is a small introduction to OpenCV an open source computer vision library using scala.
The OpenCV library provides several features to manipulate images(apply filters, transformation), detect faces and recognize faces in images.
In this post, we are going to implement two small scala programs:

  • read an image and run the Haar cascade classifier to detect the faces in the image
  • use the webcam and detect faces in real time

To detect faces, we are using the Haar Feature-based Cascade Classifier which is a classifier to detect objects in an image. You can find a good introduction on the openCV website and on the Facial Recognition youtube video by Tom Neumark

Note that detecting faces and recognizing faces(who?) are two different problems and use two different approaches. In this post we are going to only look at face detection.

Prerequisites

You can fetch the code used in this post on github:

git clone https://github.com/chimpler/blog-scala-javacv.git

Detecting faces in an image

In this section we are going to detect the faces in the skyfall movie cast picture below by using a Haar Cascade classifier:

James Bond Skyfall photocall - London

 

We start by reading the image using opencv:


val mat = opencv_highgui.imread(imageFilename)

Then we convert the image to greyscale:

skyfall_grey


val greyMat = new Mat()
opencv_imgproc.cvtColor(mat, greyMat, opencv_imgproc.CV_BGR2GRAY, 1)

Then we apply histogram equalization:
skyfall_equalized

 val equalizedMat = new Mat()
 opencv_imgproc.equalizeHist(greyMat, equalizedMat)

We can now run the face recognition using the Haar Cascade. We get the haar cascade file from the openCV git repo.

val faceXml = FaceDetectorApp.getClass.getClassLoader.getResource("haarcascade_frontalface_alt.xml").getPath
val faceCascade = new CascadeClassifier(faceXml)
val faceRects = new Rect() // will hold the rectangles surrounding the detected faces
faceCascade.detectMultiScale(equalizedMat, faceRects)

Then draw the rectangles on the detected faces:

skyfall_faces

val image = mat.getBufferedImage
val graphics = image.getGraphics
graphics.setColor(Color.RED)
graphics.setFont(new Font(Font.SANS_SERIF, Font.BOLD, 18))

for(i <- 0 until faceRects.limit()) {
    val faceRect = faceRects.position(i)
    graphics.drawRect(faceRect.x, faceRect.y, faceRect.width, faceRect.height)
    graphics.drawString(s"Face $i", faceRect.x, faceRect.y - 20)
}
ImageIO.write(image, "jpg", new File("output_faces.jpg"))

We can see that all the faces were detected.

 

To run the program from the source:

sbt "run-main com.chimpler.javacv.ImageFaceDetectorApp skyfall.jpg"

It will generate the 3 images:

  • output_grey.jpg
  • output_equalized.jpg
  • output_faces.jpg

Recognizing faces in realtime using the webcam

In this section, we are going to describe how to use the webcam using opencv then use the haar cascade to detect the face, the left eye and the right eye.

Using the webcam with openCV is pretty straightforward:

val canvas = new CanvasFrame("Webcam")

//Set Canvas frame to close on exit
canvas.setDefaultCloseOperation(javax.swing.JFrame.EXIT_ON_CLOSE)

//Declare FrameGrabber to import output from webcam
val grabber = new OpenCVFrameGrabber(0)
grabber.setImageWidth(640)
grabber.setImageHeight(480)
grabber.setBitsPerPixel(CV_8U)
grabber.setImageMode(ImageMode.COLOR)
grabber.start()

while (true) {
 val img = grabber.grab()
 canvas.showImage(img)
}

It opens a window and display the image captured by the webcam.

We are now going to update the program to run the classifier on the image every 200 milliseconds because the computation is CPU intensive.
And then we are going to draw a rectangle around the detected faces. In this example, we will also detect the eyes in each face. We first extract the image in each face, then we detect the left eye on the top-left quarter of the image and the right eye on the top-right quarter of the image.

object FaceWebcamDetectorApp extends App {

  // holder for a single detected face: contains face rectangle and the two eye rectangles inside
  case class Face(id: Int, faceRect: Rect, leftEyeRect: Rect, rightEyeRect: Rect)

  // we need to clone the rect because openCV is recycling rectangles created by the detectMultiScale method
  private def cloneRect(rect: Rect): Rect = {
    new Rect(rect.x, rect.y, rect.width, rect.height)
  }

  class FaceDetector() {
    // read the haar classifier xml files for face, left eye and right eye
    val faceXml = FaceWebcamDetectorApp.getClass.getClassLoader.getResource("haarcascade_frontalface_alt.xml").getPath
    val faceCascade = new CascadeClassifier(faceXml)

    val leftEyeXml = FaceWebcamDetectorApp.getClass.getClassLoader.getResource("haarcascade_mcs_lefteye_alt.xml").getPath
    val leftEyeCascade = new CascadeClassifier(leftEyeXml)

    val rightEyeXml = FaceWebcamDetectorApp.getClass.getClassLoader.getResource("haarcascade_mcs_righteye_alt.xml").getPath
    val rightEyeCascade = new CascadeClassifier(rightEyeXml)

    def detect(greyMat: Mat): mutable.Buffer[Face] = {
      val faces = mutable.Buffer.empty[Face]

      val faceRects = new Rect()
      faceCascade.detectMultiScale(greyMat, faceRects)
      for(i <- 0 until faceRects.limit()) {
        val faceRect = faceRects.position(i)

        // the left eye should be in the top-left quarter of the face area
        val leftFaceMat = new Mat(greyMat, new Rect(faceRect.x, faceRect.y, faceRect.width() / 2, faceRect.height() / 2))
        val leftEyeRect = new Rect()
        leftEyeCascade.detectMultiScale(leftFaceMat, leftEyeRect)

        // the right eye should be in the top-right quarter of the face area
        val rightFaceMat = new Mat(greyMat, new Rect(faceRect.x + faceRect.width() / 2, faceRect.y, faceRect.width() / 2, faceRect.height() / 2))
        val rightEyeRect = new Rect()
        rightEyeCascade.detectMultiScale(rightFaceMat, rightEyeRect)

        faces += Face(i, cloneRect(faceRect), cloneRect(leftEyeRect), cloneRect(rightEyeRect))
      }
      faces
    }
  }

  val canvas = new CanvasFrame("Webcam")

  val faceDetector = new FaceDetector
  //  //Set Canvas frame to close on exit
  canvas.setDefaultCloseOperation(javax.swing.JFrame.EXIT_ON_CLOSE)

  //Declare FrameGrabber to import output from webcam
  val grabber = new OpenCVFrameGrabber(0)
  grabber.setImageWidth(640)
  grabber.setImageHeight(480)
  grabber.setBitsPerPixel(CV_8U)
  grabber.setImageMode(ImageMode.COLOR)
  grabber.start()

  var lastRecognitionTime = 0L
  val cvFont = new CvFont()
  cvFont.hscale(0.6f)
  cvFont.vscale(0.6f)
  cvFont.font_face(FONT_HERSHEY_SIMPLEX)

  val mat = new Mat(640, 480, CV_8UC3)
  val greyMat = new Mat(640, 480, CV_8U)
  var faces = mutable.Buffer.empty[Face]
  while (true) {
    val img = grabber.grab()
    cvFlip(img, img, 1)

    // run the recognition every 200ms to not use too much CPU
    if (System.currentTimeMillis() - lastRecognitionTime > 200) {
      mat.copyFrom(img.getBufferedImage)
      opencv_imgproc.cvtColor(mat, greyMat, opencv_imgproc.CV_BGR2GRAY, 1)
      opencv_imgproc.equalizeHist(greyMat, greyMat)
      faces = faceDetector.detect(greyMat)
      lastRecognitionTime = System.currentTimeMillis()
    }

    // draw the face rectangles with the eyes and caption
    for(f <- faces) {
      // draw the face rectangle
      cvRectangle(img,
        opencv_core.cvPoint(f.faceRect.x, f.faceRect.y),
        opencv_core.cvPoint(f.faceRect.x + f.faceRect.width, f.faceRect.y + f.faceRect.height),
        AbstractCvScalar.RED,
        1, CV_AA, 0)

      // draw the left eye rectangle
      cvRectangle(img,
        opencv_core.cvPoint(f.faceRect.x + f.leftEyeRect.x, f.faceRect.y + f.leftEyeRect.y),
        opencv_core.cvPoint(f.faceRect.x + f.leftEyeRect.x + f.leftEyeRect.width, f.faceRect.y + f.leftEyeRect.y + f.leftEyeRect.height),
        AbstractCvScalar.BLUE,
        1, CV_AA, 0)

      // draw the right eye rectangle
      cvRectangle(img,
        opencv_core.cvPoint(f.faceRect.x + f.faceRect.width / 2 + f.rightEyeRect.x, f.faceRect.y + f.rightEyeRect.y),
        opencv_core.cvPoint(f.faceRect.x + f.faceRect.width / 2 + f.rightEyeRect.x + f.rightEyeRect.width, f.faceRect.y + f.rightEyeRect.y + f.rightEyeRect.height),
        AbstractCvScalar.GREEN,
        1, CV_AA, 0)

      // draw the face number
      val cvPoint = opencv_core.cvPoint(f.faceRect.x, f.faceRect.y - 20)
      cvPutText(img, s"Face ${f.id}", cvPoint, cvFont, AbstractCvScalar.RED)
    }
    canvas.showImage(img)
  }
}

To run the program:

sbt "run-main com.chimpler.javacv.FaceWebcamDetectorApp"

If everything is working well, you should see your face on the screen with the red rectangle around your face and the blue and green rectangles around your eyes.

webcam

Conclusion

We have seen how to use some functions in openCV to manipulate images and to use the haar classifier to detect faces in an image. Haar cascade classifier is not limited to faces and can recognize any objects. You can find freely available cascade classifier files on the openCV github and on http://alereimondo.no-ip.org/OpenCV/34/
You can also create your own classifier, there is a good tutorial at http://note.sonots.com/SciSoftware/haartraining.html and at http://coding-robin.de/2013/07/22/train-your-own-opencv-haar-classifier.htm.
OpenCV provides a lot more of features and we only described a small fraction of it. It implements several algorithms to recognize faces and several image and video processing algorithms.

About chimpler
http://www.chimpler.com

Leave a comment