Table of Contents
Prev Tutorial: YOLO DNNs
Next Tutorial: Custom deep learning layers support
Original author | Dmitry Kurtaev |
Compatibility | OpenCV >= 3.3.1 |
Introduction
This tutorial will show us how to run deep learning models using OpenCV.js right in a browser. Tutorial refers a sample of face detection and face recognition models pipeline.
Face detection
Face detection network gets BGR image as input and produces set of bounding boxes that might contain faces. All that we need is just select the boxes with a strong confidence.
Face recognition
Network is called OpenFace (project https://github.com/cmusatyalab/openface). Face recognition model receives RGB face image of size 96x96
. Then it returns 128
-dimensional unit vector that represents input face as a point on the unit multidimensional sphere. So difference between two faces is an angle between two output vectors.
Sample
All the sample is an HTML page that has JavaScript code to use OpenCV.js functionality. You may see an insertion of this page below. Press Start
button to begin a demo. Press Add a person
to name a person that is recognized as an unknown one. Next we'll discuss main parts of the code.
- Run face detection network to detect faces on input image. You may play with input blob sizes to balance detection quality and efficiency. The bigger input blob the smaller faces may be detected.function detectFaces(img) {var blob = cv.blobFromImage(img, 1, {width: 192, height: 144}, [104, 117, 123, 0], false, false);netDet.setInput(blob);var out = netDet.forward();var faces = [];for (var i = 0, n = out.data32F.length; i < n; i += 7) {var confidence = out.data32F[i + 2];var left = out.data32F[i + 3] * img.cols;var top = out.data32F[i + 4] * img.rows;var right = out.data32F[i + 5] * img.cols;var bottom = out.data32F[i + 6] * img.rows;left = Math.min(Math.max(0, left), img.cols - 1);right = Math.min(Math.max(0, right), img.cols - 1);bottom = Math.min(Math.max(0, bottom), img.rows - 1);top = Math.min(Math.max(0, top), img.rows - 1);if (confidence > 0.5 && left < right && top < bottom) {faces.push({x: left, y: top, width: right - left, height: bottom - top})}}blob.delete();out.delete();return faces;};
- Run face recognition network to receive
128
-dimensional unit feature vector by input face image.function face2vec(face) {var blob = cv.blobFromImage(face, 1.0, {width: 112, height: 112}, [0, 0, 0, 0], true, false)netRecogn.setInput(blob);var vec = netRecogn.forward();blob.delete();return vec;}; - Perform a recognition. Match a new feature vector with registered ones. Return a name of the best matched person.function recognize(face) {var vec = face2vec(face);var bestMatchName = 'unknown';var bestMatchScore = 0.5; // Actually, the minimum is -1 but we use it as a threshold.for (name in persons) {var personVec = persons[name];var score = vec.dot(personVec);if (score > bestMatchScore) {bestMatchScore = score;bestMatchName = name;}}vec.delete();return bestMatchName;};
- The main loop. A main loop of our application receives a frames from a camera and makes a recognition of an every detected face on the frame. We start this function ones when OpenCV.js was initialized and deep learning models were downloaded.var isRunning = false;const FPS = 30; // Target number of frames processed per second.function captureFrame() {var begin = Date.now();cap.read(frame); // Read a frame from cameracv.cvtColor(frame, frameBGR, cv.COLOR_RGBA2BGR);var faces = detectFaces(frameBGR);faces.forEach(function(rect) {cv.rectangle(frame, {x: rect.x, y: rect.y}, {x: rect.x + rect.width, y: rect.y + rect.height}, [0, 255, 0, 255]);var face = frameBGR.roi(rect);var name = recognize(face);cv.putText(frame, name, {x: rect.x, y: rect.y}, cv.FONT_HERSHEY_SIMPLEX, 1.0, [0, 255, 0, 255]);});cv.imshow(output, frame);// Loop this function.if (isRunning) {var delay = 1000 / FPS - (Date.now() - begin);setTimeout(captureFrame, delay);}};