【转】Face detection with getUserMedia
There are quite a few interesting APIs evolving in the "modern web", but not all of them are going to be things you would use in most projects. I've been very public about my feelings concerning canvas for example. Great for games and charting - but not much else. That doesn't make it a bad feature. It just makes it one I won't use terribly often. Whenever I read about some new cool feature being developed, my mind starts trying to figure out what they could be used for in a practical sense. Obviously what's practical to you may not be practical to me, but figuring out how I would actually use a feature is part of how I learn it.
One such feature is getUserMedia (W3C Spec). This is a JavaScript API that gives you access to (with permission) the user's web cam and microphone. getUserMedia is currently supported in Opera and Chrome (I believe it is in version 18 now, but you may need to grab Canary. You also need to enable it. Instructions on that here.) Once you get past actually enabling it, the API is rather simple. Here's a quick request for access:
//a video tag
var video = document.getElementById('monitor');
//request it
navigator.webkitGetUserMedia({video:true}, gotStream, noStream);
function gotStream(stream) {
video.src = webkitURL.createObjectURL(stream);
video.onerror = function () {
stream.stop();
streamError();
};
}
function noStream() {
document.getElementById('errorMessage').textContent = 'No camera available.';
}
function streamError() {
document.getElementById('errorMessage').textContent = 'Camera error.';
}
view rawgistfile1.jsThis Gist brought to you by GitHub.
The first argument to getUserMedia is the type. According to the spec, this is supposed to be an object where you enable audio, video, or both, like so: {audio:true, video:true}. However in my testing, passing a string, "video", worked fine. The demo you will be seeing is based on another demo so that line possibly came from an earlier build that still works with Chrome. The second and third arguments are your success and failure callbacks respectively.
You can see in the gist where the success handler assigns the video stream to an HTML5 video tag. What's cool then is that once you have that running you can use the Canvas API to take pictures. For a demo of this, check out Greg Miernicki's demo:
http://miernicki.com/cam.html
If this demo doesn't work for you - then stop - and try following the instructions again to enable support. (Although I plan on sharing a few screen shots so if you just want to keep reading, that's fine too.)
Based on Greg's demo, it occurred to me that there is something cool we can do with pictures of our web cams. (Cue the dirty jokes.) I remembered that Face.com had avery cool API for parsing pictures for faces. (I blogged a ColdFusion example back in November.) I wondered then if we could combine Greg's demo with the Face.com API to do some basic facial recognition.
Turns out there are a few significant issues with this. First - while Face.com has a nice REST API, how would we use it from a JavaScript application? Secondly - Face.com requires you to either upload a picture or give it a URL. I know I could send a canvas picture to a server and have my backend upload it to Face.com, but is there a way to bypass the server and send the picture right to the API?
The first issue actually turned out to be a non-issue. Face.com implements CORS(Cross-Origin Resource Sharing). CORS basically allows a server to expose itself to Ajax calls from documents on other domains. It's a great feature and I hope more services enable it.
The more complex issue then was taking the canvas data and sending it to Face.com. How can I fake a file upload? Turns out there's another cool new trick - FormData. Fellow ColdFusion blogger Sagar Ganatra has an excellent blog entry on the topic. Here's how I used it:
function snapshot() {
$("#result").html("<p><i>Working hard for the money...</i></p>");
canvas.width = video.videoWidth;
canvas.height = video.videoHeight;
canvas.getContext('2d').drawImage(video, 0, 0);
var data = canvas.toDataURL('image/jpeg', 1.0);
newblob = dataURItoBlob(data);
var formdata = new FormData();
formdata.append("api_key", faceKey);
formdata.append("api_secret", faceSecret);
formdata.append("filename","temp.jpg");
formdata.append("file",newblob);
$.ajax({
url: 'http://api.face.com/faces/detect.json?attributes=age_est,gender,mood,smiling,glasses',
data: formdata,
cache: false,
contentType: false,
processData: false,
dataType:"json",
type: 'POST',
success: function (data) {
handleResult(data.photos[0]);
}
});
}
view rawgistfile1.jsThis Gist brought to you by GitHub.
Let's look at this line by line. First off - I need to get the binary data from the canvas object. There's a few ways of doing this, but I wanted a Binary Blob specifically. Notice the dataURIToBlob method. This comes from a StackOverflow post I found a few weeks back.
I create a new FormData object and then simply begin setting my values. You can see I pass in a few API requirements but the crucial parts are the filename and file object itself.
Below that you can see the simple jQuery Ajax call. Face.com has a variety of options, but I basically just asked it to return an estimated age, gender, mood, and whether or not the person was smiling and wearing glasses. That's it. I get a nice JSON packet back and format it.
Now obviously no API is perfect. I've had different levels of results from using the API. Sometimes it's pretty damn accurate and sometimes it isn't. Overall though it's pretty cool. Here are some scary pictures of yours truly testing it out.
Ok, ready to test it yourself? Just click the demo button below. For the entire source, just view source! This is 100% client-side code.
For another look at getUserMedia, check out these examples:
- It's Curtains for Marital Strife Thanks to getUserMedia
- Testing WebRTC on Chrome
- Bleeding Edge HTML5, WebRTC & Device Access
- Capturing Audio & Video in HTML5
Edit on May 23: Chrome recently modified the getUserMedia API to match the spec (I believe) which requires you to pass an object of media you want, so instead of "video", I used {video:true}.
TAG: