One of the other machine-learning researchers had created a deep neural net (DNN) using TensorFlow but didn't have the required number of labelled training images to train, test and validate the DNN. I can't say exactly what the DNN does because it is under a pending patent having to do with eye gaze. Getting accurate information about eye gaze from real photos of people's faces is notoriously inaccurate and we needed tens of thousands of accurately labelled eye gaze images.
Python description TBD...
Experiences using this skill are shown below:
An earlier version of our machine-learning person tracking software was too slow to keep up on every video frame. (This was before our team attempted to use GPU acceleration with more efficient person tracking software.) This resulted in jumpy video transitions while tracking someone. I was given the task of finding a way of applying video motion smoothing so the resulting video framing would be smooth and professional looking.
Since I had already computed the homography and object segmentation for the text readability prototype, it was just a matter of using the OpenCV library to apply perspective warping on that video stream (from a computed perspective transform based on the homography matrix) so that one video stream could be seamlessly inserted into the other to create an augmented reality mashup. The result was a much clearer rendering of the projected content as seen in the composite video stream by remote users.
I was given the task of researching automatic white-balance algorithms with the goal of calibrating multiple video cameras to the same color balance. The problem was that when switching between multiple cameras covering the same scene, a noticeable color shift was observed in the video stream, especially when the cameras were of different manufacture.
I was given the task of researching and prototyping various computer vision algorithms to determine whether the text appearing within a given video frame was readable or not.