homography

Skill level
2 Intermediate skills

In 2019, I was on a project involving augmented or mixed reality. We needed to take an image from one source and then scale and warp it into a live video stream (showing the same image in context with the scene) such that the image was overlaid into the video scene seamlessly. This required computing the homography of one perspective relative to another.

In our use case, the homography is only computed once per camera during setup using a calibration image with the video cams set to known pan, tilt, zoom settings relative to the video scene. Once the initial homography is computed during setup, we then apply perspective transforms to adjust the homography matrix values as the cameras change pan, tilt and zoom; (the cameras don't move in translation). That way, the image overlay is continuously scaled and warped appropriately within the video scenes as camera angles and field-of-views continuously change. After the homography of the calibration overlay image is computed, I can overlay whatever image content I want as long as these overlay images maintain the same dimensions as the calibration image.

The first step in computing the homography is to figure out what features in one image match those of the other. There are a number of feature detector algorithms that can match features between two images by detecting where similar corners appear in the two images. The SILT (Scale-Invariant Line Transform) and SURF (Speeded Up Robust Features) detectors are best known but both use a patented matching algorithm based on Euclidean distances. Since the end goal was to turn our research into an eventual Barco product we could not use SILT or SURF without paying a royalty. Therefore, one of my colleagues looked at alternative feature detectors that were not based on Euclidean distances. The one that he choose was the BRISK (Binary Robust Invariant Scalable Keypoints) detector which like BRIEF and ORB are based on binary hamming distances that can be processed much faster than floating-point Euclidean distances but with a bit less accuracy. More importantly, BRISK is open-source and royalty free.

I have been using the OpenCV implementation of BRISK together with OpenCV's RANSAC (RANdom SAmple Consensus) in my mixed reality Python code to estimate homographies. RANSAC does smart elimination of outliers by looking at how well the data clusters around a common path. RANSAC is the usual second step to finding only the feature matches that really count. A faster alternative to RANSAC is PROSAC (PROgressive SAmple Consensus) but I've never tried it largely because it was never officially ported to OpenCV. The 3x3 homography matrix can then be computed from the remaining feature matches that were not filtered out by RANSAC.

Experiences using this skill are shown below:



Barco Labs (research)

[I know, this section just echos the same stuff as on the résumé. I plan to expand later.] Worked with PhDs, staff and university interns researching disruptive technologies. Barco Labs deliverables are research papers, patents and demos. Any research that might become a viable product in 2 to 5 years is then passed off to one of the product divisions. (Due to the trade secret nature of this research some details cannot be revealed.) Accomplishments: