Matthew Cutts

Document Type



Tracking an object's location with inertial sensors works well over short time periods, but sensor drift and errors in integration cause errors to accumulate exponentially. Images, in contrast, work very well to determine a camera's location provided that the camera moves slowly. I received fellowship funding to work on a hybrid image and inertial tracker. The idea was to use data from the image sensors to correct errors and drift in the inertial sensors. The computer vision community has expended a great deal of energy in pursuing motion tracking from images. This area of research is also related to compositing multiple images into a single image. As subsequent images are compared and their differences are computed, the differences yield clues to the camera's movement. For example, imagine snapping pictures of some distant mountains while sidling from left to right. If those pictures were overlaid to minimize their overlap, later pictures would capture the movement of the camera to the right. The difficulty in computing movement in this fashion is that, in general, the camera can be looking at moving objects, or even objects that arc changing shape (e.g. trees) while the camera moves as well. Also, objects are usually much closer than mountains, and objects at different distances move differently in the camera's view because of parallax. Most computer vision researchers assume that a scene is static in order to compute camera movement. Even with a static scene (i.e. unmoving objects that don't change shape), computing camera motion from images is still difficult because the shapes of objects in the scene aren't known. With no three dimensional information available, most techniques assume that the scene is a flat panorama. Information about the camera movement is found by first solving for translation. Once the translation that gives minimum overlap is found, the solver attempts to use an affine warp to fit additional movement. More sophisticated programs use the results from affine warp solving to guess at a full perspective warp solution. A perspective warp requires eight parameters and can capture the full visual appearance of a plane as a camera moves in three dimensions, as shown in Irani [2].

Publication Date



Link Foundation Fellowship for the years 1998-1999.

FORM Final Report Matthew Cutts.pdf (141 kB)
Standard cover form for report



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.