Date of Award
Doctor of Philosophy (PhD)
Computer Engineering and Sciences
In this dissertation, we address the problem of recognizing human-action from videos. The recognition aims at recovering action information from the image sequences using different features such as variations of the human shape. Approaches based on such features often use sequence-alignment methods. We propose two novel methods for human-action recognition. We also propose an elliptical-shaped band for the Dynamic Time Warping (DTW) that provides a good compromise between alignment accuracy and computational speed. First, we study the applicability of the pairwise shape-similarity measurements for human-action recognition. Since action can be seen as a sequence of shapes of silhouette poses, there exists similarities between actions from the same class. Based on this observation, we propose a new method for classifying human actions. Given two sequences of silhouettes representing an action, we measure their similarity by means of a robust sequence-alignment method. The motion cue is implicitly represented by the implicit variations of the human’s shape over time while an action is performed. We adopt the Longest Common Sub-Sequence (LCSS), a dynamic-programming approach that calculates the minimum cost of aligning the two sequences. Next, we use information from inter-pose shape variations as provided by shape descriptors for recognizing human actions. Here, in contrast to the previous method, where an action is not modeled by itself, we present a method that converts an action into a sequence based on the variations of a human’s shape over time. We construct the sequence using the Inner-Distance Shape-Context as a measurement of variations between shapes. Experimental results compare our method favorably with related methods. Finally, we develop a new global band for the Dynamic Time Warping algorithm. In contrast with standard rectangular-shaped bands, we propose an elliptical-shaped band that provides flexibility and a good compromise between alignment accuracy and computational speed. The shape of the ellipse is implicitly represented by the length of the time series. The idea of our elliptical band is to speed up DTW and enforce a global constraint on the warping path by using a window size that tolerates a significant amount of noise in the aligned time series.
Almotairi, Sultan Mohammad, "Using Variations of Shape and Appearance in Alignment Methods for Classifying Human Actions" (2014). Theses and Dissertations. 877.
Copyright held by author