Abstract
A learning-based framework for action representation and recognition relying on the description of an action by time series of optical flow motion features is presented. In the learning step, the motion curves representing each action are clustered using Gaussian mixture modeling (GMM). In the recognition step, the optical flow curves of a probe sequence are also clustered using a GMM, then each probe sequence is projected onto the training space and the probe curves are matched to the learned curves using a non-metric similarity function based on the longest common subsequence, which is robust to noise and provides an intuitive notion of similarity between curves. Alignment between the mean curves is performed using canonical time warping. Finally, the probe sequence is categorized to the learned action with the maximum similarity using a nearest neighbor classification scheme. We also present a variant of the method where the length of the time series is reduced by dimensionality reduction in both training and test phases, in order to smooth out the outliers, which are common in these type of sequences. Experimental results on KTH, UCF Sports and UCF YouTube action databases demonstrate the effectiveness of the proposed method.
Original language | English (US) |
---|---|
Pages (from-to) | 27-40 |
Number of pages | 14 |
Journal | Computer Vision and Image Understanding |
Volume | 119 |
DOIs | |
State | Published - Feb 2014 |
Keywords
- Clustering
- Dimensionality reduction
- Gaussian mixture modeling (GMM)
- Human action recognition
- Longest common subsequence
- Motion curves
- Optical flow
ASJC Scopus subject areas
- Software
- Signal Processing
- Computer Vision and Pattern Recognition