Action recognition based on histogram of oriented gradients and spatio-temporal interest points


  • P A. Dhulekar Sandip Institute of Technology and Research Centre, Nashik,Savitribai Phule Pune University, Pune, India
  • S T. Gandhe Sandip Institute of Technology and Research Centre, Nashik,Savitribai Phule Pune University, Pune, India





Action Recognition, Histogram of Oriented Gradients, K-Nearest Neighbor, Support Vector Machine, Spatio-Temporal Interest Points.


In modern years large extent of the work has been carried out to recognize human actions perhaps because of its wide range of applications in the field of surveillance, human-machine interaction and video analysis. Several methods were proposed by researchers to resolve action recognition challenges such as variations in viewpoints, occlusion, cluttered backgrounds and camera motion. To address these challenges, we propose a novel method comprise of features extraction using histogram of oriented gradients (HOG), and their classification using k-nearest neighbor (k-NN) and support vector machine (SVM). Six different experimentations were carried out on the basis of hybrid combinations of feature extractors and classifiers. Two gold standard datasets; KTH and Weizmann were used for training and testing purpose. The quantitative parameters such as recognition accuracy, training time and prediction speed were used for evaluation. To validate the applicability of proposed algorithm, its performance has been compared with spatio-temporal interest points (STIP) technique which was proposed as state of art method in the domain.



[1] H. Liu, S. Chen, and N. Kubota, “Intelligent video systems and analytics: ASurvey,†IEEE Trans. Ind. Informat., vol. 9, no. 3, pp. 1222–1233, Aug. 2013.

[2] K. Cuppens, L. Lagae, B. Ceulemans, S. Van Huffel and B. Vanrumste, “Automatic video detection of body movement during sleep based on optical flow in pediatric patients with epilepsy,†Medical and Biological Engineering and Computing, Vol. 48(9), pp. 923-931, 2010.

[3] Daniel Weinland, Remi Ronfard, Edmond Boyer, “A survey of vision-based methods for action representation, segmentation and recognition,†Computer Vision and Image Understanding 115, pp. 224–241, 2011.

[4] Md. Atiqur Rahman Ahad, J. Tan, H. Kim, S. Ishikawa, “Action Dataset – A Survey,†SICE Annual Conference, Waseda University, Tokyo, Japan, September 13-18, 2011.

[5] Manoj Ramanathan, Wei-Yun Yau, and Eam Khwang Teoh, “Human Action Recognition with Video Data: Research and Evaluation Challenges,†IEEE Transactions on Human-Machine Systems, Volume: 44, Issue: 5, pp 650 – 663, Oct. 2014.

[6] I. Laptev and T. Lindeberg, “Space-time interest points,†In Proc. of International Conference of Computer Vision, pp.432–439, 2003.

[7] Christian Schuldt, Ivan Laptev,Barbara Caputo, “Recognizing Human Actions: A Local SVM Approach,†In Proc. of the 17th International Conference on Pattern Recognition (ICPR’04), pp.1-5, 2004.

[8] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,†In CVPR, pp: 886–893, 2005.

[9] Christianini, N., and J. Shawe-Taylor, “An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods,†Cambridge University Press, Cambridge, UK, 2000.


[11] Moshe Blank, Lena Gorelick, Eli Shechtman, Michal Irani, Ronen Basri, Actions as space–time shapes, in: Proceedings of the International Conference on Computer Vision (ICCV’05), vol. 2, pp. 1395–1402, October 2005.

[12] K. Schindler and L. van Gool, “Action Snippets: how many frames does human action recognition require?†in Proc. IEEE Conf. Comput. Vis.Pattern Recognition, Jun. 2008, pp. 1–8.

View Full Article: