Enhancing Surveillance Vision-Based Human Action Recognition Using Skeleton Joint Swing and Angle Feature and Modified AlexNet-LSTM

Riky Tri Yunardi, Tri Arief Sardjono, Ronny Mardiyanto

Research output: Contribution to journalArticlepeer-review

Abstract

Human action recognition (HAR) identifies and classifies human activities by analyzing activity patterns, with feature extraction and classification accuracy being key challenges. In the field of surveillance vision-based, it requires the ability to accurately detect suspicious human activities to provide public safety. In the classifier model for action recognition, the steps start from feature extraction to classification. The classification step uses recurrent neural network architectures such as LSTM to handle sequential data. However, this approach struggles to process spatial information in video data, necessitating the need for a model to learn spatiotemporal patterns from feature data. To address these issues, this study proposes a novel method for classifying activities based on the pattern of 2D skeleton joint swing and angle features for each activity. Additionally, it introduces a novel modified AlexNet architecture with two LSTM layers, called AlexNet-2LSTM, to improve the accuracy of human activity classification. In the performance experiments, the proposed method was evaluated on the KTH and Weizmann datasets, both of which include videos of several people performing different actions. Moreover, to demonstrate the accuracy of the proposed classifier model, it was compared against other state-of-the-art (SOTA) deep learning classifiers, namely Optimized-LSTM, Triple Parallel LSTM, Hybrid CNN-LSTM, LCSWnet, and CC-LSTM-CNN, which the AlexNet-2LSTM achieved precision of 0.95, recall of 0.95, F1-score of 0.94, and accuracy of 0.96on the KTH dataset. Besides that, on the Weizmann dataset 0.95, 0.94, 0.94, and 0.93, respectively. These achievements highlight the proposed model contribution to improving feature extraction and classification accuracy in vision-based HAR systems.

Original languageEnglish
Pages (from-to)754-768
Number of pages15
JournalInternational Journal of Intelligent Engineering and Systems
Volume18
Issue number1
DOIs
Publication statusPublished - 2025

Keywords

  • Deep learning classifier
  • Long short-term memory
  • Skeleton joint
  • Vision-based human action recognition

Fingerprint

Dive into the research topics of 'Enhancing Surveillance Vision-Based Human Action Recognition Using Skeleton Joint Swing and Angle Feature and Modified AlexNet-LSTM'. Together they form a unique fingerprint.

Cite this