International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 11 Issue: 09 | Sep 2024
p-ISSN: 2395-0072
www.irjet.net
Ensembled Pre Trained Convolutional Neural Network Techniques for Human Activity Detection and Recognition Suhruth B1 1Tata Consultancy Services
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - - Human Activity Recognition (HAR) plays a
unimodal methods, which rely on data from a single source like photographs or video frames, and multimodal methods, which amalgamate information from various sources to form a more comprehensive understanding of human activities. Multimodal approaches, in particular, delve into different facets of human behavior, including emotional states and social interactions, offering a more nuanced perspective on activity recognition [2]. The advent of deep learning technologies, especially neural networks, has markedly advanced the field of human activity recognition. These technologies excel in deciphering complex patterns and analyzing vast datasets, thereby significantly enhancing both the accuracy and efficiency of recognition systems. This advancement opens up new avenues for researchers and practitioners to explore innovative algorithms, network architectures, and methodologies that could potentially redefine our comprehension and analysis of human actions. This body of work underscores the critical importance of enhancing accuracy and reliability in human activity recognition (HAR). By proposing a framework that not only improves recognition efficacy but also allows for the seamless integration of diverse CNN architectures, this research sets a new benchmark for adaptability and performance in HAR systems. Furthermore, it lays the groundwork for future explorations into ensemble approaches within HAR, including the potential inclusion of additional sensory data and the application of interpretability mechanisms. Through such endeavors, the research community can continue to build upon the existing body of knowledge, identifying areas for improvement and innovation, and moving closer to the development of sophisticated, human-centric technologies that accurately reflect and respond to our complex behaviors and interactions. The motivation behind this research lies in the ever-growing importance of HAR across various critical and everyday applications, such as healthcare monitoring, security systems, and the enhancement of human-computer interaction. With the aim to push the boundaries of what's currently achievable in HAR, we recognize the need for more sophisticated and reliable methods that can accurately interpret and classify a wide range of human activities from visual data. Traditional single-model approaches, while effective to a certain extent, often fall short in dealing with the complexity and variability of human actions. This gap highlights a significant opportunity for innovation, leading us to explore an advanced ensemble approach that leverages the power of multiple CNNs. By integrating several CNNs, each tailored to capture distinct features and
pivotal role across a broad spectrum of applications, ranging from healthcare monitoring to enhancing security systems and refining human-computer interaction. This research introduces an advanced ensemble approach that integrates multiple Convolutional Neural Networks (CNNs), each engineered to extract unique features and representations from benchmark datasets encompassing a variety of human activities. The synergy of these CNNs within our ensemble framework has led to a marked improvement in recognizing a diverse array of activities, underscoring the efficacy of CNNs in elevating HAR's capabilities. Moreover, we propose a methodological framework that harnesses the collective strength of ensemble CNNs, aiming to boost the accuracy and robustness of activity recognition. This innovative approach not only sets a new standard in achieving high precision in HAR but also opens new avenues for deploying more dependable and precise human activity recognition systems in real-life scenarios. Key Words:- Human activity recognition (HAR), Image, Deep Learning, Kinetics dataset, inception V2, Convolutional Neural Network.
1.INTRODUCTION The study and recognition of human activities play a foundational role in shaping how individuals interact and communicate within their environments. This process involves a detailed analysis of actions depicted in images or videos, ranging from simple movements such as walking or running to more intricate activities like peeling an apple. The ability to accurately interpret these actions is pivotal in understanding the context and nuances of a situation, offering valuable insights into human behavior and interactions. Developing an automated system capable of recognizing human activities within visual media presents a multitude of challenges. Factors such as background disturbances, obstructions, variations in lighting conditions, and the overall quality of the image or video significantly complicate the task of action identification [1]. The complexity is further amplified when considering the diverse ways in which individuals from different cultural backgrounds or with unique habits might perform the same action, leading to potential ambiguities in interpretation. In addressing these challenges, researchers have predominantly pursued two methodological approaches:
© 2024, IRJET
|
Impact Factor value: 8.315
|
ISO 9001:2008 Certified Journal
|
Page 713