Issuu

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 11 Issue: 04 | Apr 2024

p-ISSN: 2395-0072

www.irjet.net

ENHANCING LIVE CCTV SYSTEMS: OBJECT DETECTION, TRACKING, AND EXPRESSIVE FOOTAGE STORAGE Ankit Sharma1, Arnav Zutshi2, Mridul Sharma3 ,Anshika Sharma4, Prof. Pramila M. Chawan5 1, 2, 3, 4BTech Student, Dept. of Computer Engineering and IT, VJTI College, Mumbai, Maharashtra, India

5Associate Professor, Dept. of Computer Engineering and IT, VJTI College, Mumbai, Maharashtra, India

---------------------------------------------------------------------***---------------------------------------------------------------------

Abstract - In our rapidly evolving world, where the significance of data storage cannot be overstated, the efficient handling of such data has become paramount. Advanced storage technologies offer promising solutions for enterprises and organizations dealing with large-scale data collection. This project focuses on enhancing the efficiency of CCTV storage for traffic cameras by selectively recording essential data. Leveraging pretrained weights from YOLOv3, the system employs frameto-frame distance measurement to identify significant changes, thus avoiding unnecessary storage redundancy. Through integration with OpenCV's DNN module and utilizing ResNet18 for Feature Vector extraction, the system achieves swift loading of YOLOv3 weights, enabling real-time object detection and classification. This proposed model not only accurately pinpoints objects responsible for changes but also precisely outlines them with bounding boxes, providing an optimized solution for robust and efficient traffic surveillance systems.

is a metric commonly used to measure the similarity between two vectors by computing the cosine of the angle between them. By applying a threshold to the cosine similarity values, we can identify frames with low similarity, indicating significant changes or new objects entering the scene. The main objective of our project is to develop a video surveillance system that intelligently selects and saves only those frames that contain meaningful changes or events. This selective recording approach not only optimizes storage utilization but also ensures that important information is preserved for subsequent analysis or retrieval. Throughout this paper, we will describe in detail the methodology used for integrating YOLOv3 and ResNet18 models, the process of calculating cosine similarities, and the implementation of our selective frame-saving mechanism. We will also present experimental results and performance evaluations to demonstrate the effectiveness and efficiency of our proposed system compared to traditional continuous recording methods.

Key Words: YOLOv3, object detection, OpenCV, ResNet

1.INTRODUCTION

Overall, this research contributes to the advancement of video surveillance technologies by combining cutting-edge deep learning techniques with intelligent data selection strategies, paving the way for more resource-efficient and responsive surveillance systems in various domains.

In recent years, advancements in computer vision and deep learning have revolutionized the field of video surveillance. The ability to accurately detect and track objects in real-time has become crucial for various applications, including security monitoring, traffic management, and behavior analysis. One of the key challenges in video surveillance systems is the efficient handling and storage of vast amounts of data generated by continuous video streams.

2. LITERATURE REVIEW The evolution of real-time object detection in computer vision has seen significant advancements with the introduction of the You Only Look Once (YOLO) framework. Redmon, Divvala, Girshick, and Farhadi (2016) introduced the YOLO framework, which revolutionized object detection by providing a unified approach for real-time processing. This framework streamlined object detection tasks, making them more efficient and accessible for various applications requiring immediate and accurate object identification.

This research paper focuses on addressing this challenge by leveraging state-of-the-art techniques in object detection and feature extraction. Specifically, we utilize YOLOv3 pretrained weights, a popular deep learning model known for its high-speed and accurate object detection capabilities. Additionally, we integrate the ResNet18 model for feature vector extraction, which enables us to capture rich representations of objects in video frames.

Building upon the success of YOLO, Redmon and Farhadi (2018) proposed YOLOv3 as an incremental improvement, further enhancing the capabilities of real-time object detection. YOLOv3 represents a refinement of the original

The novelty of our approach lies in the utilization of these feature vectors to calculate cosine similarities between consecutive frames in a video sequence. Cosine similarity

Impact Factor value: 8.226

ISO 9001:2008 Certified Journal

Page 2062