Comparison of Various RCNN techniques for Classification of Object from Image

Page 1

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 04 Issue: 07 | July -2017

p-ISSN: 2395-0072

www.irjet.net

Comparison of Various RCNN techniques for Classification of Object from Image Radhamadhab Dalai1, Kishore Kumar Senapati2 PHD Student ,Dept. of Computer science & Engineering, BIT Mesra, Ranchi, Jharkhand, India 2Professor, Dept. Of Computer science & Engineering, BIT Mesra, Ranchi, Jharkhand, India ---------------------------------------------------------------------***--------------------------------------------------------------------1

Abstract -Object recognition is a very well known problem

domain in the field of computer vision and robot vision. In earlier years in neuro science field CNN has played a key role in solving many problems related to identification and recognition of object. As visual system of our brain shares many features with CNN's properties it is very easy to model and test the problem domain of classification and identification of object. Basically CNN is typically a feed forward architecture; on the other hand visual system is based upon recurrent CNN (RCNN) for incorporating recurrent connections to each convolutional layer. In middle layers each unit is modulated by the activities of its neighboring units. Here Various RCNN techniques (RCNN,FAST RCNN,FASTER RCNN )are implemented for identifying bikes using CALTECH-101 database and alter their performances are compared. Key Words: DNN, CNN, RCNN, FAST-RCNN, FASTER RCNN

1. INTRODUCTION Recently, techniques in deep neural networks (DNN) including convolutional neural networks(CNN) [1]and residual neural networks - have shown great recognition accuracy compared to traditional methods (artificial neural networks, decision tress, etc.). However, experience reveals that there are still a number of factors that limit scientists from deriving the full performance benefits of large, DNNs. We summarize these challenges as follows: (1) large number of hyper parameters that have to be tuned against the DNN during training phase, leading to several data recomputations over a large design-space, (2) the share volume of data used for training, resulting in prolonged training time, (3) how to effectively utilize underlying hardware (compute, network and storage) to achieve maximum performance during this training phase. Fast RCNN is an object detection algorithm proposed by Ross Girshick in 2015. Fast R-CNN builds on previous work to efficiently classify object proposals using deep convolutional networks. Compared to previous work, Fast R-CNN employs a region of interest pooling scheme that allows to reuse the computations from the convolutional layers. In just 3 years, we’ve seen how the research community has progressed from Krizhevsky et. al’s original result to R-CNN, and finally all the way to such powerful results as FASTER R-CNN [2].

© 2017, IRJET

|

Impact Factor value: 5.181

|

Seen in isolation, results like FASTER R-CNN seem like incredible leaps of genius that would be unapproachable. Yet, through this post, I hope you’ve seen how such advancements are really the sum of intuitive, incremental improvements through years of hard work and collaboration. Each of the ideas proposed by R-CNN, Fast RCNN[1,3], Faster R-CNN. We describe how we compose hardware, software and algorithmic components to derive efficient and optimized DNN models that are not only efficient, but can also be rapidly re-purposed for other tasks, such as object in motion identification, or assignment of transverse momentum to these motions. This work is an extension of the previous work to design a generalized hardware-software framework that simplifies the usage of deep learning techniques in big data problems.

1.1 CNN Fundamentals Convolutional neural networks are an important class of learnable representations applicable, among others, to numerous computer vision problems. Deep CNNs, in particular, are composed of several layers of processing, each involving linear as well as non-linear operators, which are learned jointly, in an end-to-end manner, to solve a particular tasks. These methods are now the dominant approach for feature extraction from audiovisual and textual data.

Figure 1: Diagram of common convolutional network

ISO 9001:2008 Certified Journal

|

Page 3147


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.