A Novel Text Detection System

Page 1

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395 -0056

Volume: 04 Issue: 04 | Apr -2017

p-ISSN: 2395-0072

www.irjet.net

A Novel Text Detection System 1Prof.

Harish Barapatre, 2Prashant Athavale, 3Sameer Suryavanshi, 4Abhishek Deware Yadavrao Tasgaonkar Institute Of Engineering and Technology Dept. Of Computer Engineering ---------------------------------------------------------------------***---------------------------------------------------------------------

Abstract - The proposed method is a novel method which

and much like the ‘object’, it has a particular set of characteristics, including a closed boundary. Text is made up of a set of interrelated characters. Therefore, effective text detection should be able to compensate for, and exploit these dependencies between characters. The three new characterness cues developed, instead of simple linear combination, a Bayesian approach is used to model the joint probability that a candidate region represents a character. The probability distribution of cues on both characters and non-characters are obtained from training samples. In order to model and exploit the inter-dependencies between characters we use the graph cuts algorithm to carry out inference over an MRF designed for the purpose. This approach is first to present a saliency detection model which measures the characterness of image regions. This text-specific saliency detection model is less likely to be distracted by other objects which are usually considered as salient in general saliency detection models. Promising experimental results on benchmark datasets demonstrate that characterness approach outperforms the state of-the-art.

is using three new features to detect text objects comprising two or more isolated characters in images. Each character is a part in the model and every two neighboring characters are connected by a link. Two characters and the link connecting them are defined as a text unit. For every candidate part we compute character energy and link energy character stroke forms two edges with high similarities in length, curvature, and orientation and the similarities in color, size, stroke width, and spacing between characters. we combine character and link energies to compute text unit energy which measures the likelihood that the candidate is a text object. Our proposed system can inherits properties of characters and discriminate text from other objects effectively.

1. INTRODUCTION Identification of text in an image is almost effortless for human being but it is very difficult for machine. Detecting text in natural images, as opposed to scans of printed pages, faxes and business cards, is an important step for a number of Computer Vision applications, such as computerized aid for visually impaired, automatic geo coding of businesses, and robotic navigation in urban environments. Visual saliency is fundamental to the human visual system, and there is need to process it. As such it has been a well studied problem within multiple disciplines, including cognitive psychology, neurobiology, and computer vision. The aim of salient object detection is to highlight the whole attention grabbing object with well-defined boundary. Previous saliency detection approaches can be broadly classified into either local or global methods. Measures of ‘objectness’ have built upon the saliency detection in order to identify windows within an image that are likely to contain an object of interest. Saliency detection models facilitate scene text detection; they share a common inherent limitation, which is that they are distracted by other salient objects in the scene. The approach here differs from these existing methods in that there is proposed a text-specific saliency detection model (i.e. a characterless model) and demonstrate its robustness when applied to scene text detection. This text detection is capable of identifying individual, bounded units of text, rather than areas with text-like characteristics. The unit in the case of text is the character,

© 2017, IRJET

|

Impact Factor value: 5.181

1.1 EXISTING SYSTEM In Previous segmentation methods, the process applies only for segmenting an object and that too only for single object. Most of the methods applies only for binary or gray images. In existing retrieval system, various algorithms proposed to improve the Retrieval properties between two images. They are such as BoW (Bag of words), GIST detectors, MSER detectors, GMM and SIFT are used to extract the feature. Among this, mixture models are commonly used in all detectors. Each model improves at least any of the parameter which improves the matching property. Pixels from the segmented object is known as feature pixels. Depend on features of pixels, the pixel matching performed between multiple images and then the image retrieved .The datasets used are commonly availed Real world datasets such as Oxford buildings and INRIA holidays. A Maximally Stable Extremal Region (MSER) is a connected component of an appropriately thresholded image. The word ‘extremal’ refers to the propertythat all pixels inside the MSER have either higher(bright extremal regions) or lower (dark extremalregions) intensity than all the pixels on its outerboundary. The ‘maximally stable’ in MSER describes the property optimized in the threshold selection

|

ISO 9001:2008 Certified Journal

| Page 487


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.
A Novel Text Detection System by IRJET Journal - Issuu