Nowadays, there is an enormous demand in storing information available on papers, such as books or newspapers.
There is an existing way to store information by scanning the desired text, but it will be stored as an image that won’t help for
further processing. For instance, if stored scanned text images, can’t read the text word by word, or line by line; the text in these
scanned images can’t be reused unless we rewrite that whole content by ourselves. Detection of text from documents in which text
is embedded in complex colored document images is a very challenging problem. There are a lot of potential users who want to
extract the text from images, archiving documents etc. For this reason, user need an Optical Character Recognition (OCR). It
aims at detecting textual regions from the document and separating it from the graphics portion. Getting information directly
from applications forms and it saves a lot of time.