Skip to main content

Automated document processing combining OCR and Generative AI for efficient text extraction and summ

Page 1

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 12 Issue: 12 | Dec 2025

p-ISSN: 2395-0072

www.irjet.net

Automated document processing combining OCR and Generative AI for efficient text extraction and summarization Prof. Rani Prakash1, Neelamma2 1

Professor, Master of Computer Application, VTU, Kalaburagi, Karnataka ,India 2 Student, Master of Computer Application, VTU, Kalaburagi, Karnataka ,India ----------------------------------------------------------------------------***--------------------------------------------------------------------Once the text is extracted, the challenge shifts toward ABSTRACT- In recent years, the growing volume of interpreting the vast amount of data. Transformer-based architectures have emerged as powerful tools for understanding textual context, allowing automated systems to perform classification, extraction, and semantic analysis with high accuracy. These models provide the contextual foundation required for efficient document processing [4].

unstructured textual data in scanned documents, images, and handwritten notes has created a pressing need for automated, accurate, and scalable document processing solutions. This study presents a hybrid framework that combines Optical Character Recognition (OCR) for precise text extraction with Generative Artificial Intelligence (Generative AI) for intelligent summarization and contextual understanding. The OCR module efficiently converts diverse document formats— including printed, handwritten, and multilingual content— into machine-readable text. Subsequently, the Generative AI model processes the extracted text to generate concise, coherent, and context-aware summaries, preserving the semantic essence of the original content. The results indicate that combining OCR with Generative AI offers a powerful, domain adaptable solution for end-to-end document automation, enabling organizations to transform large-scale unstructured data into actionable insights efficiently.

Further extending these capabilities, unified text-to-text models have enabled the treatment of multiple language tasks under a single framework. This approach has proved particularly useful in abstractive summarization, where extracted text is not merely shortened but also rephrased into coherent and human-like summaries suitable for decision-making [5].

2. PROBLEM STATEMENT In most organizations and institutions, documents are produced, stored, and shared in a wide variety of formats, including printed papers, scanned PDFs, handwritten notes, and image based records. On the other hand, the exponential growth of digital content has created a pressing need for summarization. Users are frequently overwhelmed by large volumes of information and require quick, concise, and contextually accurate summaries rather than raw text dumps. Existing extractive summarization techniques often fail to capture the essence of documents, while traditional rule-based methods lack adaptability across diverse domains and languages.

Keyword: This integration reduces manual workload, enhances information accessibility, and supports faster decision-making in domains such as law, healthcare, finance, and education.

1. INTRODUCTION Automated document processing has become an essential requirement in today’s digital ecosystem due to the massive volume of unstructured data being generated across industries. Traditional methods of manual processing are slow, error-prone, and expensive, leading to the adoption of intelligent solutions that integrate text recognition and artificial intelligence for knowledge extraction and summarization [1].

3. OBJECTIVES The main objective of this study is to design and develop an automated system that combines Optical Character Recognition (OCR) and Generative Artificial Intelligence (Gen AI) to streamline the process of text extraction and summarization from diverse document formats. Ultimately, the objective is not just to digitize documents but to elevate them into meaningful knowledge assets. By combining OCR and Generative AI, the study seeks to deliver an end-to-end solution that enhances accessibility, reduces processing time, improves accuracy, and empowers organizations with the ability to manage information intelligently and effectively.

The foundation of such automation lies in Optical Character Recognition, which enables the conversion of printed or handwritten text into machine-readable formats. Early OCR engines demonstrated the potential to handle multi-lingual documents and different layouts, thereby establishing a reliable basis for digital text extraction [2]. With the advancement of deep learning techniques, OCR systems have achieved remarkable improvements in accuracy and efficiency. The integration of recurrent neural networks and sequence models has made it possible to process even degraded or noisy documents, expanding the Usability of OCR beyond traditional constraints [3].

© 2025, IRJET

|

Impact Factor value: 8.315

|

ISO 9001:2008 Certified Journal

|

Page 293


Turn static files into dynamic content formats.

Create a flipbook
Automated document processing combining OCR and Generative AI for efficient text extraction and summ by IRJET Journal - Issuu