International Journal of Technology and Applied Science
E-ISSN: 2230-9004
•
Impact Factor: 10.31
A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal
Plagiarism is checked by the leading plagiarism checker
Call for Paper
Volume 17 Issue 4
April 2026
Indexing Partners
A Comprehensive Study on Text Detection and Extraction from Images and PDF Documents
| Author(s) | Mayank Deshmukh, Saloni Rabde, Priyanka Makode, Sourabh Jasuja, Prof. Bhavesh Khasdev |
|---|---|
| Country | India |
| Abstract | The growing need for digitization and intelligent document processing has led to significant advancements in text detection and extraction technologies. This paper reviews methodologies and tools employed for extracting textual information from images and Portable Document Format (PDF) files. Both traditional Optical Character Recognition (OCR) techniques and modern deep learning-based approaches are discussed. Five major research contributions in this area are analyzed in detail. The paper further explores challenges in handling complex document layouts, multilingual text, and low-quality images, and highlights research gaps and future directions that emphasize the potential of artificial intelligence and multimodal learning to enhance text extraction accuracy and efficiency. |
| Keywords | OCR, Text Extraction, Deep Learning, Layout LM, Scene Text Detection, Document Analysis. |
| Field | Engineering |
| Published In | Volume 17, Issue 4, April 2026 |
| Published On | 2026-04-03 |
| Cite This | A Comprehensive Study on Text Detection and Extraction from Images and PDF Documents - Mayank Deshmukh, Saloni Rabde, Priyanka Makode, Sourabh Jasuja, Prof. Bhavesh Khasdev - IJTAS Volume 17, Issue 4, April 2026. |
Share this

CrossRef DOI is assigned to each research paper published in our journal.
IJTAS DOI prefix is
10.71097/IJTAS
Downloads
All research papers published on this website are licensed under Creative Commons Attribution-ShareAlike 4.0 International License, and all rights belong to their respective authors/researchers.