| Home

Overview


Original Research

HYBRID APPROACH FOR FRAGMENTING DEVNAGARI DOCUMENT IMAGES TO CHARACTER

Dr. SARIKA T DEOKATE

Vol 17, No 07 ( 2022 )   |  DOI: 10.5281/zenodo.6891896   |   Author Affiliation: Computer Engineering, Dr. D. Y. Patil Institute of Technology Pimpri Pune, India.   |   Licensing: CC 4.0   |   Pg no: 683-689   |   To cite: Dr. SARIKA T DEOKATE. (2022). HYBRID APPROACH FOR FRAGMENTING DEVNAGARI DOCUMENT IMAGES TO CHARACTER. 17(07), 683–689. https://doi.org/10.5281/zenodo.6891896   |   Published on: 20-07-2022

Abstract

In any OCR, segmenting the manuscript needs special attention as there are numerous issues associated in the Devnagari script. Printed or handwritten manuscripts have diverse concerns, which need to be studied deeply. To fragment the manuscript which is handwritten needs a lot of pre-processing tasks. As this type of manuscript contains diverse strokes, ink variations, extra drawings on the manuscript, slant in the writing of lines, words and many more issues. Printed manuscript processing is also facing many issues, as it contains many font types, font sizes, degraded manuscripts, available datasets etc. In this work, we tried the fragmentation method which generally works for both printed and handwritten documents. By using our system lines and words are fragmented to a superior extent, in character fragmentation approximately 85% of characters are fragmented correctly. Some of the characters may not get fragmented correctly and may remain partially together due to some noise, overlapping characters etc. In future, we will work on these issues.


Keywords

Classification,Dilation, Erosion, NLP, OCR, Segmentation, Vertical Projection