Options
Towards Full-page Offline Bangla Handwritten Text Recognition using Image-to-Sequence Architecture
Date Issued
2023-01-01
Author(s)
Ghosh, Ayanabha
DOI
10.1109/SILCON59133.2023.10404241
Abstract
In this work, I have developed an Offline Handwritten Text Recognition (HTR) model architecture based on Neural Networks that can be taught to recognise whole pages of handwritten Bangla (Bengali) text without image segmentation. The model is developed in order to aim the extraction of text from an image and accurately sequence it without imposing any limits on the orientation, layout, and size of the text. Bengali being a resource-constrained Indic language, there is a lack of proper annotated dataset consisting scanned images of Bangla handwritten scripts. The publicly available datasets are either not annotated or the annotations are wrong in some cases. Therefore, in this work, I have introduced a new dataset, 'Bongabdo', which consists of full-page handwritten scripts collected from a wide variety of contributors of various age groups, occupation and gender. Significant time has been spent on preparing and modifying the datasets which are currently available publicly. Further, recently proposed State-of-the-art Image-to-Sequence architecture with different settings of hyperparameters have been applied on these images and they have been evaluated in terms of Character Error Rate (CER), Word Error Rate (WER) and Sequence Error Rate (SER) to finally come up with a comparative study.