Repository logo
  • English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
Repository logo
  • Communities & Collections
  • Research Outputs
  • Projects
  • People
  • Statistics
  • English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Scholalry Output
  3. Publications
  4. Development of Character Recognition Model Inspired by Visual Explanations
 
  • Details
Options

Development of Character Recognition Model Inspired by Visual Explanations

Journal
IEEE Transactions on Artificial Intelligence
ISSN
26914581
Date Issued
2024
Author(s)
Chetan Ralekar
Shubham Choudhary
Tapan Kumar Gandhi
Chaudhury, Santanu 
DOI
10.1109/TAI.2023.3289167
Abstract
Deep neural networks (DNNs) currently constitute the best-performing artificial vision systems. However, humans are still better at recognizing many characters, especially distorted, ornamental, or calligraphic characters compared with the highly sophisticated recognitionmodels. Understanding themechanism of character recognition by humans may give some cues for building better recognition models. However, the appropriate methodological approach to using these cues has not been much explored for developing character recognition models. Therefore, this paper tries to understand the process of character recognition by humans and DNNs by generating visual explanations for their respective decisions. We have used eye tracking to assay the spatial distribution of information hotspots for humans via fixation maps. We have proposed a gradient-based method for visualizing the reasoning behind the model's decision through visualization maps and have proved that our method is better than the other class activation mapping methods. Qualitative comparison between visualization maps and fixation maps reveals that both model and humans focus on similar regions in character in the case of correctly classified characters. However, when the focused regions are different for humans and model, the characters are typically misclassified by the latter. Hence, we propose to use the fixation maps as a supervisory input to train the model that ultimately results in improved recognition performance and better generalization. As the proposedmodel gives some insights about the reasoning behind its decision, it can find applications in fields, such as surveillance and medical applications, where explainability helps to determine system fidelity. Impact Statement-Humans and DNNs rely on selective information uptake while classifying a character. This information selection strategy can be understood by visualizing the important, informative character regions that ultimately govern the decision of humans and DNNs. Our approach of deriving visual explanations fromeye-fixation and visualizationmaps draws an interesting parallel between the processes undertaken to recognize characters by humans and DNNs. Eye-fixation maps emphasizing that humans focus on the highly diagnostic and discriminating character regions that can be used as supervisory input have the potential to make artificial systems of character recognition more accurate. The experimental results from this research attest that superior machine vision performance can be achieved when augmentedwith cues from cognitive experiments. This paper, henceforth,motivates the need for substantial efforts in exploring various aspects of the cognitive processes to develop the next generation AI models for mimicking human intelligence in AI-driven Engineering systems.
Subjects
  • Character recognition...

  • cognitive processes

  • explainable architect...

  • eye-tracking

Copyright © 2016-2025  Indian Institute of Technology Jodhpur

Developed and Maintaining by S. R. Ranganathan Learning Hub, IIT Jodhpur.

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback