Options
Harit, Gaurav
Loading...
Preferred name
Harit, Gaurav
Alternative Name
Harit, G.
Main Affiliation
ORCID
Scopus Author ID
6508137887
Researcher ID
CTN-5767-2022
Now showing 1 - 10 of 34
- PublicationSkeletonizing character images using a modified medial axis-based strategy(2011-11-01)
;Bag, SoumenIn this paper we propose a thinning methodology applicable to character images. It is novel in terms of its ability to adapt to local character shape while constructing the thinned skeleton. Our method does not produce many of the distortions in the character shapes which normally result from the use of existing thinning algorithms. The proposed thinning methodology is based on the medial axis of the character. The skeleton has a width of one pixel. As a by-product of our thinning approach, the skeleton also gets segmented into strokes in vector form. Hence further stroke segmentation is not required. We have conducted experiments with printed and handwritten characters in several scripts such as English, Bengali, Hindi, Kannada and Tamil. We obtain less spurious branches compared to other thinning methods. Our method does not use any kind of post processing. - PublicationAn improved contour-based thinning method for character images(2011-10-15)
;Bag, SoumenDigital skeleton of character images, generated by thinning method, has a wide range of applications for shape analysis and classification. But thinning of character images is a big challenge. Removal of spurious strokes or deformities in thinning is a difficult problem. In this paper, we propose a contour-based thinning method used for performing skeletonization of printed noisy isolated character images. In this method, we use shape characteristics of text to get skeleton of nearly same as the true character shape. This approach helps to preserve the local features and true shapes of the character images. As a by-product of our thinning approach, the skeleton also gets segmented into strokes in vector form. Hence further stroke segmentation is not required. Experiment is done on printed English, Bengali, Hindi, and Tamil characters and we obtain much better results comparing with other thinning methods without any post-processing. - PublicationTopological features for recognizing printed and handwritten Bangla characters(2011-10-13)
;Bag, Soumen; Bhowmick, ParthaIn this paper, we present novel topological features based on the structural shape of a character. We detect the convexshaped segments formed by the various strokes. The convex segments are then represented with shape primitives from a repertoire. The character is represented as a spatial layout of convex segments. We formulate feature templates for Bangla characters. A given character is assigned the label of the best matching feature template. We have tested the method on a benchmark datasets of printed and handwritten Bangla basic and compound character images. Our results demonstrate the efficacy of our approach. Copyright © 2011 ACM. - PublicationText detection on camera acquired document images using supervised classification of connected components in wavelet domain(2012-12-01)
;Roy, UditIn this paper we present an algorithm to detect text on video frames consisting of lecture slides. We begin by performing a multi-channel wavelet transform and then merge the channel components for the high frequency sub bands to obtain a composite energy map. Thresholding the energy map results in an edge map consisting of candidate text pixels - some of these correspond to actual text and others correspond to graphics, logo, tables, etc. The connected components in the edge map are then filtered to reject some of the false positives using a trained classifier. Rectangular text blocks compactly surrounding the text regions are then identified using a process of selective dilation and recursive splitting. False positive text blocks still remaining are then rejected using heuristics. Experiments conducted on 890 images show that our scheme has lower false positive rate and misdetection rate when compared with two existing scene text detection methods. © 2012 ICPR Org Committee. - PublicationTable detection in document images using header and trailer patterns(2012-12-01)
; Bansal, AnukritiThis paper presents a new approach to detect tabular structures present in document images and in low resolution video images. The algorithm for table detection is based on identifying the unique table start pattern and table trailer pattern. We have formulated perceptual attributes to characterize the patterns. The performance of our table detection system is tested on a set of document images picked from UW-III (University of Washington) dataset, UNLV dataset, video images of NPTEL videos, and our own dataset. Our approach demonstrates improved detection for different types of table layouts, with or without ruling lines. We have obtained correct table localization on pages with multiple tables aligned side-by-side. © 2012 ACM. - PublicationRecognition of Bengali handwritten characters using skeletal convexity and dynamic programming(2011-04-18)
;Bag, Soumen ;Bhowmick, ParthaThe main challenge in recognizing handwritten characters is to handle large-scale shape variations in the handwriting of different individuals. In this paper, we present a novel handwritten character recognition method based on the structural shape of a character irrespective of the viewing direction on the 2D plane. Structural shape of a character is described by different skeletal convexities of character strokes. Such skeletal convexity acts as an invariant feature for character recognition. Longest common subsequence matching is used for recognition. We have tested out method on a benchmark dataset of handwritten Bengali character images. Preliminary results demonstrate the efficacy of our approach. © 2011 IEEE. - PublicationGreedy Gaussian Process Regression Applied to Object Categorization and Regression(2018-12-18)
;Dey, Arka Ujjal ;Hafez, A. H.AbdulIn this work we propose an approximation of Gaussian Process and apply it to Classification and Regression tasks. We, primarily, target the problem of visual object categorization using a Greedy variant of Gaussian Processes. To deal with the prohibitive training and inferencing cost of GP, we devise a greedy approach to subset selection and the inducing input choice to approximate the kernel matrix, resulting in faster retrieval timings. A localized combination of kernel functions is designed and used in a framework of sparse approximations to Gaussian Processes for visual object categorization and generic regression tasks. Through exhaustive experimentation and empirical results we demonstrate the effectiveness of the proposed approach, when compared with other kernel based methods. - PublicationUnsupervised Temporal Segmentation of Human Action Using Community Detection(2018-08-29)
;Jain, HiteshiTemporal segmentation of complex human action videos into action primitives plays a pivotal role in building models for human action understanding. Studies in the past have introduced unsupervised frameworks for deriving a known number of motion primitives from action videos. Our work focuses towards answering a question: Given a set of videos with humans performing an activity, can the action primitives be derived from them without specifying any prior knowledge about the count for the constituting sub-actions categories? To this end, we present a novel community detection-based human action segmentation algorithm. Our work marks the existence of community structures in human action videos where the consecutive frames around the key poses group together to form communities similar to social networks. We test our proposed technique over the stitched Weizmann dataset and MHADI01-s motion capture dataset and our technique outperforms the state-of-the-art techniques of complex action segmentation without the count of actions being pre-specified. - PublicationDetecting missed and anomalous action segments using approximate string matching algorithm(2018-01-01)
;Jain, HiteshiWe forget action steps and perform some unwanted action movements as amateur performers during our daily exercise routine, dance performances, etc. To improve our proficiency, it is important that we get a feedback on our performances in terms of where we went wrong. In this paper, we propose a framework for analyzing and issuing reports of action segments that were missed or anomalously performed. This involves comparing the performed sequence with the standard action sequence and notifying when misalignments occur. We propose an exemplar based Approximate String Matching (ASM) technique for detecting such anomalous and missing segments in action sequences. We compare the results with those obtained from the conventional Dynamic Time Warping (DTW) algorithm for sequence alignment. It is seen that the alignment of the action sequences under conventional DTW fails in the presence of missed action segments and anomalous segments due to its boundary condition constraints. The performance of the two techniques has been tested on a complex aperiodic human action dataset with Warm up exercise sequences that we developed from correct and incorrect executions by multiple people. The proposed ASM technique shows promising alignment and missed/anomalous notification results over this dataset. - PublicationAn unsupervised sequence-to-sequence autoencoder based human action scoring model(2019-11-01)
;Jain, HiteshiDeveloping a model for the task of assessing quality of human action is a key research area in computer vision. The quality assessment task has been posed as a supervised regression problem, where models have been trained to predict score, given action representation features. However, human proficiency levels can widely vary and so do their scores. Providing all such performance variations and their respective scores is an expensive solution as it requires a domain expert to annotate many videos. The question arises - Can we exploit the variations of the performances from that of expert and map the variations to their respective scores? To this end, we introduce a novel sequence-to-sequence autoencoder-based scoring model which learns the representation from only expert performances and judges an unknown performance based on how well it can be regenerated from the learned model. We evaluated our model in predicting scores of a complex Sun- Salutation action sequence, and demonstrate that our model gives remarkable prediction accuracy compared to the baselines.