Arabic text recognition of printed manuscripts. Efficient recognition of off-line printed Arabic text using Hidden Markov Models, Bigram Statistical Language Model, and post-processing.

View/ Open
Thesis (6.904Mb)
Download
Publication date
2010-09-01T14:48:46ZAuthor
Al-Muhtaseb, Husni A.Supervisor
Qahwaji, Rami S.R.Keyword
Arabic text recognitionHidden Markov Models
Feature extraction
Omni font recognition
Minimal Arabic script
Bigram Statistical Language Model
Optical character recognition (OCR)
Statistical and analytical analysis
Rights

The University of Bradford theses are licenced under a Creative Commons Licence.
Institution
University of BradfordDepartment
Digital Imaging, School of Computing, Informatics and MediaAwarded
2010
Metadata
Show full item recordAbstract
Arabic text recognition was not researched as thoroughly as other natural languages. The need for automatic Arabic text recognition is clear. In addition to the traditional applications like postal address reading, check verification in banks, and office automation, there is a large interest in searching scanned documents that are available on the internet and for searching handwritten manuscripts. Other possible applications are building digital libraries, recognizing text on digitized maps, recognizing vehicle license plates, using it as first phase in text readers for visually impaired people and understanding filled forms. This research work aims to contribute to the current research in the field of optical character recognition (OCR) of printed Arabic text by developing novel techniques and schemes to advance the performance of the state of the art Arabic OCR systems. Statistical and analytical analysis for Arabic Text was carried out to estimate the probabilities of occurrences of Arabic character for use with Hidden Markov models (HMM) and other techniques. Since there is no publicly available dataset for printed Arabic text for recognition purposes it was decided to create one. In addition, a minimal Arabic script is proposed. The proposed script contains all basic shapes of Arabic letters. The script provides efficient representation for Arabic text in terms of effort and time. Based on the success of using HMM for speech and text recognition, the use of HMM for the automatic recognition of Arabic text was investigated. The HMM technique adapts to noise and font variations and does not require word or character segmentation of Arabic line images. In the feature extraction phase, experiments were conducted with a number of different features to investigate their suitability for HMM. Finally, a novel set of features, which resulted in high recognition rates for different fonts, was selected. The developed techniques do not need word or character segmentation before the classification phase as segmentation is a byproduct of recognition. This seems to be the most advantageous feature of using HMM for Arabic text as segmentation tends to produce errors which are usually propagated to the classification phase. Eight different Arabic fonts were used in the classification phase. The recognition rates were in the range from 98% to 99.9% depending on the used fonts. As far as we know, these are new results in their context. Moreover, the proposed technique could be used for other languages. A proof-of-concept experiment was conducted on English characters with a recognition rate of 98.9% using the same HMM setup. The same techniques where conducted on Bangla characters with a recognition rate above 95%. Moreover, the recognition of printed Arabic text with multi-fonts was also conducted using the same technique. Fonts were categorized into different groups. New high recognition results were achieved. To enhance the recognition rate further, a post-processing module was developed to correct the OCR output through character level post-processing and word level post-processing. The use of this module increased the accuracy of the recognition rate by more than 1%.Type
ThesisQualification name
PhDCollections
Related items
Showing items related by title, author, creator and subject.
-
Word based off-line handwritten Arabic classification and recognition. Design of automatic recognition system for large vocabulary offline handwritten Arabic words using machine learning approaches.Jiang, Jianmin; Ipson, Stanley S.; AlKhateeb, Jawad H.Y. (University of BradfordDepartment of Electronic Imaging and Media Communications, 2010-10-01)The design of a machine which reads unconstrained words still remains an unsolved problem. For example, automatic interpretation of handwritten documents by a computer is still under research. Most systems attempt to segment words into letters and read words one character at a time. However, segmenting handwritten words is very difficult. So to avoid this words are treated as a whole. This research investigates a number of features computed from whole words for the recognition of handwritten words in particular. Arabic text classification and recognition is a complicated process compared to Latin and Chinese text recognition systems. This is due to the nature cursiveness of Arabic text. The work presented in this thesis is proposed for word based recognition of handwritten Arabic scripts. This work is divided into three main stages to provide a recognition system. The first stage is the pre-processing, which applies efficient pre-processing methods which are essential for automatic recognition of handwritten documents. In this stage, techniques for detecting baseline and segmenting words in handwritten Arabic text are presented. Then connected components are extracted, and distances between different components are analyzed. The statistical distribution of these distances is then obtained to determine an optimal threshold for word segmentation. The second stage is feature extraction. This stage makes use of the normalized images to extract features that are essential in recognizing the images. Various method of feature extraction are implemented and examined. The third and final stage is the classification. Various classifiers are used for classification such as K nearest neighbour classifier (k-NN), neural network classifier (NN), Hidden Markov models (HMMs), and the Dynamic Bayesian Network (DBN). To test this concept, the particular pattern recognition problem studied is the classification of 32492 words using ii the IFN/ENIT database. The results were promising and very encouraging in terms of improved baseline detection and word segmentation for further recognition. Moreover, several feature subsets were examined and a best recognition performance of 81.5% is achieved.
-
Recognition of off-line printed Arabic text using Hidden Markov Models.Al-Muhtaseb, Husni A.; Mahmoud, Sabri A.; Qahwaji, Rami S.R. (Elsevier, 27/06/2008)This paper describes a technique for automatic recognition of off-line printed Arabic text using Hidden Markov Models. In this work different sizes of overlapping and non-overlapping hierarchical windows are used to generate 16 features from each vertical sliding strip. Eight different Arabic fonts were used for testing (viz. Arial, Tahoma, Akhbar, Thuluth, Naskh, Simplified Arabic, Andalus, and Traditional Arabic). It was experimentally proven that different fonts have their highest recognition rates at different numbers of states (5 or 7) and codebook sizes (128 or 256). Arabic text is cursive, and each character may have up to four different shapes based on its location in a word. This research work considered each shape as a different class, resulting in a total of 126 classes (compared to 28 Arabic letters). The achieved average recognition rates were between 98.08% and 99.89% for the eight experimental fonts. The main contributions of this work are the novel hierarchical sliding window technique using only 16 features for each sliding window, considering each shape of Arabic characters as a separate class, bypassing the need for segmenting Arabic text, and its applicability to other languages.
-
A Hybrid Multibiometric System for Personal Identification Based on Face and Iris Traits. The Development of an automated computer system for the identification of humans by integrating facial and iris features using Localization, Feature Extraction, Handcrafted and Deep learning Techniques.Qahwaji, Rami S.R.; Ipson, Stanley S.; Nassar, Alaa S.N. (University of BradfordSchool of Electrical Engineering and Computer Science, 2018)Multimodal biometric systems have been widely applied in many real-world applications due to its ability to deal with a number of significant limitations of unimodal biometric systems, including sensitivity to noise, population coverage, intra-class variability, non-universality, and vulnerability to spoofing. This PhD thesis is focused on the combination of both the face and the left and right irises, in a unified hybrid multimodal biometric identification system using different fusion approaches at the score and rank level. Firstly, the facial features are extracted using a novel multimodal local feature extraction approach, termed as the Curvelet-Fractal approach, which based on merging the advantages of the Curvelet transform with Fractal dimension. Secondly, a novel framework based on merging the advantages of the local handcrafted feature descriptors with the deep learning approaches is proposed, Multimodal Deep Face Recognition (MDFR) framework, to address the face recognition problem in unconstrained conditions. Thirdly, an efficient deep learning system is employed, termed as IrisConvNet, whose architecture is based on a combination of Convolutional Neural Network (CNN) and Softmax classifier to extract discriminative features from an iris image. Finally, The performance of the unimodal and multimodal systems has been evaluated by conducting a number of extensive experiments on large-scale unimodal databases: FERET, CAS-PEAL-R1, LFW, CASIA-Iris-V1, CASIA-Iris-V3 Interval, MMU1 and IITD and MMU1, and SDUMLA-HMT multimodal dataset. The results obtained have demonstrated the superiority of the proposed systems compared to the previous works by achieving new state-of-the-art recognition rates on all the employed datasets with less time required to recognize the person’s identity.Multimodal biometric systems have been widely applied in many real-world applications due to its ability to deal with a number of significant limitations of unimodal biometric systems, including sensitivity to noise, population coverage, intra-class variability, non-universality, and vulnerability to spoofing. This PhD thesis is focused on the combination of both the face and the left and right irises, in a unified hybrid multimodal biometric identification system using different fusion approaches at the score and rank level. Firstly, the facial features are extracted using a novel multimodal local feature extraction approach, termed as the Curvelet-Fractal approach, which based on merging the advantages of the Curvelet transform with Fractal dimension. Secondly, a novel framework based on merging the advantages of the local handcrafted feature descriptors with the deep learning approaches is proposed, Multimodal Deep Face Recognition (MDFR) framework, to address the face recognition problem in unconstrained conditions. Thirdly, an efficient deep learning system is employed, termed as IrisConvNet, whose architecture is based on a combination of Convolutional Neural Network (CNN) and Softmax classifier to extract discriminative features from an iris image. Finally, The performance of the unimodal and multimodal systems has been evaluated by conducting a number of extensive experiments on large-scale unimodal databases: FERET, CAS-PEAL-R1, LFW, CASIA-Iris-V1, CASIA-Iris-V3 Interval, MMU1 and IITD and MMU1, and SDUMLA-HMT multimodal dataset. The results obtained have demonstrated the superiority of the proposed systems compared to the previous works by achieving new state-of-the-art recognition rates on all the employed datasets with less time required to recognize the person’s identity.