A Deep-Learning Approach to Optical Character Recognition for Uighur Language
Tarih
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Erişim Hakkı
Özet
Optical Character Recognition (OCR) for Uighur language is a difficult problem and open research field because of the cursive nature script of the Uighur text, even in the printed text. Except for a few differences and modifications, Uighur characters have possessed the same characteristics of Arabic characters, so the obstacles and challenges of the Arabic OCR researches are also available for Uighur OCR researches. The central functional core of Uighur OCR, as well as Arabic OCR, is consisting of two operation stages: Segmentation and classification. This paper proposed an obtaining segmentation point approach in the segmentation stage and applied a deep learning classifier with three-block characters as a recognized unit in the classification stage. The experiment results show that the word segmentation method has gained 95.68%, the segmentation point approach has made 94.74%, and the deep learning approach with three-block characters unit has reached 99.33%, while an identical approach with a single character unit has reached 91.98% within five epochs. © 2020 Elsevier B.V., All rights reserved.









