This paper presents a two-stage algorithm for automatic text detection and recognition. In the first stage, using a stroke width transform and an improved connected component, an edge analysis method detects a candidate character region. Subsequently, a text region is located by filtering and linking characters with similar font sizes and colors. For the second stage, a histogram of oriented gradient is employed as a feature descriptor, and a neural network classifier is built with dynamic-group-based hybrid particle swarm optimization (DGHPSO) for character recognition. In DGHPSO, each group's threshold value of similarity depends on the threshold values of fitness and distance. In addition, a local search algorithm is used to improve the search for a global optimum. The proposed algorithm was experimentally validated; it outperformed a number of recently published studies in terms of the text recognition rate when tested on the ICDAR 2003 database and the Street View Text database.
All Science Journal Classification (ASJC) codes
- Human-Computer Interaction
- Hardware and Architecture
- Library and Information Sciences
- Computational Theory and Mathematics