Skew detection based on vertical projection in latin character recognition of text document image

 
 
 
  • Abstract
  • Keywords
  • References
  • PDF
  • Abstract


    The accuracy of Optical Character Recognition is deeply affected by the skew of the image.  Skew detection & correction is one of the steps in OCR preprocessing to detect and correct the skew of document image. This research measures the effect of Combined Vertical Projection skew detection method to the accuracy of OCR. Accuracy of OCR is measured in Character Error Rate, Word Error Rate, and Word Error Rate (Order Independent). This research also measures the computational time needed in Combined Vertical Projection with different iteration. The experiment of Combined Vertical Projection is conducted by using iteration 0.5, 1, and 2 with rotation angle within -10 until 10 degrees. The experiment results show that the use of Combined Vertical Projection could lower the Character Error Rate, Word Error Rate, and Word Error Rate (Order Independent) up to 35.53, 34.51, and 32.74 percent, respectively. Using higher iteration value could lower the computational time but also decrease the accuracy of OCR.

     

     


     

  • Keywords


    Optical Character Recognition, Preprocessing, Skew Detection, Projection Profile, Vertical Projection.

  • References


      [1] Chandarana J & Kapadia MR, “Optical character recognition”, International Journal of Emerging Technology and Advanced Engineering, Vol. 4, No. 5, (2014), pp. 219-223.

      [2] Minoru M, Character Recognition, IntechOpen, (2010).

      [3] Berchmans D & Kumar SS, “Optical character recognition: an overview and an insight”, Proceedings of International Control, Instrumentation, Communication and Computational Technologies (ICCICCT), (2014), pp: 1361-1365.

      [4] Papandreou A & Gatos B, “A novel Skew Detection technique based on Vertical Projections”, Proceedings of International Document Analysis and Recognition (ICDAR), (2011), pp: 384-388.

      [5] Postl W, “Detection of linear oblique structures and skew scan in digitized documents”, Proceedings of International Conference on Pattern Recognition, (1986), pp: 687-689.

      [6] Chauduri BB & Pal U, “An improved document skew angle estimation technique”, Journal of Pattern Recognition Letters, Vol. 17, No. 8, (1996), pp. 899-904.

      [7] Kant AJ & Vyavahare AJ, “Devanagari OCR using projection profile segmentation method”, International Research Journal of Engineering and Technology, Vol. 3, No. 7, (2016), pp. 132-134.

      [8] Carrasco RC, “An open-source OCR evaluation tool”, Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage, (2014), pp: 179-184.

      [9] Smith R, et.al., “Tesseract Open Source OCR Engine”, (2017), available online: https://github.com/tesseract-ocr/tesseract

      [10] Vijayarani S & Sakila A, “Performance comparison of OCR Tools”, International Journal of UbiComp (IJU), Vol. 6, No. 3, (2015), pp. 19-30.

      [11] Al-Khatatneh A, Pitchay SA, & Al-qudah M, “A Review of Skew Detection Techniques for Document”, Proceedings of International Conference on Modelling and Simulation (UKSim), (2015), pp: 316-321.

      [12] Jain B & Borah M, “A survey paper on skew detection of offline handwritten character recognition system”, International Journal of Computer Engineering and Applications, Vol. 6, No. 1, (2014).

      [13] Poovizhi P, “A study on preprocessing techniques for the character recognition”, International Journal of Open Information Technologies, Vol. 2, No. 12, (2014), pp. 21-24.


 

View

Download

Article ID: 26983
 
DOI: 10.14419/ijet.v7i4.44.26983




Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.