An overview of different binary methods for documents based on their features
Fazeli, Fatemeh; Sarrafzadeh, Hossein; Shanbehzadeh, Jamshid
Citation:Fazeli, F., Sarrafzadeh, A., and Shanbehzadeh, J. (2013). An overview of different binary methods for documents based on their features. Proceedings of the International MultiConference of Engineers and Computer Scientists, Hong Kong, 1, 491-495.
Permanent link to Research Bank record:http://hdl.handle.net/10652/2728
This paper surveys binarization of document images. The main role of binarization is dimension and noise reduction. Binarization is one of the most important steps in preprocessing of document image understanding and compression. Image binarization means to classify image pixels into two classes, background and foreground. The input of this classification is a feature vector based on intensity values of image pixels. The new features are extracted from the first input vector and, according to the extracted features a cost function as a classifier is constructed. The intensity value that maximizes the cost function is considered as the boundary line of two classes. This paper divides the binarization algorithms into three groups. The first considers one input feature vector including intensity values of each pixel. The second one considers an input feature vector for each pixel based on the intensity value of the pixel and its neighbors. The third group is based on a combination of the first and second group of schemes.