Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C++ Matlab OpenCV
Hi, I'm doing research project regarding OCR and I need to segment characters using horizontal and vertical histogram profile(projection profile).This is the code I have tried but I coudn't able to segment the lines of a document by crop the image at positions where histogram bin value get zero.Please help me with this problem.Thanks.
 
#include "stdafx.h"
#include <iostream>
#include <fstream>
#include <opencv\cv.h>
#include <opencv\cxcore.h>
#include <opencv\highgui.h>

 
int _tmain(int argc, _TCHAR* argv[])
{
	IplImage *img = cvLoadImage("new.jpg");
	CvSize imgSize =cvGetSize(img);
 
	//Gray scale
	IplImage *gray=cvCreateImage(cvSize(img->width,img->height),8,1);
	cvCvtColor(img,gray,CV_RGB2GRAY);
 
	//binary
	IplImage *binary=cvCreateImage(cvSize(img->width,img->height),8,1);
	cvThreshold(gray, binary, 5, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);
 
	double pixel;
	int count=0;
	int height=binary->height;
	int *linecount = new int[height];
	int width=binary->width;
	int *wordcount = new int[width];
 
	int *HorizontalHistogram = new int[height];
	for(int i = 0; i < height; i++)
    {
        HorizontalHistogram[i] = 0;
    }
 
	//Line segmentation
	printf("Horizontal Bin Values \n");
	for(int j=0;j<(binary->height);j++){
		count=0;
		for(int i=0;i<(binary->width);i++){
			pixel=cvGetReal2D(binary,j,i);
			if( pixel==0 ){
				HorizontalHistogram[j]++;
				count++;	
			}	
		}
		printf("%d \n", count);
	}
	
	int Hhist_w = height; int Hhist_h = 300;
	int Vhist_w = height; int Vhist_h = 300;
	float range[] = {0,255};
	float *ranges[] = {range};
	int Hhist_size = {binary->height};
	int Vhist_size = {binary->width};
	float min_value,max_value = 0;
	IplImage *histImage1 = cvCreateImage(cvSize(height,300),8,1);
	IplImage *histImage2 = cvCreateImage(cvSize(width,300),8,1);
	cvSet(histImage1,cvScalarAll(255),0);
	cvSet(histImage2,cvScalarAll(255),0);
	CvHistogram *hist = cvCreateHist(1,&Hhist_size,CV_HIST_ARRAY,ranges,1);
	int bin_w1 = cvRound((double)histImage1->width/Hhist_size);
	int bin_w2 = cvRound((double)histImage2->width/Vhist_size);
 
	for(int i = 0; i < height; i++)
    {
		cvLine(histImage1, cvPoint(bin_w1*(i), Hhist_h),
                              cvPoint(bin_w1*(i), Hhist_h - HorizontalHistogram[i]),
             cvScalar(0,0,0), 1, 8, 0);
    }
 
	cvNamedWindow("Image:");
	cvShowImage("Image:", img);
	cvNamedWindow("Binary:");
	cvShowImage("Binary:", binary);
	cvNamedWindow("HorizontalHistogram:");
	cvShowImage("HorizontalHistogram:", histImage1);
 
	cvWaitKey(0);
 
	cvDestroyWindow("Image:");
	cvReleaseImage(&img);
	cvDestroyWindow("Binary:");
	cvReleaseImage(&binary);
	cvDestroyWindow("HorizontalHistogram:");
	cvReleaseImage(&histImage1);
 
	return 0;
}
Posted 19-Mar-13 8:41am
Edited 19-Mar-13 8:47am
v2
Comments
nv3 at 19-Mar-13 15:58pm
   
Could you explain a little more what the actual problem is, please. Does the histogram that you output show the line separations? Can you tell by looking at the binary image if the binarization has delivered a reasonable result?
 
What I can see is that you don't do any rotational correction of the image. If it is just slightly rotated, you won't see deep depressions in the histogram for the line separations. Instead the histogram will look more or less homogeneous. Is that the case?
 
Just to mention on the side, your program has a lot of memory leaks and could need some improvements on other places as well. But we can get to that as soon as you have solved the major problem you are stuck with.
123ezone at 20-Mar-13 4:38am
   
The output histogram is generated by scanning the image horizontally and the places where the histogram get zero are the places I should segment.Then I can segment the lines.
Binarization,Image enhancement and rotation is done by another group member and I have implement this by assuming the input image is an enhanced one.
nv3 at 20-Mar-13 4:45am
   
So what exactly is your problem then? Doesn't the histogram show the line gaps? Or is the histogram ok, and you simply don't know how to implement the segmentation?
123ezone at 20-Mar-13 4:51am
   
Histogram shows the line gaps but I have no idea how should I segment those lines from those places and crop those lines into another set of images.
nv3 at 20-Mar-13 5:29am
   
So if you detect two neighboring gaps at y=15 and y=35, then create a new image with height 20 and copy the contents of your binarized image to this new image.
 
What you want to do with all those line image strips is a question of the interface between your function and the other system components. You could for example return an array of such line images to your caller.
 
If your task is to write the main program, you could for example process those line images one at a time in a loop and let run your OCR engine on each one.
123ezone at 20-Mar-13 10:41am
   
Thank You I have succeeded segmenting lines.Now I'm trying to segment characters of each of those line images.Thank you very much
nv3 at 20-Mar-13 10:49am
   
You are welcome. I write a solution with a few line of comments, so we can call the case as closed.
Fernando Pacheco Yañez at 13-Jul-13 4:14am
   
hi , i try segment character using vertical and horizontal projection but I could not do.
 
could do you post the complete code for character segmentation using vertical and horizontal projection, please??
 
thanks!
best regards!

1 solution

Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

The problems was actually OP was uncertain on how to extract the line images from the main image after having calculated the position of the line gaps. As a result of the discussion in the comment section (see above) the case could be resolved.
  Permalink  
Comments
Jochen Arndt at 20-Mar-13 11:05am
   
5ed for your efforts and congratulations for reaching platinum authority.
nv3 at 20-Mar-13 11:14am
   
Thanks Jochen, that is very kind of you.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 BillWoodruff 360
1 Mathew Soji 309
2 DamithSL 225
3 Afzaal Ahmad Zeeshan 202
4 Maciej Los 190
0 OriginalGriff 6,249
1 Sergey Alexandrovich Kryukov 5,853
2 DamithSL 5,183
3 Manas Bhardwaj 4,673
4 Maciej Los 3,865


Advertise | Privacy | Mobile
Web02 | 2.8.1411019.1 | Last Updated 20 Mar 2013
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100