Click here to Skip to main content
Click here to Skip to main content

Detect image skew angle and deskew image

By , , 13 Feb 2013
 

deskewimg/screenshot.jpg

Introduction

This article discusses about some very basic and generalized techniques for detecting image skew. Definitely there are much advanced algorithms to detect skew, but those will not be covered in this article.

So! What is image skew?

deskewimg/skew.jpg

Using the above drawn figure as reference, I can say - the theta pointed out by that bluish arrow sign is the image skew Smile | :) .

To put it in proper words, the angle by which the image seems to be deviated from its perceived steady position is the image skew.

Some pre-requisites for detecting the skew

Before detecting the skew, the first step is to differentiate between a text image and a regular image. By text image, I mean images which are scanned documents or screenshot of a text document, or in other words, images that contains letters and texts.

And a regular image in my terms is a picture, or photo of a scenery, or something drawn by Mr. Van Gogh and Co.

In one of my articles, I have covered the topic of a basic algorithm to differentiate between regular image and text images.

However, although the skew detection of a text image is relatively easier to implement, for a regular image, it's quite difficult and next to impossible in some cases. Suppose I have a picture of only the head part of my favourite celebrity Ms. Julia Roberts, ... there is no way to know from the image whether she really has her head slanted towards one direction, or the camera man who took the beautiful picture had a problem with the camera angle.

Therefore, for regular images, we have to make some sort of assumptions. The case we are going to discuss here will assume, the pictures to be deskewed will be framed. Which means, the pictures that we are dealing with are pictures taken from another picture having frames/border around them. You can refer to the application screenshot at the very top as an example. The screenshot you see is of a camera captured image of another picture having a thick white-ish border around it.

The basic concept of detecting the skew

The main idea to detect skew for a regular image and a text image is the same. We have to first turn the pictures to Gray Scale.

Then, for regular (non text) images, we have to find the edges or feature lines of the image. There are many algorithms to detect the edges of an image, and in our case, we used the canny edge detection algorithm. We can wish to skip this part for text images.

The next step and the most important step is to cast rays from one side of the picture to the other, and by using the intersection info of the rays with the various parts of the images, we come up with a good skew angle.

deskewimg/out_Album1-WW1-LochHavenPA-117.jpg

As you can see from the above image, we have first converted the image (the one you see in the application screenshot) to its very basic feature lines/edges and have casted horizontal rays from one side to the other. The rays which we have casted (I mean the horizontal lines) intersect with the border/frame of the image. If I put the intersection points of the rays and the left edge of the picture in an X, Y plot (where X is horizontal and Y is vertical), I will end up with some sort of straight line curve that has a slope angle close to 2 degrees with respect to the Y axis. So, there you have it, your skew angle for the image in the application screenshot is close to 2 degrees or 0.035 radians.

This is more or less the basic concept used for detecting the skew of a regular framed image. You have to keep in mind this idea is applicable only to regular images and not text images.

By the way, the uploaded source code's skew detection algorithm does a little more than what I have explained above. What it does is - cast rays from left to right to intersect with the left edge, then top to bottom, then right to left, and then bottom to top. The algorithm considers the one with the most intersection points, and creating consistently occurring slopes as the edge to consider as the reference edge. For example, if the bottom to top rays intersects most with the bottom edge of the image and more frequently occurring slopes are observed, then that edge has the priority and the points of the bottom to top rays would be considered to be put in the X,Y plot. All these extra effort is just for the fact that, in some images, the border edges might be broken and not as beautifully framed fully around the image, like you are seeing in the application screenshot image. So, we'll have a set of rays with different slopes for such images, because the rays will go beyond the border edge wherever there is a broken edge line, and will create a different angle with another intersected point that lies somewhere inside the image bounded by the frame. So, the most occurring slopes (if two slopes have less than 0.0349 radians / 2.0 degrees difference between them - we'd consider the slopes as same valued) will be taken into consideration to detect the skew.

In order to detect the skew of text images, we don't have to worry too much about the picture having border edges or not. As mentioned before, we cast rays from left to right and find how many blackish pixels the rays intersect with. Then, we rotate the angle of the rays by a small amount and cast the rays again, and do the same process again and again.

The angle at which most white space is encountered (in other words, most black pixels are shot down, i.e., intersected), we consider that to be the skew angle. Most likely, we would start from 0 degree and go down to 90 degrees, and then again, from 0 degree go up to 90 degrees with the angle of the rays. Whenever the white ratio reaches the maximum peak and then starts to come down again, we stop immediately, and we know we have found our skew angle. This is the basic idea of detecting skew for text images.

The deskewing of an image is less of a trouble. All you need to do is rotate the image by the same amount of the skew, only to the reverse direction.

The code for the rotation is available in the ImageFunctions.h header file of the uploaded source.

However, if you want a real nice quality rotated image, you can use the wonderful aarot class by Mark Gordon (http://codeguru.earthweb.com/cpp/cpp/algorithms/math/article.php/c10677/Anti-Aliased-Image-Rotation-Aarot.htm). Since the rotation code inside this class performs antialiasing, the performance is a little slow, but the output image quality is very good.

Using the code

In the uploaded source code, mainly two classes are used for detecting the skew. One class, the main class, is named SkewManager, which has code to detect the skew, and the other class is ImageData, which is just a wrapper for the bitmap bits and also keeps the width and the height of the image to be used by the SkewManager to apply its algorithms to the bitmap bits.

The SkewManager expects the bitmap that will be passed to be gray scaled. The blurring of the image to its edges (if required) is done inside the ImageData class.

Following is the example of the usage of the two classes (the m_ prefixed variables as well as the skewMan variable are member variables of the CScannedDocTestDoc class).

BOOL CScannedDocTestDoc::OnOpenDocument(LPCTSTR lpszPathName)
{
    m_Image.Destroy();
    HRESULT hr = m_Image.Load(lpszPathName);

    if(SUCCEEDED(hr))
    {
        int nPitch = m_Image.GetPitch();
        int nWidth = m_Image.GetWidth();
        int nHeight = m_Image.GetHeight();

        int nBytesPerPixel = m_Image.GetBPP() / 8;
        if(nBytesPerPixel)
        { 
            byte* pGrayScaleBits = NULL;
            // Get a gray scaled bitmap and bitmap bits from the original image
            m_hBmpGrayScale = ImageFunctions::GetGrayScaleImage((HBITMAP)m_Image, 
            nWidth, nHeight, nBytesPerPixel, nPitch, &pGrayScaleBits);
            if(pGrayScaleBits != NULL)
            {
                // Find out if the image is a regular image or text image
                m_bTextImage = ImageFunctions::IsTextImage(pGrayScaleBits, nWidth, nHeight);
                m_imgData.Load(pGrayScaleBits, m_Image.GetWidth(), m_Image.GetHeight(), 1);

                // get the skew angle
                double angle = 0.0;
                skewMan.GetSkewAngle(m_bTextImage, m_imgData, angle);
                m_dSkewAngle = angle;
            }
        }
        return TRUE;
    }
    return FALSE;
}

Points of interest

In order to achieve skew detection and deskewing of images, I had to go through a lot of trouble to perform basic image operations such as creating a gray scaled image, rotating an image etc., and I have put all those functionalities inside one header, ImageFunctions.h. Most of it was improvisation on already available materials in the Internet and MSDN. The canny edge detection algorithm can be found in the the header canny_edge.h.

CSharpClient

This is a cut down version of the deskewimage code built to compile into a DLL.  A small C# demo application is included in the solution showing how to call the code contained in within the DLL,  pass in a file name to be analysed and the skew angle in radians is returned and displayed on the screen. 

Acknowledgements

  • Kathey Marsden, Professor Richard J. Fateman
  • for the OCRchie project - the foundation upon which, the text image skew detection algorithms of this article's codes are based upon.

  • Also, thanks to Archie Russell, James Hopkin, and Cynthia Tian, who contributed significantly to the original design.
  • Heath, M., Sarkar, S., Sanocki, T., and Bowyer, K. Comparison of edge detectors: a methodology and initial study, Computer Vision and Image Understanding 69 (1), 38-54, January 1998.
  • Heath, M., Sarkar, S., Sanocki, T. and Bowyer, K.W. A Robust Visual, Method for Assessing the Relative Performance of Edge Detection Algorithms, IEEE Transactions on Pattern Analysis and Machine Intelligence 19 (12), 1338-1359, December 1997.
  • for the excellent canny edge detection algorithm.

  • Shahadatul Hakim
  • for providing with reading materials related to deskewing text and framed images.

History

  • Article uploaded: 24 August, 2010.
  • Article Updated 29 May 2012 CsharpClient Added
  • Article Updated 14 February 2013 CSharpClient.zip 1.1 Added
    Updated to compile in Visual Studio 2008 (Professional Edition)
    Combined into a single VS Solution containing two projects (for easier debugging)
    Minor Bug Fixes
    Angle of 999 is now returned if the document could not be analysed (not a 8bpp image)
    Added a Text box to show the name of the file analysed

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Authors

Mukit, Ataul
Chief Technology Officer Rational Technologies
Bangladesh Bangladesh
Member
You don't learn patterns, you just code it.

David_Pollard
Network Administrator
Australia Australia
Member
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
QuestionMemory LeakmemberDavid_Pollard4 Apr '13 - 14:11 
Hi Mukit,
My document handling project of which skew detection is part is going very well. I have the first release into production. Smile | :)
 
My program processes hundreds of documents at a time and calling the deskew code over and over has revealed a memory leak. I can process 200 or so pages before the program stops with an out of memory error.
 
If I test with the demo application detecting skew of the same file over and over I can see the memory usage going up by 7 or 8Mb every iteration.
 
I have had an initial look and stepped through the code to try and determine where the problem is. I’ll keep looking but I’m struggling at the moment.
 
David
AnswerRe: Memory LeakmemberMukit, Ataul5 Apr '13 - 7:57 
If you are using a C# DLL then you might need to call upon the garbage collector to collect the un-referenced memory chunks by GC.Collect();
 
If you are using a C++ module then most likely you are not destroying the (old) bitmap after deskewing it before loading a new bitmap.
 
Using a good debugger should help you detect the memory leak.
Valgrind might be a good tool to identify the memory leak.
 
Hope you'd come up with a solution soon.
 
Best,
M
GeneralRe: Memory LeakmemberDavid_Pollard5 Apr '13 - 12:57 
Hi Mukit,
I see this problem when testing using the CSharp Client 1.1 demo.
I believe my code implements the skew test in exactly the same way as the CSharp Client 1.1 demo.
I'm not actually deskewing just reporting the skew angel back to the calling C# program.
 
So I guess as you suggest something is not being destroyed correctly within the C++ code.
 
I'm using Visual Studio 2008 and stepping through the code but my limited understanding of C++ is making it difficult to spot the problem.
 
I'll post again if I find the likely problem.
 
Thanks
David
GeneralRe: Memory LeakmemberMukit, Ataul5 Apr '13 - 18:40 
Are you loading the image with C#? Then it might be the case of not removing the reference from the image object that you are loading in C#. Sometimes you need to invoke the garbage collector to remove some unreferenced instances as well.
GeneralRe: Memory LeakmemberDavid_Pollard5 Apr '13 - 18:54 
I'm only testing now with the C# demo application that is attached to this article....
 
OK I found the problem. There may be others but this is certainly the bigest leak to start with.
 
The file "ImageFunctions.h" contains the method named GetGrayScaleImage().
The line within the if statement shown below
byte* p8Bit = new byte [grayScaleSize];
Aloocates one byte of memory for every pixel in the image being tested.
I can't just dealocate this a the end of the method because obviously the calling method wants to use it.
 
I have done some reading today about constructors and destructors and they don't appear to be used.
It seems that the method that calls GetGrayScaleImage() should create the p8bit thing and then destroy it when it is finished. I'm easily confused by the C++ syntax
 
How can I destroy p8Bit thing? Even if I do it just before it is created again would ge good enough.
 
Any ideas?
 
Thanks David
 

 
static HBITMAP GetGrayScaleImage(HBITMAP hBmp, int width, int height, unsigned int bytesPerPixel, int pitch, byte** ppGrayScaleBits = NULL)
	{	
		if(bytesPerPixel == 0 || bytesPerPixel == 2)
			return NULL;	
		
		int width_in_bytes = width * bytesPerPixel;
		
		int unusedBytes = absolute(pitch) - width_in_bytes;						
		if(!(unusedBytes % 3))
			unusedBytes = 0;
		
		HBITMAP hBmp8 = Create8bppBitmap(NULL, width + unusedBytes, height, NULL); 
 
		HDC hdcScr = ::GetDC(NULL);
 
		HDC hdc = ::CreateCompatibleDC(hdcScr);
		HBITMAP hBmpOld = (HBITMAP)::SelectObject(hdc, hBmp);
 
		HDC hdc8 = ::CreateCompatibleDC(hdcScr);
		HBITMAP hBmpOld8 = (HBITMAP)::SelectObject(hdc8, hBmp8);
 
		::BitBlt(hdc8, 0, 0, width, height, hdc, 0, 0, SRCCOPY);
		if(ppGrayScaleBits)
		{	
			int grayScaleSize = width * height;		
 
			byte* p8Bit = new byte [grayScaleSize];		
			::GetBitmapBits(hBmp8, grayScaleSize, p8Bit);
 
			*ppGrayScaleBits = p8Bit;
 
		}		
		
		::SelectObject(hdc, hBmpOld);
		::SelectObject(hdc8, hBmpOld8);
 
		::DeleteDC(hdc);
		::DeleteDC(hdc8);
 
		::ReleaseDC(NULL, hdcScr);				
		
		return hBmp8;		
	}

GeneralRe: Memory LeakmemberMukit, Ataul6 Apr '13 - 8:22 
I can provide a temporary / short cut solution; after all the c# and the C++ dll is by no means a finished product, rather a hack.
 
Use a C++ array (such as MFC CArray or std::vector ) of void pointers to keep track of the allocated memories (through new).
 
Then after your deskewing operation is done, write another dll function in the C++ dll such as RemoveAllocatedMem which will just loop the array and do a delete allocateditem[itemindex] and after the looping and deleting is over clear the content of the array through allocateditem.RemoveAllItems() /allocatedItem.clear() ...
 
You should invoke this newly created dll function when all your operations are done, and memory can be cleared. I hope the solution is good enough for now.
GeneralRe: Memory LeakmemberDavid_Pollard10 Apr '13 - 19:32 
I took the approach of trying to learn a bit more about how C++ works and how to de-allocate the memory correctly. Unfortunately I didn't have any success apart from some really basic test programs and I can't spend any more time on this at the moment.
 
To work around the problem I provided an option to disable skew detection in my application. While skew detection is enabled it can process about 200 pages before the application crashes. While it is disabled there is no problem.
 
I'm keen to know if anyone else picks this up and fixed the leak(s).
GeneralCompile ErrorsmemberMember 97749616 Feb '13 - 12:12 
Hi Everyone,
Firstly this looks like a great program and very well written. Unfortunately I haven't done much C++ in many years. But I think I should just be able to call the functions I need within this DLL from My C# program without any problems. The provided demo application should give me enough clues how to do that. All I want is to detect the skew angle in documents so I can send staff back to the scanner to do the job properly Smile | :) .
 
Like others here I have run into various hard to understand error messages but after running sxstrace (sxs=side by side) I have determined that the precompiled DLL included in the zip is compiled against a debug version of the c++ runtime that was included with whatever version of Visual Studio that Mukit used to build the program. As the debug version is never redistributible and I have a different debug runtime in my version of VS 2008 I will need to recompile the DLL.
 
No problem, all the source is included so I should be able to rebuild in both debug and release mode in my copy of VS 2008. I found a couple of problems where the location of two or three header files were specified using a slighly different path than the actual location of the files. This was easy to correct.
 
The rebuild now progresses further but I get several warnings and errors. Some I think I can just ignore for now but others look more serious and there are three fatal errors.
I have tried googling these errors but due to my rusty old c++ knowledge and the fact that it was never that good in the first place I don't understand the solutions.
 
Can anyone here plaese assist with getting the DeskewDLL to compile on VS 2008?
 
Here are the results when I try and build.
 
Warning 1 Command line warning D9035 : option 'Wp64' has been deprecated and will be removed in a future release  cl  DeskewDLL
Warning 2   warning C4273: 'm_clrBackRotateBmp' : inconsistent dll linkage  c:\Projects\BulkDocumentHandling\BulkDocumentHandling\SDK\CSharpClient\DeskewImage\DeskewDLL\DeskewDLLManager.cpp   19  DeskewDLL
Error   3   error C2491: 'CDeskewDLLManager::m_clrBackRotateBmp' : definition of dllimport static data member not allowed   c:\Projects\BulkDocumentHandling\BulkDocumentHandling\SDK\CSharpClient\DeskewImage\DeskewDLL\DeskewDLLManager.cpp   19  DeskewDLL
Warning 4   warning C4273: 'CDeskewDLLManager::CDeskewDLLManager' : inconsistent dll linkage    c:\Projects\BulkDocumentHandling\BulkDocumentHandling\SDK\CSharpClient\DeskewImage\DeskewDLL\DeskewDLLManager.cpp   24  DeskewDLL
Warning 5   warning C4273: 'CDeskewDLLManager::~CDeskewDLLManager' : inconsistent dll linkage   c:\Projects\BulkDocumentHandling\BulkDocumentHandling\SDK\CSharpClient\DeskewImage\DeskewDLL\DeskewDLLManager.cpp   36  DeskewDLL
Warning 6   warning C4273: 'CDeskewDLLManager::OnOpenDocument' : inconsistent dll linkage   c:\Projects\BulkDocumentHandling\BulkDocumentHandling\SDK\CSharpClient\DeskewImage\DeskewDLL\DeskewDLLManager.cpp   54  DeskewDLL
Warning 7   warning C4273: 'CDeskewDLLManager::OnSaveDocument' : inconsistent dll linkage   c:\Projects\BulkDocumentHandling\BulkDocumentHandling\SDK\CSharpClient\DeskewImage\DeskewDLL\DeskewDLLManager.cpp   92  DeskewDLL
Warning 8   warning C4273: 'CDeskewDLLManager::GetImageSkewAngle' : inconsistent dll linkage    c:\Projects\BulkDocumentHandling\BulkDocumentHandling\SDK\CSharpClient\DeskewImage\DeskewDLL\DeskewDLLManager.cpp   105 DeskewDLL
Warning 9   warning C4273: 'CDeskewDLLManager::GetImageEdgeData' : inconsistent dll linkage c:\Projects\BulkDocumentHandling\BulkDocumentHandling\SDK\CSharpClient\DeskewImage\DeskewDLL\DeskewDLLManager.cpp   126 DeskewDLL
Warning 10  warning C4273: 'CDeskewDLLManager::GetLoadedBitmap' : inconsistent dll linkage  c:\Projects\BulkDocumentHandling\BulkDocumentHandling\SDK\CSharpClient\DeskewImage\DeskewDLL\DeskewDLLManager.cpp   144 DeskewDLL
Warning 11  warning C4273: 'CDeskewDLLManager::GetGrayScaleBitmap' : inconsistent dll linkage   c:\Projects\BulkDocumentHandling\BulkDocumentHandling\SDK\CSharpClient\DeskewImage\DeskewDLL\DeskewDLLManager.cpp   149 DeskewDLL
Warning 12  warning C4273: 'CDeskewDLLManager::GetDeskewedBitmap' : inconsistent dll linkage    c:\Projects\BulkDocumentHandling\BulkDocumentHandling\SDK\CSharpClient\DeskewImage\DeskewDLL\DeskewDLLManager.cpp   154 DeskewDLL
Warning 13  warning C4273: 'CDeskewDLLManager::GetImageEdgesBitmap' : inconsistent dll linkage  c:\Projects\BulkDocumentHandling\BulkDocumentHandling\SDK\CSharpClient\DeskewImage\DeskewDLL\DeskewDLLManager.cpp   171 DeskewDLL
Error   14  error C2491: 'SkewManOpenFile' : definition of dllimport function not allowed   c:\Projects\BulkDocumentHandling\BulkDocumentHandling\SDK\CSharpClient\DeskewImage\DeskewDLL\DeskewDLL.cpp  71  DeskewDLL
Error   15  error C2491: 'SkewManGetImageSkewAngle' : definition of dllimport function not allowed  c:\Projects\BulkDocumentHandling\BulkDocumentHandling\SDK\CSharpClient\DeskewImage\DeskewDLL\DeskewDLL.cpp  76  DeskewDLL
 

UPDATE: After rereading some of the other posts here it seems I too can compile in debug mode but I get the errors only in release mode which I will need before I can deploy my finished product.
 
David
GeneralRe: Compile ErrorsmemberMukit, Ataul7 Feb '13 - 5:28 
Please try to make sure, the include and library paths are same in release in debug modes. I have a feeling you haven't included necessary paths in your release version.
Try to match the release version with the debug version.
 
The main difference between debug and release is,
In debug mode, you generate .pdb file and you have _DEBUG as a define.
In release mode, you don't have _DEBUG defined and further more, you have some optimization setting for speed and memory optimization.
 
I am sorry that I cannot help you too much with the compilation because of my very busy schedule at the moment. However, like I suggested in a previous reply to another question, you can put this project in odesk and ask for a proper build in release which would cost a few dollars.
 
Again apologies and hope somebody else from this forum can help you solve the issue.
GeneralRe: Compile ErrorsmemberMember 97749617 Feb '13 - 14:56 
Hi Mukit,
Thanks for your assistance, I completely understand that paid work comes well in front of free stuff. It is very generous of you to post this project and it must have taken a lot of work. Your advice has helped and I now have the test App and DLL compiling error free.
 
For anyone else who is reading this
I'm Using Visual Studio 2008 on W7 64bit. I compiled and run the application as 32bit.
 
Things that I modified in the Release Mode Configuration are as follows
 
1) Configuration Properties -> C/C++ Preprocessor -> Processor Definitions.
Added "SKEWMANAGER_DLL"
 
2) Configuration Properties -> C/C++ General -> Additional Include Directories.
Added “.\SkewManager”
 
3) Still received the following warning:
warning LNK4075: ignoring '/INCREMENTAL' due to '/LTCG' specification DeskewDLL DeskewDLL
 
It appears that the Linker –> General Optimization -> Enable Incremental Linking = YES (/INCREMENTAL) option is not compatible with Linker -> Optimization -> Link Time Code Generation (/ltcg). As the compiler preferred to ignore the /INCREMENTAL setting I turned it off to correct the warning (/INCREMENTAL:NO)
 
4) Still Received the warning:
warning C4996: 'fopen': This function or variable may be unsafe. Consider using fopen_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. c:\projects\bulkdocumenthandling\bulkdocumenthandling\sdk\csharpclient\deskewimage\deskewdll\skewmanager\canny_edge.h 137 DeskewDLL
 
Added _CRT_SECURE_NO_WARNINGS option to C/C++ -> Preprocessor -> Processor Definitions for both Release and Debug Modes.
 
I'll try and re-zip the solutions and upload it so that others who wish to use it don't have to go through all of this.
 
Next.
I'll be trying to test the skew angle on multi page tiff files. If I manage to make any modifications to the code to allow for this I'll post again here.
 
Thanks
David

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web02 | 2.6.130523.1 | Last Updated 14 Feb 2013
Article Copyright 2010 by Mukit, Ataul, David_Pollard
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid