Many users want to convert documents for personal use, formatted for the viewing applications they are familiar with. There are several conversion tools available but for documents that are DRM protected, the number of free tools are particularly limited. Calibre is one of those free tools and is used, with a DeDRM extension, to convert Kindle books to a variety of other formats. Furthermore, in the case of Kindle books, the DeDRM tool requires that the book be downloaded using an older copy of the Amazon Kindle app that can provide the file format supported by the DeDRM extension. gbCapture can capture document text content from the document’s native viewer without the need for DRM removal.
For a simple capture,
gbCapture is run in “mini” mode, as shown in this image:
For evaluating all of the
gbCapture settings, selectively extracting pages, viewing details of extracted text/images and to just generally better understand how
gbCapture works, the following user interface is also supported.
I work with low vision users – particularly folks with macular degeneration. Their loss of eyesight is a great disappointment for them, in no small part because it prevents them from reading books, such as those they buy from Amazon. Native book reading apps such as the Kindle app do not allow low vision users to adjust the way the book content is displayed, at least not to the degree that low vision users require. A product of mine called EZReader allows low vision users to reformat the books so they can better see the content, as well as to read the content out loud. But, EZReader only works with text content so conversion of the Kindle book to text is required to allow display of the book in
So, the purpose of this article is to demonstrate a straightforward technique for converting protected documents, such as DRM protected Kindle books, to text. I call the utility
gbCapture and its interface is designed so that even low vision users can successfully convert their DRM-protected books to text for import into
EZReader or for conversion to other formats by other conversion applications.
One of the best known, and free, solutions to converting DRM-protected books to other formats is Calibre. Like other available solutions, it requires the use of a DeDRM extension. And in the case of Kindle books, it also requires that the book be downloaded with an earlier version of the Kindle app, one that downloads the books in a format that DeDRM supports.
gbCapture, the text is extracted by viewing the book in its native viewer (such as in any version of the Kindle app), taking a picture of each page in the document and using OCR (tesseract) to extract the text.
gbCapture automates the process, capturing each page one at a time then turning the page in the native viewer until the end of the document is reached. When the last page is captured, the extracted text will be merged into a single text file.
The Article Body
gbCapture is written in the PowerBASIC language, whose ability to directly access Win32 API particularly makes it easy to supplement the PowerBASIC statements.
gbCapture makes heavy use of the Win32 API.
Let’s go through the basic operation of
gbCapture, then I’ll highlight several areas and provide code examples for the more critical
gbCapture procedures. I’ll use the Kindle app for the discussion but
gbCapture can also work on other applications (Word, WordPad, NotePad, etc.) that display documents.
To begin a book capture, open the Kindle app to display a book. The app should be placed in the center of the desktop and sized to ensure that the pages to be captured have margins around the text.
gbCapture automatically detects the app which covers the center of the desktop. It also detects the Window within the app that contains the displayed text.
gbCapture opened in “mini” mode off to the side of the Kindle app, simply press Capture on the
gbCapture will take a picture of the currently viewed page, extract the text, turn the page and then repeat the process until it reaches the end of the book. “End of Book” is determined by two consecutive captures returning the same text.
Here’s an image the
gbCapture provides to confirm to the user that the windows containing the text has been identified.
Once Capture is pressed,
gbCapture begins the automated capture/extract text/turn page procedure. The
gbCapture statusbar will indicate progress. Capture will stop automatically when the end of the book is reached. The user can manually stop the capture at any time.
Key Code Sections
The majority of the code is very straight forward and does not require any in-depth discussion. However, I’ve selected a few of the procedures for additional discussion. The window containing the text is found using the appropriate Win32 API.
Local pt As Point
Desktop Get Client To pt.x, pt.y
pt.x = pt.x/2 : pt.y = pt.y/2
hCenterWindow = WindowFromPoint(pt)
hCenterApp = GetParent(hCenterWindow)
gbCapture provides two ways in which to capture an image of the text – one by capturing only the container window and another by capturing the entire desktop following by extracting the container window. Here’s the code for capturing the container window. Not all viewing apps respond to both approaches, so
gbCapture allows the user to select which to use. PowerBASIC provides a number of “
Graphic” statements to make it easier to work with images. In particular, the statement “Graphic Bitmap” is a PowerBASIC statement that creates an in-memory bitmap structure, whose DC is used in the following capture code.
'Capture Image to ImgName
Local hPageDC, hBMP, hBMPDC As Dword, w,h As Long
GetWindowRect hCenterWindow, rcCenter
hPageDC = GetDC(hCenterWindow)
w = rcCenter.Right - rcCenter.Left
h = rcCenter.Bottom - rcCenter.Top
Graphic Bitmap New w,h To hBMP
Graphic Attach hBMP, 0
Graphic Get DC To hBMPDC
BitBlt hBMPDC, 0, 0, w, h, hPageDC, 0, 0, %SRCCopy
ReleaseDC %Null, hPageDC
Graphic Save ImgName
Graphic Bitmap End
Statusbar Set Text hDlg, %IDC_Statusbar, 1,0, " Image: " + PathName$(Namex,ImgName)
With the page image captured, the
tesseract OCR library is used to extract the text. The time to complete a page image capture and to extract the text depends on the user’s PC capability, but in general might take up to 1s per page capture. This means that a complete book of 600 page would take about 10 minutes – much slower than other available tools, but still without the need for removing the document DRM protection.
'extract and clean the text using Tesseract
Shell ($Tesseract + " --psm " + IIf$(MultiColumn,"4 ","1 ") + _
ImgName + " " + PathName$(Path,TextName) + _
PathName$(Name,TextName), 0) 'wait for it to finish
Statusbar Set Text hDlg, %IDC_Statusbar, 1,0, " Text: " + PathName$(Namex,TextName)
Open TextName For Binary As #1 : Get$ #1, Lof(1), _
ExtractedText$ : Close #1 'get extracted text
If LastText$ = ExtractedText Then 'end of document
If IsFile(ImgName) Then Kill ImgName
If IsFile(TextName) Then Kill TextName
StopCapture = 1
LastText$ = ExtractedText$ 'for comparison, to know when stop
Open TextName For Output As #1 : Print #1, _
ExtractedText$; : Close #1 'save cleaned, but not formatted,
'append the extracted text to $Document
Open $Document For Append As #1
If AutoParagraphFormatting Then ParagraphFormatting(ExtractedText$)
Print #1, $CrLf + ExtractedText$ + IIf$(UsePageNumbers, _
$CrLf + "Page: " + Str$(ActionCount) + $CrLf, "") ;
Statusbar Set Text hDlg, %IDC_Statusbar, 1,0, " Text: " + PathName$(Namex,TextName)
Once the current page is captured and its text extracted, the viewing page is sent a Next Page command by giving focus to the application and sending a Page Down keystroke to the center application. Both key down and key up are sent to the center app.
keybd_event(%VK_NEXT, 0, 0, 0)
keybd_event(%VK_NEXT, 0, %KEYEVENTF_KEYUP, 0)
Conclusion and Points of Interest
For capturing text from a document,
gbCapture offers an alternative to other existing solutions. In particular, it can extract text from protected documents without having to enable document editing or to break the document DRM protection or without using a specific version of the native file viewer. With the text in hand, users can turn to other conversion applications to create documents in other formats.
- 20th March, 2023: Initial version
I'm an electronics engineer. I worked for Texas Instruments in Dallas as part of their Equipment Group, which provided military electronics for the US Military. Raytheon bought that part of TI and I worked for Raytheon about 10 years before retiring in 2007.
In retirement I have started a company, New Vision Concepts, which provides software (EZReader) for folks with eye diseases such as macular degeneration. I got interested in using PowerBASIC for writing applications, using it to develop an app to help my mother-in-law read books. I've since expanded the software suite to about 100 apps for low vision users.
Wife, children, grandkids, tennis and EZReader consume my time. I'm busier in retirement than I was before I "retired"!