Click here to Skip to main content
15,887,683 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hello community,

I have been working on extracting text from images, specifically focusing on seven-segment fonts,
using .NET. Unfortunately, my attempts with popular libraries like Tesseract and IronOcr have been
unsuccessful, as they seem to excel with normal English fonts.


Here's a brief overview of my approach so far:
1. Tesseract: Limited to normal English fonts, unable to recognize seven-segment characters.
2. IronOcr: Similar limitations, not suitable for seven-segment fonts.

Despite these efforts, I'm facing challenges in accurately extracting text from images with
seven-segment fonts.

Link to : Image Dataset Folder

What I have tried:

Additionally, I've experimented with image processing techniques, including:
• Cropping and zooming to the text region.
• Applying gray, black and white, and binarization filters.
Posted
Updated 15-Dec-23 5:52am
v2
Comments
[no name] 17-Dec-23 14:31pm    
Machine learning / pattern recognition. You feed it the font with the "answers" (a training set; a "subset" of the sample set). Then run tests with the whole sample set.
https://mdfarragher.medium.com/optical-character-recognition-with-c-ml-net-and-net-core-3cf71864b815

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900