Click here to Skip to main content
15,747,908 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I have a file of type / pdf / that contains an unformatted table (the design of the cells in it is irregular (the address of a cell in the line and under it in the next line is the value or the line that follows it ...))
I converted it to text using the / & C# / and library / Iron OCR /
Where the result is data collected in consecutive lines.
How can I extract field values? To store them in a database

What I have tried:

pdf file

Title 1
Value 1




Result as Text

title2 title3
title4 value4
Updated 24-Nov-20 7:38am

1 solution

public Dictionary<string, string> Parser(string jsonTemplate, string data)
var arrRow = data.Split("/n");
JavaScriptSerializer serializer = new JavaScriptSerializer();
var jsonObject = serializer.Deserialize(jsonTemplate);
  //plz see here how to use it

//then get an item from arrRow with title3. The arrRow is an array so just get value according to the position from that title.

Sorry I have no time to make it works. I hope you will get an idea.
Share this answer
Member 15000927 24-Nov-20 17:41pm    
thank you

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900