Click here to Skip to main content
15,887,267 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
See more:
I have different documents read out from pdf. Here is the text:

document 1: 2423423kh 23423h asdfswdfe. 23423sd asdf. asdfsadfsd. 134 1345 19%
document 2: 2344jklj 234 sad. 23. 23..23 asdfsadf22323d 3245 19%
document 3: 254kj kj2345 k345j34lk5j34k5 34534 k2k3k3 asdfsadf23d 43 19%

With the same regex code, i need to extract
document 1: 134
document 2: 3245
document 3: 43

all of the text before these numbers (ie starting with asdf..) is variable and not recognizable.

so the difficulty is that in document 1, there is an extra number in between 19% and the number to extract. Document 2 and 3 you can extract just the number in front of 19%.

Can you help me?

What I have tried:

i've tried the regular expression:
\d{0,5}(?=\s19%)
Posted
Updated 15-Jul-23 0:05am
Comments
Richard MacCutchan 15-Jul-23 3:27am    
Probably easier to use:
string number = stext.Split(" ")[6];
Rens Hansen 15-Jul-23 3:52am    
thanks.. i think i can use it in combination with the solution from @originalgriff

Try this:
RegEx
(?<=\s)(?<first>\d+)(?:\s\d+)?(?=\s\d+%)

The "first" group will contain your number.

If you are going to use regular expressions, you need a helper tool. Get a copy of Expresso[^] - it's free, and it examines and generates Regular expressions.
 
Share this answer
 
Comments
Rens Hansen 15-Jul-23 3:51am    
ah thanks, it comes close to my answer..
now it contains:
134 1345
3245
43

in document 1 it should only contain 134..

and thanks for expresso.. sure going to check it out!
Just a few interesting links to help building and debugging RegEx.
Here is a link to RegEx documentation:
perlre - perldoc.perl.org[^]
Here is links to tools to help build RegEx and debug them:
.NET Regex Tester - Regex Storm[^]
Expresso Regular Expression Tool[^]
RegExr: Learn, Build, & Test RegEx[^]
Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript[^]
This one show you the RegEx as a nice graph which is really helpful to understand what is doing a RegEx: Debuggex: Online visual regex tester. JavaScript, Python, and PCRE.[^]
This site also show the Regex in a nice graph but can't test what match the RegEx: Regexper[^]
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900