Click here to Skip to main content
15,886,101 members
Please Sign up or sign in to vote.
4.00/5 (1 vote)
See more:
I thought I was close on this but after 2 hours I'm making more of a mess of it than making progress. I need to search some strings for parts like "1 page" or "34 page" or "1 pages" or "1 pages" and fix them. So if a string is like this:

>>> Document 1 (p 4 of 12 page).

needs to be converted to this:

>>> Document 1 (page 4 of 12 pages).


I have 2 patterns and am doing 2 replaces. Looking for the "1" and checking to see if "pages" is plural and then fixing that so it reads "1 page" is working fine.
But changing from singular to plural is messing me up.


VB
'Here's my code. Like I said, the first part is running just fine. It's the second part that's hanging me up.
Dim myString as string = "Document 1 (page 1 of 12 page)"
Dim rxPat As String
Dim rx As Regex
'first, is there only 1 page with "pages"; if so, take out the "s"
rxPat = "([^1-9][1]\s*?pages)"
rx = New Regex(rxPat, RegexOptions.Compiled Or RegexOptions.IgnoreCase)
If rx.IsMatch(myString) Then
    myString = rx.Replace(myString, rxPat, "1 page")
End If
'second, is there more than one page with just "page"; if so, add an "s"
rxPat = "(([1-9][01]|[2-9])\s*?)(page)([^s])"
rx = New Regex(rxPat, RegexOptions.Compiled Or RegexOptions.IgnoreCase)
If rx.IsMatch(myString) Then
    myString = rx.Replace(myString, rxPat, "$1$2s$3")
End If


Also want to mention that the above code is slowing my process down immensely. A process that normally takes about 9 seconds is nearly doubled with this code in the loop. I'm hoping there's a faster way. I can't double the processing time. That'll kill me.

hope someone can help.

Thanks in advance. :-)

[edit]Comment character ' added before Here word of the first line - PES [/edit]
Posted
Updated 7-Mar-12 6:28am
v4

This may be helpful to you

VB
'Keep this outside the loop. 
'If you want atleast one space around page numbers then use this patterns -> "\(\s*p[ages]*\s+(\d+)\s+of\s+(\d+)\s+p[ages]*\s*\)"
Dim Regex1 As Regex = New Regex("\(\s*p[ages]*\s*(\d+)\s*of\s*(\d+)\s*p[ages]*\s*\)", RegexOptions.IgnoreCase)
Dim Match1 As Match

Match1 = Regex1.Match(MyString)
If Match1.Success Then
    MyString = Regex1.Replace(MyString, String.Format("(Page {0} of {1} Page{2})", Match1.Groups(1).Value, Match1.Groups(2).Value, _
    IIf(Match1.Groups(2).Value = "1", "", "s"))
End If



You may accept and vote the solution if your problem is solved, otherwise please post your queries

[edit]Replaced (\d*) with (\d+) in the Regex patterns to ensure that at least one digit shall be present for page No. to match with the pattern - PES [/edit]
 
Share this answer
 
v5
if numberhere > 1 then
charsbefore = "Pages"
else
charsbefore = "Page"
end if


of course, this is not the exact code, but that's the logic you would follow
to solve your problem.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900