Click here to Skip to main content
15,892,575 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I have this line to Regular Express it:
<title>жадно девочка and 10 green elfs - Google.com - 05 </title>

I want a single Regular Expression that will find (from a very large string):
<title> and </title>
its name that is in this case: "жадно девочка and 10 green elfs"
OMIT " - Google.com - "
and include the end of the string: "05 " or together: "05 </title>"
-In the end it should look like this:
<title>жадно девочка and 10 green elfs - 05 </title>
Clear as mud? Thank you !

What I have tried:

I didn't try anything yet.
I know how to build it myself to a point, but I have great trouble (for years) with that omitting expression !
Posted
Updated 21-Jul-21 1:02am
v11
Comments
Patrice T 21-Jul-21 7:03am    
Show samples of what you want to match and what you don't want.
Example:
Input: <title> Sample </title>
match 1: '<title> Sa' aka '<title>' and 3 letters
match 2: 'le </title>' aka 3 letters and '</title>'
Exclude: 'mp'

1 solution

Indeed - clear as mud. I'm really not sure what exactly you want, but I'm assuming you want
to turn this:
<title>жадно девочка and 10 green elfs - Google.com - 05 </title>
Into this:
<title>жадно девочка and 10 green elfs 05 </title>

In which case, try this:
(?<=\<title\>.+?)\s*-.+?-\s*(?=.*?\</title>)

And use a Replace operation:
C#
using System.Text.RegularExpressions;

/// <summary>
///  Regular expression built for C# on: Wed, Jul 21, 2021, 12:00:03 PM
///  Using Expresso Version: 3.0.4750, http://www.ultrapico.com
///  
///  A description of the regular expression:
///  
///  Match a prefix but exclude it from the capture. [\<title\>.+?]
///      \<title\>.+?
///          Literal <
///          title
///          Literal >
///          Any character, one or more repetitions, as few as possible
///  \s*-.+?-\s*
///      Whitespace, any number of repetitions
///      -
///      Any character, one or more repetitions, as few as possible
///      -
///      Whitespace, any number of repetitions
///  Match a suffix but exclude it from the capture. [.*?\</title>]
///      .*?\</title>
///          Any character, any number of repetitions, as few as possible
///          Literal <
///          /title>
///  
///
/// </summary>
public static Regex regex = new Regex(
      "(?<=\\<title\\>.+?)\\s*-.+?-\\s*(?=.*?\\</title>)",
    RegexOptions.CultureInvariant
    | RegexOptions.Compiled
    );
// This is the replacement string
public static string regexReplace = 
      " ";
string result = regex.Replace(InputText,regexReplace);
If you ar egoing to work with regular expressions, then get a copy of Expresso[^] - it's free, and it examines, tests, and generates Regular Expressions.
 
Share this answer
 
v2
Comments
Maciej Los 22-Jul-21 15:46pm    
:)

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900