Click here to Skip to main content
15,897,704 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
I have an XML as follows:

XML
<break name="article_1-1">
<h1>
  <page num="1" />Some heading</h1>
<bl>
  Human name Contributing Writer</bl>
<h3>
  OPINION
</h3>
<p>First Paragraph</p>
<p>Second Paragraph</p>
<p>Third Paragraph</p>
<bq>
  Some value
</bq>
<p>
  Fourth Paragraph with italic values
</p>
<fig>
  <img src="images/img_1-1.jpg" width="1553" height="1050" alt="" />
  <fc>
	Image caption
  </fc>
  <cr>PHOTOGRAPHS BY SOME HUMAN</cr>
</fig>
<h3>
  CITY, STATE
</h3>
</break>


I want to make it like:

XML
<break name="article_1-1">
<h1><page num="1" />Some heading</h1>
<bl>Human name Contributing Writer</bl>
<h3>OPINION</h3>
<p>First Paragraph</p>
<p>Second Paragraph</p>
<p>Third Paragraph</p>
<bq>Some value</bq>
<p>Fourth Paragraph with italic values</p>
<fig><img src="images/img_1-1.jpg" width="1553" height="1050" alt="" /><fc>Image caption</fc><cr>PHOTOGRAPHS BY SOME HUMAN</cr></fig>
<h3>CITY, STATE</h3>
</break>


I am removing the indentation at a later stage but my main focus is on bringing the opening and closing XML tags in the same line.

I want a regex for this. I have tried something but I think there is a better way.

Please help.

Regards

What I have tried:

C#
string pattern = @"(?:(?:(<\w.>)|(<\w>)|(<\w..>|(<p>)|(\/>)))(\s+)|((<\/(?!(title)|(head)|(break)|(body))\w+>)(\s+)(<\/(?!(title)|(head)|(break)|(body))\w+>))|((<\/fc>)(\s+)(<cr>)))";

string substitution2 = @"$1$2$3$8$14$20$22";
Posted
Updated 11-Apr-19 0:51am

1 solution

Your regex have 22 groups, but only 7 are used in substitution, you probably can remove some of them.
Use Debuggex in following links, it show you a nice graph of your RegEx.
As far as I understand it, the code you show do not match the result you want, so it is difficult to know what is what.

Just a few interesting links to help building and debugging RegEx.
Here is a link to RegEx documentation:
perlre - perldoc.perl.org[^]
Here is links to tools to help build RegEx and debug them:
.NET Regex Tester - Regex Storm[^]
Expresso Regular Expression Tool[^]
RegExr: Learn, Build, & Test RegEx[^]
Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript[^]
This one show you the RegEx as a nice graph which is really helpful to understand what is doing a RegEx: Debuggex: Online visual regex tester. JavaScript, Python, and PCRE.[^]
This site also show the Regex in a nice graph but can't test what match the RegEx: Regexper[^]
 
Share this answer
 
Comments
CPallini 11-Apr-19 9:26am    
5.
Patrice T 11-Apr-19 9:48am    
Thank you

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900