Click here to Skip to main content
13,863,289 members
Rate this:
 
Please Sign up or sign in to vote.
Hello!

I have an XML that contains various <bold> and <italic> tags.

I am loading the file with
Xdocument.Load(filename, LoadOptions.PreserveWhiteSpace)
, but after removing the tags and writing it to another file, all the <bold> and <italic> tags are getting replaced by ("\n").

For example,

<text><bold>Text</bold><text>
is:
<text>
Text
</text>

<text><bold><italic>Text</bold></italic><text>
is:
<text>


Text

</text>

I want it to be:
<text>Text</text>


Please help.

Regards
Aman

What I have tried:

Xdocument.Load(filename, LoadOptions.PreserveWhiteSpace)
Posted
Updated 11-Feb-19 9:14am
Rate this: bad
 
good
Please Sign up or sign in to vote.

Solution 1

That's what LoadOptions.PreserveWhitespace[^] does - it preserves all the insignificant whitespace characters in the input file. And whitespace is ' ' at the beginning or end, or more than one ' ' in a row; newlines '\n'; and tabs '\t'

Remove the option, and it'll probably disappear. If it doesn't, you need to look very closely at your input file.
   
Comments
Primo Chalice 11-Feb-19 3:47am
   
Hello!

What happens is that when I don't give that option, then tags like <bold>Bold <italic>Italic are concatenated and I need those spaces between them.

Without LoadOptions - BoldItalic
With LoadOptions - Bold Italic -- This is what I want.

My main XML has a structure like <paragraph><bold><italic>Text Goes Here< /italic>< /bold>< /paragraph>. This becomes:

<paragraph>
<bold>
<italic>Text Goes Here
< /italic> -- ignore the space
< /bold> -- ignore the space
< /paragraph> -- ignore the space

I think this is the problem. Is there a way to prevent this from happening?

I am getting one result correctly and the other is at fault.

Please help.

Regards
Aman
OriginalGriff 11-Feb-19 3:52am
   
Have a look here: https://www.tutorialspoint.com/xml/xml_white_spaces.htm
Primo Chalice 11-Feb-19 4:32am
   
Hello!

Is there a way of preventing the child nodes from going to the next line i.e. preserving the main XML structure?

Regards
Aman
OriginalGriff 11-Feb-19 4:52am
   
That doesn;t make a lot of sense in isolation - bear in mind we can't see your screen, access your HDD, or read your mind - we only get exactly what you type to work with.
Primo Chalice 11-Feb-19 4:56am
   
Yes, sorry. What I meant was that since I am using XDocument.Load(), I think that it is restructuring the XML file and sending the child nodes below their respective parent nodes. So, i just wanted to know that is there a way to load the XML file using XDocument.Load() but without changing the original file?
OriginalGriff 11-Feb-19 5:10am
   
No, when you load the document it parses the XML and will discard what it considers irrelevant. At a guess, you need to look at what you are doing to produce the XML when you remove the bold and italic tags, not try to patch it up when you load the result.
Rate this: bad
 
good
Please Sign up or sign in to vote.

Solution 2

When you specify LoadOptions.PreserveWhiteSpace while loading there will be no whitespace added or removed, so I do not think this is where the problem is.

I think your problem is on saving.

Take a look at the SaveOptions Enum (System.Xml.Linq) | Microsoft Docs[^].

It is an optional parameter on XDocument.Save.
Specifically you would need the DisableFormatting flag to ensure the writer doesn't insert insignificant whitespaces.

But in all cases, you need to learn to debug. Do not try to just look at the input and output and then randomly tweak some code. Single step over the relevant code in the debugger and observe it.

Are the extra newlines present after load? If yes, you need to look into how to load it correctly.

Are they present after removing the bold tags etc? If so, you need to look into the code doing the replacement and try to come up with a solution.

If they are not present in the element when you call save, then it is added by save and you need to look into the flags you pass on to save.

Be careful with the debugger in Visual Studio, it tries to "help" you by rendering new lines as spaces sometimes. Use the "text visualizer" available by clicking the small dropdown menu shown with a magnifying glass next to the value.

Most likely using this flag is the correct approach in your case. But if you are responsible for generating the XML files, and you want to minimize other tools making similar errors processing your files "down the line", you should look into the xml:space attribute. It tells any standard compliant XML writer/loaded to preserve the significant whitespaces on both load and save without any additional parameters being needed. Specify it at the root element if you are lazy (and want to make it less likely you forget it somewhere), or to individual elements if you want to keep it "nice" where it can still format as much as possible.
   
v2
Rate this: bad
 
good
Please Sign up or sign in to vote.

Solution 3

Quote:
is:
<text>


Text

</text>


I want it to be:
<text>Text</text>


Your xml structure is exactly what it is, because...
Quote:
after removing the tags and writing it to another file, all the <bold> and <italic> tags are getting replaced by ("\n").


Conclusion: replace unnecessary tags with empty string.
   

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month


Advertise | Privacy | Cookies | Terms of Service
Web05 | 2.8.190214.1 | Last Updated 11 Feb 2019
Copyright © CodeProject, 1999-2019
All Rights Reserved.
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100