Click here to Skip to main content
15,919,479 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Please tell me how can i remove this invisible junk characters from xml file using C# code


I want to read some xml files. when i read i found some unwanted characters like symbols presenet in it i need to remove it, can any 1 helps me
Posted
Updated 8-Oct-19 22:40pm
v2

C#
public static string RemoveInvalidXmlChars(string text)
{
   var validChars = text.Where(ch =>System.Xml.XmlConvert.IsXmlChar(ch)).ToArray();
   return new string(validChars);
}
 
Share this answer
 
C#
internal static void RectifyXML()
        {
            //the path to the xml file
            string path = @"C:\CodeProject\test.xml";
            //create the xmldocument
            System.Xml.XmlDocument CXML = new System.Xml.XmlDocument();
            //load the xml into the XmlDocument
            CXML.Load(path);
            string correctedXMlString = Regex.Replace(CXML.InnerXml, @"[^\u0000-\u007F]", string.Empty);
            File.Delete(path);
            CXML.LoadXml(correctedXMlString);
            CXML.Save(path);
        }
 
Share this answer
 
JavaScript
var xmlPattern = "[^\u0001-\uD7FF\uE000-\uFFFD\ud800\udc00-\udbff\udfff]";

var newXml = xml.replace(new RegExp(xmlPattern , "g"), "");
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900