Click here to Skip to main content
15,886,110 members
Please Sign up or sign in to vote.
1.80/5 (3 votes)
See more:
I have a mutable string with some junk character and some hindi character, I want to remove all the junk character excluding hindi character.

for example

C#
string sjunk="<div>te♠st data here ♠ ♣ !@#$%^&* ♠ ♣<p>dfdsससे पहले कब-कब शाहरुख खान </p> ";
sjunk = System.Text.RegularExpressions.Regex.Replace(sjunk, @"[^\u0000-\u007F]", "");


from the above regex all the Hindi and junk character got replaced, I want only junk character to be removed.


Thanks..
Posted

1 solution

You'll need to update your regular expression to allow characters from the Devanagari character block[^]:
C#
sjunk = System.Text.RegularExpressions.Regex.Replace(sjunk, @"[^\u0000-\u007F\u0900-\u097f]", "");
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 5-Nov-15 10:12am    
5ed.
—SA

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900