Click here to Skip to main content
15,667,864 members
Please Sign up or sign in to vote.
5.00/5 (1 vote)
I am having trouble in splitting string. I mean i am able to split the string but i also need to put a certain condition as to how to split.

My strings text is

France;Japan;Russia;China;"India;New-Delhi";"NewZeland;Wellington";England


I want to split string using ; as delimiter. NO issues there.
But i also want ; to be considered as text inside " ".

The result that i need is to be like
France
Japan
Russia
China
India;New-Delhi
NewZeland;Wellington
England


As Opposed to
France
Japan
Russia
China
India
New-Delhi
NewZeland
Wellington
England


that i am getting.

Any suggestions ?? i am stuck. :'(

Update

I am Getting the string from a CSV file.
I split the string using:
C#
MyString.Split(';');
Posted
Updated 14-Sep-15 3:31am
v5
Comments
Herman<T>.Instance 14-Sep-15 9:18am    
replace ; between country and city with a :
Riya-Pandey 14-Sep-15 9:26am    
for that i will have to find the "" first.
Patrice T 14-Sep-15 9:20am    
What are you using to split the string ?
How do you get the string ?
Riya-Pandey 14-Sep-15 9:23am    
MyString.Split(';');

I am Getting the string from a CSV file
Andy Lanng 14-Sep-15 9:23am    
Ah, much better. That's a good question ^_^

Solution 1: Never use as separator a char that appear in data.
I guess you see why it is a problem.
If you can change your csv file, use a separator that is unused in that data at csv generation.

Solution 2: it gets pretty complicated.
C#
string csvline ="France;Japan;Russia;China;""India;New-Delhi"";""NewZeland;Wellington"";England";
/// replace all ';' with ','
csvline= csvline.Replace(";", ",");
/// search all quoted substring with a , inside and replace it with ;
string pattern = "(""[^""]+),([^""]+"")";
Regex rgx = new Regex(pattern);
Match m = rgx.Match(csvline);
while (m.Success)
{
    csvline= rgx.Replace(csvline, "$1;$2");
    m = rgx.Match(csvline);
}
/// remove the quotes
csvline= csvline.Replace("""", "");
/// split at ,
string[] words = csvline.Split(",");


To see what a regex is doing https://www.debuggex.com/[^]
I eventualy found a tester that generate C# code https://www.myregextester.com/index.php[^]
 
Share this answer
 
v9
Comments
Patrice T 14-Sep-15 10:13am    
The one unhappy with this solution can say what he don't like.
I see vote changed, but even 3 stars is a downvote and harm reputation
Leo Chapiro 14-Sep-15 10:39am    
I have downvoted as your solution was obviously not ready, something like "string[] words = csvline.Split(',');" - that made no sense! Now after you have completed it looks good and I have voted with +5
Patrice T 14-Sep-15 10:54am    
Thank :)
That is CP fault it changed my end of comments and commented all but the last line.
So I changed to ///
Thanks7872 14-Sep-15 10:42am    
harm reputation???

So what? How useful these points are? They mean nothing,really.
Patrice T 14-Sep-15 10:48am    
Ok it is not money. But reputation reflect your activity on CP.
May be I still care of reputation because I started to change color since too little time. :)
According to Expresso[^], this regex will give what you want:
((?>"[^"]+")|[^;]+)

It will split
France;Japan;Russia;China;"India;New-Delhi";"NewZeland;Wellington";England

into
France
Japan
Russia
China
"India;New-Delhi"
"NewZeland;Wellington"
England

Then you can use .Trim('"') on the individual results to remove any quotes there:
C#
string csvline ="France;Japan;Russia;China;\"India;New-Delhi\";\"NewZeland;Wellington\";England";
string pattern = "((?>\"(?:[^\"]+)\")|[^;]+)";
Regex rgx = new Regex(pattern);
var matches = rgx.Matches(csvline);
/// remove the quotes and get the words in order.
string[] words = matches.Cast<Match>().Select(m => m.Groups[1].Value.Trim('"')).ToArray();
 
Share this answer
 
v4
First, I'd suggest looking at csv parsers (just swap the comma for the semi-colon). Dealing with strings inside csv strings is a problem that has been looked into in detail under the guise of CSV Parsing:

this one looks good and lists the Quote character as something it deals with:
A Fast CSV Reader[^]
This would be my first choice

Another option might be to use Regular Expressions focusing on look-ahead and look-behind groups. This can be a little overwhelming if you have not used it before.

My final suggestion would be to Split my the Quote character first, then split by the semi-colon.

A word of warning: Some of the options will not work if there is a chance of having a malformed string (i.e. un-paired quote chars):

C#
var strings = mystring.Split('"');
List<string> result;


//Always expect ;" or ";
if(strings.Length>1){
  //The first might be the Quoted string, so this needs work
  if(strings[0][strings[0].Length-1]!=';'
    Throw new Exception("You get the idea");

  result.AddRange(strings[0].Split(';');
  
  
  for(int x = 1; x < strings.Length - 2; x++){
    //Check the first and last char.  Must both be ";" or neither.

      result.AddRange(strings[x].Split(';');
  }

  //Finally test the last sting

  result.AddRange(strings[last].Split(';');
}

return result.ToArray();


Hope that helps ^_^
Andy
 
Share this answer
 
v2
Try this
C#
string test = "France;Japan;Russia;China;\"India;New-Delhi\";\"NewZeland;Wellington\";England";
List<string> Jointarray=new List<string>();
foreach (Match match in Regex.Matches(test, "\"[^\"]*\""))
{
     Jointarray.Add(match.ToString());
     test = test.Replace(match.ToString(), "");
}
Jointarray.AddRange(test.Split(';').ToList().Where(x => !string.IsNullOrEmpty(x))); 

Now, Jointarray is what you are looking for.

Regards,
 
Share this answer
 
v5
Comments
Riya-Pandey 14-Sep-15 13:37pm    
Thanks Rohan . It works. but i have a problem that here values positions get changed and i can't have that because i have to enter data from csv file into database. Table has 63 columns and i there can be multiple value for any of the column [I mean string with multiple ;]. SO i need these values be in order.. :'(
try this solution

C#
string value ="France;Japan;Russia;China;\"India;New-Delhi\";\"NewZeland;Wellington\";England'";
          string ReplaceDoubleQuote = value.Replace("\"", " ");
          string[] words = ReplaceDoubleQuote.Split(';');
 
Share this answer
 
Comments
Riya-Pandey 14-Sep-15 9:32am    
Nop it's returning the same result. Not Working.
Andy Lanng 14-Sep-15 9:40am    
The issue is that Quoted text should remain as a single string :S
string[] strarry= test.Split(new char[] { ';' }, 6);
string mainString=strarray[0]+strarray[1]+strarray[3]+....


Or You can take This in saprate string:

String r=strarry[0];
string s=strarray[1];
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900