Click here to Skip to main content
15,904,653 members
Please Sign up or sign in to vote.
5.00/5 (1 vote)
I am having trouble in splitting string. I mean i am able to split the string but i also need to put a certain condition as to how to split.

My strings text is

France;Japan;Russia;China;"India;New-Delhi";"NewZeland;Wellington";England


I want to split string using ; as delimiter. NO issues there.
But i also want ; to be considered as text inside " ".

The result that i need is to be like
France
Japan
Russia
China
India;New-Delhi
NewZeland;Wellington
England


As Opposed to
France
Japan
Russia
China
India
New-Delhi
NewZeland
Wellington
England


that i am getting.

Any suggestions ?? i am stuck. :'(

Update

I am Getting the string from a CSV file.
I split the string using:
C#
MyString.Split(';');
Posted
Updated 14-Sep-15 3:31am
v5
Comments
Herman<T>.Instance 14-Sep-15 9:18am    
replace ; between country and city with a :
Riya-Pandey 14-Sep-15 9:26am    
for that i will have to find the "" first.
Patrice T 14-Sep-15 9:20am    
What are you using to split the string ?
How do you get the string ?
Riya-Pandey 14-Sep-15 9:23am    
MyString.Split(';');

I am Getting the string from a CSV file
Andy Lanng 14-Sep-15 9:23am    
Ah, much better. That's a good question ^_^

string[] strarry= test.Split(new char[] { ';' }, 6);
string mainString=strarray[0]+strarray[1]+strarray[3]+....


Or You can take This in saprate string:

String r=strarry[0];
string s=strarray[1];
 
Share this answer
 
According to Expresso[^], this regex will give what you want:
((?>"[^"]+")|[^;]+)

It will split
France;Japan;Russia;China;"India;New-Delhi";"NewZeland;Wellington";England

into
France
Japan
Russia
China
"India;New-Delhi"
"NewZeland;Wellington"
England

Then you can use .Trim('"') on the individual results to remove any quotes there:
C#
string csvline ="France;Japan;Russia;China;\"India;New-Delhi\";\"NewZeland;Wellington\";England";
string pattern = "((?>\"(?:[^\"]+)\")|[^;]+)";
Regex rgx = new Regex(pattern);
var matches = rgx.Matches(csvline);
/// remove the quotes and get the words in order.
string[] words = matches.Cast<Match>().Select(m => m.Groups[1].Value.Trim('"')).ToArray();
 
Share this answer
 
v4
Solution 1: Never use as separator a char that appear in data.
I guess you see why it is a problem.
If you can change your csv file, use a separator that is unused in that data at csv generation.

Solution 2: it gets pretty complicated.
C#
string csvline ="France;Japan;Russia;China;""India;New-Delhi"";""NewZeland;Wellington"";England";
/// replace all ';' with ','
csvline= csvline.Replace(";", ",");
/// search all quoted substring with a , inside and replace it with ;
string pattern = "(""[^""]+),([^""]+"")";
Regex rgx = new Regex(pattern);
Match m = rgx.Match(csvline);
while (m.Success)
{
    csvline= rgx.Replace(csvline, "$1;$2");
    m = rgx.Match(csvline);
}
/// remove the quotes
csvline= csvline.Replace("""", "");
/// split at ,
string[] words = csvline.Split(",");


To see what a regex is doing https://www.debuggex.com/[^]
I eventualy found a tester that generate C# code https://www.myregextester.com/index.php[^]
 
Share this answer
 
v9
Comments
Patrice T 14-Sep-15 10:13am    
The one unhappy with this solution can say what he don't like.
I see vote changed, but even 3 stars is a downvote and harm reputation
Leo Chapiro 14-Sep-15 10:39am    
I have downvoted as your solution was obviously not ready, something like "string[] words = csvline.Split(',');" - that made no sense! Now after you have completed it looks good and I have voted with +5
Patrice T 14-Sep-15 10:54am    
Thank :)
That is CP fault it changed my end of comments and commented all but the last line.
So I changed to ///
Thanks7872 14-Sep-15 10:42am    
harm reputation???

So what? How useful these points are? They mean nothing,really.
Patrice T 14-Sep-15 10:48am    
Ok it is not money. But reputation reflect your activity on CP.
May be I still care of reputation because I started to change color since too little time. :)
Try this
C#
string test = "France;Japan;Russia;China;\"India;New-Delhi\";\"NewZeland;Wellington\";England";
List<string> Jointarray=new List<string>();
foreach (Match match in Regex.Matches(test, "\"[^\"]*\""))
{
     Jointarray.Add(match.ToString());
     test = test.Replace(match.ToString(), "");
}
Jointarray.AddRange(test.Split(';').ToList().Where(x => !string.IsNullOrEmpty(x))); 

Now, Jointarray is what you are looking for.

Regards,
 
Share this answer
 
v5
Comments
Riya-Pandey 14-Sep-15 13:37pm    
Thanks Rohan . It works. but i have a problem that here values positions get changed and i can't have that because i have to enter data from csv file into database. Table has 63 columns and i there can be multiple value for any of the column [I mean string with multiple ;]. SO i need these values be in order.. :'(
First, I'd suggest looking at csv parsers (just swap the comma for the semi-colon). Dealing with strings inside csv strings is a problem that has been looked into in detail under the guise of CSV Parsing:

this one looks good and lists the Quote character as something it deals with:
A Fast CSV Reader[^]
This would be my first choice

Another option might be to use Regular Expressions focusing on look-ahead and look-behind groups. This can be a little overwhelming if you have not used it before.

My final suggestion would be to Split my the Quote character first, then split by the semi-colon.

A word of warning: Some of the options will not work if there is a chance of having a malformed string (i.e. un-paired quote chars):

C#
var strings = mystring.Split('"');
List<string> result;


//Always expect ;" or ";
if(strings.Length>1){
  //The first might be the Quoted string, so this needs work
  if(strings[0][strings[0].Length-1]!=';'
    Throw new Exception("You get the idea");

  result.AddRange(strings[0].Split(';');
  
  
  for(int x = 1; x < strings.Length - 2; x++){
    //Check the first and last char.  Must both be ";" or neither.

      result.AddRange(strings[x].Split(';');
  }

  //Finally test the last sting

  result.AddRange(strings[last].Split(';');
}

return result.ToArray();


Hope that helps ^_^
Andy
 
Share this answer
 
v2
try this solution

C#
string value ="France;Japan;Russia;China;\"India;New-Delhi\";\"NewZeland;Wellington\";England'";
          string ReplaceDoubleQuote = value.Replace("\"", " ");
          string[] words = ReplaceDoubleQuote.Split(';');
 
Share this answer
 
Comments
Riya-Pandey 14-Sep-15 9:32am    
Nop it's returning the same result. Not Working.
Andy Lanng 14-Sep-15 9:40am    
The issue is that Quoted text should remain as a single string :S

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900