Click here to Skip to main content
15,879,613 members
Articles / Programming Languages / C#

vCard Parser with Lightweight Approach II

Rate me:
Please Sign up or sign in to vote.
4.50/5 (15 votes)
31 Jul 2008CPOL2 min read 52.7K   1.3K   34   10
vCard parser implemented using C#

The download contains a VS 2008 solution with .NET 2 including unit tests. The test cases contain some Chinese characters for testing Quoted Printable encoding.

Background

For parsing vCard text using C# codes, I published an article vCard Reader with Lightweight Approach on CodeProject. In the article, I demonstrated how to use regular expressions to parse vCard text. The code uploaded and the article was more about proof of concept, and you need further work to integrate the code into commercial applications, which I had done later on.

This article is about follow-up work to strengthen the C# .NET vCard parser. You can use it, as usual, without any warranty. Before further reading, please first read vCard Reader with Lightweight Approach on CodeProject.

When testing, please be aware of the following facts:

vCard standard has been around for more than 10 years, and widely accepted by the industries. However, the implementations from different vendors are a bit buggy, resulting in data corruption during data exchanges.

For example:

  1. Microsoft Outlook 2003 can handle Unicode. When exporting to vCard, non-ASCII characters are encoded into QuotedPrintable over UTF8, however, when importing, Outlook will fail to import those characters. NickName is not included in vCard 2.1 but in vCard 3.0, but Outlook's implementation of vCard 2.1 includes NickName.
  2. Yahoo has similar problem. In addition, Rev in Yahoo vCard is not DateTime but a kind of 3-digit number, a bit doggy.

These applications could not eat their own dog food. So when you are testing the exchanges of vCard objects, keep these facts in mind. For more quirks of vCard implementations, please read this summary and the vcard-errata.

vCard Handling using C#

vCardModel.gif

Comparing with the code in the previous article, the class structures were re-constructed to separate concerns in order to improve flexibility and maintainability.

C#
public class VCardReader
    { 
        /// <summary>
        /// Analyze vCard text into vCard properties.
        /// </summary>
        /// <param name="vCardText">vCard text.</param>
        /// <returns>vCard object.</returns>
        public static VCard ParseText(string vCardText)
        {
            VCard v = new VCard();
            RegexOptions options = RegexOptions.IgnoreCase | 
		RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace;
 
            Regex regex;
            Match m;
            MatchCollection mc;
 
            NameValueCollection vCardLines = new NameValueCollection();
            regex = new Regex(@"((?<strElement>[\w]*) 
		(;*(ENCODING=)?(?<strAttr>(QUOTED-PRINTABLE)))*  ([^:]*)*  
		(:(?<strValue> (([^\n\r]*=[\n\r]+)*
			[^\n\r]*[^=][\n\r]*) )))", options);
            MatchCollection matches = regex.Matches(vCardText);
 
            foreach (Match match in matches)
            { 
                string ss;
 
                string vCardLine = match.Value;
                switch (match.Groups["strElement"].Value)
                {
                    case "FN":
                        regex = new Regex(@"(?<strElement>(FN))
			(;CHARSET=UTF-8)? (:(?<strFN>[^\n\r]*))", options);
                        m = regex.Match(vCardLine);
                        if (m.Success)
                            v.FormattedName = m.Groups["strFN"].Value;
                        break;
                    case "N":
                        regex = new Regex(@"(?<strElement>(N))
			(;CHARSET=UTF-8)?(:(?<strSurname>
			([^;\n\r]*))) (;(?<strGivenName>([^;\n\r]*)))?
 			(;(?<strMidName>([^;\n\r]*)))? (;(?<strPrefix>([^;\n\r]*)))? 
			(;(?<strSuffix>[^;\n\r]*))?", options);
                        m = regex.Match(vCardLine);
                        if (m.Success)
                        {
                            v.Surname = m.Groups["strSurname"].Value;
                            v.GivenName = m.Groups["strGivenName"].Value;
                            v.MiddleName = m.Groups["strMidName"].Value;
                            v.Prefix = m.Groups["strPrefix"].Value;
                            v.Suffix = m.Groups["strSuffix"].Value;
                        }
                        break;
                    case "TITLE":
                        regex = new Regex(@"(?<strElement>(TITLE))
			(;CHARSET=UTF-8)? (:(?<strTITLE>[^\n\r]*))", options);
                        m = regex.Match(vCardLine);
                        if (m.Success)
                            v.Title = m.Groups["strTITLE"].Value;
                        break;
                    case "ORG":
                        regex = new Regex(@"(?<strElement>(ORG)) (;CHARSET=utf-8)? 
 			(:(?<strORG>[^;\n\r]*))(;(?<strDept>[^\n\r]*))?", options);
                        m = regex.Match(vCardLine);
                        if (m.Success)
                        {
                            v.Org = m.Groups["strORG"].Value;
                            v.Department = m.Groups["strDept"].Value;
                        }

                        break;
                    case "BDAY":
                        regex = new Regex(@"(?<strElement>(BDAY)) 
  				(:(?<strBDAY>[^\n\r]*))", options);
                        m = regex.Match(vCardLine);
                        if (m.Success)
                        {
                            string[] expectedFormats = 
				{ "yyyyMMdd", "yyMMdd", "yyyy-MM-dd" };
                            v.Birthday = DateTime.ParseExact
			(m.Groups["strBDAY"].Value, expectedFormats, null,
 			System.Globalization.DateTimeStyles.AllowWhiteSpaces);
                        }
                        break;
                    case "REV":
                        regex = new Regex(@"(?<strElement>(REV))
 			(;CHARSET=utf-8)?  (:(?<strREV>[^\n\r]*))", options);
                        m = regex.Match(vCardLine);
                        if (m.Success)
                        {
                            string[] expectedFormats = 
				{ "yyyyMMddHHmmss", "yyyyMMddTHHmmssZ" };
                            v.Rev = DateTime.ParseExact
			(m.Groups["strREV"].Value, expectedFormats, null, 
			System.Globalization.DateTimeStyles.AllowWhiteSpaces);
                        }
                        break;
                    case "EMAIL":
                        regex = new Regex(@"((?<strElement>(EMAIL))
 			((;(?<strAttr>(HOME|WORK)))|(;(?<strPref>(PREF))))* (;[^:]*)*
  			(:(?<strValue>[^\n\r]*)))", options);
                        mc = regex.Matches(vCardLine);
                        if (mc.Count > 0)
                        {
                            for (int i = 0; i < mc.Count; i++)
                            {
                                EmailAddress email = new EmailAddress();
                                v.Emails.Add(email);
                                m = mc[i];
                                email.Address = m.Groups["strValue"].Value;
                                ss = m.Groups["strAttr"].Value;
                                if (ss == "HOME")
                                    email.HomeWorkTypes = HomeWorkTypes.HOME;
                                else if (ss == "WORK")
                                    email.HomeWorkTypes = HomeWorkTypes.WORK;
 
                                if (m.Groups["strPref"].Value == "PREF")
                                    email.Pref = true;
                            }
                        }
 
                        break;
                    case "TEL":
                        regex = new Regex(@"((?<strElement>(TEL))
  			((;(?<strType>(VOICE|CELL|PAGER|MSG|FAX)))| 
			(;(?<strAttr>(HOME|WORK)))| (;(?<strPref>(PREF)))?)*  
			(:(?<strValue>[^\n\r]*)))", options);
                        mc = regex.Matches(vCardLine);
                        if (mc.Count > 0)
                        {
                            for (int i = 0; i < mc.Count; i++)
                            {
                                PhoneNumber phone = new PhoneNumber();
                                v.Phones.Add(phone);
                                m = mc[i];
                                phone.Number = m.Groups["strValue"].Value;
                                ss = m.Groups["strAttr"].Value;
                                if (ss == "HOME")
                                    phone.HomeWorkTypes = HomeWorkTypes.HOME;
                                else if (ss == "WORK")
                                    phone.HomeWorkTypes = HomeWorkTypes.WORK;
 
                                if (m.Groups["strPref"].Value == "PREF")
                                    phone.Pref = true;
 
                                CaptureCollection types = m.Groups["strType"].Captures;
                                foreach (Capture capture in types)
                                {
                                    switch (capture.Value)
                                    {
                                        case "VOICE":
                                            phone.PhoneTypes |= PhoneTypes.VOICE;
                                            break;
                                        case "CELL": phone.PhoneTypes |= PhoneTypes.CELL;
                                            break;
                                        case "PAGER": phone.PhoneTypes |= 
							PhoneTypes.PAGER;
                                            break;
                                        case "MSG": phone.PhoneTypes |= PhoneTypes.MSG;
                                            break;
                                        case "FAX": phone.PhoneTypes |= PhoneTypes.FAX;
                                            break;
                                    }
                                } 
                            }
                        }
                        break;
                    case "ADR":
                        regex = new Regex(@"(?<strElement>(ADR))
 		(;(?<strAttr>(HOME|WORK)))?(;CHARSET=utf-8)?(:(?<strPo>([^;]*)))
		(;(?<strBlock>([^;]*)))  (;(?<strStreet>([^;]*)))  
		(;(?<strCity>([^;]*))) (;(?<strRegion>([^;]*))) 
		(;(?<strPostcode>([^;]*)))(;(?<strNation>[^\n\r]*))", options);
                        mc = regex.Matches(vCardLine);
                        if (mc.Count > 0)
                        {
                            for (int i = 0; i < mc.Count; i++)
                            {
                                Address address = new Address();
                                v.Addresses.Add(address);
                                m = mc[i];
                                ss = m.Groups["strAttr"].Value;
                                if (ss == "HOME")
                                    address.HomeWorkType = HomeWorkTypes.HOME;
                                else if (ss == "WORK")
                                    address.HomeWorkType = HomeWorkTypes.WORK;
 
                                address.PO = m.Groups["strPo"].Value;
                                address.Ext = m.Groups["strBlock"].Value;
                                address.Street = m.Groups["strStreet"].Value;
                                address.Locality = m.Groups["strCity"].Value;
                                address.Region = m.Groups["strRegion"].Value;
                                address.Postcode = m.Groups["strPostcode"].Value;
                                address.Country = m.Groups["strNation"].Value;
                            }
                        }
                        break;
                    case "NOTE":
                        regex = new Regex(@"((?<strElement>(NOTE))
 		((;CHARSET=[^;]*)?;*(ENCODING=)?(?<strAttr>(QUOTED-PRINTABLE)))* 
 		([^:]*)*  (:(?<strValue> (([^\n\r]*=[\n\r]+)*[^\n\r]*[^=][\n\r]*)
 		)))", options);
                        m = regex.Match(vCardLine);
                        if (m.Success)
                        {
                            if (m.Groups["strAttr"].Value == "QUOTED-PRINTABLE")
                                v.Note = QuotedPrintable.Decode
					(m.Groups["strValue"].Value);
                            else
                                v.Note = m.Groups["strValue"].Value;
                        }
                        break;
                    case "URL":
                        regex = new Regex(@"((?<strElement>(URL)) 
		(;*(?<strAttr>(HOME|WORK)))?   (:(?<strValue>[^\n\r]*)))", options);
                        mc = regex.Matches(vCardLine);
                        if (mc.Count > 0)
                        {
                            for (int i = 0; i < mc.Count; i++)
                            {
                                URL url = new URL();
                                v.URLs.Add(url);
                                m = mc[i];
                                url.Address = m.Groups["strValue"].Value;
                                ss = m.Groups["strAttr"].Value;
                                if (ss == "HOME")
                                    url.HomeWorkTypes = HomeWorkTypes.HOME;
                                else if (ss == "WORK")
                                    url.HomeWorkTypes = HomeWorkTypes.WORK;
                            }
                        }
 
                        break;
                    case "ROLE":
                        regex = new Regex(@"(?<strElement>(ROLE)) 
			(;CHARSET=utf-8)?  (:(?<strROLE>[^\n\r]*))", options);
                        m = regex.Match(vCardLine);
                        if (m.Success)
                            v.Role = m.Groups["strROLE"].Value;
                        break; 
                }
            }
 
            return v;
        } 
    } 

If you have ever reviewed the existing vCard implementations in C, PHP or Pascal, you will see using regular expression makes the parser simpler and shorter.

Further Work for Integrating to Other Applications

Considering the current implementation of vCard support in Microsoft Outlook, Microsoft Outlook Express, Yahoo Mail and Eudora etc., the following properties and attributes are not supported in this vCard implementation:

  • Photo
  • Address labels
  • Delivery address types
  • Mailer
  • Timezone
  • EMail types
  • Sound
  • Public key
  • Geo
  • Extensions

You might need to modify the codes to adapt your needs of data exchange.

The efficiency of regular expression processing can be improved by using Regex.CompileToAssembly.

VCardFileWriter might be needed to write vCard data into files of different encodings.

An example of integration exists in an open source project initialized by me, called SyncML.NET Client API with SyncML Client for Open Contacts. The latest update of the C# vCard components is located in the download area of SyncML.NET Client API.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer
Australia Australia
I started my IT career in programming on different embedded devices since 1992, such as credit card readers, smart card readers and Palm Pilot.

Since 2000, I have mostly been developing business applications on Windows platforms while also developing some tools for myself and developers around the world, so we developers could focus more on delivering business values rather than repetitive tasks of handling technical details.

Beside technical works, I enjoy reading literatures, playing balls, cooking and gardening.

Comments and Discussions

 
AnswerThanks - SavevCard Pin
Angelo Cresta2-May-19 0:03
professionalAngelo Cresta2-May-19 0:03 
Questionbeaitufull :) Pin
olum19891-May-14 0:56
olum19891-May-14 0:56 
QuestionAddress not working Pin
Stephen Cavender27-Jun-13 6:08
Stephen Cavender27-Jun-13 6:08 
AnswerRe: Address not working Pin
Stephen Cavender27-Jun-13 7:01
Stephen Cavender27-Jun-13 7:01 
GeneralMy vote of 5 Pin
Stephen Cavender27-Jun-13 6:05
Stephen Cavender27-Jun-13 6:05 
GeneralRegex parsing Pin
MeetaPanicker27-Feb-09 1:30
MeetaPanicker27-Feb-09 1:30 
GeneralSupport vcard 3.0 Pin
davidhart20-Dec-08 11:11
davidhart20-Dec-08 11:11 
GeneralRe: Support vcard 3.0 Pin
Zijian21-Dec-08 11:01
Zijian21-Dec-08 11:01 
GeneralRe: Support vcard 3.0 Pin
davidhart21-Dec-08 15:28
davidhart21-Dec-08 15:28 
Questionsupport for portrait? Pin
Huisheng Chen31-Jul-08 18:51
Huisheng Chen31-Jul-08 18:51 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.