Click here to Skip to main content
Click here to Skip to main content

Parsing a Key-Value pair with a Regular Expression

, 15 Jun 2014
Rate this:
Please Sign up or sign in to vote.
A walk-through on how a Key-Value pair can be parsed using a PERL-compatible regex engine
This article demonstrates a solution using Qt. The regular expression itself can be used with any PERL-compatible regex engine. The main source of this Tip can be found at the RegEx Forum. Thanks to all the fellow Members who helped me on my way to this solution.

Introduction

Parsing a Key-Value pair ain't that hard, you say? I'm sure it isn't if you keep it as simple as possible and define that a Key can't contain anything else than characters and numbers, a value can't contain anything else than characters and numbers and they are separated by a '=':

Key1=Value1 Key2=Value2

But what if we advance that a bit? I want a Value to contain white-spaces, optional ones:

Key1=Value 1 Key2=Value Key3=Value 3

We get somewhere, as you can see. But I want more. I want the value to possibly contain everything. Of course this leads us to a problem, because the '=' is already reserved as separator between Key and Value - It's solvable by escaping the '=' if it occurs in a value. I made up this practical example where this pattern might be of use:

ErrorMessage=The file was not found. Path\=C:/Temp/File.txt ErrorNumber=12312

Using the regular expression

The solution presented here is feasible in any PERL-compatible regex engine, even though I will use Qt to demonstrate it's use. The regex looks rather distracting if you look at it for the first time:

^((\b[^\s=]+)=(([^=]|\\=)+))*$

The regex looks strange at first, but as soon as you put it into Expresso you can see what it means more clear:

The regex essentially contains two different capture groups, one being the key ((\b[^\s=]+)) and the other one being the value ((([^=]|\\=)+)). These two captures must be separated by a '='. A key can contain anything but a white-space or a '=' and a value can contain anything but an unescaped '='. Each sequence can occur with any number of repetitions.

Now that you know what the regex essentially does, you also need to be able to parse a string using the previously described regex. Something important also remains to be said: The proposed regex does only return the last Key-Value pair, therefore we need to process the input string multiple times.

//Regular expression as descripted at 
// http://www.codeproject.com/Tips752372/Parsing-a-Key-Value-pair-with-a-Regular-Expression
QRegularExpression regex("^((\\b[^\\s=]+)=(([^=]|\\\\=)+))*$");
QString example = "ErrorMessage=File wasn't found_Path\=C:/Temp/File.txt ErrorNumber=12312";
while(example.length() > 0){//As long there is stuff in example
   //Get the last Key-Value pair from the RegEx
   QRegularExpressionMatch keyValueRegexMatch = keyValueRegex.match(keyValueRawData); 

   //Output the found Key-Value Pair
   qDebug()<<"Key="<<keyValueRegexMatch.captured(2);
   qDebug()<<"Value="<<keyValueRegexMatch.captured(3);

   //Remove the replaced Key-Value pair from the input
   example = example.replace(keyValueRegexMatch.captured(1), "");
}

The above solution isn't far from perfect, yet it needs some tweaking: If a captured group is not at the end of a string, it happens that the space between Value and the following key isn't removed.

Points of interest

It's fascinating how powerful regular expressions are. But this example has also showed me that not all regex engines are working the same way, and sometimes you need to tweak a regex to get it work on a specific engine, even though it has perfectly worked with another engine. I tested this regex with the Qt regex engine (QRegularExpression, to be exact - See here for a distinction to QRegExp) and the .Net regex engine, yet I'm confident that it will work well with most of the popular regex engines out there.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Marco Bertschi
Software Developer (Junior)
Switzerland Switzerland
Software Developer (Swiss Federal VET Diploma), experienced with Qt C++, C#, RFC 5424 and Arduino Boards.
Music enthusiast, runner, part-time psychologist for friends, awesome guy.
Follow on   Twitter

Comments and Discussions

 
SuggestionLink to source PinprofessionalRichard Deeming10-Apr-14 9:16 
GeneralRe: Link to source PinprotectorMarco Bertschi10-Apr-14 9:37 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web01 | 2.8.140721.1 | Last Updated 15 Jun 2014
Article Copyright 2014 by Marco Bertschi
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid